UP | HOME

Introduction to Linux

Overview

These notes present a guide for learning how to use Linux. There are always exceptions to rules and I have omitted many of these and other minor details for the sake of clarity.

What is Linux?

  • Colloquially, the term Linux refers to an operating system, like Windows, macOS, or OpenBSD.
  • Operating systems typically consist of a kernel which is the main interface to the physical hardware and the user space, a set of utilities that users use to interact with the system.
  • Technically, Linux is the name of a kernel and the user space is provided by other organizations (most commonly GNU)
  • The source code of Linux is free, so anybody is able to read, modify, and distribute it, unlike Windows or (most of) macOS.

Components of Linux

  • Kernel: Drivers for hardware, memory management, filesystems, and task scheduling.
    • Special filesystems to interact with the kernel are /dev, /proc, and /sys.
  • User-Space Programs (every program not the kernel).
    • Many were made by the GNU project, hence Linux is sometimes referred to as GNU/Linux.
  • Init System: systemd is the first program that runs and schedules all the daemons (programs that run in the background).
  • Network Client: We'll use NetworkManager, but there are many other options (including systemd).
  • Graphical System: X Windows (the Xorg server implementation). Another more-modern but choice is called Wayland.
  • Desktop Environment: On Ubuntu, by default, this is Gnome 3.
    • There are Ubuntu derivatives (Kubuntu, Lubuntu, Xubuntu) whose main difference is having a different default desktop environment.
    • Window Manager: manages drawing windows for programs, often part of a desktop environment.
  • Login Manager: gdm, manages logins graphically. It is possible to use Linux without a graphical login.

What is Ubuntu?

  • Ubuntu is Linux distribution, a collection of software containing the Linux kernel, user space utilities, and other programs.
  • Ubuntu is created by Canonical, a for-profit company that releases Ubuntu for free as part of its business.
  • There are many other Linux distributions, many of which distribute software through a package manager.
  • We use Ubuntu because it is the primary Linux distribution supported by the Robot Operating System (ROS).
  • There are pros and cons to other Linux distributions relative to Ubuntu.
Linux Distributions

There are several properties that distinguish different Linux distributions from each other.

  • Organizational structure: Corporation (e.g., Ubuntu, Red Hat), Community (e.g., Archlinux, Debian), Individual (e.g., Slackware).
  • Updates: Rolling (e.g., Archlinux), Release-Based (e.g., Ubuntu).
  • Package Manager: Declarative (e.g., NixOS, GuixSD), Simple Tarball (e.g., Slackware), Compile everything yourself (e.g., Gentoo), Debian Packages.
  • Software Allowed: Some closed-source (e.g., Archlinux, Debian), Free Software Only (e.g., Parabola).
  • Philosophy: Minimalistic (e.g., Alpine), easy to use (Ubuntu, Manjaro), for power users (e.g., Archlinux, Gentoo).

How to Learn

Experience

  • The best way to learn Linux is to use it every day.
  • Use your Linux computer for everything, even if it is at first less convenient.
  • The long-term benefits of having full control of your computer far outweigh any minor inconveniences.
  • Remember, a robot is a computer, and you are in the business of controlling robots.
  • Every time you fix something on your computer, you are getting better at robotics!

Read the Manuals

  • Sometimes, you need just a quick solution anywhere you can find it. But don't stop there:
    • Find the primary source for the solution and read it.
    • Don't be afraid to read source code to learn how a program works.
  • Knowledge gleaned from reading more deeply into problems compounds over time.
    • The more experience you have, the more quickly you can skim manuals to focus on the relevant parts.

Learn Incrementally

  • Each day, learn a new keyboard shortcut or command.
  • Integrate what you learned into your daily workflow slowly.
  • Frequently used commands will soon become second nature.

Customization versus Defaults

Everything on Linux can be customized and most parts can be substituted with alternative programs. There are advantages and disadvantages to customizing versus sticking with the defaults.

Advantages
  • Make your system works how you want it to work.
  • The process of customization helps you learn the system.
Disadvantages
  • The farther away from mainstream you go the less help there is.
  • Makes it harder to use other people's computers
  • Makes it harder for others to use your computer (maybe that's actually an advantage?)

Daily usage

  • Use the command line for most every-day tasks.
    • Once it becomes comfortable, you will find it faster, more widely applicable, and easier to explain than other methods.
    • Don't use the graphical file manager. Instead learn to manage files from the command line.
  • Ability to use the command line fluently is a crucial skill for a robotics engineer.
    • Many robots don't have a nice GUI interface to work with so there is no choice.
    • Some problems on computers cause graphical displays to malfunction, which means the command line is the only choice.
    • The more frequently you do something, the more it makes sense to learn the command-line version, which is often a first step to automating the task.

The command line

People use many terms to refer to the command line, the place where you type commands to control your computer. Below is a brief overview of the technical differences between these terms, which are often used in a looser sense.

  1. You type commands into a shell which interprets and executes them. Commands are case-sensitive (in most shells).
  2. The shell runs inside a terminal (tty for short), a special "hardware" device that gets keyboard/mouse input and renders output.
  3. A tty used to be a physical hardware device (also known as a console or teletypewriter).
  4. In modern times, the terminal is usually emulated in software.
  5. Linux creates virtual consoles to serve as virtual tty devices. To switch to these try C-M-F4. Use C-M-F2 to return. (C means Control, M means "meta" which is now the Alt key, the - indicates they should be pressed simultaneously).
  6. Within your graphical system, special programs called terminal emulators mimic the behavior of a physical terminal using pseudo-terminal devices.

Bottom line:

  • Unless you are doing something specific, the terms console, terminal, shell, and command line

refer to where you enter text commands to the computer and see the output.

Why use the command line?

  1. Easier to automate tasks.
    • If you can accomplish something with a series of commands, you can put the commands into a script to automate them.
  2. Faster for certain tasks.
    • For example, rename all files ending in .txt to .md
  3. Composability: Can chain an arbitrary number of commands together, feeding the input of one into the output of the other.
  4. Easier to explain to others.
    • Commands can be replicated without resorting to screenshots or vague descriptions
    • Commands tend to be more stable than GUI, which seem to change with every update.

Why use a GUI?

  1. Easier to use and discover new features.
  2. Some tasks may benefit from a mouse.
  3. Easier to present information graphically and interactively.

Developing command line skills

  1. Practice by using it every day.
  2. Look up how to do tasks via the command line even if you can just use the GUI.
  3. Be incremental. Each day, focus on adding a new task or learning a new shortcut or customization.
  4. Eventually, you can start trying to simplify tasks that you find tedious or are repeating a lot.
  5. Don't be afraid to modify configuration files.

The shell

  • Program that runs inside the terminal and lets you interact with the operating system.
  • Reads your keyboard commands, performs an action, and displays output.
  • Usually doubles as an interpreter for a scripting language.
  • This is the User Interface (UI) of the command line.

Many types of shells

  • bash - This is the most common and I will assume its use in examples.
  • dash - basic, POSIX compliant shell, invoked with /bin/sh. Used for maximum portability. Not a good shell to use interactively, but it is always there as a lowest-common-denominator.
  • zsh - Mostly bash compatible, but with many extra features.
  • tcsh - C shell.
  • fish - A "user friendly" shell.
  • nushell - A shell with advanced features for parsing and interpreting the output of commands.
  • Oil Shell - A shell designed with an easier-to-use scripting language.
  • For ROS, the most supported shells are dash, bash and zsh.

The Bash Shell

The Prompt

When you first start the shell you are presented with a prompt and are able to type commands.

  • The prompt (by default) displays <user>@<hostname>:~$ and then has a cursor where the commands you type are displayed
  • Every command occurs in the context of a working directory.
    • By default the prompt shows the current working directory.
    • The ~ symbol expands to /home/<username> and is usually where the shell will start when it is first opened
    • The pwd command (print working directory) displays the path to the current working directory.

Readline

  1. Tab completion - Press TAB to complete the rest of the command. If there is no unique match press TAB again to view possible matches.
    • There are many options to customize this behavior if you want.
    • When in doubt, press TAB.
    • Usually when TAB does not do what you expect it is because your expectations are incorrect.
  2. Bash keeps a history of your commands. Use the up arrow to scroll through previous commands.
  3. Reverse search: type part of a command, then press C-r to start searching through your history for a similar command.
  4. Bash uses the readline library to enable these keyboard shortcuts and other behavior related to processing your input.
    • readline is also used by other programs and can be customized. See info readline for more details.

Getting Help

Although the internet is useful, your system comes with built-in documentation that is quite thorough. When solving a Linux problem, it often helps to consult this built-in documentation in conjunction with any online instructions that you may find, as a way to understand what you are doing and why it may help. The advantage of using built-in help is that it is the help for the same versions of the software that are installed on your computer.

Command Line flags

  • -h or --help. These are mostly standard and often cause programs to output useful information.

Commands with Subcommands

  • Many modern Linux programs (e.g., git) follow a command subcommand pattern.
  • Often, help is a subcommand, and -h or --help can added before or after each subcommand to get more details.

Man pages

Manual pages are the primary built-in documentation system.

  • man - Manual pages reader program.
  • man man to see a description of man.
  • man <program> Manual page for <program>.
  • man (by default) uses vi style keys:
    • Search for text is /.
    • Scroll down is j up is k.
    • h for more help.
    • q to quit.
  • man X topic view section X of the manual on the given topic.
    • man cd: Does not work because cd is a shell built-in (type cd can tell you this information).
      • On some systems there is a man page for cd.
    • man does not have hyperlinks, since it was invented before hyperlinks became prevalent!
      • So you have to manually follow references.
  • apropos topic searches descriptions in manpages for the topic you are looking for.

bash help

  • help - This is a shell built-in that provides help on shell built-ins.
  • help help - Information on how to use help.

info

  • The GNU project has its own documentation: info pages.
  • Most of the basic user tools on Linux are GNU projects and have extensive info pages.
  • Info pages are available using the info command. They are also usually hosted online as html.
  • The command itself works like man except the viewer is more capable, having modern innovations such as hyperlinks.

    How to navigate info pages?

    • Emacs style keys (by default).
    • ctrl-n next line.
    • ctrl-p previous line.
    • enter - follow link.
    • u up a level.
    • if you run info by itself, it describes how to do a tutorial or get help.

    Info pages tend to be more informative than man pages and easier to navigate.

Exercises

  1. Run the following commands, being sure to press the correct keys/commands to exit each program that may be launched:

    man man
    man --help # comments start with #
    help exit
    bash    # you can start one instance of a shell inside another
    cd /tmp # move to the /tmp directory
    pwd     # print the current directory
    # Use a command to exit the shell you just launched
    # What do you think the current directory is?
    # What command can you use to confirm? Try it.
    
  2. Use the up and down arrow keys to get a feel for navigating your history.
  3. Type m then press up. What command is shown?
    • Press C-a to move to the start of the line.
    • Press C-e to move to the end of the line.
    • Press C-u to clear the current line.
    • The above keys are the standard emacs navigation shortcuts.
  4. Type m then C-r a few times. What happens?
  5. Many find C-r cumbersome and prefer using the arrow keys to search through history based on what I've started typing. (So it will find everything beginning with what has been typed). If you want to enable this behavior run the following code to create ~/.inputrc, a configuration file that controls readline:

    echo  '"\e[A":history-search-backward' >> ~/.inputrc
    echo  '"\e[B":history-search-forward' >> ~/.inputrc
    

    Restart bash to apply the changes. If you less ~/.inputrc you should see the lines above are in the file

  6. Type m then press up, what command is shown?
    • Clear the current line.
  7. Try to rerun the command man man using as few keystrokes as possible. What were they?

The Virtual Filesystem

Structure

  • The Virtual File System (VFS) is the primary organizing structure on a Linux system.
  • Collections of related data are called files and are stored in the (VFS) as nodes in a tree.
    • Everything on Linux is a file, including hardware devices!
    • Filenames are case sensitive.
    • Although Linux supports using spaces (' ') in file names, I recommend against this practice because the shell requires you to place the file name in quotation marks or escape the space by inserting a backslash \ character (e.g. The\ File\ With\ Spaces).
  • The base node of the VFS tree is called the root directory and is referred to as / (slash).
  • A directory contains multiple files and other directories.
    • Technically, a directory is a special type of file containing pointers to the elements it contains (everything on Linux is a file).
    • This document treats files and directories separately unless treating a directory as a file is relevant.
    • Every directory contains the directory . (dot), which refers to itself, and .. (dot-dot), which refers to the parent directory (/.. refers to /).
  • An absolute path starts with a / and contains a sequence of directories, optionally terminated with a file.
    • For example /home/user, /, /home/user/./../user/file
    • Such a path refers to a specific file or directory, regardless of context.
    • When used at the beginning of a path, the shell expands the ~ (tilde) symbol to /home/username. Thus, paths starting with a ~ are also absolute paths.
  • A relative path is a sequence of directory names, optionally terminated with a file, that does not start with a /.
    • For example usr/bin, home/user/.., ./myscript
    • The entity that is referenced by a relative path depends on context.
      • Typically relative paths are relative to the working directory which is the current directory that is active in the shell.
      • The path is resolved by appending the relative path to the path of the working directory.
      • Other programs may use relative paths where the path is relative to some other directory in the system (such as a directory the program uses for configuration).
  • Directory structure is standardized according to Linux Filesystem Hierarchy Standard (although not every Linux distribution or program follows this standard perfectly)
    • /bin - basic commands for executing
    • /sbin - tools generally that should only be used by root
    • /lib - shared libraries and kernel modules
    • /usr/bin - files to execute
    • /usr/lib - shared libraries for most programs
    • /etc - global configuration files
    • /tmp - temporary, now adays its in ram and reset everytime reboot
    • /run - temporary runtime data (some things that used to be stored in var)
    • /var - data that programs write, log files, etc
    • /home - user home directory
    • /root - root's home directory
    • /media - automatically mounted filesystems (more about mounting later)
    • /mnt - the area for mount points
    • /sys - access information from the kernel
    • /proc - enables processes and kernel to interact
  • Ubuntu and Debian have their own additions to the File System Hierarchy:
    • /snap - used for Ubuntu snaps, not part of the standard.
    • /bin.usr-is-merged /sbin.usr-is-merged, /lib.usr-is-merged:
      • On modern systems /bin, /sbin and /lib point to /usr/bin /usr/sbin and /usr/lib respectively
      • The /{bin,sbin,lib}.usr-is-merged are present on debian-derived systems (like Ubuntu) to indicate that the system has done this

Navigation

Here are some commands for traversing the VFS from the command line

  • pwd - shows the current working directory.
  • ls - List files in a directory.
    • By default, ls does not show files that start with a ., causing these files to be hidden.
    • Use ls -a to show hidden files.
    • Use ls -l to show more information about the files.
      • ls | less lets you page through the listing (more on this later).
      • ls b* uses globbing to list all files starting with b (see man 7 glob).
  • cd - Change the working directory.
    • cd - Change back to the previous working directory.
    • cd Change to the home directory.

Example

Here is an example that uses some commands to explore the filesystem.

pwd
ls
ls -a
ls -l
cd /
ls
cd /usr/bin
ls
cd

Manipulation

Many commands modify the filesystem structure.

  • mkdir creates a new directory.
  • rmdir removes a directory (the directory must be empty).
  • cp copies a file. Use cp -i to warn you if copying will overwrite an existing file.
  • mv move a file, also used for quick renaming.
    • Use mv -i to warn you if moving will overwrite a file.
    • Use mv -n to prevent you from overwriting a file
    • mv can be dangerous: if you move one file over another that file is lost forever.
  • rm deletes files permanently (use with caution)
    • rm -r deletes directories recursively (use with extreme caution).
    • rm -rf deletes directories recursively and does not warn if deleting
    • trash (install with sudo apt install trash-cli provides an alternative to rm which will allow you to recover files deleted accidentaly.
  • ln Linking (like a shortcut in Windows but better).
    • To create a symbolic link (symlink) use ln -s.
    • This comamand creates a file that points to another file.
    • Opening the symlink is (mostly) equivalent to opening the file it points to.

Mounts

  • Mounts are used to make a set of files appear somewhere in the VFS.
  • The data that compose files are physically organized on their storage medium (e.g., a harddrive) by a filesystem such as
    • ext4 The basic journaling filesystem for Linux, and the one your harddrive uses.
    • NTFS (Windows partitions).
    • APFS (macOS).
    • FAT32 (small usb drives or sd cards).
    • exFat (large usb drives or sd cards).
  • Filesystems usually reside on a storage device such as a harddrive or CD-ROM.
    • In general these devices on Linux are called block devices.
  • A block device can be divided into one or more partitions.
    • Each partition is a section of the device can contain a different filesystem.
  • Each block device and each partition is treated as a separate device in Linux and has its own file under /dev.
  • Each partition can be mounted to a subdirectory under /.
    • Drivers in the Linux kernel then display the contents of the device as a familiar directory system.
  • The /etc/fstab file configures mounts (e.g., mounting your harddrive to / on boot).
  • Use lsblk to see the names of devices that can store filesystems.
  • Loopback devices (/dev/loop*) enable Linux to mount a file containing a disk image (i.e., a bit-for-bit copy of a filesystem) and pretend it is a physical device.
  • These devices are just files!
    • You can read the raw data from them (that is the bytes that make up the filesystem) and write data to them (probably corrupting the disk).
  • The mount -t <filesystem> <device> <location> command mounts a device with a given filesystem to a given location.
    • The filesystem can often be omitted as it is automatically detected.
  • umount <device|location> unmounts a device or loacation.
  • df -h Shows free disk space by device.
  • du -h Shows disk space used by individual files or directories.
  • By default, Ubuntu automatically mounts devices such as flash drives.
    • You can find these devices under /media/$USER (where $USER is your username).
  • In a modern linux system, your desktop environment (e.g., gnome) usually automatically mounts disks when they are attached to your computer (e.g., when inserting a USB flash drive)
    • The mounts appear under /media/<username>

Exercises

When doing these exercises, use TAB completion as much as possible.

  1. Create a hidden directory within your home directory to use as the trash.
    • These notes refer to the directory as trash, but you need to name it so that it is hidden.
  2. Using ls, Verify that the directory exists and is hidden.
  3. Create a directory /home/$USER/hackathon/ex1 by typing as few characters as possible.
    • Hint: this step can be done with one mkdir command…
  4. Change to the root directory / and list the files there.
  5. List the files in your home directory without changing your current directory.
  6. List the files in /usr/bin.
  7. Change your current directory to /usr/bin.
  8. List the files in a directory in reverse order.
    • Hint: use the manual for ls and search for the word "reverse" by pressing / in man.
  9. Change the working directory to ~/hackathon/ and verify that the working directory has changed.
  10. Rename ex1 to example1, using relative paths.
  11. Copy a few files from /usr/bin/ to the example1 directory.
  12. Move the example1 directory to your trash directly.
  13. Create a symlink to your trash directory under ~/hackathon and use it to cd into the trash directory.
    • What is the output of pwd?
    • What is the output of ls ..?
  14. Use rm to permanently and irrevocably delete the contents of trash without deleting the the directory itself.
  15. Use a command other than rm to delete the trash directory.
  16. Use a command to determine the name of file corresponding to the physical device that contains the data for the root file system.
    • What command did you use and what is the name of the device?

Users, Groups, Permissions

Concepts

  • In Linux, everything is a file (including directories and devices).
  • Every file is owned by one user and one group.
  • Every user belongs to one or more groups.
  • Permissions control what operations can be performed on a file, based on the user and the groups that is trying to perform the action
  • The file owner controls the file permissions

Utilities

  • whoami retrieves your username
  • groups shows what groups you are in
  • users shows what users are logged in
  • w shows information about logged in users
  • id shows the numbers corresponding to users and groups
    • Every user has a unique identifier called a UID
    • Every group has a unique identifier called a GID
  • ls -l shows who owns what and the permissions
  • chown user:group file Changes the user and group that owns file

Linux file permissions

There are (mostly) three types of permissions:

  • r - read: enables you to view the contents of a file or directory
  • w - write: enables you to modify the contents of a file or directory
  • x - execute: enables you to run the file or enter into a directory

Permissions apply to users, groups, and everyone else (other).

User Group Other
rwx rwx rwx
  • Think of these permissions as three bits 000, for read, write, execute.
  • Each set of permissions is represented by an octal (base 8) number.
  • If a bit is 1 the corresponding permission is allowed, otherwise it is denied.

Use chmod to change permissions on files.

  • You can provide an octet to specify all permissions at once:
  • chmod 755 file (common permissions for executables, rwxr-xr-x)
  • chmod 644 file (universally readable file, only user who owns it can write, rw-r--r-)

Or modify using letter codes for (u)ser (g)roup (o)others, and (a)ll

  • chmod a+x
  • chmod a-x
  • Using letters with chmod is advantageous when you only want to change some permissions while leaving others unmodified
  • Using octals with chmod is advantageous when you want to specify all the permissions of the file at one time.
Advanced Permissions

There are more advanced permissions than discussed above, but they are rarely encountered in robotics.

The following three variants of executable bits (corresponding to user, group, and other respectively) generally have significant security implications and should only be used after seriously considering various threat scenarios.

  • setuid: if set makes an executable run with the privileges of the user who owns the file.
    • Files that have this bit and the user execute bit set have an s where the x usually is when displayed with ls -l (e.g., /usr/bin/sudo)
    • If a file owned by root is setuid it executes with root permissions, regardless of who runs it (very dangerous).
  • setgid: if set makes an executable run with privileges of the group who owns the file.
    • If set on a directory, any files created within the directory are owned by the group of the directory instead of the owner.
  • "sticky bit": when set on a directory files in the directory can only be removed by the owner (e.g., /tmp).
    • Sticky bit is a t instead of the the other execute permission.
    • Normally, permission to remove files from a directory is related to the ownership and permissions of the directory not the file itself. The sticky bit changes this relationship.
  • Access Control Lists (ACLs) enable fine-grained permissions (such as enabling specific users to access a directory). These mainly exist to overcome limitations imposed by files only being able to be owned by a single user and group .
  • Extended Attributes (xattr) provide extra attributes to files and directories (e.g., ACLs are implemented with them).

Directories

Directories are files. They have owners and permissions too.

  • In the output of ls -l, a d in the first permission column indicates that a file is a directory (e.g. drwxr-xr-x)
  • To see permissions for a specific directory do ls -l -d1 <dirname>.
    • The -d1 prevents ls from recursively displaying information from sub-directories of <dirname>.

Symlinks

  • A symlink is a file that points to another file (called the target).
  • The symlink acts as if it were the target file.
  • Create a symlink with ln -s TARGET LINK_NAME
  • Symlinks have an l in the first column of ls -l output (e.g., lwxrwxrwxr).
    • Symlinks do not have their own permissions. All chmod operations on a symlink happen to the target file.

root

  • root is the all powerful user. root can do anything. Be careful when you run with root permissions.
    • By default, root can read and write to any file in the system, regardless of ownership or permissions
    • root can change ownership of any file, and become any user without needing a password.
    • root cannot execute files that do not have the execute permission. However, root can always set this permission bit.
  • sudo lets you temporarily execute commands as root by entering your own user password.
    • This command lets you administer your system without directly becoming root.
    • The root user controls which users have access to sudo.
  • su lets you change to a different user. By default that user is root, but you will need the root password.
    • On Ubuntu, you do not know the root password. (This is a choice that Ubuntu has made; other distributions work differently).
    • You can use sudo to set a root password if you want (arguably this is less secure and not a good idea).
Security Notes

The concept of permissions is enforced by the running system, which has the following implications:

  1. If the data-storage device that holds the files is not encrypted, anybody with physical access to the device can put it into their own machine (where they have root access) and read all the files.
  2. If the hard-drive is not encrypted, anybody with physical access to the computer can boot into the machine as the root user, without needing to know the root password.

Exercises

  1. What are the permissions for /usr/bin/ls?
  2. What are the permissions of your home directory?
  3. Create a new file. What are its permissions?
  4. Remove the write permission from the file you just created and try to modify it. What happens?
  5. Create a new directory and a file in that directory. Remove the execute permission. What happens when you try to cd into it?
  6. Remove the read permission from the directory. What happens when you ls the directory?
  7. Restore the execute permission and try to cd into the directory. What happens?
  8. Set permissions so that you can list the files in the directory, cd into it, but not add or remove files from it.
  9. Determine the permission octets for the following common permissions:
    1. rwxr-xr-x (usually used for scripts)
    2. rw-r-r (A file you want to share with others but only you modify it)
    3. rw-rw-rw (Everyone can read and write the file

Process Management

Processes

  • The basic unit of execution on Linux is called a process. Most programs that you run consist of a single process but a few (such as firefox) consist of multiple coordinated processes.
  • Each process has a unique process identifier (PID), that can be used to interact with it.
  • You can launch processes from the command line: in fact when you ran many of the commands in the previous section you were actually causing the Linux kernel to load and execute a process.
  • Process are organized hierarchically into a tree structure. Process on Linux can spawn child processes. By default the processes you run from the shell are child processes of the shell (and so the shell is their parent process). When the parent processes terminates its children also stop running.
  • Processes can also be launched as daemons. These are background processes not associated with a shell and will continue running even when the shell exits.
  • For example, when you run ls, this command launches a process that lists files and then terminates.
  • The shell, bash, is also a process. Some shell-builtins, are a part of bash and don't launch separate processes (such as cd).

Viewing Processes

  • ps reports information about processes.
    • ps by default shows processes run from your current shell.
    • ps aux shows all processes by all users regardless of whether they are associated with a shell.
      • The a adds processes from all users not just your user.
      • The u prints the username and some other information next to the processes.
      • The x adds processes not associated with the shell.
  • pstree shows the process tree.
    • If process A launches process B then A is the parent of B and B is the child of A.
  • top provides interactive process monitoring in the terminal. It includes CPU and memory usage reports.
  • htop (install with sudo apt install htop) is a fancier version of top with an easier-to-interpret display and easier to use interface.
  • free -h Shows the amount of free RAM and swap memory used by the processes on your system in a human readable format
  • pgrep search for processes that meet certain criteria (such as name).

Signals

  • The user can communicate with processes using signals.
  • Each signal invokes a different behavior (see man 7 signals)
    • Each signal has a name and a number (the number depends on processor architecture)
    • Four important signals are:
      1. SIGKILL Tell the process to close.
      2. SIGSTP Tell the process to stop executing code but remain in memory.
      3. SIGCONT Tell the process to resume.
      4. SIGQUIT Tell the process to terminate immediately.
  • kill sends a signal to a process with a given PID.
    • kill -KILL 1435 will send SIGKILL to PID 1435.
    • kill -9 1435 will also send SIGKILL to PID 1435, since (on x86) 9 is the number for SIGKILL.
  • pkill sends a signals to processes matching a certain criteria.
    • You probably want to pgrep before pkill to see what it's going to do.
  • killall sends a signals to processes that exactly match a given name.

Jobs

Jobs are processes that you run from the shell. The shell keeps track of these processes and helps you manage the, which is often more convenient than managing processes based on PIDs. As jobs are a shell-specific concept, this information primarily pertains to bash (although other shells have similar functionality).

Normally, when you run a command, it starts the process in the foreground. The shell will not offer a prompt to read new keyboard input until the process terminates.

You can also start a process in the background by appending an ampersand & to the command (e.g. command &). The shell prints the job number and PID of the process for your reference and then provides another prompt, letting you continue running commands as the process you started runs.

Here are some commands and keyboard sequences that relate to jobs. Jobs are the primary way that a user can run multiple programs simultaneously from a single shell.

  • jobs lists the jobs. Each job has a number. The first job is 1 and they increment. Refer to the job as %NUMBER (e.g., %1) in commands
  • Pressing C-c will send SIGKILL to the current foreground process, which should cleanup and terminate.
  • Pressing C-\ will send SIGQUIT to the current foreground process and tell it to terminate immediately (use sparingly. processes can sometimes ignore SIGKILL but in most circumstances SIGQUIT will get through, however the process will forego some cleanup routines).
  • Pressing C-z will send SIGSTP to the current foreground process, which will suspend it (it remains in RAM but does not run).
  • bg %<JOB> will resume <JOB> in the background (e.g., bg %1).
  • fg %<JOB> will bring <JOB> into the foreground and resume it if it had been suspended.
  • disown %<JOB> will detach the job from the shell (so it will not be terminated when the shell closes).
  • You can also use %<JOB> with kill to send a signal to the specified job.

Exercises

  1. Run gedit from the terminal (sudo apt install gedit to install). Note that you cannot run more commands in terminal until gedit exits.
  2. Suspend the process with C-z. What happens to gedit?
  3. Use jobs to see the processes you have started in the shell.
  4. Resume gedit in the background. What happens to gedit?
  5. Bring gedit to the foreground.
  6. Quit gedit by typing the appropriate keys in the terminal.
  7. Launch gedit directly in the background.
  8. Close gedit with one command from the terminal.
  9. Run gedit in the background again, then exit the shell by closing the terminal window.
  10. Run gedit in the background again, then exit the shell by typing exit

The SIGHUP Signal

Background jobs behave differently, depending on how you exit bash.

  1. When you quit the terminal, the terminal emulator sends SIGHUP to bash.
  2. bash, in turn, sends SIGHUP to the jobs it was running, which usually causes them to terminate.
  3. If however, you exit bash by using the exit built-in or by sending bash the EOF character C-d, SIGHUP is not sent so backgrounded processes continue to run.
  4. You can prevent bash from sending SIGHUP upon terminal closure using disown.

Computer Maintenance

Package management

  1. The main package management tool on Ubuntu is apt.
  2. You will see references to apt-get everywhere: this command still works but is an older tool.
    • apt provides a mostly compatible but nicer interface to apt-get.
    • For example, by default apt includes status bars and other information that is useful.
    • apt-get still works if you want to use it and is preferred for scripts.
  3. Any apt command that will change your system must be run as root (i.e, use sudo).
  4. The --dry-run option lets you see what apt will do without actually doing it.
    • Be careful, as you can seriously destroy your system with apt, especially if you start using exotic flags.
    • Linux distributions like Guix and NixOS are designed around package managers that let you deterministically roll-back changes and were developed partially as a reaction to problems you can run into with traditional package managers like apt.

Important Apt Commands

apt update
Downloads the latest package descriptions from servers configured in /etc/apt/sources.list and /etc/apt/sources.list.d. Use this command before running others.
apt upgrade
Upgrades all packages to the latest version, but won't delete any packages. sudo apt full-upgrade (or equivalently sudo apt dist-ugprade will remove packages if upgrade means they are no longer needed).
apt install pkg-name
Install a new package, retrieved from servers specified in /etc/apt/sources.list and /etc/apt/sources.list.d/.
apt install $path_to_deb_file
Install a new package from a .deb file.
apt remove pkg-name
Remove a package but keep configuration files.
apt purge pkg-name
Remove a package and all configuration files.
apt search pkgname
Find packages that have the word pkgname in them.
apt autoremove
Remove unneeded dependencies.
apt autopurge
Remove unneeded dependencies and their configuration files.
apt list
View packages that are available. apt list --installed to see installed packages

Third-Party Packages

You may need a program or library that does not come with Ubuntu. Reasons include:

  1. You need a different version than Ubuntu includes.
  2. The package is not open-source (e.g., zoom).

Sometimes you may need to install a package from its source code:

  • Each package has different build methods.
  • Two common build methods are cmake and autotools.

Language-specific package managers

  • Many programming languages have their own package managers.
  • Python has pip (invoked as pip3 on Ubuntu).
  • I recommend using virtual environments for python projects that do not use ROS 2.
  • If you want a python program available on your system, my first preference is to install from apt or use pip in a virtual environment.
    • These packages are named python3-packagename.

On Upgrading

  1. You should primarily use sudo apt upgrade to upgrade your computer, rather than sudo apt dist-upgrade.
  2. The dist-upgrade command can remove packages from your computer and thus is more dangerous than upgrade.
  3. Only use dist-upgrade after carefully inspecting what it will do and understanding the consequences.
  4. In this case we need to use dist-upgrade so that new nvidia drivers and tools can be installed.
  5. Unlike on Windows, on Linux you essentially never need to reboot after an upgrade
    • People pride themselves on how long their computers remain on without rebooting
    • There are many situations where rebooting or running a bunch of commands will fix a problem
      • Oftentimes (especially on a laptop) it is easier to just reboot anyway.
    • I recommend rebooting after a kernel update
      • Technically a kernel can be reloaded without rebooting, but this is usually not necessary and somewhat difficult to setup.

Initialization System

The init system controls what happens when you start your computer and what tasks run in the background.

  1. Systemd is responsible for running tasks automatically when your computer starts.
  2. It also does a lot more (such as logging with journald).
  3. Use systemctl to control daemons (background processes) running on your computer.
  4. systemctl start <service> starts a service.
  5. systemctl stop <service> stops a service.
  6. systemctl enable <service> makes the service run during the boot process.
  7. systemctl disable <service> prevents the service from running during the boot process.
  8. systemctl reboot [--firmware-setup] will reboot your computer (--firmware-setup enters the UEFI screen on reboot).
  9. systemctl poweroff will turn off your computer.
  10. systemctl isolate <stage> switches to a different run level (e.g., single-user, multi-user, graphical.
  11. systemd also has timers, which can run a task at a given time (like chrontab).
  12. systemd --user <command> lets you start tasks as an ordinary user.
Other Init Systems
  • Most Linux systems currently use systemd.
  • Prior to systemd most Linux systems used either sysvinit or bsd init. These init systems were based on shell scripts and some Linux distributions still use them today.
  • Other Linux init systems include openrc and GNU Shepherd

System Logs

  1. Examining log files is one of the first steps to take when something goes wrong.
  2. If you are unsure of something and are asking someone for help, it often makes sense to show them log information.
  3. Logs are stored in /var/log.
  4. dmesg shows logs related to the kernel.
  5. Systemd has a built-in logging system called journald.
  6. journalctl -u <service> returns log entries for the given service.
  7. journalctl -x shows extra explanatory information which can be helpful.
  8. You can use journalctl to view other types of logs as well, see man journalctl.

Environment Variables

  • Environment variables store information that can be accessed by processes.
  • Environment variables can be local to the process or marked for export, which makes them inherited by the child process.
  • printenv displays all currently defined variables.
  • printenv VARNAME displays the value of VARNAME.
  • echo $var prints the value of var (echo generally prints data to the screen).
  • env displays the currently defined variables and does other things.
  • To set environment variables in bash.
    • VARNAME=value sets the value within the shell but processes launched from the shell won't see it.
    • export VARNAME to make the variable visible to processes launched from the shell.
    • export VARNAME=value sets the variable and exports it.
    • VARNAME=val runprogram exports the variable to the program run on that line but the value is cleared afterwards.
    • unset VARNAME clears the environment variable.
  • To access the value of an environment variable prefix it with a $.
    • By default, the value of the variable is substituted wherever you write $var.
    • You can also use the syntax ${var}, which can separate var from subsequent characters.
    • You can use \$ to get a literal dollar sign.
    • Single quotes ('') also suppress a variable substitution.
  • Command substitution: $(command) will be replaced with the output from command. Therefore you can capture a command output in a variable by doing var=$(command), e.g., var=$(echo "hello"), var will hold "hello".

The PATH environment variable

When you type a command, bash must be able to find it.

  1. If the command is a builtin, bash executes the appropriate code.
  2. If not, bash searches in the directories specified by PATH.
  3. You can also directly specify the path to an executable with a relative or absolute path (e.g., /usr/bin/python).
  4. Adding directories to the path is sometimes necessary when installing certain programs.
  5. To append to the path use PATH=$PATH:newdir. (Exercise: Why does this work?)
  6. The current directory is NOT on the PATH (due to security issues).
    • To run an executable in the current directory you must specify the path to it.
    • For an executable in the current directory, use ./executable.
  7. To see what file bash will execute use the which <executable>.

Dotfiles

Dotfiles refer to the hidden files that programs place in your home directory and (more recently) under ~/.config. These files contain user-specific configuration for most programs on Linux, whereas global configuration is stored in /etc. Using dotfiles, you can customize most aspects of your system.

Startup

There are several files that run bash commands whenever you log into the shell. The main file you use to control shell settings is called ~/.bashrc.

Use ~/.bashrc to:

  1. Add directories to your path, using export.
  2. Add command aliases: an alias is a shortcut that gets substituted with a larger command
    • alias will display all the current aliases
    • alias spdir="cd /home/user/my/dir/that/I/always/goto" will let you type spdir to execute the specified cd.
    • To use aliases with 'sudo', add alias sudo='sudo ' to the .bashrc.
  3. Set other environment variables, for example, PS1 controls what you see on the command prompt.
  4. Editing and customizing your .bashrc is a great way to make your computer fit you better and to learn a bit about shell scripting.

Ubuntu, by default, provides you with a .bashrc. This .bashrc loads aliases from a file called .bash_alias. This behavior is not standard, it is something that Ubuntu adds.

Managing Dotfiles

When you put a lot of work into your dotfiles, it can be helpful to track changes, deploy to multiple machines, and share with others. This Archlinux wiki page provides a lot of useful information on tools and strategy for managing dotfiles.

Exercises

  1. Print the value of the HOME environment variable.
  2. cd uses the value of HOME to determine where to go when no arguments are provided.
    1. Using the HOME environment variable, and without passing any arguments to cd, use a single command to change your directory to /usr/bin
    2. cd back to your home directory.
  3. Display your path.
  4. Save PATH by running export opath=$PATH.
  5. Display your path.
  6. Copy /usr/bin/free to your home directory.
  7. Run free. Which free are you using? (The one installed on the system or the one in your home directory?)
  8. Append your home directory to PATH. Run free. Which free are you using now?
  9. Clear PATH so that it is empty.
    • Verify that it is empty using echo.
    • Try running free. What happens? Why?
    • Try running ls. What happens? Why?
    • Run the free in your home directory by specifying the relative path to it.
    • Run the free in /usr/bin by specifying the full path to it
    • Move ~/free to /tmp (since /tmp is stored in RAM this copy will be deleted on your next reboot.
    • Restore the PATH (export PATH=$opath) and verify that ls works
    • Bonus: Why did echo work and ls did not when you unset the PATH? Hint: use the type command to find out.
  10. Install a package with apt. Some programs you may wish to install:
    • Picture editing gimp (Free Photoshop), inkscape (Free Adobe Illustrator), imagemagick (Command line image manipulation).
    • mplayer (Media Player), vlc (media player), ffmpeg (Basically do anything with video files, but from the command line.
    • ripgrep, a utility for a full-text search of files via the command-line.
  11. View what you have done using apt by examining the apt log file. less /var/log/apt/term.log.

Intermediate Bash

Bash is not just an interface to your computer, it is also a complete programming language that is good for automating and combining tasks.

Streams, pipes, and redirects

Streams, pipes, and redirects constitute the plumbing of bash and enable you easily change the input source, output destination of programs and chain together programs. The Unix philosophy is to have many programs that do one thing well and allow the user to chain these programs together.

Streams

Every console program automatically opens three special files, called streams:

  • stdin - standard input stream: this is the file that sends keyboard input to a program.
  • stdout - standard output stream: this sends characters from a program to the terminal for displaying.
  • stderr - like stdout but used to separate error message from regular messages.

These special files are used by the program to obtain keyboard input from and output text to the user.

These streams can be redirected to files on disk, allowing you to directly save the output of a program to disk or feed a program input from disk. This technique works for any program, even if it does not explicitly provide save/restore functionality.

  • To redirect stdout to a file: <cmd> > outfile, where <cmd> is the command. For example echo "hello" > outfile will cause outfile to contain the text "hello\n".
    • Redirecting to output files overwrites them so be careful.
    • Setting the noclobber option (set -o noclobber) will prevent this behavior and require the use of >| to overwrite files
    • To unset noclobber use set +o noclobber.
    • If you set and get used to this option, be careful as most systems do not have it enabled for historical reasons.
  • To redirect stdout and have it append to a file: <cmd> >> file.
  • To redirect stdin use <cmd> < file.
  • To redirect stderr use <cmd> 2 > file (This works because stderr has file descriptor 2).
  • To redirect stdout to stderr use <cmd> 2 > &1. The &1 indicates a redirection to file descriptor 1 which is stdout.
  • Redirect to /dev/null to suppress the output (/dev/null is a file that discards anything written to it).

Pipes

A pipe connects the stdout of one program to the stdin of another program. Thus, the output of one program acts as the input to another program.

  • Create an anonymous pipe using the syntax <cmd1> | <cmd2>, where <cmd1> and <cmd2> are commands. The output of <cmd1> will be used as the input for <cmd2>.
  • Create a named pipe using mkfifo. Processes can open the file for writing or reading. The kernel forwards any data written to the pipe to the processes reading from the pipe.

Useful Programs

These programs are especially useful when combined with pipes and redirection and most are already installed on your computer

  • grep This program searches its input for matching strings. It has powerful regular expression (regex) abilities but in its simplest form, grep word will read from stdin and output to stderr any line containing word.
  • rg Ripgrep (install with apt install ripgrep) is an enhanced grep replacement with default options that make it easy to search through source code.
  • tldr TLDR provides short help snippets for various commands.
  • find Used to find files. Powerful, but the most useful case is find . -name *.cpp will find all the .cpp files in the current directory and recursively in sub-directories
  • less Less is a pager, a program that takes long outputs and lets you scroll through them easily. Typically you will pipe the output of commands to less, but less file will let you see and scroll through a file easily.
  • sort Sorts its input alphabetically
  • diff Display the differences between two files
  • curl (Install with apt) Download a webpage and output to stdout
  • cat Concatentate multiple files and output them. Often used to output just one file
  • tar Often, groups of files are distributed as tarballs. You can extract them with tar xf file or use tar to make a tarball.
  • head Show the first few lines of a file
  • ping Contact a network address and see if it is alive
  • tail Show the last few lines of a file
  • watch Continually read a file and update the display as it changes
  • wc Counts the number of words or lines in a file. Useful for determining how many lines of code you've written
  • tr Replace or delete characters in an input stream
  • fold Insert a newline after every N characters
  • sed A complete programming language, specialized for replacing text in files.
  • awk A complete programming language, specialized at parsing tables of text
  • xargs Useful for converting values on stdin into command line arguments used by another program

Bash scripts

Bash scripts contain sequences of commands and instructions for the bash shell to run. A bash script can be a simple list of commands executed in sequence or a complicated and messy program.

Overall, bash is useful for quickly automating tasks that require running commands in the bash shell. For more complicated tasks it is often easier to maintain a python script.

  1. Modern bash scripts usually start with #!/usr/bin/env bash (sometimes you will see #!/usr/bin/bash).
    • After creating a script you will usually chmod 755 your_script to make it executable.
    • If your script has the executable permission and this shebang line then you will be able to execute the script directly by either having it on your PATH or explicitly typing the path to the script at the prompt.
    • Comments in bash start with # so the shebang line is ignored by the interpreter.
  2. Alternatively, scripts can omit the shebang, and then only be executed by explicitly calling bash directly on the script.
  3. For an in-depth tutorial see The Linux Documentation Project Advanced Shell Scripting

Exercises

Even a broken clock is right twice a day.

  • The file /dev/urandom is an infinite stream of random bytes.
  • tr lets you delete bytes matching certain criteria from an input stream.
  • fold cuts off an input stream after a certain width and inserts a newline.
  • date prints the current date and time: arguments provide the format.
  • grep searches for strings from stdin.

Using ONLY the above commands, stream redirection, pipes, and command substitution $() write a single line that will generate random times in the format HH:MM and print out the time if the randomly generated time matches the current time to the hour and minute (as generated by date when you press enter).

You may wish to experiment with various parts of the pipeline to put this all together.

Networking

Computer networks allow computers and robots to talk to each other. From a robotics perspective, having some basic knowledge of networking can help debug many problems when communicating with robots.

The Linux kernel is ultimately responsible for networking in Linux. There are also many user-space tools to control and inspect network.

There are several tools for managing the network. We will focus on NetworkManager because

  1. It is the default system installed on Ubuntu Desktop
  2. It allows users to easily move between different WiFi networks

There are four major NetworkManager components

  1. The NetworkManager daemon (see systemctl status NetworkManager), which runs in the background and responds to network changes
  2. The NetworkManager applet (nm-applet), this is what you click on when you change network settings from the GUI
  3. The NetworkManager command line interface (nmcli). You can do everything from the command line, if you want
  4. The NetworkManager text-user-interface (nmtui). Useful for managing the network when only a text console is available.
Other Networking tools
  1. systemd (the init system) provides systemd-networkd which allows basic network connectivity. It is probably best suited for systems with fixed Ethernet connections
    • This is bare-bones functionality seems best suited systems that have fixed Ethernet connectivity and do not move
  2. netplan is used by Ubuntu Server. It is specific to Ubuntu, and is a declarative front-end for either NetworkManager or systemd.
  3. ifplugd Old school network management, with ifup ifdown commands and /etc/interfaces. Good for a simple static configuration.

A Layered Network Model

The network can be thought of in terms of different layers, according to the OSI model. What follows is a simplified conceptual description, useful for someone who is specializing in robotics.

Physical Layer

The physical layer describes the electrical specifications for how bits are physically sent over a wire (or the air in the case of WiFi).

Data Link Layer

Every network interface card (NIC) (e.g., WiFi or Ethernet) has a unique identifier called a MAC address. The MAC address is used for low-level communication between hardware that is directly connected with no intermediate devices.

  • The MAC address is 48 bits long, and is typically expressed like A1:B2:C3:D4:E5:F6
  • To see the MAC addresses of your network interface cards (NIC) (e.g., wifi and ethernet), you can use nmcli (the NetworkManager command line interface) or ip a (basic user-space Linux tool)
  • The MAC address is assigned to the network card at the factory; however, it can be changed (spoofed) in software.

Network Layer

  • This layer establishes connectivity between all the hosts in the network, allowing any two hosts to talk to each other.
  • There are three main components of this layer in the internet: IP (Internet Protocol), ARP (Address Routing Protocol), and ICMP (Internet Control Messsage Protocol).
  • IP is the main protocol that we work with: each host has an IP address which is used to direct data to that particular host.
  • ARP connects the Data Link Layer and Network Layer by associating IP addresses with MAC addresses
  • ICMP is for diagostics: typically when you ping a host, that is sending an ICMP packet.

Transport Layer

  • The transport layer determines how data is divided into packets over the network
  • There are two major transport layers on the internet: TCP (Transmission Control Protocol) and UDP (User Datagram Protocol)
  • TCP packets ensure end-to-end communication, meaning that when one host sends data to another host, all the data is guaranteed to arrive. It is received in the order it which it was sent. If data is lost, such a loss is detected through acknowledgments.
  • In UDP, a host sends packets without verifying when they arrive. Packets can also arrive in any order. UDP is faster and has less overhead than TCP but you lose the guarantees about how and whether the data arrives at the remote host.

Application Layer

  • These are applications and protocols built on top of the Transport Layer
  • For example Hyper Text Transfer Protocol (http), Secure Shell (ssh), IMAP4 (for email), etc.

Hubs and Switches

Hubs and switches are devices that enable multiple computers to communicate on a network.

  1. A hub is essentially a wire-splitter: when a device connected to a hub sends data, all the other devices on the hub receive data.
    • If the data is destined for a specific MAC address, the other computers on the hub ignore it (or can inspect it if they want).
  2. A switch operates using MAC addresses. If a computer sends data to a given MAC address, the switch intervenes and sends that data only to the intended recipient.
    • These days, we are mostly connected to switches rather than hubs.
  3. Generally, all computers connected to the same switch or hub can be thought of as being on the same Local Area Network (LAN).
  4. A router generally contains a switch, along with an extra port for a Wide Area Network (WAN).
    • The devices on the switch are part of a LAN
    • The router lets them communicate to an upstream computer via the WAN port.

IP Addresses

In a TCP/IP network (like the internet), every computer has an IP address. This address is used to route packets (containing data) to your computer. For IPv4 addresses this is a 32 bit number, often written as 4 decimal numbers separated by periods (e.g., 192.168.1.3).1

Subnet Mask

IP addresses are structured such that the first \(N\) bits correspond to the network address (which all computers on a LAN share) and the last \(32-N\) bits correspond to the host address (each computer has a unique host address on a given LAN).

  • A subnet mask is a 32 bit binary mask that determines how many bits of the IP address belong to the network and how many belong to the hosts.
  • The \(N\) most significant bits are all 1 and the \(32-N\) least significant bits are all 0.
  • A common subnet mask is 255.255.255.0 (this means the first three bytes are the network and the last byte is the host).2
    • In this setup the LAN can have 253 hosts (two host addresses are reserved: writing to .255 broadcasts the data to all hosts).
  • The subnet mask can also be represented by the number of bits that are 1. For example, 255.255.255.0 can be expressed as /24. The /24 can conveniently be placed after the ip address (e.g., 192.168.0.1/23 has a subnet mask of 255.255.254.0).

Routing

Routing is the process of getting data to the appropriate recipient. A router typically contains a switch (to connect LAN hosts) and an upstream WAN port, which is connected to an upstream router. The upstream router in turn is connected to another upstream router and so on. This network of networks is essentially the internet.

  • Hosts on the same subnet (or LAN) are neighbors and can communicate via MAC addresses (since they are physically connected to the same switch or hub).
  • Hosts on different subnets must communicate through a router
  • LANs are often organized with a central router called a gateway.
    • Typically, all hosts are directly connected to the gateway.
    • When a packet is destined outside the LAN, the gateway sends it to the upstream router via the WAN port.
    • Not all LANS have a router. In robotics it is common to directly connect your computer to a robot's computer without any intervening router.
  • The address resolution protocol (ARP) is used to find MAC addresses on the LAN associated with a given ip address.
    • Assume that Alice (ip address 192.168.0.2/24) wants to send a message to Bob (ip address 192.168.0.4/24)
    • Logically ANDing the ip addresses and subnet masks reveals that both Alice and Bob are expected to be on the same LAN (192.168.0.0)
    • The ARP protocol then is used to find the MAC address associated with Bob (host address 0.0.0.4). Knowing this MAC address, Alice can send data over the LAN to Bob.
    • You can view the MAC addresses associated with IP addresses on your LAN using the ip neigh command.
  • If Logically ANDing the ip addresses and subnet masks reveals that Alice and Bob are on different subnets
    • It is assumed that Alice and Bob are not on the same LAN, and therefore cannot directly communicate via MAC addresses.
    • The router must forward Alice's packets over it's (WAN) port to it's upstream switch.
    • The upstream switch repeats the process. Eventually the data gets where it's going.
    • Use traceroute (install with apt) to see what routers your packet hops through on the way to, for example, www.google.com or www.northwestern.edu

Obtaining an IP

There are many ways of obtaining an IP address. Usually, it is obtained via a dhcp server running on the gateway router (this is how Northwestern Wireless works). At home, your router probably obtains it's WAN ip address from your internet service provider's (ISP) router (using a dhcp client), and assigns local addresses to your computers with its own dhcp server.

When connecting to robots, we often need to assign ip addresses statically. With static ip allocation, each host has a manually specified IP address and subnet mask. If all hosts are connected to the same switch/router/hub they can then communicate with each other.

Ports

In TCP/IP communication, data is sent over ports. A port is represented by 16 bit integer. A server opens a socket on a given port and listens for incoming data. A client opens a socket that connects to a port at a given ip address.

Most services on the internet have a standard port. For example, a web server running the http protocol runs on port 80, and one running the https protocol runs on port 443. When you run a server you can specify the port it listens on.

On Linux, only root can listen on ports below 1024. You can see what ports are open and listening using lsof (list open file) because on Linux everything is a file. The exact command to use is lsof -i -P -n | grep LISTEN.

Domains

To allow humans to easily remember the location of websites the internet implements domain names. There are special DNS servers that are used to associate ip addresses with names. Your computer connects to them to convert a domain name into an ip address.

  • DNS servers are usually run by your internet service providers, but there are also servers run by companies like google (8.8.8.8) and CloudFlare (1.1.1.1)
  • Firefox now (optionally but by default) uses DNS over HTTPS, which bypasses your systems DNS settings to connect to different DNS servers
  • In Ubuntu (by default) systemd-resolved runs a local DNS server on your computer. This enables it to manage and combine results from multiple DNS servers on the internet and your LAN.

Testing Connectivity

  • The ping command allows you to test connectivity over the IP network (in most cases, unless it is specifically blocked by the target computer.
  • You can ping www.domain.com or ping <ip address>
  • traceroute (sudo apt install traceroute) is a tool used for tracing the path your packets take to reach a destination.

SSH

  1. ssh or Secure Shell lets you run a shell on a remote computer. All data is encrypted, hence the "secure".
    • Essentially this is remote desktop minus the desktop
    • You can also use ssh to get graphical applications running over the network, either directly or by using vnc or x2go
  2. You will mostly be using ssh clients, but if you do run your own ssh server, I recommend carefully considering the security consequences (since doing this lets remote users run commands on your computer).
    • Running ssh on a non-default port certainly provides some security by obscurity.
  3. I will provided everyone with shell accounts on our GPU machine (address provided separately) which you can use to practice your ssh skills.
  4. You can copy files using scp or rsync

SSH Keys

SSH keys allow you to login to ssh without typing your password each time. They are mandatory for logging onto MSR servers. You can also use them for GitHub and other git websites to avoid entering your password.

Keys are divided into a public and private components. The private key is secret, should be password protected, and not given out to anyone. The public key is freely distributed. An entity that has your public key can generate a cryptographic challenge that only someone with the private key can properly respond to. A successful challenge response allows you to gain access to a system without ever transmitting your password over the network.

  1. Create an ssh key using ssh-keygen -t ed25519.
    • You can use GNOME/Keyring to automatically unlock the key when you need it by using ssh-add
    • Install the seahorse package and use /usr/lib/seahorse/ssh-askpass <key> to unlock it at login

Terminal Multiplexers

  1. These programs let you save your work between ssh sessions, have multiple terminal windows open and switch between them during a single ssh session.
  2. Popular choices are tmux, screen, byobu.
    • screen is the most common program and may already be installed
    • tmux is newer then screen and easier to use/better in my opinion
    • byobu is a configuration that can use either tmux or screen
  3. If you would like to do pair programming using only the terminal, see This tutorial

Emergency Text editing

When you login to a server for the first time, you don't necessarily know what tools will be available, but you can usually count on vi

  1. vi is a basic text editor available on most UNIX systems.
    • It starts in Normal mode, a mode where you can enter commands such as
      • h cursor left
      • j cursor down
      • k cursor up
      • l cursor right
      • x delete character under the cursor
      • i Enter Insert mode
      • :q! Exit without saving
      • :qw Exit with saving
    • In Insert mode you can mostly type as usual, however
  2. Multi-line strings let you easily enter text directly from the bash shell. Run cat << END > file and press Enter. Then insert text. To finish type END on a line by itself to write the data to file. END is not written and it can be a word of your choice.

Hostname

  1. The hostname of your computer is the name of the computer used by others on the network.
  2. In the terminal you will see a prompt of the form <user>@<hostname>.
    • <user> is your username.
    • hostname is the name of your computer
  3. You can change your hostname with hostnamectl set-hostname <new-hostname>
    • Don't include the angle brackets (<>): e.g., hostnamectl set-hostname northwestern will set your hostname to northwestern.
    • The hostname must start with a letter, and contain only lowercase letters (a-z), digits (0-9), and the hyphen character (-).
    • Hostnames are technically case-insensitive, but conventionally use lowercase letters.
    • Keep the hostname to less than 8 characters or so.
    • An official guide to choosing a "good" name for your computer is available in RFC 1178.
  4. The file /etc/hosts- enables you to create a static mapping between names and IP addresses.
    • Some Linux functionality requires <myhostname> to properly resolve to your local machine.
    • Because <myhostname> is generally not registered with the Domain Name System (DNS), by default it cannot be translated into an IP address.
    • The entries in /etc/hosts enable your system to resolve <myhostname> and <myhostname>.localdomain to 127.0.1.1, which always points to your computer.
      • Commonly 127.0.0.1 is used to point to your local machine, but technically any 127.xxx.xxx.xxx addresses also work.
      • Using 127.0.1.1 enables distinguishing between addresses resolved through the name localhost (which always points to the local machine) and those resolved through <myhostname> (which if you set it up can point to the ip address for your computer as used by others on the network).
      • For details see Stack Overflow and the Debian Manual.

Exercises

We will prepare to get you accounts on a machine that you can ssh into.

  1. Trace your route to a website of your choice.
    • Can you identify any of the hops (e.g., which one is your router)?
    • How many hops until the packets leave your LAN?
  2. I've created a student account that you can use
    • Download the private ssh key and save it as ~/.ssh/id_student curl -L 'http://beast.mech.northwestern.edu/id_student' > ~/.ssh/id_student
    • Change permissions to owner read-write only: chmod 600 ~/.ssh/id_student
    • Connect to the server ssh -i ~/.ssh/id_student student@beast.mech.northwestern.edu
    • When you've successfully connected, disconnect
  3. On your LOCAL computer: create your own ssh key ssh-keygen -t ed25519
    • Set a password!
    • Creates private key ~/.ssh/id_ed25519 (don't share this with anybody ever) and public key ~/.ssh/id_ed25519.pub
  4. Use scp to copy your public key to the server scp -i ~/.ssh/id_student ~/.ssh/id_ed25519.pub student@beast.mech.northwestern.edu:/home/student/<firstname_lastname>.pub
    • <lastname_firstname> should be your last name, underscore, your first name
  5. Once your key is uploaded, I will eventually add it to your username's authorized keys file ~/.ssh/authorized_keys on our GPU computers.

Using the GUI

Architecture of GUI in Linux

  1. X Windows - Client server architecture for drawing GUI's and handling input.
    • Today Xorg is the display server.
  2. Wayland - The successor to X Windows, available on Ubuntu, recently used as the default
  3. Display Manager - This is the graphical screen when you login. GDM is the gnome display manager.
  4. Desktop Environment - This is GNOME 3 by default, provides the user interface for interacting with the system
    • Alternatives include KDE and XFCE
  5. Window Manager - Draws windows on screen. Your window manager is integrated with Gnome 3.
  6. On Linux, you can choose your own Desktop Environment, Window Manager, and Display Manager. It is also not necessary to have any of them.

Copy and Paste

  1. Whatever text you highlight goes into a clipboard. Middle click pastes.
  2. You can also copy and paste with right click. This is separate from the middle click clipboard
  3. C-c and C-v work in many gui programs but not the terminal. Adding Shift to these combinations works in some terminals
  4. You can paste using Shift-Ins, using readline commands (emacs style).
  5. Emacs and vim use their own clipboards but can access the X clipboard.

Other Resources

Footnotes:

1

We will not discuss IPv6 networks here. As the world runs out of Ipv4 addresses there is a slow transition to IPv6, but I have not needed IPv6 in a robotics context yet.

2

255.255.255.0 is 11111111.11111111.11111111.00000000 in binary

Author: Matthew Elwin. Date: September 2023.