This tutorial will help those new to computing in the geosciences become familiar with working in a command line environment. Here you will learn about the basics of Unix file structures, how to navigate in a Unix environment, and you'll get to practice creating, storing and searching for files.
Unix is an operating system that can handle multiple users and processes at the same time. Dennis Ritchie and Ken Thompson of AT&T's Bell Labs developed Unix in the late 1960s and early 1970s. Unix forms the basis for Linux, Apple's Mac OSx, Android, and most other tech that you know and love, including all versions of Microsoft Windows. Expand the optional Geek Box below if you want an in-depth look at the many Unix variants over the years.
Work in a Unix environment is accomplished through typing in a command line. If you have used the terminal window in Mac OSx or a DOS prompt in Windows, then you are already familiar with this way of computing. If this idea is new to you, fear not - you already execute the same types of commands whenever you save an image, open a program via an icon, and so on. The next section will explain a little more about how a Unix environment is set up and how it compares to a personal computer or smartphone environment.
Unix is made up of 3 main parts: the kernel, the shell, and user commands and applications.
The kernel and shell are the heart and soul of the operating system.
The kernel ingests user input via the shell and accesses the hardware to perform things like memory allocation and file storage.
The shell is an interface that interprets the command line input and calls the necessary programs to do the work. The commands that you enter are programs themselves, so once the work is done, the command line will return to a prompt and await further input.
There are several different shells, and syntax and shortcuts vary between them. For example, the "csh" shell listed in the image above is called "C shell" and has syntax similar to the C programming language. All shells support similar basic functions.
One example of how the shell and kernel work together is copying a file. If you want to copy a file named "file1" and name the copy "file2", you would enter "cp file1 file2" at the command line. The shell will search for the program "cp" and then tell the kernel to run that program on "file 1" and name the output "file 2". When the copying is finished, the shell returns you to the prompt and awaits more commands.
Let's take a look at another example. Suppose you have a folder called "docs" on your personal computer, phone or stored in a cloud somewhere. Let's say you have "personal" and "schoolwork" subfolders in there, and that inside your personal folder you have a subfolder with your photos from 2015, and that they're arranged into monthly subfolders. How do you get into March 2015's photo area? Easy - you keep clicking on or touching the appropriate folder, until it opens the next, and then the next folders until you can see March 2015 - then you click on it.
In Unix, you'd simply type the following at the command line to perform the same task: "cd /docs/personal/photos/2015/march".
# cd /docs/personal/photos/2015/march
Although you can't see your photos as icons here, the computer is performing exactly the same actions as you did by clicking on all those folders. Additionally, you can then list the photos in the directory, rename them one at a time or all at once, move them to other directories or even other computers, or much more - all by just typing a few characters.
The rest of the tutorial will introduce you to the file structures in Unix, how to navigate them, and how to use many common commands and programs to efficiently perform the work you need to accomplish.
To ensure that all users have a chance to practice using Unix regardless of their access to work or school computing resources, we will use an online simulated linux environment created by Fabrice Bellard, at this location: http://bellard.org/jslinux/.
Please note that if the linux simulator page is refreshed or closed, all modifications to files and directories will be lost.
Everything in a Unix environment spreads outward from a single "root" directory, much like a tree trunk and its branches.
The root directory is the top level, and is denoted by a slash (/). Other directories are created below the root directory - typically, you will find a "bin" directory, which contains binary files required for commands and processes like those we'll cover next, as well as a "tmp" directory for temporary files, and directories like "home" that contain information for individual users.
Within the linux simulator we'll be using, the directory structure looks like this:
In the online simulator, we are not assigned a user account in the home directory. Instead, we will be working from the "root" subfolder. The other directories that we will explore in the simulator are read-only and un-editable.
Click through to the linux simulator (which opens in a new window) if you haven't done so already. If you have already clicked through, refresh the page.
To see that we are indeed located within the "root" directory, which is a subdirectory of "/", type the letters "pwd" at the prompt in the simulator then press enter.
/root # pwd
/root
/root #
"pwd" is a command that stands for "print working directory". You can use this command to find out your location within the Unix file structure at any time. Here we can see that we are in "root" which is a subdirectory of the main root directory, "/".
Next, let's find out what's in our current working directory. List the contents of the current working directory using the list command, "ls", in the simulator.
What did you find? The simulator should have shown the two entries below:
/root # ls
dos hello.c
What do we know about these listed entries? Are they directories or files? Can you edit them, or are they read-only? The ls command by itself simply lists all the contents of a directory. You can add options to it to find out more information.
Try typing "ls -l" into the simulator and press enter. What kind of information are we shown now? The image below shows some examples of the standard information that is returned with the -l, long-format option.
The first character in the listing denotes the type of content:
This is a listing of the most common content types.
The next 9 characters denote permissions for the user, group and others (also called public). "r" denotes read privileges, "w" denotes write privileges, and "x" denotes execute privileges.
Thus, the first entry in the listed contents above is a directory in which the owner, group and all others have read, write and execute privileges. The second entry is a regular file that can be read, written to, and executed by the owner but is read-only for the group and other users.
The remaining fields returned in the long format listing (from left to right) include the number of links, owner, group, size (in bytes), date modified and the file or directory name.
Returning to our long listing in the current directory within the online linux simulator, we can see that "hello.c" is a regular file (a C program, in fact), and "dos" is a symbolic link to another "dos" directory that is in the "root" directory.
There are a few other common ls options that you may find useful.
Command and Option |
Description |
---|---|
ls -a |
List all files, including hidden files |
ls -lt |
List files sorted by the time last modified |
ls -R |
List files recursively (descend through all directories and list files from those sub-directories as well) |
ls --help |
As with most commands, if you add a --help to it, it will return all of the possible options for that command. Note there are two dashes, and no space between them. |
Now that we know what's going on in our current working directory, let's change to another directory to see what's there. We know that "dos" leads to a directory, so let's use that.
Type "cd" followed by a space and the name of a directory you want to change to (dos) into the simulator and press enter. Remember: all commands must be followed by a space before their target file/directory or process!
Now you should be in the "dos" directory. Notice that the prompt changed to show you the current directory. This is not always the case in Unix. Explore the contents of the directory with some of the listing commands we introduced earlier and then answer the question below.
Within the dos directory
You can tell that asm-1.9 is a directory by using the "-l" option and noting that that line begins with the letter "d". Additionally, it shows as a different color in the simulator. Finally, you could try to change directories into it - doing so will work for directories, but not for files.
Next let's change to the asm-1.9 directory. There are several files listed in that directory. In addition to easily listing all the contents in a directory, Unix allows users to quickly show the contents of individual files. One way to do this is the concatenate command, "cat". Try entering "cat" followed by one of the filenames in asm-1.9, and then press enter. Readme.txt is an easy one to view (don't forget a space between the command and filename!)
You should see the contents of readme.txt printed to your screen until the end of the file is reached, like so:
....
Format is:
Symbol-Name File-Name Line-No. Number-of-Refs Symbol-Type Value-Hex Value-Dec
To print cross references:
C:> lister -x asm.lst
....
PathSize asm.s 2 Equate 0040 64
asm.s 148
asm.s 153
2 references found
...
Format is:
Def: Symbol-Name File-Name Line-No. Number-of-Refs Symbol-Type Value-Hex Value-D
ec
Ref: File-Name Line-No.
REFERENCES
1. Tannenbaum A S, "Operating Systems : Design and
Implementation", Prentice Hall of India, New Delhi,
1989.
2. Rector R and Alexy G, "The 8086 Book", Osborne/
McGraw-Hill, California, 1980.
~/dos/asm-1.9 #
Contents of long files can be viewed stepwise by using the "more" command. This is similar to "cat", but it prints the file contents to screen and allows the user to step through them using the spacebar. To exit the "more" command, press "q" for quit.
Finally, let's move from the asm-1.9 directory back to the dos directory. Try getting there using the cd command.
What did you type? And what did it do?
Since you are in a sub-directory of the directory you're trying to access, the "cd" command must be used with an absolute path, or an appropriate relative one - we cannot simply type "cd directoryname" like we did before, because the directory we want to access is no longer below our location in the directory structure.
Here's the error message you would have received if you simply tried "cd dos":
~/dos/asm-1.9 # cd dos
sh: cd: can't cd to dos
~/dos/asm-1.9 #
To change back to the "dos" directory, we can use the absolute address "cd /root/dos" or we can use a relative path "cd .." where ".." indicates the directory above your current working directory. "." is always the current directory. To access the directory above /root from the asm-1.9 directory, we could type "cd ../.." as that directory is two directories up from our current location.
For quick navigation and efficient command-line usage, here are a few commands to cut down on your typing.
Command |
Description |
---|---|
<TAB> |
Before completing a file or directory name in the command line, press TAB to autocomplete the name based on the list of files/directories within this directory. |
~ |
When navigating, this is a synonym for your home directory. |
<UP ARROW> |
Go chronologically backwards through the previous commands you have run from the command line. |
<DOWN ARROW> |
Go chronologically forwards through the previous commands you have run from the command line. (Only works after pressing <UP ARROW>) |
Command |
Description |
Usage |
---|---|---|
pwd |
print working directory |
pwd |
ls |
list working directory contents |
ls |
ls -l |
list working directory contents with a long-listing |
ls -l |
cd |
change directory |
cd directory |
Paths |
Description |
Example |
---|---|---|
/ |
root directory if first character or sub-directory if any other character |
"cd /" Changes directory to the root of the file system |
. | current directory |
"ls ." Lists the contents of the current directory (this is implied by typing "ls") |
.. | directory one level up from current directory |
"cd .." Changes directory to one level up from current directory |
Now that we know how to navigate the file structures and find out what's in directories, let's make and modify some directories and files.
To make a directory, use the "mkdir" command. Let's start by making a directory called "test" in /root
/root # mkdir test
List the contents of our current directory to check that "test" was successfully created. Your results should look like this:
/root # ls
dos hello.c test
/root #
Next, let's put a file into our new directory. To do that, we can copy or move the hello.c file that is in /root. We'll try two ways.
Option one: we can use the "cp" command to copy hello.c into the test directory while naming it hello2, like shown below. Note that we have to use the relative address "test/" to ensure that hello2 is placed where we want it. If we did not specify this, it would be copied into the current directory.
/root # cp hello.c test/hello2
Option two: we can redirect the content of hello.c to a new file named hello3 using the "cat" and redirection commands. Redirecting the output from one command into another file is done with the ">" command. You can think of the greater than sign as a funnel to push contents from the cat command into the container/file on the other end. Try it using the code below:
/root # cat hello.c > test/hello3
Now, navigate to the test directory, and check that hello2 and hello3 have been created.
True or False: the contents of hello2 and hello3 are identical
The correct answer is a.
You can use the "cat" or "more" commands on hello2 and hello3 to verify that they are exactly the same. We copied the file the first time, and then printed all its contents into a new filename the second time, so they should be identical.
There is an easy way to truly tell the difference between two files: diff. You can use diff followed by two file names to check the difference between them. The differences will be listed individually. In this case, the command would be:
/root # diff hello2 hello3
/root #
Because the command didn't return any output, the files are exactly the same.
In addition to copying, we can rename or "move" the contents of one file to another filename. Type "mv hello3 hello4" into the simulator from the test directory you created previously. Then, list test's contents again.
How many files should now exist in the test directory?
The correct answer is b.
When you moved hello3 to hello4, it did not create a copy, it simply moved the file from one name to another in your directory. Thus, there should be two files: hello2 and hello4. This is the same thing that you might think of when renaming a file in another computing environment.
You should now be comfortable creating, moving and copying files. What about removing files?
To remove a file, we use the "rm" command. Use this command to remove hello4, and be sure to check that it has been completed.
/root/test # ls
hello2 hello4
/root/test # rm hello4
/root/test # ls
hello2
/root/test #
If you want to remove an entire directory, you can use the rmdir command. Navigate back to the parent directory and try it on the test directory by typing "rmdir test".
Why do you think this didn't work?
The correct answer is b.
The contents had to be removed first. Unix will only allow you to remove empty directories using the rmdir command. It may seem like a hassle, but it does ensure that you really want to delete the contents of a directory, since you have to go through the effort of deleting all other materials first.
Removing directories is best done with rmdir for safety's sake. However, you can remove files and directories quickly by recursively removing a directory's contents with "rm -r directory_name". This will remove the directory itself and the files within it. Always be careful when removing like this as there is no recovering from deleting files you meant to keep. If you want to play it safer, you can make "rm" interactive so it will check with you each time to make sure you want to remove that file before proceeding. To make "rm" interactive, use the -i command.
/root # rm -ri test
rm: descend into directory 'test'?
/root #
You can respond to the query by either typing
Now, try removing hello2, and then remove the entire directory.
Command |
Description |
Usage |
---|---|---|
cp |
Copy file1 from directory1 to directory2 optionally renaming the file in the process. |
cp directory1/file1 directory2/file2 |
mv |
Move (similar to cut) file1 from directory1 to directory2 optionally renaming the file in the process. |
mv directory1/file1 directory2/file2 |
rm |
Remove a file from a directory |
rm directory1/file1 |
rmdir |
Remove an empty directory |
rmdir directory |
Thus far we've explored only a handful of files in a couple directories. Now, let's move to looking at larger amounts of files and those from which you may need specific information.
Let's say you wanted to copy all of the ".txt" files from one directory to another, but there were several different file types present in that directory. In a GUI (graphical user interface) environment, you would probably sort the file list by type/extension and then just select the desired files, which are now listed in a block. In a Unix environment, you can find, list, sort and copy all the files of one type by using wildcards. These are special characters that can be used like wild cards in a card game - they can be anything you want them to be.
Navigate to the "asm-1.9" directory under /root/dos. To list all of the .txt files only, you would enter your command as:
~/dos/asm-1.9 # ls *.txt
The wildcard in this case is the special character "*". This wildcard represents any number of characters, digits, or whitespace followed by the last 4 characters being exactly ".txt".
What files would be returned from this "ls *.txt" command, based on the following list of files and directories? Select all that apply.
~/dos/asm-1.9 # ls
Changelog display.s expr.s lister.s readme.txt symtab.i
asm.s dos.i input.s message.s support.s symtab.s
direct.s equ.s license.txt output.s symbols.s
The correct answers are c, d.
Any filename that ends with .txt would be returned from this command except for a file called ".txt" (without the quotes) since it doesn't have any characters, digits, punctuation marks, or whitespace before the ".".
There are many different wildcards available in Unix. Some common wildcards are:
Let's do some examples using wildcards.
How many files would be returned by entering each of the "ls [option]" searches below?
~/dos/asm-1.9 # ls
Changelog display.s expr.s lister.s readme.txt symtab.i
asm.s dos.i input.s message.s support.s symtab.s
direct.s equ.s license.txt output.s symbols.s
The coded answers for each of the previous examples are below.
~/asm-1.9 # ls [a-g].s
~/asm-1.9 #
~/asm-1.9 # ls [a-g]??.s
asm.s equ.s
~/asm-1.9 #
~/asm-1.9 # ls [a-g]*.s
asm.s direct.s display.s equ.s expr.s
~/asm-1.9 #
~/asm-1.9 # ls *.s
asm.s display.s expr.s lister.s output.s symbols.s
direct.s equ.s input.s message.s support.s symtab.s
~/asm-1.9 #
~/asm-1.9 # ls *
Changelog display.s expr.s lister.s readme.txt symtab.i
asm.s dos.i input.s message.s support.s symtab.s
direct.s equ.s license.txt output.s symbols.s
~/asm-1.9 #
If you wanted to list all the files that started with an A through G (capitals matter!), you could do that within a bracketed list like so:
~/dos/asm-1.9 # ls [A-G]*
This will display a file listing that returns one filename: Changelog.
If you wanted the "ls" command to return both capital and lowercase letters, you would need to include both of them within the brackets separated by a comma.
~/dos/asm-1.9 # ls [a-g,A-G]*
Using wildcards is great if you know where certain types of files are located. But what about if you don't know where a file is, but remember part of its name? You can still use wildcards, but you will need more functionality than just "ls".
Using the "find" command, you can find those missing files using wildcards. And, searches with find will search the current directory and any sub-directories. Be careful when doing this if you have a lot of sub-directories containing many files, as it can take a very long time to search all of the content.
The syntax for the find command is as follows:
find -name search_string
where -name indicates that it will search for the name of the file.
Here's an example of a find command.
~/dos # find -name 'a*'
./asm.com
./asm-1.9
./asm-1.9/asm.s
~/dos #
Notice that the command returned both files and directories and even files in sub-directories from where the command was run. These are all the files that start with an "a" and are any length longer than just "a". So we found the files we were looking for that started with "a".
What if you didn't know the name of the file, but remembered something within the file? To find something within a file, you can use the command "grep", which stands for Global Regular Expression Print. Grep follows the syntax below:
grep search_expression file_to_search
As an example, let's search through the file hello.c within your home directory to see if it contains the string "int".
/root # grep int hello.c
int main(int, argc, char **argv)
printf("Hello World\n");
/root #
Grep found two instances of that string. Notice though that the lines that are returned aren't looking for the word "int", they are looking for the string "int", which can be inside a word. This is the reason that "printf" is returned in the example - it contains "int" inside the word.
If you wanted to only return instances where the pattern was a word, you can add the "-w" option after grep.
/root # grep -w int hello.c
int main(int, argc, char **argv)
/root #
What if you didn't know the name of the file that had the string within it? You could still find all the files that contain that string with grep but you could use a wildcard in place of the "file_to_search".
Let's say we wanted to search for ALL files below your dos directory that contained the string "print". The following code would show you those files and print the lines which have the "print" pattern on them. The added code here is the "-r" option which digs recursively downward from your current directory giving the results below.
~/dos # grep -r print *
asm-1.9/readme.txt:The contents of the symbol table are printed out at the end o
f the
asm-1.9/readme.txt:Only one of -x or -z must be specified. The -x option prints
a
asm-1.9/readme.txt:complete xref dump (definitions + references) The -z option p
rints a
asm-1.9/readme.txt:To print labels not referenced
asm-1.9/readme.txt:To print all defined symbols:
asm-1.9/readme.txt:To print crossreferences:
asm-1.9/message.s:| BX points to the message to be printed
asm-1.9/message.s:|The procedure print 'asm :', the message, a carriage return a
nd a line feed
asm-1.9/lister.s:|Lister - print the symbol table of the assembler from the list
file.
asm-1.9/Changelog:1. Instead of printing the symbol table onto the screen it put
s
asm-1.9/Changelog:7. A separate program lister was added which prints out the sy
mbol
asm-1.9/Changelog:2. The print stats function was removed
asm-1.9/Changelog:7. Doesn't print the name of the file that it is assembling an
y longer.
asm-1.9/equ.s: call SprintRegister
asm-1.9/symtab.s: call SprintRegister
asm-1.9/symtab.s:SprintRegister:
asm-1.9/symtab.s:SprintRegisterMore:
asm-1.9/symtab.s: call SprintHexDigit
asm-1.9/symtab.s: jnz SprintRegisterMore
asm-1.9/symtab.s:SprintHexDigit:
~/dos #
Command |
Description |
Usage |
---|---|---|
find |
Find files recursively by their file name and list them. |
find -name string_or_wildcard |
grep |
Find files by their contents and display the line from each file that contains that search string. |
grep search_string file |
Wildcard |
Description |
Example |
---|---|---|
* |
any non-zero number of characters, digits, punctuation marks, or whitespaces |
ls *.jpg ls file* ls *in* |
? |
any single character, digit, punctuation mark, or whitespace |
ls photo?.jpg ls ?ilename.txt ls test?file.txt |
[...] |
A user-defined range of characters, digits, punctuation, or whitespace that takes up one space |
ls file[0-9].jpg ls [a-z]ile.txt ls file[_, ,.]name.txt |
This module covered the origins of Unix, the architecture of Unix-based systems, and allowed users to perform the following in an online simulator:
The table below, also available as a printable take-away, lists all of the common information and commands that were covered in the module.
Command |
Description |
Usage or Example |
|
---|---|---|---|
Paths |
|||
/ |
root directory if first character, or sub-directory if any other character |
cd / Changes directory to the filesystem root |
|
. |
current directory |
ls . Lists the contents of the current directory |
|
.. |
directory one level up from current directory |
cd .. Changes to one level above current directory |
|
Navigation & Content Listing |
|||
pwd |
print working directory |
pwd |
|
ls |
list working directory contents |
ls option |
|
ls -l |
list working directory contents with a long-format listing |
ls -l |
|
cd |
change directory |
cd directory |
|
File Management |
|||
cp |
Copy file1 from directory1 to directory2 optionally renaming the file in the process. |
cp file1 file2 or cp directory1 directory2 |
|
mv |
Move file1 from directory1 to directory2, optionally renaming the file in the process. |
mv file1 file2 or mv directory1 directory2 |
|
rm |
Remove a file from a directory |
rm -option filename or rm -option directoryname |
|
rmdir |
remove a directory |
rmdir directoryname |
|
Search Commands |
|||
find |
Find files recursively by their file name and list them. |
find -name search_string |
|
grep |
Find files by their contents and display the line from each file that contains that search string. |
grep search_string file |
|
Wildcards |
|||
* |
any non-zero number of characters, digits, punctuation marks, or whitespaces |
ls *.jpg ls file* ls *in* |
|
? |
any single character, digit, punctuation mark, or whitespace |
ls photo?.jpg find ???.c |
|
[...] |
A user-defined range of characters, digits, punctuation, or whitespace that takes up one space |
ls file[0-9].jpg ls [a-z]ile.txt ls file[_, ,.]name.txt |