Text is consider a "universal interface" for Unix systems. As you can already tell, Bash has a certain way of interpreting the text that we send it.
We can't, for instance, just type, "Create a new directory named 'Documents'", and expect Bash to know what's going on:
user@host:~$ Create a new directory named 'Documents' Create: command not found
Bash expects text to come Some words, like
mkdir seem to refer to programs. And some symbols, such as
*, will be interpreted by Bash to mean something much more expansive than just single characters.
It just does. Bash has a syntax which defines how it will interpret the text characters we send it. Just as English has a syntax in which the two following phrases have the same words, but different interpretation based on the punctuation:
"That's what," he said.
That's what he said.
But just as it's a bad idea to teach children their first language by focusing on the rules of grammar, it's not productive to just learn Bash through memorizing its particular grammar and syntax – you should be writing programs and seeing what happens.
However, it's helpful to explain some of the initial concepts of how Bash interprets our commands and data, as a way to prepare you for the seemingly rudimentary way that Unix handles text. Most of these concepts will make more sense after you've read about pipes and redirection and variables.
In programming, a literal value can be thought of as: what you see is what you get.
In the sequence of commands below, I call the
mkdir command three times separately. However, it will not just create 3 directories:
user@host:~$ mkdir 42 user@host:~$ mkdir apples oranges user@host:~$ mkdir "42 bottles of beer"
In the animated GIF below, I'm running these commands on OS X so you can see how it affects the filesystem, graphically:
So what were the characters, or strings of text, that were interpreted by the shell as literal values?
42 bottles of beer(including the space characters)
And which text characters were not interpreted as literal values?
mkdir- this was interpreted as the command to make a new directory
mkdirand the directory names passed to it
oranges, hence, the creation of two separate directories
42 bottles of beer
If you're coming from a modern operating system, like Windows or OS X, you've probably seen that it's possible to make files or directories with space characters in the name, e.g. the
My Documents and Settings directory on your
So how does
mkdir know that I wanted to make two separate directories instead of one called
apples oranges? It didn't. We have to explicitly specify that particular directory name by enclosing it in quotes, either single or double:
user@host:~$ mkdir 'apples and oranges' "sunshine and lollipops"
Without the use of quotes, Bash will interpret each space-separated word as a separate "word", or token. So
mkdir dogs cats will be treated as three different tokens: the command
mkdir, and the two arguments
Both apostrophes (single quotes) and quotation marks (double quotes) can be used to denote a text string (whether it contains spaces or new lines) as a single literal value. Whichever one you start with, make sure to end with it:
user@host:~$ echo 'Jimmy says "Hello"' Jimmy says "Hello" user@host:~$ echo "Jimmy's friend does not respond" Jimmy's friend does not respond
When using double quotes, however, certain special characters, such as the dollar signs that denote a variable, will be interpreted by the shell and expanded.
In the single-quote version, the entire text string passed into
echo is interpreted literally:
user@host:~$ some_number=42 user@host:~$ echo 'There are $s bottles of beer' There are $some_number bottles of beer
In the double-quote version, the shell sees the
$ and replaces the variable
some_number with its actual value,
user@host:~$ some_number=42 user@host:~$ echo "There are $some_number bottles of beer" There are 42 bottles of beer
A technical aside: In the olden days of computing, it was easy to assume that filenames (and the names of programs and commands) would never have a space in them. Now, that's changed. So most programs and commands designed for Unix-like system still adhere to this "no fancy filenames" mindset – quite sensibly, in my opinion – while allowing users to use the aforementioned quotation marks to delineate fancy filenames
For the most part, most of the exercises in this course will work on filenames and references that are safe and simple. But keep in mind the real world is not so simple, and not knowing that can lead to a lot of problems. For example, watch me create four new directories on my OS X system via
~ $ mkdir dogs cats ~ $ mkdir "This is the end, my friend" ~ $ mkdir "Don't ever > ever > ever name a directory like > this." ~ $ ls Dont ever?ever?ever name a directory like?this. This is the end, my friend cats dogs # Note: I've removed the apostrophe from the output here for # formatting purposes
As animated GIF:
Suffice to say, most programmers do not expect a filename to contain newlines, and that assumption is the source of many comical or critical (and sometimes both) system errors. Which is why later on in this course, we move to more sophisticated text-handling environments, e.g. Python.
One vital purpose of double quotes will be evident in later examples of variable usage. If a variable contains a space-separated value, such as
Documents and Settings, wrapping a variable in double-quotes prevents the variable's space-separated values from being interpreted separately, which can lead to nasty unexpected effects.
Again, this will make more sense when we look at how variables are used. But pretend that the variable
dir_name has been set to
"Documents and Settings". And compare the effects of the three
mkdir calls below:
user@host:~$ dir_name='Documents and Settings' user@host:~$ echo $dir_name Documents and Settings user@host:~$ mkdir '$dir_name' user@host:~$ mkdir "$dir_name" user@host:~$ mkdir $dir_name
mkdirto create a directory with the literal name of
$dir_nameinside double-quotes behaves as expected. The shell expands
$dir_nameto the string,
Documents and Settings, and a single directory with that name is created.
mkdir, causes three directories to be made:
Here's an animated GIF showing which directories are unexpectedly created as a result of a variable containing a value with spaces:
So with the interactive command-line, the shell typically expects to execute a command every time you press Enter (i.e. send a newline character)
sunet_id@corn30:~$ echo Hello Hello sunet_id@corn30:~$
There are a few exceptions, such as when quoted values include newline characters (i.e., what happens when you press Enter). And there are special characters we can use to change up the line-by-line interaction, though these are more or less for human-readability purposes.
For a single command that contains so many characters that it causes a line wrap, it's helpful – again, for human-readability, as the computer doesn't care either way – to split it over multiple lines.
Ending a line with a backslash will tell the shell that the command continues onto the next line (notice how the prompt changes into a right-angle-bracket):
sunet_id@corn30:~$ echo Hello \ > world Hello world
Note: make sure that the backslash is the very last character of the line you wish to continue, i.e. hit Enter immediately after the backslash, don't put a space or any other character after the backslash on the same line.
Using the backslash at the end of a line is how we explicitly tell Bash, "Hey, don't do anything yet, we're continuing this command on the next line". However, it's fairly easy for typos to make us accidentally carry-over commands. This happens most often with unclosed quote-marks or parentheses:
sunet_id@corn30:~$ echo "How are you world? > > ksdfljsadklfj > " How are you world? ksdfljsadklfj
Tip: If you unintentionally run into this situation and can't figure how to get out, hit Ctrl-C to break out of the limbo and to return to the standard prompt.
When you have multiple commands that are so short that they don't seem to merit their own lines, you can use the semicolon to separate the commands, and Bash will still execute the command as if you had put the commands on their own lines:
user@host:/tmp$ pwd; mkdir stuff; cd stuff; pwd /tmp /tmp/stuff
As a GIF:
The use of the double ampersand will let you join commands on a single line. However, how
&& differs from
; is that if the first command fails, the subsequent command will not run:
user@host:/$ pwd && mkdir stuff && cd stuff && pwd / user@host: cannot create directory 'stuff': Permission denied
As a GIF:
The use of double-ampersands is considered a good practice when doing something destructive right after a command that may not succeed. Consider these two commands (but do not run them on your own system):
# Dangerous: user@host:/$ cd junk; rm -f * # Safe: user@host:/$ cd junk && rm -f *
What happens when the
junk directory exists? The
cd (change directory) command will be successful and then the
rm command will remove all files in it. But what happens when
junk doesn't exist? Where is the program when
cd fails? And where will
rm be unexpectedly be doing its business?
This feature won't be particularly helpful to you until you start writing shell script files. But the pound sign can be used to tell Bash to ignore every character to the right of the pound sign. This can be used to annotate your code:
user@host:/tmp/x$ # I hope this works user@host:/tmp/x$ mkdir new_dir user@host:/tmp/x$ # hopefully that worked
The line-by-line nature of how Bash processes data makes it an inelegant system for processing data that spans more than one line.
For example, in the example HTML snippet below:
<h1>This is a headline</h1>
It is trivial (though clunky) to extract the text,
This is a headline, between the
h1 tags using grep (with Perl-standard regex):
echo '<h1>This, is a headline</h1>' | grep -oP '(?<=<h1>)(.+?)(?=</h1>)'
However, if the data looks like this:
<h1>This is a headline </h1>
While it's possible to use quotation marks to enclose multi-line strings:
echo "hey you what's going on?"
– this quickly becomes cumbersome when the strings themselves contain literal quotation marks, as in the case of HTML:
echo " <p class=\"note\"> John told me, \"This site is the <a href=\"http://example.com\" target=\"_blank\"> best\" </a> </p> "
By using a "Heredoc string", we can specify that some other delimiter be used to denote the beginning and the end of a string (note that we use
cat now, instead of
echo). Heredocs are a great way to include multi-line text, such as data rows, right alongside our script file.
cat <<EOF <p class="note"> John told me, "This site is the <a href="http://example.com" target="_blank"> best" </a> </p> EOF
The "limit string", which in the above case is
EOF, is traditionally used to delimit the string, though sequence of characters can be used, as long as these conditions are met:
So this is good:
cat <<THISISMYHEREDOC hello there THISISMYHEREDOC
The following examples are both wrong:
cat <<EOF hello there EOF
cat << EOF hello there EOF
The notation is a little weird, but think of it as
cat feeding into
stuff.html what it gets from the
cat > stuff.html <<EOF <html> <h1> <a href="http://example.com">An example</a> </h1> </html> EOF
By default, a Heredoc that contains special symbols and sequences, such as
$ before a variable name, will have those sequences expanded, just as they would be in a normal double-quoted string. To prevent this, put the
EOF inside of single-quotes:
world="LADEEDAH" # This gets interpreted cat <<EOF Hello $world EOF # Output: # Hello LADEEDAH # Prevent interpretation: cat <<'EOF' Hello $world EOF # Output: # Hello $world
Previously, I said that the limit string, e.g.
EOF, has to be exactly the same at the start and the beginning of the heredoc. The exception is for certain special symbols, such as single-quotes…in other words, you can begin a heredoc with
'EOF' and end it with
read command (read this elaboration on StackOverflow):
read -r -d '' some_variable <<'EOF' <html> <h1> <a href="http://example.com">An example</a> </h1> </html> EOF
Read more about Heredocs here: