Two roads diverged in a wood, and I –
I took the one less traveled by,
And that has made all the difference.
Robert Frost, "The Road Not Taken"
To create a branch in our program, is to create an alternative sequence of commands that may be ignored, based on certain conditions at run-time. In English/pseudocode, the control flow might be described like this:
"If this is true, then do this thing. If not, then do something else."
For example, type out the following sequence:
if [[ $num -eq 42 ]] then # if/then branch echo 'num is actually equal to 42' else # else branch echo 'num is not 42' fi
If you didn't set the variable
num to 42 beforehand, then the condition in the if statement, (
$num is equal to
42), would evaluate to false. So only the command in the "else branch" is executed, with this result:
num is not 42
However, if you set
42 beforehand, then the condition in the
if statement was met, as
$num evaluates to
42, which is equal to
42. Thus, the code after
if/then is executed, with this result:
num is actually equal to 42
Here's a GIF of the process:
Is this confusing the hell out of you? It probably should, as if/then/else constructs, like for-loops, aren't meant to be typed out at the interactive prompt. Just like for-loops, in which the interpreter waits for you to finish typing in code between
done, the if/then/else construct isn't executed until you've typed in
fi (the closing statement for an
If/then/else conditional statements, like for loops, represent a fundamental change to the control flow of programs. No longer do commands get executed in sequence, one-at-a-time as you hit Enter. With a for-loop, some command sequences are executed numerous times before the program advances. And with conditional branching, some sequences may be completely ignored.
For the most part, we'll be using conditional branching in shell-scripts, particularly inside loops, to add an extra layer of complexity to our programs. Complexity is not necessarily better, though…As fundamental as conditional branching is, I've waited until we've practiced loops and shell-scripts, as if/else statements can add a whole layer of debugging confusion.
As always, take things one step at a time. Try not to use conditional branching until you've convinced yourself that you really need your single-minded program to handle alternative scenarios.
To test the examples in this section, type the code into a shell script, and then execute it from the command-line.
If you're unfamiliar with how arguments are passed into scripts, keep in mind that the variable
$1, inside a script, is equal to the first argument passed into a script at execution time.
In other words, when this command is executed:
bash my-script.sh 90210
my-script.sh has access to the value
90210 by referring to
$1 (and if a second argument was passed in, it'd be inside $2)
If there is a command-sequence that should optionally run based on whether a conditional expression is true, then the
if/then statement can look as simple as this:
if [[ some condition ]]; then do_something fi
Note some key things about the syntax:
[[ ]]are used to enclose the conditional expression
[[$x == $y]]
[[ $x == $y ]]
Inside a script named
just_an_if.sh, write the following code:
echo 'Hello' if [[ $1 == 'awesome' ]]; then echo 'You are awesome' fi echo 'Bye'
Running that script will look like this:
dun@corn02:~$ bash just_an_if.sh stuff Hello Bye ## now with awesome dun@corn02:~$ bash just_an_if.sh awesome Hello You are awesome Bye
The branching logic looks like this:
'Hello' ____________________________________________'Bye' \ / if [[ $1 == 'awesome' ]] / then / \ / \___'You are awesome'_/
$1 is not equal to
'awesome', then the program continues along to the final line. If it does equal
'awesome', then the program takes the
then branch of code.
For situations that call for an either this happens, or that happens, we use the else syntax:
if [[ some condition ]]; then do_this else do_that fi
Inside a script named
if_else.sh, write the following code:
echo 'Hello' if [[ $1 == 'awesome' ]]; then echo 'You are awesome' else echo 'You are...OK' fi echo 'Bye'
Running that script will look like this:
dun@corn02:~$ bash if_else.sh stuff Hello You are...OK Bye ## now with awesome dun@corn02:~$ bash if_else.sh awesome Hello You are awesome Bye
Here's a diagram of that control flow:
'Hello' __ __________'Bye' \ / / if [[ $1 == 'awesome' ]] / / | then / / | \ / / | \___'You are awesome'_/ / \ / else / \____'You are...OK'______/
Unlike the standalone if-statement, if the program fails to meet the if condition (
$1 == 'awesome'), it does not simply continue to the final line,
echo 'Bye'. Instead, it branches into its own command sequence,
echo 'You are ...OK'
Many situations require more than an "either/or" to adequately deal with. For that, we have
elif, which allows us to make as many alternative branches as we'd like:
if [[ some condition ]]; then do_this elif [[ another condition ]]; then do_that_a elif [[ yet another condition]]; then do_that_b else do_that_default_thing fi
In a script named
if_elif_else.sh, write the following code:
echo 'Hello' if [[ $1 == 'awesome' ]]; then echo 'You are awesome' elif [[ $1 == 'bad' ]]; then echo 'Yuck' else echo 'You are...OK' fi echo 'Bye'
dun@corn02:~$ bash if_elif_else.sh awesome Hello You are awesome Bye dun@corn02:~$ bash if_elif_else.sh bad Hello Yuck Bye dun@corn02:~$ bash if_elif_else.sh kinda_bad Hello You are...OK Bye
The diagram of the control flow:
'Hello' __ __________'Bye' \ / / if [[ $1 == 'awesome' ]] / / | then / / | \ / / | \___'You are awesome'_/ / |\ / | elif [[ $1 == 'bad' ]] / | then / | \_______'Yuck'__________/ \ / else / \____'You are...OK____/
-a filename- true if
-f filename- true if
filenameexists and is a regular file
-d filename- true if
filenameexists and is a directory
-s filename- true if
filenameexists and has a size > 0
-z $some_string- true if
$some_stringhas 0 characters (i.e. is empty)
-n $some_string- true if
$some_stringhas more than 0 characters
$string_a == $string_b- true if
$string_ais equal to
$string_a != $string_b- true if
$string_ais not equal to
$x -eq $y- true if integer
$xis equal to integer
$x -lt $y- true if integer
$xis less than integer
$x -gt $y- true if integer
$xis greater than integer
See a full list of expressions in the Bash documentation
A common problem in long-running web-scraping tasks, or anything involving the Internet, is that you have to worry about the target site, or the entire Internet going down. Preparing for this scenario is a huge part of professional systems engineering.
What we've done so far hasn't risen up to that level of engineering. But we still have need, quaint as it is, for more robust operation. For example, it'd be nice if our web-scraper, when it has to quit and then restart, could continue from where it started, as opposed to re-downloading the pages it already downloaded.
To implement that kind of unnecessary-download-prevention, we can use the test for file existence:
for url in http://www.example.com http://www.wikipedia.org http://www.cnn.com do # remove all punctuation characters fname=$( echo $url | tr -d '[:punct:]') if [[ -a $fname ]]; then echo "Already exists: $fname" else echo "Downloading $url into $fname" fi done
If you put that code into a shell script named
nice-downloader.sh and run it twice (and assuming it isn't interrupted the first time):
user@host:~$ bash nice-downloader.sh Downloading http://www.example.com into httpwwwexamplecom Downloading http://www.wikipedia.org into httpwwwwikipediaorg Downloading http://www.cnn.com into httpwwwcnncom # second time: user@host:~$ bash nice-downloader.sh Already exists: httpwwwexamplecom Already exists: httpwwwwikipediaorg Already exists: httpwwwcnncom
The exclamation mark can be used within the conditional expression if what we want a branch to execute when something is not true:
if [[ ! 1 -eq 0 ]]; then echo 'FYI, one is not equal to zero' fi
The conditional expression in the above example reads as: if it is not true that 1 is equal to 0, then…
We can test more than one conditional expression at once, using
&& to require that two conditions that both must be true. Or, using
|| to require that either one (or both) of the conditions must be true.
&&, to join two conditional expressions in a way that reads: condition A and condition B must both be true :
if [[ $a -gt 42 && $a -lt 100 ]]; then echo "The value $a is greater than 42 but less than 100" else echo "The value $a is not between 42 and 100" fi
In the above example, the if statement evaluates to true only if both the conditional expressions are true:
$ais greater than (
$ais less than (
The following, much more convoluted code, achieves the same result – in other words, avoid nested if-blocks unless absolutely necessary:
if [[ $a -gt 42 ]]; then if [[ $a -lt 100 ]]; then echo "The value $a is greater than 42 but less than 100" else echo "The value $a is not between 42 and 100" fi elif [[ $a -lt 100 ]]; then if [[ $a -gt 42 ]]; then echo "The value $a is greater than 42 but less than 100" else echo "The value $a is not between 42 and 100" fi else echo "The value $a is not between 42 and 100" fi
Sometimes, you need a conditional expression to read as: "if condition A OR condition B is true".
The if-then branch below will execute if either of these conditions are met:
$ais less than 42
$ais greater than 100
if [[ $a -lt 42 || $a -gt 100 ]]; then echo "The value $a is either: less than 42, or greater than 100" else echo "The value $a is between 42 and 100" fi
Up to this point, we've been acquainted with the read-while loop, which executes commands for every line in an input file:
while read url do curl "$url" >> everywebpage_combined.html done < list_of_urls.txt
Another form of the
while loop involves passing in a conditional statement that, when true, causes the loop to repeat itself.
The following example sets
For each iteration of the while loop, the condition
[[ $countdown -ge 0 ]] is tested. If it is true, then the loop executes again. This loop keeps executing until the value of
countdown is greater than or equal to
countdown has a value of
-1, the condition
[[ $countdown -ge 0 ]] will be false, and the loop will cease execution.
How or when does
-1? The code inside the loop subtracts
1 each time the loop runs:
user@host:~$ countdown=5 user@host:~$ while [[ $countdown -ge 0 ]]; do echo "Liftoff in...$countdown" countdown=$(( countdown - 1 )) done echo 'And we have liftoff' # Liftoff in...5 Liftoff in...4 Liftoff in...3 Liftoff in...2 Liftoff in...1 Liftoff in...0 And we have liftoff
Now what happens if the line
countdown=$(( countdown -1)) wasn't included?. Then
[[ $countdown -ge 0 ]] will always be true, and the loop won't stop until the universe, or the computer, dies of heat death.
Try the amended loop, and prepare to hit Ctrl-C
countdown=1 while [[ $countdown -ge 0 ]]; do echo "Liftoff in...$countdown" done
Sometimes infinite loops are useful, for situations in which we want a program to be performing a task in the background for the indefinite future. In that case, you can simply use true for the conditional statement, which, well, is always true. The following program will remind me to be positive, every 12 hours (43,200 seconds), for as long as the computer stays on. Or until I kill it:
words="You're good enough, you're smart enough, and doggone it, people like you" while true; do sleep 43200 echo $words | mail firstname.lastname@example.org -s 'Important reminder' done
Note: for the purposes of this class and corn.stanford.edu, you should probably not use an infinite loop, but instead, have a finite bound so that if you forget which machine you were on when you launched a script and are thus unable to kill it, it will die at least sometime on its own:
for x in $(seq 1 1000); do echo "something for every 10 minutes" sleep 600 done