Computer Science

Creating a shell script

Home>Computer Science homework help>Creating a shell script
CSCI 132: Practical Unix and Programming Adjunct: Trami Dang Assignment 3 Fall 2018

Assignment 31

Summary The purpose of this assignment is to give you some practice in bash scripting. When you write a bash script, you are really writing a program in the bash programming language. bash is not just a shell, but a programming language as well, and a bash script can just as well be called a bash program. It is mentioned now because very soon you will begin writing programs in another programming language, Perl; this is the first in a sequence of small steps in mastering Perl. We have been calling your programs shell scripts. A script is a program, make no bones about it. Scripts are programs written in a scripting language, which is a special kind of programming language. All scripting languages are programming languages, but not vice versa. The distinction will be explained in a later lecture. In this case, bash is both a programming language and a scripting language. This assignment will begin with a review of some of the things that have been covered in class, and then introduce a few things not covered in class.

Some Important bash Instructions A bash instruction is also called a statement. For example, the if-instruction

if test $# -ne 2 then echo “usage: $0 arg1 arg2” exit fi echo “User input: $1 and $2”

is usually referred to as the if-statement. The test condition in this case is $# -ne 2 . If the test is true, where the number of the command parameters ($#) is not equal to 2, as in (-ne 2), the statement(s) between then and if execute. If the test is not true, bash will skip those statements and execute what comes after the if-statement, as in the second echo statement. 1 This is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. To view a copy of this license, visit

CSCI 132: Practical Unix and Programming Adjunct: Trami Dang Assignment 3 Fall 2018

The bash programming language has several statements that are known as looping statements. A looping statement is one that makes it possible to repeat a sequence of statements one or more times. bash has a looping statement called the while-statement: the while-statement is a looping statement whose form (syntax) is while <expression> do <list-of-statements> done in which <expression> is a statement such as the test command, or any other statement that can be evaluated as being true or false, and <list-of-statements> is any sequence of statements (including looping statements.) The following snippet (little piece) of a script shows one example of a while-statement:

echo -n “Try to guess my favorite color:” read guess rest_of_line mycolor=`cat secretfile` # read about backquoted commands like `cat file` while [ $guess != $mycolor ] do echo -n “Sorry, that is not my favorite color. Try again: ” read guess rest_of_line done In the above script, the <expression> part of the while-statement is [ $guess != $mycolor ] and the <list-of-statements> is the list of two lines echo -n “Sorry, that is not my favorite color. Try again: ” read guess rest_of_line

The above script will test whether guess is the same string as mycolor, and if it is not, it will execute the echo and read statements and then re-evaluate the test command that compares guess and mycolor. It will keep doing this until the user enters a string that is identical to the string stored in mycolor. When they do match, the expression becomes false and the “while loop” is exited.

A while-statement is usually called a while loop because if we visualize the sequence of executed statements as being connected by an imaginary thread, then this thread loops around and around the lines of the script.

CSCI 132: Practical Unix and Programming Adjunct: Trami Dang Assignment 3 Fall 2018

Bash also has a for-loop. (It has other loops too.) The for-loop is very different from the while-loop. It has two forms. One form (again the proper term is syntax) is for <variable> in <argument-list> do <list-of-statements> done and the other is for <variable> do <list-of-statements> done The <variable> can be any valid variable name (words starting with letters and containing letters, digits, and the underscore character.) The <argument-list> can be any sequence of words, including words that look like numbers. Examples of this are

for number in 1 2 3 4 5 6 7 8 9 10 for name in John Jacob Judy Jocelyn for word in $* As you can see, this can be very powerful.

As with the while-loop, the list of statements is any list of statements, but the intention is that the variable plays a role in this list. For example, the script

let sum=0 for number in 1 2 3 4 5 6 7 8 9 10 do let square=$number*$number let sum=$sum+$number echo The square of $number is $square done echo The sum of the numbers is $sum. displays ten lines showing the squares of the first ten positive integers and then displays their sum. Notice how the sum is calculated.

The second form of the for-loop does not need an argument list. It automatically assigns to the variable the successive words from the command line arguments of the script when it is run:

CSCI 132: Practical Unix and Programming Adjunct: Trami Dang Assignment 3 Fall 2018

for word do echo $word. done It is the same as

for name in $* do echo $word. done It just prints the words found on the command line one after the other on separate lines.

Your Tasks This assignment consists of two exercises in writing relatively simple shell scripts. The objectives when writing any script are

clarity the script should be easy to understand by someone with a basic knowledge of UNIX;

efficiency the script should use the least resources possible; and

simplicity ���the script should be as simple as possible. An example will demonstrate. Suppose we needed a script that would count the number of lines in a file named molecule containing the word ‘ATOM’ anywhere on the line. The following script would achieve this:

#!/bin/bash grep ‘ ATOM ‘ molecule >| atomcount wc -l atomcount >| answer rm atomcount cat answer rm answer but it is very inefficient (it needlessly creates files and then removes them), it is hard to understand because the reader spends more time reading it and may not be familiar with certain operators such as >|, and it is not as simple as it could be. A simple, well- documented, and efficient solution is #!/bin/bash

CSCI 132: Practical Unix and Programming Adjunct: Trami Dang Assignment 3 Fall 2018

# Displays how many lines in file molecule contain ATOM as # a complete word # Written by Stewart Weiss grep -c ‘ ATOM ‘ molecule # The -c option to grep counts matching lines It has comments to explain what it does and it achieves it with a single command that can be looked up easily.

Your job is to apply these ideas as you create solutions to the following exercises.

1. The last command lists information about who has logged into the computer on which it is run. In particular, it has a column with the username, the terminal on which the user was connected, the internet address (the IP address) from which they connected to the computer, and the date and time that they logged in and then logged out if they did. If they logged out it also displays the total time they were logged in. For example, this is an entry for Dr. Weiss on cslab12: ��� sweiss pts/11 Thu Sep 14 13:05 – 14:27 (01:22)

If the username is too long it is truncated, but there are options to display the full username. For this exercise you are to write a bash script named logincount that takes a list of usernames as its command line arguments and displays on the screen, for each user name, a message of the form ��� Number of times that <username> logged into this machine is <N> 
 where <N> is to be replaced by the number of records that the last command output that match <username> exactly. For example, if I enter the command ���logincount sweiss 
 it should output something like Number of times that sweiss logged into this machine is 7 If a name given as an argument is not a username, nothing is printed for that name. On the other hand, if no names are given, it is an error and the command should display the error message, “Usage: logincount <list of usernames>”. 2. A DNA string is a sequence of the letters a, c, g, and t in any order. For example, aacgtttgtaaccag is a DNA string of length 15. Each sequence of three consecutive

CSCI 132: Practical Unix and Programming Adjunct: Trami Dang Assignment 3 Fall 2018

letters is called a codon. For example, in the preceding string, the codons are aac, gtt, tgt, aac, and cag. If we ignored the first letter and started listing the codons starting at the second a, the codons would be acg, ttt, gta, and acc, and we would ignore the last ag. The letters are called bases.

A DNA string can be hundreds of thousands of codons long, even millions of codons long, which means that it is infeasible to count them by hand. It would be useful to have a simple script that could count the number of occurrences of a specific codon in such a string. For instance, for the example string above such a script would tell us that aac occurs three times and tgt occurs once. Generally, we want to be able to find occurrences of arbitrary sequences of bases in a given DNA string, such as how many times ttatg occurs, or how many times cgacgattag occurs.

Your job is to write a script named countmatches that expects at least two arguments on the command line. The first argument is the pathname of a file containing a valid DNA string with no newline characters or white space characters of any kind within it. (It will be terminated with a newline character.) This file contains nothing but a sequence of the letters a, c, g, and t. DNA text files are located at

/data/biocs/b/student.accounts/cs132/data/dna_textfiles to give to your script as the file argument.

The remaining arguments are strings containing only the bases a, c, g, and t in any order. For each valid argument string, it will search the DNA string in the file and count how many non-overlapping occurrences of that argument string are in the DNA string. To make sure you understand what non-overlapping means, the string ata occurs just once in the string atata, not twice, because the two occurrences overlap.

If your script is called correctly, it will output for each argument a line containing the argument string followed by how many times it occurs in the string. If it finds no occurrences, it should output 0 as a count. For example, if the string aaccgtttgtaaccggaac is in a file named dnafile, then your script should work like this: $ countmatches dnafile ttt ttt 1 $ countmatches dnafile aac ggg aaccg aac 3 ggg 0 aaccg 2

CSCI 132: Practical Unix and Programming Adjunct: Trami Dang Assignment 3 Fall 2018

Warning: if it is given valid arguments, the script is not to output anything except the strings and their associated counts. No fancy messages, no words! The script should check that the first argument is a file name and that there is at least one other argument after it. If the first argument is not a file name, or if it is missing anything after the filename, the script should print a how-to-use-me message and then exit. It is not required to check that the file is in the proper form, or that the string contains nothing but the letters a, c, g, and t. Hint: You can solve this problem using grep and one other command that appears in this document. Although there are other filters, you do not need them to solve this problem. You have to read more about grep to know how to use it. The other command has appeared in the slides already.

Grading Rubric This homework is graded on a 100-point scale. Each script is worth 50 points. Each script will be graded on its correctness foremost. This means that it does exactly what the assignment states it must do, in detail. Correctness is worth 70% of the grade. Then it is graded on its clarity, simplicity, and efficiency, as described above. These qualitative measures are worth 30% of the grade.

Submitting the Homework Due Date: This assignment is due by the end of the day (i.e. 11:59PM, EST) on Wednesday, October 31st. I will update the class accordingly of when this particular assignment is to be submitted to Blackboard as an assignment submission. If you complete the assignment before I announce the post of Blackboard assignment submission, you may post your assignment to my email, only as a zip archive.

Submission details In PDF format of your actions of command input with screenshots of all output; or as a zip file. For remote logins: ssh to with your valid username and password, and then ssh into any cslab host.

1. In your own home directory, create a directory named assignment3_username where username is your Linux Lab account username.

2. Put copies of the two scripts that you have written into this directory. Make sure they

CSCI 132: Practical Unix and Programming Adjunct: Trami Dang Assignment 3 Fall 2018

are named logincount and countmatches.

3. Run the commands:

$ zip -r assignment3_username/ 
$ chmod 755 This will create the file For Linux Lab users: once you have made the zip file, navigate to its location in the file- system and upload to Blackboard. For anyone working on the assignment remotely, use the scp command to securely copy it to your local computer, and then upload the file to Blackboard. $ scp <>:<path_of_zip_file> <desired_path> There is no whitespace on either side of the colon. Your login, Your.Username is named before the colon. The <path_of_zip_file> is absolute path on the remote machine, named after the colon. Then type a whitespace and specify the <desired_path> on your local file-system that you would like to put your zip file. If you run the command properly it should bring up a password prompt from The zip file will be placed in your specified location. Now you are ready to upload your zip file to Blackboard.

Order now and get 10% discount on all orders above $50 now!!The professional are ready and willing handle your assignment.