Chapter 12 Solutions to Exercises
12.0.9 Writing BASH Scripts
Corresponding exercises: see Section 6.9.3
vim like.sh #!/bin/bash echo "I like $1!" # Leave vim by pressing Esc, then :wq chmod u+x like.sh # you can call your script three times like this: ./like.sh "biology" ./like.sh "computer science" ./like.sh "bioinformatics" # or alternatively, you write a for-loop: for x in "biology" "computer science" "bioinformatics"; do ./like.sh "$x"; donevim reasons.sh #!/bin/bash echo $1 >> whyILikeBASH.txt # Leave vim by pressing Esc, then :wq chmod u+x reasons.sh # you can call your script three times like this: ./reasons.sh "powerful" ./reasons.sh "flexible" ./reasons.sh "fast" # or alternatively, you write a for-loop: for x in "powerful" "flexible" "flexible"; do ./reasons.sh "$x"; done cat whyILikeBASH.txt
12.0.18 awk on the Banthracis proteome
Corresponding exercises: see Section 10.6.2
# if 1st column is ID, store name \& len and empty the seq variable. If 1st column is SQ, set addseq=1. If addseq=1, add sequence to seq. If 1st column is //, set addseq=0 (stop adding sequences) and print. # The order of these commands is crucial. e.g. if the last two conditions are switched, you would add "//" to the seq. awk 'BEGIN {addseq=0}; {if ($1 == "ID") {name= $2; len=$4; seq="";} else {if ($1 == "SQ") {addseq=1;} else {if ($1 == "//") {addseq=0; print name, len, seq;} else {if (addseq == 1) {seq = seq $1 $2 $3 $4 $5 $6}}}}}' BanthracisProteome.txt> seq.txt
12.0.19 R and bash
Corresponding exercises: see Section 11.1.1
#!bin/bash wget https://data.geo.admin.ch/ch.meteoschweiz.messwerte-lufttemperatur-10min/ch.meteoschweiz.messwerte-lufttemperatur-10min_en.csv cut -d";" -f4,6 ch.meteoschweiz.messwerte-lufttemperatur-10min en.csv | tail -n+2 > cols.csv echo ’file <- read.csv("cols.csv", sep = ";"); pdf("altVsTemp.pdf"); plot(file[,2], file[,1], xlab = "altitude", ylab = "temperature", main = "altitude vs temperature"); dev.off()’ | R --slave cat ch.meteoschweiz.messwerte-lufttemperatur-10min en.csv | grep Fribourg | cut -d";" -f4 rm ch.meteoschweiz.messwerte-lufttemperatur-10min en.csv rm cols.csv#!/bin/bash for go in GO:0005886 GO:0005737 GO:0003677 GO:0005524 GO:0016021; do #if $1=ID, save length and set "found" to 0; #if $1=DR and $2=GO (go-term found) set "found" to 1; #if the end of the protein is reached ($1=//), and a GO-term was found (found=1), print the length. awk -vGO=${go} '$1=="ID"{len=$4; found=0}; $1=="DR" && $2~"GO" {found=1}; $1=="//"&&found{print len}' BanthracisProteome.txt > ${go}_length.txt echo "lengths <- read.table('${go}_length.txt'); pdf('${go}_length.pdf'); hist(lengths[,1],main = '${go}'); dev.off()" | R --slave; done