Chapter 11 Using R with BASH
Sometimes you will want to calculate something in R without actually entering the R environment manually. This is possible - and very handy - by passing STDIN (standard-input) to an R-script, and - if desired - getting the results past to STDOUT (standard-output).
But first you will have to specify the settings to be used. Some of them are compulsory and your attempt of using R in bash will fail if you don’t specify them.
Now imagine the following R-script “example.r”, which will print the correlation coefficient between x and y, as well as call the correlation coefficient between x^2 and y:
x <- 1:100
y <- x + rnorm(100)
print(cor(x,y))
cor(x^2, y)
To execute this script from the command-line, you run:
$ cat example.r | R --no-save
R version 4.3.3 (2024-02-29) -- "Angel Food Cake"
Copyright (C) 2024 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
Natural language support but running in an English locale
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
> x <- 1:100
> y <- x + rnorm(100)
> print(cor(x,y))
[1] 0.9993565
> cor(x^2, y)
[1] 0.9655807
>
The --no-save
or --save
argument is compulsary and defines wether you want the work-space to be saved in the end or not (like the q()
event when closing R).
As you can see, this standard output is very bulky and large. It contains the copyright-information as well as all R-code. If you want to use only the output of the script, you will have to add --silent
to your command:
$ cat example.r | R --silent --no-save
> x <- 1:100
> y <- x + rnorm(100)
> print(cor(x,y))
[1] 0.9993322
> cor(x^2, y)
[1] 0.9672689
>
This is already much better, but still contains the code. Instead of the --no-save
argument you can also add --vanilla
which adds some more options to make sure the execution is as clean as possible, and includes the --no-save
option by default:
$ cat example.r | R --silent --vanilla
> x <- 1:100
> y <- x + rnorm(100)
> print(cor(x,y))
[1] 0.9993597
> cor(x^2, y)
[1] 0.9675887
>
Now to only get the bare output of your script, you can replace --silent
and --no-save
by the --slave
command.
Note: when calling R from the bash promt, it makes no difference if you actively print a code, or if you call it without the print command.
You can also directly pass the script to R via two ways:
The easiest way, though - and the most commonly used - is the Rscript
command.
Rscript
is a shortcut for R --slave --no-restore --file=
, where --no-restore
implies --no-save
:
Inside an R script, you have access to command-line arguments via the function commandArgs
. You can store all arguments that come after your script in a list (here called “args”). Now you can access every element of that list in your downstream script:
#saving all arguments in the list "args"
args = commandArgs(trailingOnly=TRUE)
#accessing the first element as args[1], the second element as args[2], a.s.o.
x <- args[1]:args[2]
y <- x + rnorm(length(x))
paste('argument1:', args[1], '; argument2:', args[2])
print(cor(x,y))
cor(x^2, y)
You can also pass an R-command directly to R, and pass the output back to the shell:
$ echo "x <- 1:5 ; y <- x + rnorm(5); cat(y)" | R --slave | awk '{print $0}{print $1, $2+$3, $4*$5}'
0.4308924 2.421816 2.744308 3.33685 6.880522
0.4308924 5.16612 22.9593
Note: cat
in R removes the index [1] in the beginning.
All options can be found via R --help
. Here once again the ones discussed in this tutorial:
–save or –no-save:
- compulsory
- specify weather or not the workspace has to be saved when quitting.
–silent:
- triggers R not to print the copyright info at the start
–slave:
- makes R run as quiatly as possible
- supresses the printing of commands
- implies –no-save and –silent
–vanilla:
- enables –no-save and other options that make sure the execution is as clean as possible.