Chapter 11 Using R with BASH

Sometimes you will want to calculate something in R without actually entering the R environment manually. This is possible - and very handy - by passing STDIN (standard-input) to an R-script, and - if desired - getting the results past to STDOUT (standard-output).

But first you will have to specify the settings to be used. Some of them are compulsory and your attempt of using R in bash will fail if you don’t specify them.

Now imagine the following R-script “example.r”, which will print the correlation coefficient between x and y, as well as call the correlation coefficient between x^2 and y:

x <- 1:100
y <- x + rnorm(100)
print(cor(x,y))
cor(x^2, y)

To execute this script from the command-line, you run:

$ cat example.r | R --no-save 

R version 4.3.3 (2024-02-29) -- "Angel Food Cake"
Copyright (C) 2024 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

  Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> x <- 1:100
> y <- x + rnorm(100)
> print(cor(x,y))
[1] 0.9993565
> cor(x^2, y)
[1] 0.9655807
> 

The --no-save or --save argument is compulsary and defines wether you want the work-space to be saved in the end or not (like the q() event when closing R). As you can see, this standard output is very bulky and large. It contains the copyright-information as well as all R-code. If you want to use only the output of the script, you will have to add --silent to your command:

$ cat example.r | R --silent --no-save
> x <- 1:100
> y <- x + rnorm(100)
> print(cor(x,y))
[1] 0.9993322
> cor(x^2, y)
[1] 0.9672689
> 

This is already much better, but still contains the code. Instead of the --no-save argument you can also add --vanilla which adds some more options to make sure the execution is as clean as possible, and includes the --no-save option by default:

$ cat example.r | R --silent --vanilla 
> x <- 1:100
> y <- x + rnorm(100)
> print(cor(x,y))
[1] 0.9993597
> cor(x^2, y)
[1] 0.9675887
> 

Now to only get the bare output of your script, you can replace --silentand --no-save by the --slave command.

$ cat example.r | R --slave
[1] 0.9995553
[1] 0.9685453

Note: when calling R from the bash promt, it makes no difference if you actively print a code, or if you call it without the print command.

You can also directly pass the script to R via two ways:

$ R --slave < example.r
[1] 0.9993082
[1] 0.9690733
$ R --slave --file=example.r
[1] 0.9992919
[1] 0.9668773

The easiest way, though - and the most commonly used - is the Rscript command. Rscript is a shortcut for R --slave --no-restore --file=, where --no-restore implies --no-save:

$ Rscript example.r
[1] 0.9995145
[1] 0.9694067

Inside an R script, you have access to command-line arguments via the function commandArgs. You can store all arguments that come after your script in a list (here called “args”). Now you can access every element of that list in your downstream script:

#saving all arguments in the list "args"
args = commandArgs(trailingOnly=TRUE)
#accessing the first element as args[1], the second element as args[2], a.s.o.
x <- args[1]:args[2]
y <- x + rnorm(length(x))
paste('argument1:', args[1], '; argument2:', args[2])
print(cor(x,y))
cor(x^2, y)
$ Rscript example2.r 50 70
[1] "argument1: 50 ; argument2: 70"
[1] 0.99195
[1] 0.9918694

You can also pass an R-command directly to R, and pass the output back to the shell:

$ echo "x <- 1:5 ; y <- x + rnorm(5); cat(y)" | R --slave | awk '{print $0}{print $1, $2+$3, $4*$5}'
0.4308924 2.421816 2.744308 3.33685 6.880522
0.4308924 5.16612 22.9593

Note: cat in R removes the index [1] in the beginning.


All options can be found via R --help. Here once again the ones discussed in this tutorial:

–save or –no-save:

  • compulsory
  • specify weather or not the workspace has to be saved when quitting.

–silent:

  • triggers R not to print the copyright info at the start

–slave:

  • makes R run as quiatly as possible
  • supresses the printing of commands
  • implies –no-save and –silent

–vanilla:

  • enables –no-save and other options that make sure the execution is as clean as possible.