- Invoking and running R in match mode
- Understand the command
R CMD BATCH- Execute some scripts
R provides an special kind of command called R CMD which should be
executed from the command line. This command is an interface or wrapper
to various R tools such as processing files in R documentation
format, or manipulating add-on packages. These tools are useful in
conjunction with R, but not intended to be called “directly”. The main
source of reference to know more about R CMD is the R manual An
Introduction to
R
The general form is:
R CMD command options args-
commandis the name of the tool -
optionsare the command line options ofR -
argsrefers to the arguments passed on to thecommand
Some of the relevant options for this tutorial are:
-
--savesaves data sets at the end of the R session. -
--no-savedoes not save data sets at the end of the R session. -
--no-environdon’t read any user file to set environment variables. -
--restorerestores.RDatafile in the directory where R was started -
--vanillacombines--no-save,--no-environ,--no-site-file,--no-init-fileand--no-restore -
--quiet,--silent, and-qdon’t print out the initial copyright and welcome messages -
--slavemakes R run as quietly as possible.
Among the several available R tools, the one that we are interested in
is the BATCH tool, which is designed to run R in batch mode, that is,
batch execution of R.
When you run R CMD BATCH, the default options are --restore --save
To see more information, check the manual documentation by typing
?BATCH (or help(BATCH)) from within an R interactive session. Or by
typing R CMD BATCH --help from the command line.
The usage is as follows:
R CMD BATCH options infile outfile-
optionsare optional options from the command R -
infileis the required input file with the code to be executed -
outfileis the name of an optional output file. If no output file is provided, the name ofinfileis taken as default, appending the extension.Routto it.
Consider the file myscript1.R (inside the folder scripts/). If you
take a look at its contents, this script generates two vectors x and
y of length 20. It fits a regression line (regressing y on x).
Then it produces two plots: 1) a scatterplot with the fitted regression
line, and 2) a residual plot. Both plots are saved as png images.
To run the code inside myscript1.R in batch mode, you have type the
following commands from the command line:
R CMD BATCH myscript1.RThe file myscript1.R is the input file.
R CMD BATCH generates an output file. By default, this file has the
same name as the input file, but its extension will be .Rout. In other
words, the output of R CMD BATCH always goes to a file that is built
with name of the input filename and appending out. For this particular
example, the output file will be myscript1.Rout. However, you can
provide a different name if you want so.
# specifying the name of the output file
R CMD BATCH myscript1.R myscript1-output.RIf you run the command R CMD BATCH myscript1.R, you should be able to
see a file called myscript1.Rout. If you open this file, you will see
the welcome message that appears every time you open a new session in R,
followed by the R commands (i.e. R code) that were executed, and finally
an additional command at the end of the file with information about the
execution time:
R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-apple-darwin13.4.0 (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
Natural language support but running in an English locale
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
> # Regression analysis of two random vectors
>
> # random data
> x <- rnorm(20)
> y <- x + rnorm(20)
>
> # regression line
> reg <- lm(y ~ x)
>
> # scatter diagram with fitted regression line
> png('scatterplot.png')
> plot(x, y, las = 1, pch = 19, col = "#555555")
> abline(reg, col = "#0000DD59", lwd = 2)
> dev.off()
null device
1
>
> # residuals plot
> png('residuals_plot.png')
> plot(x, reg$residuals, las = 1, pch = 19, col = "#606060")
> abline(h = 0)
> dev.off()
null device
1
>
> proc.time()
user system elapsed
0.179 0.036 0.210
Notice that the R code inside the output file is displayed with the
prompt character >.
R CMD BATCH has default options --restore --save --no-readline
-
--savesaves all the objects in the workspace (i.e. all the objects created while the script was executed) into a hidden file.RDatawhich is a file in R’s binary format. When you save the available objects, this is known as saving theimage(seehelp(save)for more information). -
--restorewill load the saved images, that is, willload()the content of the file.RDatain the directory where R was started. -
--no-readlineturns off command-line editing viareadline. This option is not that important.
If you don’t want R CMD BATCH to save the image, use the option
--no-save. Likewise, if you don’t want R CMD BATCH to load the
objects in .RData use --no-restore.
You can further modify the contents of the output file with a couple of additional options:
-
R CMD BATCH --silentwon’t print out the initial copyright and welcome messages. -
R CMD BATCH --no-savewon’t save the workspace image (i.e. no.RDatawill be saved). -
R CMD BATCH --vanillacombines--no-save,--no-environ,--no-site-file,--no-init-fileand--no-restore.
In script1.R, the number of random values used to create vectors x
and y was fixed. However, it would be nice if the user could specify a
a value of n to control the length of the random numbers. In other
words, it would be nice if we could specify a value for an argument n
that we could pass it to the script file.
When running R scripts, often you will want to provide values for
certain arguments. Luckily, R CMD BATCH allows you specify arguments
and pass them to the script.
How to run an R script in batch mode and passing argument? You can pass
parameters to scripts via additional arguments on the command line. This
is done by quoting the arguments using the --args option:
R CMD BATCH "--args arg1 arg2" myscript.R
To see an example, take a look at the file myscript2.R which is almost
identical to myscript1.R. The difference is that myscript2.R reads
in a parameter for n which is the number of values to generate the x-y
coordinates for the scatterplot.
Let’s say we want to generate 50 values. Here’s how to pass this number:
R CMD BATCH "--args 50" myscript2.R