Professor Norm Matloff
Dept. of Computer Science
University of California at Davis
Davis, CA 95616
Contents of this site:
RPy is a simple, easy-to-use interface to R from Python. It enables one to enjoy the elegance of Python programming while having access to the rich graphical and statistical capabilities of R.
In its simplest form, shown here, one includes in one's Python code a statement
from rpy2.robjects import r
This launches an execution of R, with communication to the original Python program. The Python class instance r includes various functions for remote execution of R commands, including those involved with data produced by the Python program.
IMPORTANT NOTE: The material here concerns RPy2, not the original RPy.
Dowload RPy from the RPY home page. Unpack it, and in the top directory created by the package, open a shell/command window and run
python setup.py install
If you are on a multiuser system and do not have root privileges, you can specify a nondefault root directory. For example, on the UC Davis Computer Science Department's instructional machines, I typed
R RHOME setenv RHOME /usr/lib/R python setup.py install --root /home/matloff/Pub/rpy2
The first command ran R with a request to report where R was installed on the system, which turned out to be /usr/lib/R. The second command set the corresponding shell environment variable (C shell in my case). The third command specified a nondefault installation directory.
First, make sure the RPy module is in your Python path. In the above context, I typed
setenv PYTHONPATH /home/matloff/Pub/rpy2/usr/lib/python2.5/site-packages/
Now, let's generate vectors x and y in R, do a scatter plot, fit a least-squares line, etc.:
>>> from rpy2.robjects import r >>> r('x <- rnorm(100)') # generate x at R >>> r('y <- x + rnorm(100,sd=0.5)') # generate y at R >>> r('plot(x,y)') # have R plot them >>> r('lmout <- lm(y~x)') # run the regression >>> r('print(lmout)') # print from R >>> loclmout = r('lmout') # download lmout from R to Python >>> print loclmout # print locally >>> print loclmout.r['coefficients'] # print one component
Now let's apply some R operations to some Python variables:
>>> u = range(10) # set up another scatter plot, this one local >>> e = 5*[0.25,-0.25] >>> v = u[:] >>> for i in range(10): v[i] += e[i] >>> r.plot(u,v) >>> r.assign('remoteu',u) # ship local u to R >>> r.assign('remotev',v) # ship local v to R >>> r('plot(remoteu,remotev)') # plot there
There are many more functions. See the RPy documentation for details.