A FIRST TUTORIAL ON R text version of a document by, very likely, Chris Raphael with unauthorized changes by Don Byrd, Jan. 2008 To get R: 1. Download R (it’s free) from the website http://cran.r-project.org . There are versions for Macintosh, Linux, and Windows. 2. From the same website, click on "Manuals" (at the left, near the bottom) to find documentation. R can be used either to run pre-written programs or purely interactively, as a calculator. Try typing each of the following expressions -- except for the initial ">", R's command prompt -- to R (followed by the return key). Note: anything on a line following ’#’ is a comment and is ignored by R. > 5+3 > 12*7 > log(81/80) > exp(20) > help("exp") # Just what is "exp", anyway? > exp(exp(exp(20))) # R is only human! :-) R has most any mathematical function you can think of: sqrt(), sin() ... mostly with easily guessable names. Expressions using the logical operators == != < > give _Boolean_ values, namely T or F: > 4>3 # this evaluates to T (true) > sqrt(4)==2 # so does this > sqrt(5)==2 # this evaluates to F (false) It is possible to have variables even when you use R as a calculator. Most strings beginning with an alphabetic character will be treated as variables. Try typing some of the following lines in succession; no need to include the comments, of course. Or, you can copy and paste them -- not a bad habit to get into. > x <- 3 # set x to 3 > y <- x*x+x > y # print the value of y > freq <- 440 * 2^((m-69)/12) # What does this do? Not much unless you set m first > m <- 60 > freq <- 440 * 2^((m-69)/12) # It works better this time > freq Vectors One of the nicest aspects of R is the way it handles _vectors_, though it can be tricky, e.g., using vectors of different lengths together. Here are several ways to create and use vectors: > x <- 1:100 # x is now the vector (1, 2, ..., 100) > y <- seq(-pi,pi,length=100) # y consists of 100 evenly spaced values from -pi to pi > z <- c(1,4,8,20) # z is the vector (1,4,8,20) > a <- x+y # vectors of same length can be added, multiplied, etc. > b <- 4*x # this is interpreted correctly too Random Number Generation Random numbers are useful for many things, especially in probability theory and statistics (R's original raison d'etre), as well as many areas of music informatics. R has a bunch of functions for generating them. There are many different _distributions_ of random numbers; the most important for us are the _uniform_ (all possible values are equally likely) and the _normal_ (which produces the Bell-shaped curve you've probably encountered before). R has lots of built-in functions for doing things with random numbers. For instance: > x <- runif(100) # creates a vector of 100 (uniformly distributed) random numbers between 0 and 1. > punif(v) # is the probability that a Unif(0,1) random number is less than v There are similar functions for a variety of other distributions, including the normal(0,1) (rnorm,pnorm,qnorm), Exponential, Binomial, Poisson, Cauchy (rcauchy, pcauchy, qcauchy), and others. Subsets > x <- runif(100) # creates a vector of 100 Unif(0,1) random numbers > x[1] # the first element of x > x[c(1,3,5)] # a vector containing 1st, 3rd and 5th elements of x > y <- x > .5 # a 100-long vector of Boolean values; y[i] is T iff x[i] > .5 > bigx <- x[x>.8] # the ‘‘x’s’’ that are greater than .8 - 1 - ------------------------------------------------------------------------ Simple graphs. Try the following. > x <- seq(0,1,length=100) > y <- x^2 # y = x squared > plot(x,y) # plot with (x[1],y[1]) ... (x[100],y[100]) > plot(x,y,"l") # plot has lots of options: say help("plot") to find out > plot(y,x,"l") > plot(y,x,"s") > plot(y) # same as plot(1:length(y),y) Source Files. You will want to write simple programs in R, and getting programs working almost always requires some trial, error and iteration. I recommend the following procedure. Create a “source” file containing your R commands in any text editor. You can use the Windows Notepad, BBEdit, or whatever you are comfortable with. Suppose you create the following file named “PriceCrash.r” in your editor: len <- 100 x <- runif(len,-.5,.4) y <- cumsum(x) # y[1] = x[1], y[2] = x[1]+x[2], etc. price <- 100*exp(y) plot(price, type="l", xlab="day", ylab="sale price ($1000)") title("Average Home Value") print("Price history: ") print(price) To run it in R, do this: > source("PriceCrash.r") # run the program you created This assumes that PriceCrash.r is in R's "working directory". This technique allows you to write a program in the usual incremental way. If you want to get a hard copy of the printout and the plot (for example, to submit as your homework), do the following: > postscript("myplot.ps") # write plot in the postscript file ‘‘myplot.ps’’ > sink("myout.txt") # write text output to ‘‘myout.txt’’ > source("PriceCrash.r") # run the program you created > dev.off() # redirect plots to screen. Don’t forget this! > sink() # redirect output to screen. ditto. > A Fun Example. Suppose two decks of cards are shuffled; then the cards are lined up > side by side, and you count the number of places where the two decks have the same > card. What is the probability that are no matches? This is a hard calculation to do, > but you could estimate the probability by doing it many times and observing the > proportion of times it occurs. > > nTrials <- 1000 # number of trials > nZeros <- 0 # to count the number of times the "no match" event happens > for (i in 1:nTrials) { > x <- 1:52; > deck1 <- sample(x, 52, replace=F) # a random permutation of the "cards" > deck2 <- sample(x, 52, replace=F) # another random permutation > nMatches <- sum(deck1==deck2) # number of matching cards this time > if (nMatches==0) nZeros <- nZeros+1 > } > > print(c("estimated probability=", nZeros/nTrials)) # the result Quitting and Help > help("rnorm") # gives information about the function rnorm. Of couse this works > # for other functions too, and even for operators. > q() # quitting the program. Hope you had fun. - 2 -