DESCRIPTION

Students:  Please keep in mind the OMSI rules.  Save your files often,
make sure OMSI fills your entire screen at all times, etc.  Remember
that clicking CopyQtoA will copy the entire question box to the answer
box.

In questions involving code which will PARTIALLY be given to you in the
question specs, you may need add new lines.  There may not be
information given as to where the lines should be inserted.

MAKE SURE TO RUN THE CODE IN PROBLEMS THAT INVOLVE CODE!

QUESTION

(Text answer, 25 points.)  What possible explanation was given in class
for the fact that human heights are approximately normal?


QUESTION -ext .R -run 'Rscript ./omsi_answer2.R'

(R code answer, 25 points.)  In the mlb data in regtools, 
print the correlation matrix of height, age and weight.

library(regtools)
data(mlb)

print(       )

QUESTION -ext .R -run 'Rscript ./omsi_answer3.R'

(R code answer, 25 points.)  Write a function (myLm) that finds the beta-hat
vector directly rather than via lm()/qeLin(), by using the formula
(5.14).  Your function should find beta-hat for column number yColNum using
all other columns in inputMatrix.  Then write a function (pred) that does 
prediction using beta-hat on a new data point, newx.

myLm <- function(inputMatrix,yColNum) 
{


   return(betahat)
}

pred <- function(betahat,newx) 
{

}

library(regtools)
data(mlb)
mlb1 <- as.matrix(mlb[,4:6])
z <- myLm(mlb1,2)
print(z)
print(pred(z,c(70,21)))

QUESTION -ext .R -run 'Rscript ./omsi_answer4.R'

(R code answer, 25 points) Say, in using qeLin() (or lm()), we round
off the predicted value, to the nearest integer in the ratings range.
For MovieLens, that would be one of 1,2,3,4,5.  We can then ask what our
chances are of being correct.  Find this value for MovieLens, predicting
solely from user and item ID, with code beginning as follows:

load('Hwk1.RData')
library(qeML)
ml100 <- ml100k[,-4]
# do our own holdout
set.seed(9999)  # so we are all on the same page
idxs <- sample(1:nrow(ml100),1000)
trn <- ml100[-idxs,]
tst <- ml100[idxs,]
mlout <- qeLin(trn,'V3',holdout=NULL)