1. The Julia language does this. 2. E.g. B <- 1:3; C = c(2,4,6) print(B) print(C) B <- 1:3; C = c(1,0,0) print(B) print(C) 3. There are various ways to do this. Here's one: propRated <- function(k) { load('Hwk1.RData') getFirst <- function(x) x[1] Ns <- tapply(ml100kpluscovs$Nuser,ml100kpluscovs$user,getFirst) mean(Ns >= k) } print(propRated(88)) # wrong way: propRated <- function(k) { load('Hwk1.RData') Ns <- ml100kpluscovs$Nuser mean(Ns >= k) } print(propRated(88)) This wrongly puts extra weight on the users who rate a lot of movies. 4. (Text answer, n points.) The ratings of movies will often be correlated; people who like movie X may tend to like movie Y. Say we compute the correlations between all pairs of movies, i.e. call cor() on all pairs of columns of the ratings matrix A. (We first would need to find all users who have rated both X and Y. Don't worry about this point.) We could then do some analysis on the "winner," i.e. the pair of columns that are most highly correlated. What would be a concern about doing this, from our book? Answer: This may cause p-hacking. 5: (R code answer, n points.) Write a function that will find the proportion of movies that are in all genres specified in the argument. For instance, say we are interested in G5 and G8. If a movie is in both of these genres, it counts, but not if it is in only one or none of them. The argument is a vector of column names. inAllGenres <- function(whichGenres) { load('Hwk1.RData') } print(inAllGenres(c('G2','G5','G18'))) Again, various approaches could be taken. inAllGenres <- function(whichGenres) { load('Hwk1.RData') z <- ml100kpluscovs[,c('item',whichGenres)] getFirst <- function(x) x[1] firstAppearance <- tapply(1:nrow(z),z$item,getFirst) zall <- apply(z[firstAppearance,-1,drop=FALSE],1,prod) mean(zall,na.rm=TRUE) # some NAs in there } print(inAllGenres(c('G2','G5','G18')))