ECS 172 Term Project

Important dates

Problem A

This is a mini (nano) research project, in which you will investigate the effects of three general parameters on accuracy and run time of collaborative filtering methods.

You will hold every pair of these variables fixed while varying the third.

A suggestion: For some selected dataset, first find the estimated ratings matrix aEst. Then reduce n, m and d by thinning it out, replacing some values by NAs. You may find rectools::toUserItemRatings() handy here; given an A matrix, it produces the input data frame in (userID, itemID, rating) form. Running toUserItemRatings() on aEst will thus produce a fake data frame of this form, on which you can run the various rec sys models.

Note: To fix m and d, for instance, take only the first m columns (or randomly chosen m columns) from aEst. Then randomly sprinkle a proportion 1-d of NAs in what remains.

Your report should have lots of graphs, e.g. MAPE accuracy plotted against n, for m and d fixed. You may wish to draw sevral curves in each plot. The code used to generate them must be in R, and supplied with your submission package. I suggest either ggplot2 or lattice. I personally find the colors more vivid in the latter, but the former has the possible advantage of many add-on packages.

Your report must explain in detail what you did, why you did it, and the conclusions you drew from it.

Choice of datasets and collaborative filtering methods is up to you, but the latter should consist of at least one matrix-factorization (MF) method and one non-MF.

Important General Rules

PLEASE FOLLOW THESE RULES 100%!

Grading

General commnets:

Criteria: