ECS 172 Syllabus, Winter 2022
(Subject to change, due to Covid-19.)
Instructors:
Contents (somewhat long, but VITAL):
-
Gain skill in the major methods of recommender systems, a branch
of machine learning (ML).
-
Acquire or deepen a practical understanding of basic ML methods.
-
Acquire or deepen a practical understanding of dealing with real
data, both in terms of modeling and in data wrangling.
-
A calculus-based probability course, e.g. ECS 132, STA 131A or STA 130A;
linear algebra; coding skills as in ECS 36B or 32C.
- Prior knowledge of R is NOT required, but you are expected to learn
it on your own, e.g. from my
fasteR site.
- Prior knowledge of ML is NOT required. If you do have such
background, you should find that your insight is enhanced by this
course.
- Good mathematical intuition is the most important prerequisite.
Instructor's notes. Will be available
here.
Lecture periods, MW, 9:00-9:50, Hoagland 168, F 10-11 Wellman 212:
- Lecture, from the textbook. You must bring the book to class,
in e-format if you wish, but preferably hard copy so you can easily
annotate.
- During lecture, I will often make impromptu examples not in the
book, and impromptu explanatory remarks. Often these will be in
response to student questions. These examples and remarks will often
be the subject of exam questions.
- If you take the view that "the professor just lectures out of
the book, so one need not go to class," you will find that you are
incorrect and your exam performance will suffer.
- Please do not hesitate to ask questions in lecture!
Discussion sections, M 10-10:50, Kerr 212:
- Led by TA.
-
There will be a quiz almost every week.
- Occasional mini-lectures by the TA.
NOTE: For now, and likely throughout the quarter, the TA will
swap time slots with me on Fridays. I will lecture in the 10 a.m. slot, and
the TA wlll conduct the discussion section at 9 a.m.
You will rely quite heavily on your homework groups for homework, the
term project and the group quiz. Get to know them well!
-
Group size must be 3 or 4. If you wish to form your own group, let the
TA know during Week 1 of class. Otherwise the TA will assign you
to a group.
- Group work should be collaborative, not "You do Problem 1, I'll do
Problem 2 etc."
- Within a group, there may be differing levels of math or coding
skill, so not everyone might contribute equally. But everyone must do
their part.
In working with your teammates on the Homework, and later on the Term
Project, you ARE part of a team, and you are expected to do your best to
contribute to the team. You cannot say, for instance, "Oh, I just want a
P grade in this course, so I will do less." This is unethical and is
irresponsible behavior toward your teammates.
If, in interactive grading, it becomes clear that you contributed little
or nothing to the team effort, YOU WILL NOT GET CREDIT FOR THE
ASSIGNMENT. If this happens on multiple occasions,
you will get
you a D or F for the course, regardless of your quiz grades etc.
If a member of your group is not participating, notify me
IMMEDIATELY.
-
As noted, there will be a quiz almost every week, during the
discusson section, plus a group quiz in the last lecture slot.
There will be no midterm or final exams.
- You will need a laptop to write these exams, on which you have
installed OMSI. It
need not be a fancy laptop. Hopefully we will have a dry run
beforehand, but it is your responsibility to be ready to use the
OMSI software correctly. You will also need R on your laptop.
Do not run OMSI on CSIF or other machine.
- You must have printed or electronic copies of all course
materials.
- Most questions will involve writing R code. These
will serve mainly as a vehicle for measuring your understanding
of the concepts, not primarily of your knowledge of R and
the rectools R package. ("OMSI" stands for Online
Measurement of Student Insight.)
- However, you do have to get the R correct, as the OMSI software will
run your R code, both for you during the quiz and for me when I grade
it. Note: OMSI will invoke R directly, not via RStudio.
- Be ready to write mathematical answers, including things like
integrals, in R. See these tips.
-
The questions will involve the reading, homework (though
probably rarely), and class discussions as noted above.
The quizzes and solutions from Winter 2020, when this class was
taught as 189G, are available
here.
-
On the last day of class, March 11, there will be a special Group
Quiz, held during the lecture time slot, in which you will work
in your homework groups.
- Ethical behavior: In preparation for exams, you will
make sure you've downloaded all materials to your laptop. During
the exams, you will not communicate with anyone else in any form,
and will not access (read or write) the Internet or any electronic
device for any reason, except to write your exam via OMSI (and except
for the Group Quiz). You must be physically present in the
classroom, i.e. not logged in from elsewhere.
- 3-4 assignments through the quarter.
- Turn in just ONE report per group. Must be written in LaTeX (which
must compile on CSIF). See my
intro;
many others on the Web.
- Read the submission details CAREFULLY!
- The term due date means 11:59 p.m. of the stated date.
Each Group will be allowed a total of 2 late days over the quarter, time
which is not penalized.
Don't squander your grace period days in the first assignment!
- Will be graded interactively. Group signs up for a time slot with
TA. In addition to correctness of the work, TA will also grade on
questions that the TA asks each group member, concerning the homework
(e.g. "If the question had asked XYZ, what would change in your
solution?") and about the course in general. Note that each
member of the group may get a different grade.
-
This is the most important part of the course. I view everything
else as preparation leading up to the project.
- Again, it is done with your group.
- Assigned near the end of the quarter, due at 11:59 pm of the date
of our scheduled final exam.
- Submit following the same rules as for homework. Again, the name
of the .tar file is especially crucial.
- However, submit to my handin account on CSIF, not the TA's.
It is required that you read the
course blog at least once per day. All course announcements will be
made there. Material in blog posts may also be subject to exam
questions.
- I will grade the quizzes and the project, and the TA
will grade the homework.
-
The weighting for the course grade will normally be 60% for the
quizzes, 20% for the homework, and 20% for the project.
- Your lowest 2 quiz grades (letter) will be thrown out. (In some
classes, I increase that to 3.)
- However, the reason for the qualifier "normally" is that your
grade may be higher or lower than what the above formula would give:;
- You may get a bonus, usually for an unusually good term
project.
- If we have time for my "job interview" exercise, this too
could increase your course grade (and couldn't reduce it).
- See this case study of how a student's
course grade rose sharply due to bonuses.
- On the other hand, extreme evidence of your not acting
responsibly in the course could result in your getting a D or F
for the course, regardless of your quiz grades etc. As noted
earlier, one way in this could happen is that you don't
participate much in your group work. Another way would be if you
miss a lot of quizzes.
It's vital to get these steps 100% correct. Remember, a script will be
preprocessing your submission.
- You place your files--.tex, .R and any figures, with
NO subdirectories--in a Unix .tar file.
- Make SURE not to make subdirectories. When the grading script
unpacks your .tar file, it will expect to see all your work files
in the same directory from which the script invokes the .tar
command.
- The file name will be email1.email2....tar where the
'email' fields are the official UCD e-mail addresses of the members of your
Group, e.g. jsmith.agutierrez.streddy.tjwong.tar. Be sure to get
those addresses exactly correct, to avoid a situation in which your team
member doesn't get credit. Be sure to use the proper e-mail
address, NOT a different one based on your UCD login. Your official
address is the one used by the TA and me in mailing you; check your
records.
- You then submit your .tar file to the TA for homework
or to me for the project, using handin on CSIF
Prologue: Overview of the Recommender Systems Field
Collaborative filtering; content-based methods; the "Hello, World"
dataset, MovieLens.
Part I: Infrastructure
- Review of linear algebra, especially diagonalization.
R functions for linear algebra.
- Review of probability tools, especially density functions;
extension to the multivariate case, e.g. from variance to covariance
matrices.
- Mixture distributions and clustering methods.
Part II: Collaborative Filtering
- Additive/MLE methods.
- Parametric and ML approaches.
- Matrix factorization methods (SVD, NMF).
- Clustering methods.
Part III: Content-Based Methods
- Brief look at NLP.
- R packages.
Part IV: Misc.
- Google Pagerank
- More on clustering.
-
Remember, when I grade your quiz, OMSI will run your R code. So you
need to be able to express math answers in R. E.g. if your answer is
(32 + 42)0.5
you need to know to write the R code.
(3^2+4^2)^0.5
- Know how to use the c() function to create a vector, the
matrix() function to create a matrix, etc.
- These math functions may come in handy:
length(), choose(), combn(), sum(), min(), max(), exp(), log(),
integrate().
- Here are some examples of integrate():
> integrate(function(t) 2*t/15,2.5,4)$value
[1] 0.65
> integrate(function(t) 2*t^2/15,1,4)$value
[1] 2.8
> integrate(function(t) sqrt(abs(t)) *
dnorm(t,mean=10,sd=2.1),-Inf,Inf)$value
[1] 3.144017
Make sure you know why '$value' is needed.
This was a student in ECS 132.
- Original quiz record:
F A+ F C+ B+ D B
-
Add Quiz 0, practice using the OMSI system (most students get A+):
F A+ F C+ B+ D B A+
- Drop two lowest quizzes:
A+ C+ B+ D B A+
- Replace next-lowest quiz by "job interview" grade:
A+ C+ B+ B+ B A+
- Course grade according to formula (after the above changes):
B+
- Bumped-up course grade after two-notch bonus for A- project:
A
-
Again, look at that original quiz record!
F A+ F C+ B+ D B
Yet, A for course grade!