Blog, ECS 132, Winter 2012

In your analyses involving confidence intervals and significance tests, it's centrally important to explain whether you belive forming the CI added anything useful to the analysis, compared to the test.

In your regression analyses, it's

March 20, 12:47 p.m.

Please note that I will be away on Thursday and Friday, presenting a paper at a research conference. I will have little or no e-mail access during that time (except for late Thursday night). So, try to get me any questions you have regarding the Project before then.

March 20, 11:17 a.m.

Please do not send me your data for Problem II. Just give me a URL. Thanks.

March 18, 6:03 p.m.

The fact that I have some plots in the regression chapter of the book is perhaps misleading. This would be very difficult to do with 2 predictors, and basically impossible with more than 2.

There are alternative graphical methods, and if you wish to find some on the Web and use them, that would be great. However, our book does go into nongraphical methods for predictor variable selection and asessing goodness of fit.

March 16, 11:17 p.m.

I said yesterday that the Quiz was quite difficult, and I'd be happy if you get even one of the two problems right. Well, one team made me extremely happy--they got Problem 1 fully correct and Problem 2 mostly correct. Unfortunately, none of the other three teams got any points, though one team got the 10-point bonus for complying with the specs for submission. In view of the extremely difficult nature of the Quiz, I gave C- grades for simply showing up, C+ with the bonus. The "Dream Team" got an A+, good for them.

For those who didn't get the A+, don't be discouraged. Again, this one was really hard. Nevertheless, please make sure you understand the solution to Problem 2, as it is relevant to our Project.

I'm looking forward to strong Projects from ALL groups. Last quarter in 132, half the groups got A+ grades on the Projects, and the rest got A grades.

March 12, 8:47 p.m.

As noted, at the end of our group Quiz (not later), each group will turn in their solution files to my handin directory, 132quiz8. Submit your files under the UCD e-mail address that is alphabetically earliest in your group. Make sure that you state in the comments in your code the names of your group members.

I will be applying my semiautomatic grading script, and it will expect that your file names are EXACTLY as is requested on the Quiz sheet.

March 12, 7:17 p.m.

Whenever we compare two means, we do so by forming a confidence interval for their (population) difference, or if doing a significance test, we test the hypothesis that the (population) difference is 0.

For the confidence interval case, use the material in Section 9.7. For significance, use Section 10.2 (example in Sec. 10.3).

March 12, 7:09 p.m.

IMPORTANT MESSAGE REGARDING THURSDAY'S GROUP QUIZ:

March 12, 11:09 a.m.

There were 3 students who did not submit their Midterm files to handin. I'll be grading those today. If you have any previous Quizzes that you didn't turn in to handin which I have NOT graded yet, please let me know, to ensure that nothing is lost.

March 12, 10:59 a.m.

I'm going to take another look at Problems 2(c) and (e) on the Midterm, with a possible regrade. Don't worry--your grade can only go up, not down. :-)

March 11, 11:36 p.m.

I've added a Section 14.9, A Preview of Linear Regression Analysis with R, to the revised version of our textbook. Here I walk the reader through a full R regression analysis, on real data. THIS IS MUST READING FOR YOUR PROJECT; it should be very helpful to you in seeing "the big picture." Please let me know if you have any questions about it.

March 11, 11:35 p.m.

In my 4:45 p.m. message, "the last 2 Tests" means Quiz 7 and the Midterm.

March 11, 4:45 p.m.

I just e-mailed the midterm grades. The results were fairly good.

Please note that in the group Quiz I have been thinking the last few days of having two problems, one dealing with Chapter 8 and the other on Chapters 9/10. In grading the two Tests today, I see that many students have weaknesses in those chapters, so my plan during the last few days is probably a good thing. Therefore:

If a group does well on both problems on the group Quiz, I will add 10 more points to each of the last 2 Tests, for each member of the group.

Please remember that each group must have at least one laptop computer in the group Quiz. One of the problems will involve programming (R), and you will submit .R and .tex files via WiFi at the end of the Quiz (not later).

March 11, 4:30 p.m.

Please do not use integrate() on quizzes. Almost everyone gets the syntax wrong, resulting in an R parse error message when the grading script is run.

March 11, 2:50 p.m.

I just e-mailed the grades for Quiz 7.

It was a difficult Quiz, so I set the grade cutoffs pretty low. For instance, an A was 60, and it was 25 for a C. The latter was set on the grounds that everyone should get Problem 2(b) correct.

March 10, 11:10 p.m.

I've added another exemplar project to the Web site. See Project specs for the link.

March 9, 1:10 p.m.

Good news! The specs for your Project are now on the Web. :-) See here.

March 8, 10:30 p.m.

I fixed a typo (lower bound) in Problem III.

March 8, 9:55 p.m.

I've added some hints to Problem IV, and fixed a couple of typos. Also, I've added a formal statement on the MV CLT in our revised textbook, but the new hints should be enough.

March 8, 8:35 p.m.

I'm not happy with the statement of the Multivariate Central Limit Theorem in our book. Technically the book does have enough for your to do Problem IV, but it's too vague.

I'm going to fix this tonight, but meanwhile I'll postpone the due date to Monday.

March 8, 7:45 p.m.

I had the dates wrong in my Feb. 27, 9:05 p.m. posting. We will have no Quiz on March 14 (though possibly a lecture), and our group Quiz is on March 15. Remember, by the way, that each group will need at least one laptop computer on Mar. 15.

February 29, 8:05 p.m.

I just e-mailed the grades for Quiz 6 (not today's Quiz).

February 28, 9:05 p.m.

There will be no Quizzes on March 7 and March 15. I may need them to present lecture material, though. Watch this space for news. Of course, remember we have our group Quiz on March 16.

February 27, 3:49 p.m.

Please note that the ONLY section in Chapter 5 that you are responsible for is Section 5.1, Memoryless Property.

February 24, 4:49 p.m.

Our final Homework assignment for the quarter is now on the Web.

February 21, 10:22 p.m.

Nice New York Times article It touches on a number of things I've mentioned in class, including Prof. King.

February 19, 9:17 p.m.

I will be in my office hours as usual tomorrow. Not sure any building doors will be open, but if you can't get it, then phone or e-mail me and I'll come down to meet you.

February 19, 9:16 p.m.

Just fixed the link in Problem III to my R tutorial.

February 19, 2:06 p.m.

Remember, in almost of our Homework problems in this course, you can check your answers via simulation. That includes all problems in the current assignment, especially Problem I, in which I've written the main simulation code for you.

February 17, 11:16 p.m.

I added the necessary values for the pi in Problem I.

February 15, 12:59 p.m.

I had an error in my original solution for Problem 4 in Quiz 4, which I've now fixed. Though not directly related, I believe that one or two of you may have been graded incorrectly on that problem. Please check, and if that is the case, let me know so I can change your grade.

February 13, 11:29 p.m.

I had hoped to grade your Quiz 4 files this evening, but got bogged down because of a large number of manual fixes of format errors. I'll try again tomorrow.

Sorry for the delay, but you can help me a lot on future Quizzes (and get the full 10-point bonus, rather than 5 or 0). Please keep the following in mind:

Concerning that last point, please look at the solutions for Quiz 4 on our Web site. You'll see that you didn't need e anyway, because the best way to answer would have been using the d*, p* and r* series of functions for distributions.

While I'm here writing this blog entry regarding Quizzes, here is an important tip for Quiz 5, coming this Wednesday: Keep in mind that the course material is highly cumulative; later material draws upon earlier material. In preparing for Quiz 5, you may wish to lighly review the major points we've covered so far. To do this, first read the table of contents, to remind yourself what things we've covered. Second, make sure you thoroughly understand the solutions of the previous Quizzes; if you have any question at all about them or about the course material, contact me about it, either asking after class and sending me e-mail.

February 13, 9:49 p.m.

Correction to the sample P matrix for Problem I:

      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
 [1,]  0.2 0.45  0.0 0.00 0.00 0.00 0.35 0.00 0.00  0.00
 [2,]  0.0 0.45  0.0 0.00 0.00 0.00 0.00 0.35 0.20  0.00
 [3,]  0.0 0.45  0.2 0.00 0.00 0.00 0.00 0.00 0.35  0.00
 [4,]  0.0 0.00  0.0 0.45 0.00 0.00 0.00 0.00 0.20  0.35
 [5,]  0.0 0.00  0.0 0.45 0.55 0.00 0.00 0.00 0.00  0.00
 [6,]  0.0 0.00  0.0 0.00 0.00 0.80 0.00 0.00 0.20  0.00
 [7,]  0.0 0.00  0.0 0.00 0.00 0.45 0.55 0.00 0.00  0.00
 [8,]  0.0 0.00  0.0 0.00 0.00 0.00 0.00 0.80 0.20  0.00
 [9,]  0.0 0.00  0.0 0.00 0.00 0.00 0.00 0.45 0.55  0.00
[10,]  0.0 0.00  0.0 0.00 0.00 0.00 0.00 0.00 0.20  0.80
February 13, 1:54 p.m.

Please note that you are responsible for the material on pp.117-118 on tests.

February 12, 7:54 p.m.

Here are some sample values and some tips regarding Problem I of the Homework.

For the case q = 4, d = 1 and p = (0.35,0.45,0.20):

Tips:

February 9, 10:54 p.m.

All three problems for Homework 3 are now up on the Web. Note that I've extended the due date considerably. I'll be adding some test cases etc. in the next day or so.

Each of the three problems requires "stamina," as described in class. They are not very hard, but they require persistence.

February 8, 7:44 p.m.

Problem I of Homework 3 is now on our Web site. There will be two more problems.

February 8, 1:24 p.m.

Please keep in mind that the grading script expects your answer to consist of just one line per problem. Problems are designed with that in mind, but if you ever need more than one line of R code, you can use semicolons, which R treats in the same way C does.

February 7, 10:14 p.m.

I just mailed out the grades for Quiz 3, for all who had valid-format answers files.

Overall the results were quite good, very impressive.

Your Quiz sheets will be given back soon.

Hwk 3 posted tomorrow.

February 2, 5:50 p.m.

I fixed a typo in the statement of Problem III (missing argument).

February 1, 1:50 p.m.

Don't forget to submit an electronic copy of your answers to today's quiz.

January 26, 8:50 p.m.

Problem I.f should read, "Explain intuitively why E(L2) should be larger than E(L1)."

January 26, 11:00 a.m.

My semi-automatic grading program is working very smoothly. To ensure that you receive a full bonus, make sure your answers file meets the format requirements, such as:

I'm still tweaking the script, though. Here is a new requirement:

January 25, 9:37 p.m.

Homework 2 is now ready on the Web.

January 25, 9:27 p.m.

If you would like a chance to think further about the material, please note that all the old exams are in the (of course) OldExams/ directory on our course Web page. Naturally, you will get the most benefit if you try the problems yourself before looking at the solutions. Note that some exam questions may have occurred somewhat later in the quarter than you are currently, thus may use unfamiliar terms, so leave these for later.

In addition, there are many exercises at the end of each chapter in the book. There are no solutions in the book, but many are actually old exam questions, so see above. Also, if you'd like to know how to do one of the other exercises, you are welcome to discuss it with me.

January 25, 12:45 p.m.

The handin directory 132quiz1 was blocked, but is fixed now. Remember to turn in your answers file.

January 25, 9:25 a.m.

Remember to keep a separate copy of your Quiz answers, and submit them on handin later today.

January 23, 6:05 p.m.

The .tar file you submit for your homework must include all your R source files (none in Hwk 1), and your Answers.txt file, your .tex and .pdf files, as well as any files for figures.

January 21, 4:15 p.m.

After tweaking my semiautomatic grading program the last few days, I finally got it refined to what I wanted, and just now ran it on your Quizzes.

The program automatically e-mails your results to you. The information sent to you consists of your scores on the individual problems (or subproblems), your bonus score, your total score, and your letter grade.

Half of my grading program consists of error checking. If it finds a format error, say unparsable R code, it invokes the vim text editor on your answers file, in a new window. There are I can try repair it. If you got only a 5-point bonus rather than 10, it was because I needed to do repair.

The grading program is only semiautomatic, as I get to visually inspect every answer. In one case, for instance, a student write 0.144 for an answer when it should have been 0.0144. I assumed this to be an arithmetic error, and gave full credit. Nevertheless, this shows that you shouldn't do your own arithmetic; just write the R code.

The Quiz and its solutions are now on our Web page.

January 18, 7:35 p.m.

I'm teaching 2 classes this quarter. When you use handin, make sure not to submit to the wrong class. :-(

January 18, 7:31 p.m.

In submitting your quiz answers file, please note that it will be read by my automatic grading program. For problems with numeric answers, you must give R programming language expressions.

For example, if a problem asks you to find P(X = 2) and you find that it is equal to 2 divided by the product of 3 and 5, then your answer should be

2/(3*5)

NOT

P(X = 2) = 2/(3*5)

My program is not that smart. :-)

January 18, 4:15 p.m.

Don't forget to submit a copy of your Quiz answers to the CSIF handin facility.

January 17, 8:04 p.m.

Tomorrow, when you use the CSIF handin command to submit your electronic copy of your quiz answers, use the subdirectory 132quiz1.

So for instance, if your answers file is, say, jsmith.txt (must be named after your UCD e-mail address; see syllabus), then on CSIF you would type

% handin matloff 132quiz1 jsmith.txt

If you wish to check which directories are there, type

% handin matloff

Please review the syllabus regarding the format of your quiz answers file, but note that I broadened it in class today: If you have a numerical answer, don't compute it. (No calculators allowed.) Simply type an R expression for it, such as

5/6 + (1/6) * (4/6)

in the case of Equation (2.35), page 17. Note that the R exponentiation operator is ^, e.g. 2^3 = 8.

January 16, 10:00 p.m.

Note that our first Quiz will be in this Wednesday's discussion section.

As mentioned earlier, it will cover all of the book through what we cover tomorrow (Tues.). Pay special attention to the ALOHA example (Sec. 2,5) and the board game example (Sec. 2.10). Think of your own variations of the probabilities found there. As usual, feel free to ask me about anything in those examples (or even about the variations you devise).

Remember, you can get a 10-point bonus on each Quiz with very little effort. Details are in the syllabus, but I will review them (with a small update to make things easier) in an upcoming blog posting.

January 16, 11:04 a.m.

Thanks to our TA for pointing out a couple of errors.

The sample answer in 2(a) should be 0.37.

The sample answer in 3 should be 1 - 0.3571 = 0.6429. (I originally forgot to subtract from 1.)

January 13, 1:24 p.m.

The TA's office hours are in Kemper 53.

January 13, 12:57 p.m.

The specs on our Web page for Homework 1 are now complete.

January 12, 6:27 p.m.

Please keep in mind how the textbook coverage works. Since I covered Section 2.5 today, you are required to read the material before that section, but not covered in class, on your own.

Next Tuesday, I will probably begin with Section 2.11, then spend quite a bit of time on Section 2.12. If that is as far as I get, you will be responsible for all pages through Section 2.12 during Wednesday's Quiz.

January 12, 6:26 p.m.

Due to the MLK holiday, I won't be holding office hours that day, but remember that I'm available on e-mail every day, including that one.

January 12, 6:24 p.m.

The first two problems in Homework 1 are now on our Web page. (There will be one more.) Start early! Tip: Read the board game and bus ridership examples, Sections 2.10 and 2.11, before starting the homework.

January 12, 6:23 p.m.

Jui-Chung's office hours will be Tuesdays 2-4, Thursdays 2-3

January 11, 12:34 p.m.

Please use the following as the contact e-mail address for Jui-Chung, our TA: juiwu@ucdavis.edu

January 11, 10:49 a.m.

You may find it fun (and profitable!) to browse through kaggle.com. This company is in the business of managing data mining contests. The methods used are from probability and statitics, some from our course (regression analysis) and some that are advanced versions of what's in our course.

January 10, 10:49 p.m.

Remember, groups set in discussion section tomorrow (Wed.). For a class our size, group size is 2-4.

January 10, 6:02 p.m.

I mentioned in the syllabus and in class that many Google jobs require statistics, and especially knowledge of the R programming language. Here is a random example I found on the Web. Notice that it is a CS job ("building a highly scalable computing infrastructure, novel storage systems, innovative user experiences...") but with a Statistics title. They do prefer a grad degree, which is common these days.