Blog, ECS 145 Winter 2019

Tuesday, March 19, 4:30 pm

Again: Submit only the new files and the ones you changed. But in your report, you must show that the modified package builds, installs and executes properly. Show the actual commands, executed at the R interactive prompt.

Monday, March 18, 9:00 pm

On the group quiz, three groups did not follow directions in submitting a .tar file. This causes problems with the grading script, resulting in some groups being graded twice, with different scores. Please check the solutions to see if this may have occurred in your case.

When you submit your Term Project, please make SURE your file is submitted properly.

Saturday, March 16, 11:50 pm

I've placed the solutions for Quizzes 6 and 7 (group quiz) on our Web page. I hope to get Quiz 7 graded tomorrow.

Saturday, March 16, 9:00 am

When submitting your Term Project, do not submit the package. Just submit your C/C++ code and any related files, and in your report, show that INSTALL really does work as promised.

Thursday, March 14, 9:50 pm

Correction: There are two problems in the quiz tomorrow. Submit them in Problem1.R and Problem2.R.

Thursday, March 14, 8:40 pm

As mentioned in class yesterday, there will be a single problem in tomorrow's quiz. Put your code in a file named Problem 1.R.

Even though you will be submitting just one file, package it in a .tar file, with naming scheme the same you've been using for submitting homework. Do NOT have any subdirectories in the .tar file; when the grading script does tar xf it should NOT produce any subdirectories, for full credit. You submit via handin, to my directory ('matloff') not the TA's

Please note that the same requirement holds for the Term Project.

Wednesday, March 13, 8:40 pm

As I mentioned in class (and is implied in the project specs), the package you choose for your project must be pure-R, no existing C/C++ code.

Tuesday, March 12, 10:35 pm

For Friday's group quiz, make sure you are familiar with R's character string functions, grep(), nchar(), paste(), paste0(), substr(), strsplit() and sprintf().

Monday, March 11, 10:55 pm

Note: The modifications you make should be transparent to users. Say you modify package X, one that I've been using for a long time. The R scripts I've been running, calling functions in X, should continue to work without change. (I would have to reinstall X, of course.)

Monday, March 11, 8:10 pm

News regarding the group quiz this Friday:

Concerning our Term Project:

Friday, March 8, 11:35 pm

Your Term Project is now on the Web!

There's a lot to do here: choose an R package to modify; learn how to interface R to C/C++; learn how package structures work; write a good, professional-quality report. Obviously, the key is to START EARLY. I consider this to be the most important part of the course, and give a substantial bonus to your course grade if you do an excellent job on this.

Monday, March 4, 11:05 pm

In lecture today, I forgot to mention the counterpart of assign(), which is get(). E.g.

> x <- 3
> get('x',envir=.GlobalEnv)
[1] 3
Monday, March 4, 8:05 am

On the first day of class, in discussing my policy of dropping your lowest two quiz grades, I may have mentioned that occasionally I increase that number to three. I've decided to go ahead and do this for our ECS 145 class this quarter.

Sunday, March 3, 10:35 pm

I've received several requests for regrading last Friday's quiz. I'm always open to regrade requests -- I still remember what it's like to be a student, even though that was eons ago -- but please keep the following in mind:

Saturday, March 2, 11:55 pm

I sent out your cumulative quiz records. Please note that for many of you, your eventual course grade will be higher than your current quiz record might suggest, because:

The course grades when I taught this class in Winter 2018 were:

 A A- A+  B B- B+  C C+  F                                                      
22 11 17  5  1  5  1  2  1

We have one more regular quiz, March 8, and the group quiz, March 15.

I've placed the solutions to yesterday's quiz on our Web page.

Saturday March 2, 9:20 pm
> print <- function(x) x^2
> print(5)
[1] 25

A number of people did the equivalent of the above in Problem 1 of yesterday's quiz, clobbering the print() function.

Saturday March 2, 9:20 am

Yesterday after class, I met with the students who had had trouble with OMSI in the quiz. As I had suspected, the problem was that they had not followed the OMSI directions, which require that R be executable from a terminal window. (Of course, this is a path issue, but that is the easiest way to check.)

Nevertheless, OMSI still may have some occasional glitches. If you encounter any, I'd greatly appreciate your letting me know.

(Note that this is separate from the problem with OMSI 1.4.0 that arose yesterday in ECS 132, which was due to lack of thorough testing.)

Wednesday, February 27, 8:30 pm

A student asked:

We have a couple of questions. First, for "get", if the requested object does not exist, do we throw an error and close connection? Second, for "ol", does the client want all objects on the server, even the ones that were not put into the server by the client or just the objects that were put into the server by the client? Third, for establishing multiple client connections, do we block all other clients until the current client is done?

As I've often mentioned, you are not required to do error checking in any assignment unless specifically asked to do so.

The server will have various objects available for access by the clients. It will also have other objects, of course, but these are "private." Let's assume that initially the server has no objects available. Clients put them there, and other clients (or even the same ones) can download them.

Concerning that last query ("Third..."), I discussed this in class. The connections are simultaneous, as implied in the wording of the assignment and as seen in our Python network examples. I stated in class that R does not have a threads feature, and stated the possible ways that you can handle this lack: (a) Use nonblocking I/O. (b) Use the Rdsm package. (c) Use socketSelect(), which is an R wrapper for the C select().

As usual, there are lots of examples on the Web, e.g. this one. I've also suggested to some people individually that they use the CRAN package snow as a model.

Monday, February 25, 10:45 pm

A student asked:

We are a little confused about how the prompt is meant to work for homework 3. Our initial assumption was that in the client, we should constantly be reading text input from the user, but that doesn't work if we're supposed to be able to parse R code as well (ex: "x <- 5", then "put x").

In talking to a student group today, I found that none of them had ever used FTP. If any of you are in that situation, the very first thing you must do is familiarize yourself with FTP. Use it to transfer a couple of files to/from your laptop and CSIF. The command is sftp.

You of course launch this app from a shell command line, e.g.

% sftp pc22

Once sftp is running, it accepts commands from you, repeatedly until you decide to exit. During that time, you can NOT execute other shell commands.

The same is true for our programming assignment. Once you execute otpClient(), that function runs until you decide to exit. During that time, you can NOT execute other R commands.

Monday, February 25, 1:55 pm

The commands 'ol' etc. are typed by the user in response to a user prompt, which we will take to be '#otp '.

Monday, February 25, 9:20 am

I learned yesterday that some students have unknowingly been engaging in a serious violation of our rules for using OMSI: They are logging in to CSIF and running OMSI there, a major security hole. You are not allowed any Internet access during quizzes (except for the group quiz, last day of lecture). It's not really their fault, as the TA suggested it to them during Quiz 0, so I would be reluctant to cancel the A+ grades those students received on that quiz, but MAKE SURE you do not do this in our one remaining non-group quiz.

Sunday, February 24, 11:05 pm

A handy trick in R is closures. For instance:

> f
function(x) {

   function(y) x * y

}
> f <- edit()
> g <- f(3)
> g(12)
[1] 36

What happened here? First, note that R functions automatically return the last value computed (if there is no explicit return() call). E.g.

h <- function(a,b) a^b

is the same as

h <- function(a,b) return(a^b)

So, the inner function above returns x*y. But that same line builds a function. ("The function of the function named 'function' is to build functions!") And since that building action is the last action in f(), that means that the return value of f() is that inner function.

Furthermore, x is part of the calling environment of that inner function. So, in the example run above, the creation of the function g() is the same as

> g <- function(y) 3*y
Sunday, February 24, 11:00 pm

In Hwk III, 'connect' should not really be listed among the client requests, as it is done via the otpClient() call.

Friday, February 22, 11:05 pm

Hwk III is now ready!

Thursday, February 21, 4:00 pm

This will be my first blog post on R! We'll start on material on R tomorrow, so please make sure to bring your books.

There is an R function, on.exit(), that I wanted to mention to you now lest I forget later.

The function is meant to be called inside another, which I'll refer to as f() here. What on.exit() does is tell the R interpreter to take a certain action when we leave f() -- including the all-important case where there is an execution error while f() is running.

Here is an example:

on.exit(setwd(subsdir))   

Below that line, I had a call to setwd() to change the working directory, with subsdir being my current directory. If there were no concern about execution errors, I would simply have setwd(subsdir) at the end of f() to restore the original directory. But by using on.exit() I both achieve that goal AND am assured that if there is an error, I still will return to the proper directory.

Thursday, February 21, 3:50 pm

Someone asked me about the "takeoff and landing" metaphor.

I actually have discussed this many times in class. Remember, if I say something in class that you don't understand, or simply didn't hear, please interrupt me and ask me to explain. I will gladly comply.

The metaphor alludes to learning to fly an airplane, where the hardest things to learn are takeoff and landing.

In threaded programming, it's often difficult to get the code quite right for behavior at the start of execution. This is because one cannot know in advance which thread will start first. So the "takeoff" can be difficult.

Regarding the "landing," it can be quite difficult getting a threaded program to END properly. Indeed, it's very easy to write code that inadvertently results in one thread never finishing.

Thursday, February 21, 3:10 pm

Notes on the homework:

The specs say, "Set up a class walkThread...[that] will contain as class variables the entity that ultimately will contain the result desired by the user..." Please name this walkThread.result, so it will be easier for the TA to grade with his test code.

Wednesday, February 20, 9:45 pm

Several people have asked whether they should assume that the user will do the locking and releasing. This is probably not a very good design; we should not place this burden on the user.

Wednesday, February 20, 6:05 pm

Several people have asked me whether they can use walk() in twalk(). When someone asked me today, I asked why. She replied that her group wants twalk() to first use walk() to generate a list of all the files in the directory tree, then have twalk() go through the list one by one.

Please do not even think of doing this. It's extremely wasteful, and violates the spirit of the material we've been covering, where we've had a lot of emphasis on having only one object in memory at a time. The TA would not give full credit for this, and it would look quite bad in a job interview. I understand why this approach is attractive, but it's not the right path.

Tuesday, February 19, 8:15 pm

Note carefully: If you miscode the assignment, you may have a situation in which only one thread ends up doing all the work. If so, even if you get the "right" answers, your submission will be considered seriously flawed.

Tuesday, February 19, 7:55 pm

The homework specs say, "And note that the 'hidden' files, .*, will be picked up." This means, when you are testing your code, keep this in mind. It is up to the user as to whether to count the hidden files or not. Of course, if this were to be used in general, you'd probably want to add an argument like useHiddenFiles.

Wednesday, February 13, 11:00 pm

Hwk II is ready!

Tuesday, February 12, 5:35 pm

I've mentioned that you can try the code in our book without having to type it yourself, since you can download the raw files. In the case of the threads chapter, it's elsewhere. If you go to

http://heather.cs.ucdavis.edu/~matloff/Python/PLN/FastLanePython.tex

you'll see this line:

\input ../../158/PLN/PyThreads 

That shows where the file actually resides.

Monday, February 11, 11:00 pm

This Friday, February 15, Ismail and I will switch times. I will lecture during the discussion section, and he will also lecture during the lecture time, no quiz this week.

Monday, February 4, 7:55 pm

Homework I news:

Sunday, February 3, 12:00 pm

In the homework specs, it was pointed out that inclusion of powers > 2 of an indicator variable column would produce structural duplication, and thus such powers must be excluded. That requirement includes excluding monomials like, e.g., x^2 y for an indicator variable x.

Wednesday, January 30, 9:50 pm

As noted the other day, it's important to you understand row- and column-major storage. C uses the former. The code below is an illustration.

#include <stdio.h>

int x[3][8];

main() {
   x[2][1] = 12;
   printf("%d\n",x[2][1]);  // prints 12
   int *px = (int *) x;
   printf("%d\n",*(px+17));  // prints 12
}

The array x here has 3 rows and 8 columns. (Keep in mind that indices start at 0.) So we visualize it as 3 rows and 8 columns, but actually the array is stored in 24 consecutive words of memory, first row 0, then row 1 and so on.

The element x[2][1] will thus be the 8+8+2 = 18th word in the array. That in turn is 17 words past the beginning of x.

In general C code, it's convenient to write two-dimensional arrays as one-dimensional to begin with, e.g.

int x[24];
Monday, January 28, 9:30 pm

I've just added a full example to the homework specs, and folded in my blog posting on the lex order.

Saturday, January 26, 10:10 am

Here are two things you may need to know for future quizzes:

Please see me if you have any questions.

Thursday, January 24, 6:15 pm

The class syllabus is ready.

Thursday, January 24, 2:40 pm

Concerning the homework specs:

Note that you just store the monomials, e.g xz, not the polynomial itself. The coefficients, e.g. b, c, d etc. in the example, would be supplied by the user, who would form polynomials by taking linear combinations of your output matrix; you are simply setting things up for a potential user.

Suppose the matrix x is data on people, with columns for height, weight and gender. Suppose also that the first person in the data has height 70, weight 160 and is male. Then the first row of x will be (70,160,1), and the first row of the output will be (in the order shown in the polynomial in the specs; see below) (70,160,1,4900,25600,11200,70,160). Note that there are only 8 numbers here instead of nine, as the square of the gender column would be the same vector again, and thus would be duplicate. Say the second row of x is (68,145,0); then the second row of the output would be (68,145,0,4624,21025,9860,0,0), etc.

You can tell that an input column is an indicator variable if it consists only of 0s and 1s.

One thing the specs didn't make clear, though, was the ordering of the columns. Use lexicographic ordering, as follows. Name the input columns u1, u2, u3, ..., uk. Code each factor in a monomial by its exponent. For the case of three input columns and degree 2, for instance, (1,0,1) means u1 u3. Then consider one monomial "less than" another if is exponent code is lexicographically less. So for example (1,0,1) is less than (1,1,0), so in the output column for u1 u3 would come before the u1 u2 column.

I would suggest that you first solve the problem without worrying about indicator variables.

Wednesday, January 23, 10:30 pm

Homework I is ready!

There will be no quiz this week. Instead, Ismail will give a mini-lecture. (Note that mini-lectures are part of the official course material, eligible for quiz coverage.)

Wednesday, January 23, 9:45 pm

Just finished grading Quiz 1. Your scores are automatically e-mailed to you.

If you do not receive your score, let me know.

Overall, the results on Quiz 1 were quite good. True, the quiz was not very challenging, but I was pleased.

Please note: Make SURE you read the solutions, EVEN IF YOU GOT FULL POINTS ON A PROBLEM. Some students gave highly roundabout answers that, though correct, did not take advantage of Python's features.

Wednesday, January 23, 9:00 pm

When you call sqrt() in whatever language, does it print the result? No. Why not? Some students in the quiz had print statements for the results of their code, rather than return. Please think about why the latter is appropriate and the former is inappropriate.

Tuesday, January 22, 10:55 pm

In the book, it mentions that class instances are implemented as dictionaries. Here is a little example:

>>> class z:
...    def __init__(self):
...       self.x = 8
...    def printx(self):
...       print self.x
... 
>>> a = z()
>>> a.x
8
>>> a.__dict__
{'x': 8}
>>> a.__dict__['y'] = 12
>>> a.__dict__['x'] = 88
>>> a.x
88
>>> a.y
12
Tuesday, January 22, 10:55 pm

Sorry for the delay in grading Quiz 1. I got 1/3 of the way through when I ran into server problems.

Should get that grading, plus Homework I, ready by tomorrow night, Thursday at the latest.

Friday, January 18, 4:20 pm

Apparently some students used their tablets to view their PDF of the book in today's quiz. I never authorized this. On the contrary, NO electronic devices other than your laptop are allowed in quizzes. I did say use of tablets is fine in lecture. OMSI has a feature in which you can view your PDF of the book.

Thursday, January 17, 10:40 am

There has been NO homework assignment yet. (There was one left over from last year, now removed.) As stated a few times in class, homework will be announced here on the blog, which will be soon.

Monday, January 14, 1:20 pm

I mentioned in class that the split() method can optionally take any character as a delimiter, as opposed to the default ' '. I thought that was in the book somewhere, but I don't see it, so here is an example:

>>> z = 'abc de f'
>>> z.split()
['abc', 'de', 'f']
>>> y = 'abc.de.f'
>>> y.split()
['abc.de.f']
>>> y.split('.')
['abc', 'de', 'f']

Note that, per the material on debugging at the end of the book, there are online "man pages," which you could use to see the details, e.g. help(str.split) or help(z.split).

Tuesday, January 8, 4:25 pm

Ground rules for use of programming language constructs: On quizzes, you are allowed to use only the constructs we've covered in the course so far. In homework, you are allowed to use any constructs from the official language. If you wish to make use of libraries, you need special permission, and any usage of ideas from the Web etc. must be credited.

Tuesday, January 8, 4:15 pm

We need to form Groups. Recall that these are used in the homework, the term project and the group quiz(zes). Group size is 3 or 4. (Those on the waiting list cannot yet join a group.)

If you wish to form your own group, send the group information to Ismail (he goes by his surname) by MONDAY, January 14. Otherwise, he will assign you to a group.

Monday, January 7, 9:00 pm

One slight change to my office hours: I will be there for sure M 4:30-6:00. Will stay later if students are still around at 6.

Our TA's OHs are Th 8:30-10 a.m., F 8-9:30 am, Kemper 55.

Tuesday, January 8, 12:05 am

Our TA is S. Ismail, sismail@ucdavis.edu.