1.  Say within each department all students have the same weight.  Then
Cov(weight | dept) = 0, so the first term would be 0.  Yet Var(weight)
would be non-0, since weights would vary from one department to another.

2.  

probs <- dbinom(0:2,2,0.2)

probs01 <- probs[1:2] / sum(probs[1:2])

sum(probs01 * 0:1)

3.  Let C = 1 or 2 denote the coin is chosen, 1 for the heads-weighted coin
and 2 for the other.  N|C is binomially distributed with n = 2 trials
and p = 0.9 or 0.1, depending on C.

headsProbs <- c(0.9,0.1)

n <- 2

expectNgivenC <- n * headsProbs  # the 2 conditional E()s

varNgivenC <- n * headsProbs * (1-headsProbs)  # the 2 conditional Var()s

# apply Law of Tot. Expect.

eN = mean(expectNgivenC)  # straight mean() call, since C = 0,1 with equal probs

# apply Law of Tot. Var.; need to tack on an
# (n1)/n factor explained in blog, 9/27, 1505

varN <- mean(varNgivenC) + var(expectNgivenC) * ((n-1)/n)

print(c(eN,varN))

4.  The Law of Total Variance says that

Var(wageinc) = E[Var(wageinc | occ)] + Var[E(wageinc | occ)]

means <- c(50396.47,51373.53,68797.72,53639.86,67019.26,69494.44)

vars <- c(2314077046,1822538680,2357274094,1576480779,3312502360,2732145307)

props <- c(0.22857143,0.22389248,0.33947237,0.02493778,0.03992036,0.14320557)

wtdmean <- function(p,x) sum(p * x)

expectWginc <- wtdmean(props,means)

wtdvar <- function(p,x) wtdmean(p,x^2) - (wtdmean(p,x))^2

varWginc <- wtdmean(props,vars) + wtdvar(props,means)

print(varWginc)