Description
1 1. [2 points] Adapted from Exercise 2.3 of FCMA p.90: Let Y be a random variable that can take any positive integer value. The likelihood of these outcomes is given by the Poisson pmf (probability mass function): p(y) = λ y y! e −λ (1) By using the fact that for a discrete random variable the pmf gives the probabilities of the individual events occurring and the probabilities are additive… (a) Compute the probability that Y ≤ 6 for λ = 8, i.e., P(Y ≤ 6). Write a (very!) short python script to compute this value, and include a listing of the code in your solution. (b) Using the result of (a) and the fact that one outcome has to happen, compute the probability that Y > 6. Solution. a) Code Listing 1: poisson.py script #! / u s r / bin /python import math , s y s def p o i s s o n p r o b a b i l i t y (y , lamb ) : sum = 0 . 0 ; fo r i in r an ge ( y + 1 ) : sum += (math . pow( lamb , i ) * math . exp ( – lamb ) ) / math . f a c t o r i a l ( i ) return sum y = 6 lamb = 8 prob = p o i s s o n p r o b a b i l i t y (y , lamb ) print ”The P oi s s o n p r o b a b i l i t y i s ” , prob [emanuel@localhost submit]$ python poisson.py The Poisson probability is 0.313374277536 As we saw above, P(Y ≤ 6) = 0.3133742 b) P(Y > 6) = 1 − P(Y ≤ 6) = 1 − 0.3133742 = 0.6866257 2. [3 points] Adapted from Exercise 2.4 of FCMA p.90: Let X be a random variable with uniform density, p(x) = U(a, b). Derive Ep(x){1 + 0.1x + 0.5x 2 + 0.05x 3}. Work out analytically Ep(x) 1 + 0.1x + 0.5x 2 + 0.05x 3 for a = −10, b = 5 (show the steps). The script approx expected value.py demonstrates how you use random samples to approximate an expectation, as described in Section 2.5.1 of the book. The script estimates the expectation of the function y 2 when Y ∼ U(0, 1) (that is, y is uniformly distributed between 0 and 1). This script shows a plot of how the estimation improves as larger samples are considered, up to 100 samples. Modify the script approx expected value.py to compute a sample-based approximation to the expectation of the function 1+0.1x+0.5x 2+0.05x 3 when X ∼ U(−10, 5) and observe how the approximation improves with the number of samples drawn. Include a plot showing the evolution of the approximation, relative to the true value, over 3,000 samples. 2 Solution. 3. [3 points] Adapted from Exercise 2.5 of FCMA p.91: Assume that p(w) is the Gaussian pdf for a D-dimensional vector w given in p(w) = 1 (2π)D/2|Σ| 1/2 exp − 1 2 (w − µ) >Σ−1 (w − µ) . By expanding the vector notation and re-arranging, show that using Σ = σ 2 I as the covariance matrix assumes independence of the D elements of w. You will need to be aware that the determinant of a matrix that only has entries on the diagonal (|σ 2 I|) is the product of the diagonal values and that the inverse of the same matrix is constructed by simply inverting each element on the diagonal. (Hint, a product of exponentials can be expressed as an exponential of a sum. Also, just a reminder that exp{x} is e x .) Solution. 4. [2 points; Required only for Graduates] Adapted from Exercise 2.6 of FCMA p.91: Using the same setup as in Problem 4, see what happens if we use a diagonal covariance matrix with different elements on the diagonal, i.e., Σ = σ 2 1 0 · · · 0 0 σ 2 2 · · · 0 . . . . . . . . . . . . 0 0 · · · σ 2 D Solution. 5. [4 points] Adapted from Exercise 2.9 of FCMA p.91: Assume that a dataset of N binary values, x1, …, xn, was sampled from a Bernoulli distribution, and each sample xi is independent of any other sample. Explain why this is not a Binomial distribution. Derive the maximum likelihood estimate for the Bernoulli parameter. Solution. 6. [3 points] Adapted from Exercise 2.12 of FCMA p.91: Familiarize yourself with the provided script predictive variance example.py. When you run it, it will generate a dataset and then remove all values for which −2 ≤ x ≤ 2. Observe the effect this has on the predictive variance in this range. Plot (a) the data, (b) the error bar plots for model orders 1, 3, 5 and 9, and (c) the sampled functions for model orders 1, 3, 5 and 9. You will plot a total of 9 figures. Include a caption for each figure that qualitatively describes what the figure shows. Also, clearly explain what removing the points has done in contrast to when they’re left in. Solution. 7. [5 points] In this exercise, you will create a simple demonstration of how model bias impacts variance, similar to the demonstration in class. Using the same true model in the script predictive variance example.py, that is t = 1 + 0.1x+ 0.5x 2 + 0.05x 3 , generate 20 data sets, each consisting of 25 samples from the true function (using the same range of x ∈ [−12.0, 5.0]). Then, create a separate plot for each of the model polynomial orders 1, 3, 5 and 9, in which you plot the true function in red and each of the best fit functions of that model order to the 20 data sets. You will therefore produce four plots. The first will be for model order 1 and will include the true model plotted in red and then 20 curves, one each for an 3 order 1 best fit model for each of the 20 data set, for all data sets. The second plot will repeat this for model order 3, and so on. You can use any of the code in the script predictive variance example.py as a guide. Describe what happens to the variance in the functions as the model order is changed. (tips: plot the true function curve last, so it is plotted on top of the others; also, use linewidth=3 in the plot fn to increase the line width to make the curve stand out more.) Solution. 8. [3 points; Required only for Graduates] Adapted from Exercise 2.13 of FCMA p.92: Compute the Fisher Information Matrix for the parameter of a Bernoulli distribution. Solution.