Description
Task 1
15 points.
In a certain probability problem, we have 11 variables: A, B1
, B2
, …, B10
.
Variable A has 8 values.
Each of variables B1
, …, B10 have 5 possible values. Each Bi
is conditionally indepedent of all other 9 Bjvariables (with j != i) given A.
Based on these facts:
Part a: How many numbers do you need to store in the joint distribution table of these 11 variables?
Part b: What is the most space-efficient way (in terms of how many numbers you need to store) representation for the joint probability distribution of these 11 variables? How
many numbers do you need to store in your solution? Your answer should work with any variables satisfying the assumptions stated above.
Part c: Does this scenario follow the Naive-Bayes model?
Task 2
30 points
As in the slides that we saw in class (Prior and Posterior probabilties), there are five types of bags of candies. Each bag has an infinite amount of candies. We have one of those
bags, and we are picking candies out of it. We don’t know what type of bag we have, so we want to figure out the probability of each type based on the candies that we have
picked.
The five possible hypotheses for our bag are:
h1
(prior: 10%): This type of bag contains 100% cherry candies.
h2
(prior: 20%): This type of bag contains 75% cherry candies and 25% lime candies.
h3
(prior: 40%): This type of bag contains 50% cherry candies and 50% lime candies.
h4
(prior: 20%): This type of bag contains 25% cherry candies and 75% lime candies.
h5
(prior: 10%): This type of bag contains 100% lime candies.
Given the following sequences of observations show how the Posterior probabilties change.
a. CCCCCL
b. CLCLCL
c. CCCLLL
Task 3
10 points
George doesn’t watch much TV in the evening, unless there is a baseball game on. When there is baseball on TV, George is very likely to watch. George has a cat that he feeds
most evenings, although he forgets every now and then. He’s much more likely to forget when he’s watching TV. He’s also very unlikely to feed the cat if he has run out of cat
food (although sometimes he gives the cat some of his own food). Design a Bayesian network for modeling the relations between these four events:
baseball_game_on_TV
George_watches_TV
out_of_cat_food
George_feeds_cat
Your task is to connect these nodes with arrows pointing from causes to effects. No programming is needed for this part, just include an electronic document (PDF, Word file, or
OpenOffice document) showing your Bayesian network design.
Task 4
10 points
For the Bayesian network of Task 3, the text file at this link contains training data from every evening of an entire year. Every line in this text file corresponds to an evening, and
contains four numbers. Each number is a 0 or a 1. In more detail:
The first number is 0 if there is no baseball game on TV, and 1 if there is a baseball game on TV.
The second number is 0 if George does not watch TV, and 1 if George watches TV.
The third number is 0 if George is not out of cat food, and 1 if George is out of cat food.
The fourth number is 0 if George does not feed the cat, and 1 if George feeds the cat.
Based on the data in this file, determine the probability table for each node in the Bayesian network you have designed for Task 3. You need to include these four tables in the
drawing that you produce for question 3. You also need to submit the code/script that computes these probabilities.
Task 5
8 points
Given the network in Task 4, calculate P (not(George Feeds Cat) / Baseball Game on TV) using Inference by Enumeration
Task 6
12 points.
Figure 1: Yet another Bayesian Network.
Part a: On the network shown in Figure 1, what is the Markovian blanket of node L?
Part b: On the network shown in Figure 1, what is P(C, H)? How is it derived?
Part c: On the network shown in Figure 1, what is P(O | not(J), E)? How is it derived?

