Description
Logic-based and Bayesian Inference
PART A. Due April 17.
Question 1: [10 points] Consider the following Bayesian network, where variables A through E
are all Boolean valued:
a) What is the probability that all five of these Boolean variables are simultaneously true?
[Hint: You have to compute the joint probability distribution (JPD). The structure of the
Bayesian network suggests how the JPD is decomposed to the conditional probabilities available.]
b) What is the probability that all five of these Boolean variables are simultaneously false?
[Hint: Answer similarly to above.]
c) What is the probability that A is false given that the four other variables are all known to be
true?
[Hint: Try to use the definition of the conditional probability and the structure of the network.
For probabilities that can not be computed directly from the network, remember the following normalization trick: if P() = α · ƒ () and P(¬) = α · ƒ (¬), then you can compute the
normalization factor as α =
1.0
ƒ ()+ƒ (¬)
, since P() + P(¬) = 1.0.]
Question 2: [10 points] For this problem, check the Variable Elimination algorithm in your book.
Also consider the Bayesian network from the “burglary” example.
a) Apply variable elimination to the query:
P(Brgry|JohnsCs = tre, MryCs = tre)
and show in detail the calculations that take place. Use your book to confirm that your answer
is correct.
b) Count the number of arithmetic operations performed (additions, multiplications, divisions),
and compare it against the number of operations performed by the tree enumeration algorithm.
c) Suppose a Bayesian network has the from of a chain: a sequence of Boolean variables
X1, . . . Xn where Prents(X) = {X−1} for = 2, . . . , n. What is the complexity of computing P(X1|Xn = tre) using enumeration? What is the complexity with variable elimination?
Question 3: [15 points]
In this problem, you will implement two kinds of sampling techniques (rejection sampling and
likelihood weighting) to perform approximate inference on the given Bayesian network.
a) Compute the probabilities of P(d|c), P(b|c), and P(d|¬, b) by enumeration. Then, employ
rejection sampling and likelihood weighting to approximate them. Use 1000 samples, and
document your results. How do the approximations and the actual values compare?
b) Next, we focus on P(d|c). We know that the accuracy of the approximation depends on the
number of samples used. For each of the sampling methods, plot the probability as a function
of the number of samples used. Do you notice large divergences in the convergence rates
among the methods?
c) Construct a query on this Bayes Net such that the convergence and effectiveness of rejection
sampling is noticeably worse than that of likelihood weighting List the query you are using,
and provide the probability plot as a function of samples used. Why is it the case that rejection
sampling is noticeably worse for this query?
Question 4: [15 points]
You are an interplanetary search and rescue expert who has just received an urgent message:
a rover on Mercury has fallen and become trapped in Death Ravine, a deep, narrow gorge on the
borders of enemy territory. You zoom over to Mercury to investigate the situation. Death Ravine
is a narrow gorge 6 miles long, as shown below. There are volcanic vents at locations A and D,
indicated by the triangular symbols at those locations.
[3pts] (b) Estimate the expectation E[f(X, Y, Z)|Y = 0, Z = 1] using two samples obtained using likelihood-weighting. Show
your work. Reuse the sequence {ai}1≤i≤15 (starting with a1) as a source of randomness.
3 [6 pts] HMM: Search and Rescue
You are an interplanetary search and rescue expert who has just received an urgent message: a rover on Mercury
has fallen and become trapped in Death Ravine, a deep, narrow gorge on the borders of enemy territory. You zoom
over to Mercury to investigate the situation. Death Ravine is a narrow gorge 6 miles long, as shown below. There
are volcanic vents at locations A and D, indicated by the triangular symbols at those locations. A B C F D E
! !
The rover was heavily damaged in the fall, and as a result, most of its sensors are broken. The only ones still functioning are its thermometers, which register only two levels: hot and cold. The rover sends back evidence E = hot
when it is at a volcanic vent (A and D), and E = cold otherwise. There is no chance of a mistaken reading. The rover
fell into the gorge at position A on day 1, so X1 = A. Let the rover’s position on day t be Xt ∈ {A, B, C, D, E, F}.
The rover is still executing its original programming, trying to move 1 mile east (i.e. right, towards F) every day.
However, because of the damage, it only moves east with probability 0.5, and it stays in place with probability 0.5.
Your job is to figure out where the rover is, so that you can dispatch your rescue-bot.
(a) (2 pt) Three days have passed since the rover fell into the ravine. The observations were (E1 = hot, E2 = cold,
E3 = cold). What is P(X3|hot1, cold2, cold3), the probability distribution over the rover’s position on day 3, given
the observations?
You decide to attempt to rescue the rover on day 4. However, the transmission of E4 seems to have been corrupted,
and so it is not observed.
(b) (2 pt) What is the rover’s position distribution for day 4 given the same evidence, P(X4|hot1, cold2, cold3)?
(c) (2 pt) All this computation is taxing your computers, so the next time this happens you decide to try approximate inference using particle filtering to track the rover. If your particles are initially in the top configuration shown
below, what is the probability that they will be in the bottom configuration shown below after one day (after time
elapses, but before evidence is observed)?
!
A B C F D E
! !
A B C F D E
!
4
The rover was heavily damaged in the fall, and as a result, most of its sensors are broken.
The only ones still functioning are its thermometers, which register only two levels: hot and cold.
The rover sends back evidence E = hot when it is at a volcanic vent (A and D), and E = cold
otherwise. There is no chance of a mistaken reading. The rover fell into the gorge at position A
on day 1, so X1 = A. Let the rover’s position on day t be Xt ∈ {A, B, C, D, E, F}. The rover is still
executing its original programming, trying to move 1 mile east (i.e. right, towards F) every day.
However, because of the damage, it only moves east with probability 0.80, and it stays in place
with probability 0.20. Your job is to figure out where the rover is, so that you can dispatch your
rescue-bot.
a) Filtering: Three days have passed since the rover fell into the ravine. The observations were
(E1 = hot, E2 = cold, E3 = cold). What is P(X3 | hot1, cold2, cold3), the probability distribution
over the rover’s position on day 3, given the observations? (This is a probability distribution
over the six possible positions).
b) Smoothing: What is P(X2 | hot1, cold2, cold3), the probability distribution over the rover’s
position on day 2, given the observations? (This is a probability distribution over the six
possible positions).
c) Most Likely Explanation: What is the most likely sequence of the rover’s positions in the three
days given the observations (E1 = hot, E2 = cold, E3 = cold)?
d) Prediction: What is P(hot4, hot5, cold6 | hot1, cold2, cold3), the probability of observing hot4
and hot5 and cold6 in days 4,5,6 respectively, given the previous observations in days 1,2,
and 3? (This is a single value, not a distribution).
e) Prediction: You decide to attempt to rescue the rover on day 4. However, the transmission
of E4 seems to have been corrupted, and so it is not observed. What is the rover’s position
distribution for day 4 given the same evidence, P(X4 | hot1, cold2, cold3)?
The same thing happens again on day 5. What is the rover’s position distribution
Question 5: [30 points]
H H T
N N N
N B H
Consider that you are placed in an unknown cell of the above 3 × 3 map, i.e., initially there is a
probability P(0) = 1
8
to be located in any of the cells. We denote as (1, 2) the coordinates of the
top middle cell, and as (2, 3) the coordinates of the rightmost middle cell, where:
• N is a normal cell;
• H is a highway cell;
• T is a hard to traverse cell;
• B is a blocked cell.
You are able to move inside this world by executing actions α = {Up, Leƒ t, Don, Rght}. You are
also equipped with a sensor that informs you about the terrain type that you are occupying after
every time you are trying to move inside this world. Your objective is to figure out your location
inside this world given that you had no idea initially where you were located.
Lets denote as P(
|−1, α) the transition model for moving from cell −1 at step ( − 1) to cell
at step given an action α. If given your location −1 and action α, you would hit the boundary
of this grid world or a blocked cell, then you stay in the same cell, i.e., = −1. The model is
probabilistic because our motions are not executed perfectly inside this grid world. In particular,
90% of the time the action is executed correctly but 10% the action fails and the agent stays on
the same cell. So, for instance if you execute action {Up} from cell (2, 2), with 90% probability
you move to cell (1, 2) and with 10% probability you stay at cell (2, 2). If you are at cell (3, 1) and
move Rght, then with 100% probability you stay at the same cell (because cell (3, 2) is a blocked
cell).
Furthermore, denote as P(e
|) the observation model for detecting terrain type, where e
is the observed terrain type for cell . The terrain sensor is correct 90% of the time but with
probability 5% it can return either of the other two terrain types (the sensor never returns “blocked
cell” as you never occupy one). For instance, if you are located at cell (2, 2), your terrain sensor
returns with probability 90% the value N for “normal cell”, with 5% probability it returns the
value H for “highway cell” and with 5% probability it returns the value T for “hard to traverse cell”.
Step A: You are asked to compute the probability of where you are inside this grid world after
you execute actions {Rght, Rght, Don, Don} and the corresponding sensing readings are
{N, N, H, H} (you sense after you move). You are encouraged to do this by hand and through
a program so as to have the capability to debug your solution. Submit in your report the
probabilities of where you are inside the world after each action/sensor reading pair assuming
that in the beginning you have no knowledge of where you are located. This means that you need
to provide four 33 maps with sets of eight probabilities indicating where you are inside this grid
world after each action/sensing pair.
Step B: The next objective is to scale up the size of filtering problems like the above one that
you can solve. For this question, you will use large maps (e.g., 100 by 50), where you randomly
assign terrain types to each cell out of the four possible types (50% normal cells, 20% highway
cells, 20% hard to traverse and 10% blocked cells). First, generate “ground truth” data, i.e.,
generate sequences of actions and sensor readings to test your algorithm.
For this purpose, first randomly select a non-blocked cell as the initial location of your agent in
the world o. Then, randomly generate a sequence of 100 actions, i.e., randomly select actions
from the set α = {Up, Leƒ t, Don, Rght} and generate a string of length 100. For each action
starting from the initial location 0 apply:
• the transition model (i.e., 90% follow the action – when you collide stay in place, 10% stay in
place), in order to get the next ground truth state +1;
• the observation model (i.e., 90% the correct terrain type and 5% one of the other two types),
in order to get the “ground truth” sensor reading e+1.
Once you have generated the 100 actions, the 100 ground truth states and the 100 observations,
store them in a file. Generate 10 such files per map and for 10 different maps of the world (i.e.,
100 total ground truth files), as different experiments inside the large maps. The format for the
file can be as follows:
0y0 :coordinates of initial point
y :100 coordinates of the consecutive points that the agent actually goes through separated by
a new line
α :100 characters indicating the type of action executed {U, L, D, R} separated by a new line
e :100 characters indicating the sensor reading {N, H, T} collected by the agent as it moves
In your report, provide examples of ground truth paths and the corresponding action and
sensor readings generated.
PART B. Due May 1.
Step C of Question 5: Demonstrate the capability of estimating the position of the agent inside
the world given only the actions and observations indicated in these files. In particular, your
program should be able to load a “ground truth” file and a map. Then, after each action/sensor
reading pair, it should be able to compute the probability that the agent is in each cell of the map
by applying the filtering algorithm. Visualize the different probabilities on your map (e.g., think of
a heatmap) or provide a capability to indicate the probability on any cell of the map. Visualize the
ground truth location of the agent inside the world at the corresponding step. Your visualization
should be updated with each new action/sensor reading pair until you consume all 100 readings.
Attach example heat maps in your report after 10, 50 and 100 iterations. Indicate the ground
truth path up to this point in each case.
For each of the 100 experiments, compute the error (as distance inside the grid world) between
the true location of the agent and the maximum likelihood estimation (i.e., the cell with the highest
probability according to the filtering algorithm) as the number of readings increases. For the
computation of the maximum likelihood estimation, ties can be broken randomly. Generate a plot
of the average error over all 100 experiments as the number of readings increases. For this plot,
you can ignore the first 5 iterations as many cells will have the same probability in the beginning.
For each of the 100 experiments, keep track of the probability that the cell where the agent is
actually located (which changes over time) is assigned by the filtering algorithm. Generate a plot
of the average probability of the ground truth cell over 100 experiments as the number of readings
increases. Here you can start with the uniform probability assigned to the cell in the beginning of
this process.
You may face computational challenges in the straightforward implementation of the above
algorithms given the sizes of the map. If you find this to be the case, you are allowed to provide
results for smaller maps (e.g., try first 50×50 maps, then 20×20). You will not be awarded all of
the points but you can still collect the majority of the available points for a correct implementation.
Question 6: [10 points] Assume you are interested in buying a used vehicle C1. You are also
considering of taking it to a qualified mechanic and then decide whether to buy it or not. The cost
of taking it to the mechanic is $100. C1 can be in good shape (quality q
+) or bad one (quality q
−).
The mechanic might help to indicate what shape the vehicle is in. C1 costs $3,000 to buy and its
market value is $4,000 if in good shape; if not, $1,400 in repairs will be needed to make it in good
shape. Your estimate is that C1 has a 70% chance of being in good shape. Assume that the utility
function depends linearly on the vehicle’s monetary value.
a. Calculate the expected net gain from buying C1, given no test.
b. We also have the following information about whether the vehicle will pass the mechanic’s
test:
P(pss(c1)|q
+ (c1)) = 0.8
P(pss(c1)|q
− (c1)) = 0.35
Use Bayes’ theorem to calculate the probability that the car will pass/fail the test and hence
the probability that it is in good/ bad shape given what the mechanic will tell you.
[Hint: Compute the four probabilities: P(q
+ |Pss), P(q
− |Pss), P(q
+ |¬Pss), P(q
− |¬Pss)]
c. What is the best decision given either a pass or a fail? What is the expected utility in each
case?
[Hint: Use the probabilities from the previous question.]
d. What is the value of optimal information for the mechanic’s test? Will you take C1 to the
mechanic or not?
[Hint: You can easily answer this based on the answers from questions a) and c).]
Question 7: [15 points] Consider the specification of a Markov Decision Process according to the
following figure. Code your own implementation of Value Iteration and compute the optimal policy
as well as the optimum utilities for this challenge.
Indicate the original utilities you used in order to start the process. Provide at least 5 intermediate results (in terms of optimum utilities and policies) depending on the number of iterations
needed for convergence as well as the final results. Describe your implementation and your convergence criterion. Report computation time and number of iterations.

