Assignment 1. Foundations CSED342 – Artificial Intelligence

$30.00

Download Details:

  • Name: assign1-58mehs.zip
  • Type: zip
  • Size: 184.36 KB

Category:

Description

Rate this product

Problems
Problem 0. Optimization and probability
In this class, we will cast AI problems as optimization problems, that is, finding the best
solution in a rigorous mathematical sense. At the same time, we must be adroit at coping
with uncertainty in the world, and for that, we appeal to tools from probability. The three
problems below will not be scored, but please solve them because they are basic concepts.
Problem 0a [0 points]
Suppose you repeatedly roll a fair six-sided dice until you roll a 1 or a 2 (and then you stop).
Every time you roll a 3, you lose 1 points, and every time you roll a 4, you win 2 points.
You do not win or lose any points if you roll a 5 or a 6. What is the expected number of
points you will have when you stop?
Problem 0b [0 points]
Suppose the probability of a coin turning up heads is 0 < p < 1, and that we flip it 8 times
and get {H, T, H, T, T}. We know the probability (likelihood) of obtaining this sequence is
L(p) = p(1 − p)p(1 − p)(1 − p) = p
2
(1 − p)
3
. Now let’s go backwards and ask the question:
what is the value of p that maximizes L(p)?
Hint: Consider taking the derivative of log L(p). Taking the derivative of L(p) works too,
but it is cleaner and more natural to differentiate log L(p). You can verify for yourself that
the value of p which minimizes log L(p) must also minimize L(p).
2
Problem 0c [0 points]
Let’s practice taking gradients, which is a key operation for being able to optimize continuous
functions. For w ∈ R
d and constants ai
, bj ∈ R
d and λ ∈ R, define the scalar-valued function
f(w) = Xm
i=1
Xn
j=1
(a

i w − b

j w)
2 + λ∥w∥
2
2
,
where w, ai and bj are column vectors (e.g. w = (w1, …, wd)
⊤) and ∥w∥2 =
qPd
j=1 w2
j
is
known as the L2 norm. Compute the gradient ∇wf(w).
Recall: the gradient is a d-dimensional vector of the partial derivatives with respect to
each wi
:
∇wf(w) = 
∂f(w)
∂w1
, . . . ,
∂f(w)
∂wd
⊤
.
If you’re not comfortable with vector calculus, first warm up by working out this problem
using scalars in place of vectors and derivatives in place of gradients. Not everything for
scalars goes through for vectors, but the two should at least be consistent with each other
(when d = 1). Do not write out summation over dimensions, because that gets tedious.
3
Problem 1. Programming 1
In this problem, you will implement a bunch of short functions related vector representations.
The main purpose of this exercise is to familiarize yourself with Python, and to understand
vector representations in programming.
If you’re new to Python, the following provide pointers to various tutorials and examples
for the language:
• Python for Programmers:
https://wiki.python.org/moin/BeginnersGuide/Programmers
• Example programs of increasing complexity:
https://wiki.python.org/moin/SimplePrograms
Problem 1a [2 points]
Implement denseVectorDotProduct in submission.py.
Problem 1b [2 points]
Implement incrementDenseVector in submission.py.
Problem 1c [2 points]
Implement dense2sparseVector in submission.py.
Problem 1d [2 points]
Implement sparseVectorDotProduct in submission.py.
Problem 1e [2 points]
Implement incrementSparseVector in submission.py.
4
Problem 2. Programming 2
In this problem, you will implement short functions more, and the main purpose of this
exercise is also to familiarize yourself with Python.
Problem 2a [2 points]
Implement minkowskiDistance in submission.py.
Problem 2b [2 points]
Implement getLongestWord in submission.py.
Problem 2c [2 points]
Implement getFrequentWords in submission.py.
5