CS 5350/6350: Machine Learining  Homework 2

$35.00

Download Details:

  • Name: Assignment-2-9p63xc.zip
  • Type: zip
  • Size: 1.05 MB

Category:

Description

5/5 - (1 vote)

1 Boolean Functions In this problem, you will be asked to write Boolean functions and linear threshold functions based on given labeled data. 1. [3 points] Table 1 shows several data points (the x’s) along with corresponding labels (y). (That is, each row is an example with a label.) Write down three different Boolean functions all of which can produce the label y when given the inputs x. y x1 x2 x3 x4 0 1 0 0 0 0 1 1 0 0 1 1 0 1 1 Table 1: Original Table 2. [5 points] Next, we expand Table 1 to Table 2 by adding more data points. How many errors will each of your functions from the previous questions make on the full data set. 3. [7 points] Write down the linear threshold function for the data in Table 2. 1 y x1 x2 x3 x4 0 1 0 0 0 0 1 1 0 0 1 1 0 1 1 0 0 1 0 0 0 0 1 1 0 0 1 1 1 0 1 0 1 1 1 Table 2: Expanded Table 2 Mistake Bound Model of Learning Consider an instance space consisting of integer points on the two dimensional plane (x1, x2) with −128 ≤ x1, x2 ≤ 128. Let C be a concept class defined on this instance space. Each function fr in C is defined by an integer radius r (with 1 ≤ r ≤ 128) as follows: fr(x1, x2) = +1 x 2 1 + x 2 2 ≤ r 2 ; −1 otherwise (1) Our goal is to come up with a error-driven algorithm that will learn the correct function f ∈ C that correctly classifies a dataset. Side notes 1. Recall that a concept class is the set of functions from which the true target function is drawn and the hypothesis space is the set of functions that the learning algorithm searches over. In this question, both these are the same set. 2. Assume that there is no noise. That is, assume that the data is separable using the hypothesis class. Questions 1. [5 points] Determine |C|, the size of concept class. 2. [5 points] To design an error driven learning algorithm, we should be able to first write down what it means to make a mistake. Suppose our current guess for the function is fr defined as in Equation 1 above. Say we get an input point (x t 1 , xt 2 ) along with its label y t . Write down an expression (an equality or an inequality) in terms of x t 1 , x t 2 , y t and r that checks whether the current hypothesis fr has made a mistake. 3. [10 points] Next, we need to specify how we will update a hypothesis if there is an error. Since fr is completely defined in terms of r, we only need to update r. How will you update r if there is an error? Consider errors for both positive and negative examples. 2 4. [20 points] Use the answers from the previous two steps to write a mistake-driven learning algorithm to learn the function. Please write the algorithm concisely in the form of pseudocode. What is the maximum number of mistakes that this algorithm can make on any dataset? 5. (For 6350 students)[15 points total] We have seen the Halving algorithm in class. The Halving algorithm will maintain a set of hypotheses consistent with all the examples seen so far and predict using the most frequent label among this set. Upon making a mistake, the algorithm prune at least half of this set. In this question, you will design and analyze a Halving algorithm for this particular concept space. a. [5 points] The set of hypotheses consistent with all examples seen so far can be defined storing only two integers. How would you do this? b. [5 points] How would you check if there is an error for an example (x t 1 , xt 2 ) that has the label y t ? c. [5 points] Write the full Halving algorithm for this specific concept space. (Do not write the same Halving algorithm we saw in class. You need to tailor it to this problem.) What is its mistake bound? 3 The Perceptron Algorithm and Its Variants 3.1 The Task and Data Imagine you have access to information about people such as age, gender and level of education. Now, you want to predict whether a person makes over $50K a year or not using these features. We will use Adult data set from the UCI Machine Learning repository1 . The original Adult data set has 14 features, among which 6 are continuous and 8 are categorical. In order to make it easier to use, we will use a pre-processed version (and subset) of the original Adult data set, created by the makers of the popular LIBSVM tool. From the LIBSVM website: “In this data set, the continuous features are discretized into quantiles, and each quantile is represented by a binary feature. Also, a categorical feature with m categories is converted to m binary features.” Use the training/test files called ‘a1a.train’ and ‘a1a.test’, available on the assignments page of the class website.2 This data is in the LIBSVM format, where each row is a single training example. The format of the each row in the data is