Neural Networks (2010/11) Example Exam, December 2010
Neural Networks (2010/11) Example Exam, December 2010
Neural Networks (2010/11) Example Exam, December 2010
b) Assume you have found two different solutions w(1) and w(2) of the percep-
tron storage problem for data set ID. Assume furthermore that w(1) can
be written as a linear combination
X
P
w(1) = x S with x IR,
=1
whereas the difference vector w(2) w(1) is orthogonal to all the vectors
ID.
Consider the stabilities of the competing solutions and prove (give precise
mathematical arguments) that (w(1) ) (w(2) ) holds true. What does
this result imply for the perceptron of optimal stability and potential train-
ing algorithms?
1
3) Learning a linearly separable rule
Here we consider linearly separable data ID = { , SR }P=1 where noise free labels
SR = sign[w ] are provided by a teacher vector w IRN with |w | = 1.
a) Define and explain the term version space precisely in this context, provide
a mathematical definition as a set of vectors and also a simplifying graphical
illustration. Give a brief argument why one can expect the perceptron of
maximum stability to display good generalization behavior.
b) Define and explain the (Rosenblatt) Perceptron algorithm for a given set
of examples ID. Be precise, for instance by writing it in a few lines of
pseudocode. Also include a stopping criterion.
c) While experimenting with the Rosenblatt perceptron in the practicals, your
partner has a brilliant idea: the use of a larger learning rate. His/her
argument: updating w by Hebbian terms of the form S with a large
> 1 should give (I) faster convergence and (II) a better perceptron vector.
Are you convinced? Give precise arguments for yor answer!
Note: The following will be treated in January, but some of you might be able
to do (a) and (b) already, using the information on gradients etc. 3c) should get
clearer in January, but I think you might solve it already, with a little thinking :-)