Homework 3: Problem 1
Homework 3: Problem 1
Homework 3: Problem 1
Qiu G
3/31/2020
Problem 1
Part 1)
eiπ , F,
Rn , φ, xπ , x
∀x ∈ Rd , we define φ : Rd → F such that φ(x) = x.
We show that k ∗ is a valid kernel.
Part 2)
We know that k1 (x, x0 ) is a valid kernel. Using the basic definitions and properties for kernels we can
deduce the following: it is positive semi definite.
For real a > 0, a ∗ k1 (x, x0 ) is a valid kernel.
Hence:
Part 3)
We know that k1 (x, x0 ), is a valid kernel. We use Mercer’s Theorem and the following decomposition:
∞
X
k ∗ (x, x0 ) = g(x)[ λj φ(x)j φ(xj0 )]g(x0 ) (4)
j=1
∞
X
= λj g(x)φ(x)g(x0 )φ(x0 ) (5)
j=1
1
Part 4)
Both k1 (x, x0 ) and k2 (x, x0 ) are valid kernels. Suppose φ1 (x) and φ2 (x) corrspond to the kernels just
mentioned (respectively). We define φ∗ : Rd → F, such that φ∗i,j (x) = φ1,i (x)φ2,j (x). We can easily see
that:
Part 5)
library(e1071)
#Train svm
svm1 <- svm(class~.,data = data1)
svm1
##
## Call:
## svm(formula = class ~ ., data = data1)
##
##
## Parameters:
## SVM-Type: C-classification
2
## SVM-Kernel: radial
## cost: 1
##
## Number of Support Vectors: 347
#Generate plot
plot(svm1, data = data1)
x xx xx
o o xx x x xxx
xx x xx
x x
o xx
o x x xx
x
x
xx x x o xx x xxxx
x x xxx x x x x x
x xx xx x x
5 oxo x
x xxxx x x
x
xx x xx
B
x x x
o x xx x
x x x x
x xx x
oxx x xx x xx
x xxxxx xxxxxx xx xoxxoxxxxxxoxx
x oo x x x x x x o
oo
oo xoo x x
xxx xxxxxxo ooo xxx oxxx
x xxxoo o xx xx x x xxxx xx o xxx xx x
x1
0 x x x x
x o x x x xxx oxx x xxxooo x
x xx o x
x ox xxx xxo o x x xxx xxx oxx xxx x xx
x x x x x xxoxoxxxx xxxxxxxxx xxxxx xxx
o
o xx x x xx
x x x
x xx x x xx x xx oo
−5 x o x xx x xx x xxxxxxx x o
x
A
xx x xxx xxxxx x x x x
x x x xx x x x x xox
x xoo
x xx x
xx x x xx x xx xx x o
xx x
−10 o
−10 −5 0 5 10
x2
3
Problem 2
Part 1)
#Range
z <- seq(-1,3,by=0.001)
#Hinge Loss
g <- function(z){
result <- ifelse(1-z>=0,1-z,0)
return(result)
}
#Create tibble
t1 <- tibble(z=z)%>%mutate(loss=g(z))
#Create plot
g1 <- ggplot(data = t1, mapping = aes(x=z,y=loss))+
geom_line()+
labs(title = "Hinge Loss g(z)")
g1
1.5
loss
1.0
0.5
0.0
−1 0 1 2 3
z
4
Part2)
When zi < 1, we have Qi (vH , c) = 1 − yi (w1 xi,1 + w2 xi,2 + c) + λ2 (w12 + w22 ). Therefore:
−yi xi,1 + λw1
∇Qi (vH , c) = −yi xi,2 + λw2 (11)
−yi
λw1
∇Qi (vH , c) = λw2 (12)
0
Part3)
#Load data
data2 <- read_csv("~/Downloads/HW3Problem2.csv")
#Create Algorithm
gradient.descent <- function(data,lambda=0.25,iter=10000){
n <- nrow(data)
x1 <- data[,1]
x2 <- data[,2]
y <- data[,3]
sum <- 0
beta <- runif(3)
for (i in 1:iter) {
z<-y*(beta[1]+beta[2]*x1+beta[3]*x2)
if(z[[1]][1]<1){
sum<-c(-y[[1]][1],-y[[1]][1]*x1[[1]][1]+lambda*beta[2],-y[[1]][1]*x2[[1]][1]+lambda*beta[3])
}
else{
sum<-c(0,lambda*beta[2],lambda*beta[3])
}
for (j in 2:n) {
if(z[[1]][j]<1){
sum<-sum+c(-y[[1]][j],-y[[1]][j]*x1[[1]][j]+lambda*beta[2],-y[[1]][j]*x2[[1]][j]+lambda*beta[3])
}
else{
sum<-sum+c(0,lambda*beta[2],lambda*beta[3])
5
}
}
sum <- sum/n
eta<-1/(i*lambda)
beta<-beta-eta*sum
}
return(beta)
}
#Evaluate beta
beta.hat <- gradient.descent(data2)
#Display
beta.hat
Part 4)
We plot the points with the decision boundary from the previous part.
data3<-data2%>%mutate(y=factor(y))
g2 <- ggplot(data = data3, mapping = aes(x=x1,y=x2,colour=y))+
geom_point()+
geom_abline(slope=-beta.hat[2]/beta.hat[3],intercept = -beta.hat[1])+
labs(title = "SVM")
g2
6
SVM
y
x2
−1
1
1
−1
−3 −2 −1 0
x1