Homework 3: Problem 1

Homework 3
Qiu G
3/31/2020
Problem 1
Part 1)
eiπ , F,
Rn , φ, xπ , x
∀x ∈ Rd , we define φ : Rd → F such that φ(x) = x.
We show that k ∗ is a valid kernel.
k ∗ (x, x0 ) = hφ(x), φ(x0 )iF (1)

0
= hx, x i (2)
Part 2)
We know that k1 (x, x0 ) is a valid kernel. Using the basic definitions and properties for kernels we can
deduce the following: it is positive semi definite.
For real a > 0, a ∗ k1 (x, x0 ) is a valid kernel.
Hence:
k ∗ (x, x0 ) = ak1 (x, x0 ) (3)

is a valid kernel.
Part 3)
We know that k1 (x, x0 ), is a valid kernel. We use Mercer’s Theorem and the following decomposition:
∞
X
k ∗ (x, x0 ) = g(x)[ λj φ(x)j φ(xj0 )]g(x0 ) (4)
j=1
∞
X
= λj g(x)φ(x)g(x0 )φ(x0 ) (5)
j=1
= hg(x)φ(x), g(x0 )φ(x0 )iF (6)

∗ ∗ 0
= hφ (x), φ (x )i (7)
1
Part 4)
Both k1 (x, x0 ) and k2 (x, x0 ) are valid kernels. Suppose φ1 (x) and φ2 (x) corrspond to the kernels just
mentioned (respectively). We define φ∗ : Rd → F, such that φ∗i,j (x) = φ1,i (x)φ2,j (x). We can easily see
that:
k ∗ (x, x0 ) = hφ∗ij (x), φ∗ij (x0 )iF (8)

= hφ1,i (x)φ2,j (x), φ1,i (x0 )φ2,j (x0 )i (9)
0 0
= k1 (x, x )k2 (x, x ) (10)
Part 5)
We import the data.
#Import data, and load svm library

library(tidyverse)
## -- Attaching packages -------------------------------------------------------- tidyverse 1.3.0 --
## v ggplot2 3.3.0 v purrr 0.3.4

## v tibble 3.0.1 v dplyr 0.8.5
## v tidyr 1.0.3 v stringr 1.4.0
## v readr 1.3.1 v forcats 0.5.0
## -- Conflicts ----------------------------------------------------------- tidyverse_conflicts() --

## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
library(e1071)
data1 <- read_csv("~/Downloads/HW3Problem1.csv")%>%mutate(class=factor(class))
## Parsed with column specification:

## cols(
## x1 = col_double(),
## class = col_character()
## )
#Train svm
svm1 <- svm(class~.,data = data1)
svm1
##
## Call:
## svm(formula = class ~ ., data = data1)
##
##
## Parameters:
## SVM-Type: C-classification
2
## SVM-Kernel: radial
## cost: 1
##
## Number of Support Vectors: 347
#Generate plot
plot(svm1, data = data1)
SVM classification plot
x xx xx
o o xx x x xxx
xx x xx
x x
o xx
o x x xx
x
x
xx x x o xx x xxxx
x x xxx x x x x x
x xx xx x x
5 oxo x
x xxxx x x
x
xx x xx
B
x x x
o x xx x
x x x x
x xx x
oxx x xx x xx
x xxxxx xxxxxx xx xoxxoxxxxxxoxx
x oo x x x x x x o
oo
oo xoo x x
xxx xxxxxxo ooo xxx oxxx
x xxxoo o xx xx x x xxxx xx o xxx xx x
x1
0 x x x x
x o x x x xxx oxx x xxxooo x
x xx o x
x ox xxx xxo o x x xxx xxx oxx xxx x xx
x x x x x xxoxoxxxx xxxxxxxxx xxxxx xxx
o
o xx x x xx
x x x
x xx x x xx x xx oo
−5 x o x xx x xx x xxxxxxx x o
x
A
xx x xxx xxxxx x x x x
x x x xx x x x x xox
x xoo
x xx x
xx x x xx x xx xx x o
xx x
−10 o
−10 −5 0 5 10
x2
3
Problem 2
Part 1)
We construct the hinge loss function g(z).
#Range
z <- seq(-1,3,by=0.001)
#Hinge Loss
g <- function(z){
result <- ifelse(1-z>=0,1-z,0)
return(result)
}
#Create tibble
t1 <- tibble(z=z)%>%mutate(loss=g(z))
#Create plot
g1 <- ggplot(data = t1, mapping = aes(x=z,y=loss))+
geom_line()+
labs(title = "Hinge Loss g(z)")
g1
Hinge Loss g(z)

2.0
1.5
loss
1.0
0.5
0.0
−1 0 1 2 3
z
4
Part2)
When zi < 1, we have Qi (vH , c) = 1 − yi (w1 xi,1 + w2 xi,2 + c) + λ2 (w12 + w22 ). Therefore:
 
−yi xi,1 + λw1
∇Qi (vH , c) = −yi xi,2 + λw2  (11)
−yi
When zi > 1, we have Qi (vH , c) = λ2 (w12 + w22 ). Therefore:
 
λw1
∇Qi (vH , c) = λw2  (12)
0
Part3)
Here β is the parameter vector.
#Load data
data2 <- read_csv("~/Downloads/HW3Problem2.csv")
## Parsed with column specification:

## cols(
## y = col_double()
## )
#Create Algorithm
gradient.descent <- function(data,lambda=0.25,iter=10000){
n <- nrow(data)
x1 <- data[,1]
x2 <- data[,2]
y <- data[,3]
sum <- 0
beta <- runif(3)
for (i in 1:iter) {
z<-y*(beta[1]+beta[2]*x1+beta[3]*x2)
if(z[[1]][1]<1){
sum<-c(-y[[1]][1],-y[[1]][1]*x1[[1]][1]+lambda*beta[2],-y[[1]][1]*x2[[1]][1]+lambda*beta[3])
}
else{
sum<-c(0,lambda*beta[2],lambda*beta[3])
}
for (j in 2:n) {
if(z[[1]][j]<1){
sum<-sum+c(-y[[1]][j],-y[[1]][j]*x1[[1]][j]+lambda*beta[2],-y[[1]][j]*x2[[1]][j]+lambda*beta[3])
}
else{
sum<-sum+c(0,lambda*beta[2],lambda*beta[3])
5
}
}
sum <- sum/n
eta<-1/(i*lambda)
beta<-beta-eta*sum
}
return(beta)
}
#Evaluate beta
beta.hat <- gradient.descent(data2)
#Display
beta.hat
## [1] -2.8072328 -0.8211780 0.8368775
Part 4)
We plot the points with the decision boundary from the previous part.
data3<-data2%>%mutate(y=factor(y))
g2 <- ggplot(data = data3, mapping = aes(x=x1,y=x2,colour=y))+
geom_point()+
geom_abline(slope=-beta.hat[2]/beta.hat[3],intercept = -beta.hat[1])+
labs(title = "SVM")
g2
6
SVM
y
x2
−1
1
1
−1
−3 −2 −1 0
x1

Homework 3: Problem 1

Uploaded by

Copyright:

Available Formats

Homework 3: Problem 1

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Homework 3: Problem 1

Uploaded by

Copyright:

Available Formats

Homework 3

k ∗ (x, x0 ) = hφ(x), φ(x0 )iF (1)

k ∗ (x, x0 ) = ak1 (x, x0 ) (3)

= hg(x)φ(x), g(x0 )φ(x0 )iF (6)

k ∗ (x, x0 ) = hφ∗ij (x), φ∗ij (x0 )iF (8)

We import the data.

#Import data, and load svm library

## -- Attaching packages -------------------------------------------------------- tidyverse 1.3.0 --

## v ggplot2 3.3.0 v purrr 0.3.4

## -- Conflicts ----------------------------------------------------------- tidyverse_conflicts() --

data1 <- read_csv("~/Downloads/HW3Problem1.csv")%>%mutate(class=factor(class))

## Parsed with column specification:

SVM classification plot

We construct the hinge loss function g(z).

Hinge Loss g(z)

When zi > 1, we have Qi (vH , c) = λ2 (w12 + w22 ). Therefore:

Here β is the parameter vector.

## Parsed with column specification:

## [1] -2.8072328 -0.8211780 0.8368775

You might also like