0% found this document useful (0 votes)
37 views

Machine Learning Notes2

Machine Learning Notes2

Uploaded by

jjwfish
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
37 views

Machine Learning Notes2

Machine Learning Notes2

Uploaded by

jjwfish
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 34

What is it?

 A process of selecting the best element (in terms of some


criterion) from some set of available alternatives
 Simplest case: minimizing or maximizing a real function
 Selecting the student with the highest GPA at MUN
 Find the shortest route from MUN to village mall
 Define the most efficient sorting algorithm
…

2
Domain and Range
 A function is a mapping from one set of values to another
 The from-set is called Domain
 The To-set is called range
 Domain is the set from which the variable(s) take
values
 Every value must be mapped from
 Range is the set from which the function takes values
 Not every value may be mapped to
 Another way to say it: a function is defined on all the
values in the domain, but doesn’t have to take all the
values in the range

3
Domain and Range
 Examples: consider function f(x) = x2 + 1
 Domain: set of all real numbers
 Range: set of all real number
 Ok

 Domain: set of all real numbers


 Range: set of all real numbers ≥ 1
 Ok

 Domain: set of all real numbers


 Range: set of all integers
 Not ok!
4
Domain and Range
 Notation:
 R for the set of all real numbers
 R+ for the set of all positive real numbers
 R2 for the set of all pairs of real numbers
…
 Rn for the set of all n-tuples of real numbers
 Rn is called nD space

5
Domain and Range
 f(x) = GPA of each MUN student x
 Domain: set of all MUN students
 Range: [0, 5]
 f(x, y) = y/x
 Domain: {(x, y): (x, y)  R2, x  0}
 Range: R
 In reality, the following convention is used:
 when defining a function, only its domain is given
 Its range defaults to the set of values it can take
based on its domain

6
Domain and Range
 In an optimization problem, we are interested in finding
the value at which the function reaches the optimum
(maximum or minimum), not just the optimum value itself
 The function to be optimized is called objective function
 Problem 1
 Objective function: f(x) = x2, where x  R
 Find the value for x that minimizes/maximizes the function
value
 Solution: when x = 0, the function takes minimum value of 0
 The function doesn’t have a maximum value

7
Domain and Range
 Problem 2
 Objective function: f(x) = 𝑥 2 − 𝑥 − 2 where x  R
 Find the value for x that minimizes/maximizes the
function value
 Answer
1 2 9
 Write f(x) = 𝑥 − −
2 4
1
 By observation, when x = , the function takes minimum
2
9
value of −
4
 The function doesn’t have a maximum value

8
Domain and Range
 Problem 3:
 Objective function f(x) = the GPA of x where x is a student
at MUN
 Find the student that maximizes the function value
 Solution: x = the student who receives highest GPA at MUN
 The maximum value of f(x) is the highest GPA at MUN
 Problem 4:
 Objective function: f(x, y) = - (x2 + y2) + 4 where (x, y)  R2
 Find the value for x and y that maximizes/minimizes the
function
 Solution: x = 0 and y = 0 maximizes the function
 The function doesn’t have minimum value
9
Plotting
 For numerical functions, maximum/minimum points can
be plotted
1 2
Plot for objective function y = f(x) = 𝑥 − 2
− 94

1
𝑥 = minimizes
2
the objective
function
1
2
The minimum
function value is
9 − 9Τ4

4

10
Plotting
 For numerical functions, maximum/minimum points can
be plotted
Plot for objective function z = f(x,y) = –(x2 + y2) + 4

X=0, y=0 maximizes


the function

The maximum
function value is 4

11
How to solve: Single variable case
 Problem:
 Objective function f(x) with domain U  R and
range S
 Find the value for x in interval T  U that
maximizes (or minimizes) f(x)
 By observation: not always possible
1 1
 How to minimize x + sin πx in interval (0, 3)?
2 π

12
How to solve: Single variable case
 By calculus
𝑑𝑓
1. Take the 1st derivative of f w.r.t x, and set to zero: =0
𝑑𝑥
2. Solve the equation, the roots are called turning points (TP)
(possibly not unique)
3. Collect the TPs belonging to T: {x1, …,xn}  T
𝑑2 𝑓
4. Take the 2nd derivative of f w.r.t x: , and get its value at each
𝑑𝑥 2
TP xi
𝑑 2 𝑓(𝑥𝑖 )
 If > 0, then xi is a local minima (i.e., minimizes the function
𝑑𝑥𝑖2
in its neighborhood)
𝑑 2 𝑓(𝑥𝑖 )
 If < 0, then xi is a local maxima ( i.e., maximizes the
𝑑𝑥𝑖2
function in its neighborhood)
13
How to solve: Single variable case
 By calculus (continued)
𝑑 2 𝑓(𝑥𝑖 )
 If = 0, further steps required (omitted for this course)
𝑑𝑥𝑖2

5. Denote by P the set of all the local minima, and the borders in
T (if T has one)
6. Among all the points in P, find the one, say u, with the smallest
objective function value
7. If u  T, u minimizes the objective function in T
8. If u T, there does not exist a value that minimizes the
objective function in T

14
How to solve: Single variable case
 By calculus (continued)
9. Denote by Q the set of all the local maxima and the borders in
T (if T has one)
10. Among all the points in Q, find one, say v, with the largest
objective function value
11. If v  T, v maximizes the objective function in T
12. If v T, there does not exist a value that maximizes the
objective function in T

15
Single Variable Case – Examples
 Example 1:
2 border
 Objective function: f(x) = x – x – 2
 Find x that minimizes/maximizes f(x) in interval (0, 2]
 Answer:
df
1. Set 1st derivative to zero: = 2𝑥 − 1 = 0
dx
2. The root is 𝑥 = 12 , which is the (only) TP in (0, 2)
1
d2 f d2 (2)
3. Take the 2nd derivative, = 2 > 0, so =2>0
dx2 dx2
4. Let P = {½, 0, 2}, f(½) = -9/4, f(0) = -2, f(2) = 0
5. Since ½  (0,2], x = ½ minimizes the function in (0, 2]
6. Let Q = {0, 2} , f(0) = -2, f(2) = 0
7. Since 2  (0,2], x = 2 maximizes the function in (0,2]

16
𝑑(12) 𝑑2 (12)
𝑇 = 0,2 , = 0, 2
>0
𝑑𝑥 𝑑𝑥

1
2

17
Single Variable Case – Examples
 Example 2:
1 1
 Objective function: f x = x + sin πx
2 π
 Find x that minimizes/maximizes f(x) in interval (0, 3)
 Answer:
df 1
1. Set 1st derivative to zero: = + cos πx = 0
dx 2
2 4
2. The roots are 2𝑛 + , 2𝑛 + for all integer n, but only
3 3
2 4 8
, , are in (0, 3), so they are the turning points in (0, 3)
3 3 3
d2 f
3. Take the 2nd derivative, = −π sin πx, get its signs at
dx2
the three turning points:
2 4 8
d2 f( ) 3π d2 f( ) 3π d2 f( ) 3π
3 3 3
 2 =− < 0, 2 = > 0, 2 =− <0
dx 2 dx 2 dx 2

18
Single Variable Case – Examples
1 1 2 4 8
 𝑓 𝑥 = 𝑥 + sin 𝑥, turning points: 𝑥 = , ,
2 𝜋 3 3 3
2 4 8
d2 f(3) 3π d2 f(3) 3π d2 f(3) 3π
 =− < 0, = > 0, =− <0
dx2 2 dx2 2 dx2 2

4. Let P = 43, 0, 3 , 𝑓 4
3
= 23 − 2𝜋3, 𝑓 0 = 0, 𝑓 3 = 32
5. We have 𝑓 3 > 𝑓 43 > 𝑓(0)
6. Since 0 (0,3), there does not exist a value that minimizes
the objective function in (0,3)
7. Let Q = 23, 83, 0,3 , 𝑓 2
3
= 13 + 2𝜋3, 𝑓 8
3
= 43 + 2𝜋3, 𝑓 0 =
0, 𝑓 3 = 32
8 2
8. We have 𝑓 3
>𝑓 3 >𝑓 3
> 𝑓(0)
8
9. Since 83 ∈ (0, 3), x = maximizes the objective function in
3
(0,3)
19
𝑑𝑓 23 𝑑𝑓 43 𝑑𝑓 83 𝑑 2 𝑓(23) 𝑑 2 𝑓(83) 𝑑 2 𝑓(43)
= = =0 2 <0 2 <0 2 >0
𝑑𝑥 𝑑𝑥 𝑑𝑥 𝑑𝑥 𝑑𝑥 𝑑𝑥

1 1
y= 2
𝑥+ sin 𝜋𝑥
𝜋

2 4 8
3 3 3

𝑇 = 0,3
20
Single variable case
 Observation:
 If we minimize the objective function that’s twice
differentiable in an open interval, then
 Either the objective function takes its minimum value at a turning
point, or
 There does not exist a value at which the objective function can
reach the minimum in the interval
 Similarly for maximizing

21
How to solve: Multiple Variable Case
 Problem:
 Objective function f(x1, …,xn) with domain U  Rn and
range S  R, where Rn is the nD space
 Find the value for x1,…,xn in T  U that maximizes (or
minimizes) f(x1,…,xn)
 Notations
 Each set of values for the n variables is a point in the
nD space
 A bold face lower case letter denotes a vector of
variables x = (x1,…,xn)
 Thus f(x1, …,xn) can be written as f(x)
 Method follows similar idea to single variable case,
but more complex
22
How to solve: Multiple Variable Case
𝜕f
 Take the 1st partial derivative of f w.r.t each xi,:
𝜕xi
for all i = 1,…, n, and then set to zero
 Solve the system of n equations
 Each set of roots is a turning point
 For each turning point, determine whether it is a
local minima or maxima
 Let P (Q) be the set of all the local minima
(maxima) and the border points
 Find points in P (Q) with the smallest (largest)
value for the objective function
23
How solve: Multiple Variable Case
 If they are in U, then they minimizes (maximizes)
the objective function
 If none of these points are in the domain, there
does not exist a point that can minimize (maximize)
the objective function

24
How solve: Multiple Variable Case
 Issue:
 For each turning point, how to determine if it is
a local minima or maxima
 The general solution is beyond the scope of this
course
 We consider only a special class of function:
convex or concave functions
 Also, we do the optimization only in an open
subspace

25
Multiple Variable Case: convex and
concave functions
 For a convex function in an open space, the following
holds true
 There is at most one turning point
 If a turning point exists, it minimizes the function
 The function can not be maximized
 For a concave function in an open space, the following
holds true
 There is at most one turning point
 If a turning point exists, it maximizes the function
 The function can not be minimized in the domain

26
Convex Functions - Examples
f(x) = x2-x-2 xR f(x) = 1/x x  R+

27
f(x,y) = (x-3)2 + (y-1)2 + 1

2
(3, 1, 1)
1
1

global minima: (3, 1)


1
2
3 4

28
Concave functions - Examples
f(x) = -x2+3x-2 x  R f(x) = logx x  R+

29
Concave functions - Examples
A concave function with two variables:
f(x) = -(x2 + y2) + 4 (x, y)  R+

30
Convex/concave Functions – Examples
 Many commonly used functions are in these
classes
 Linear: 𝑓(𝑥, 𝑦, 𝑧) = 𝑎𝑥 + 𝑏𝑦 + 𝑐𝑧 + 𝑑
 Both convex and concave
 Quadratic: 𝑓 𝑥 = 𝑎𝑥 2 + 𝑏𝑥 + 𝑐
 Convex if 𝑎 > 0, concave if 𝑎 < 0
 Quadratic on multiple variables:
 𝑓 𝑥, 𝑦 = 𝑎𝑥 2 + 𝑏𝑦 2 + 𝑐𝑥𝑦 + 𝑑𝑥 + 𝑒𝑦 + 𝑓
 Convex if a ≥ 0 and b ≥ 0
 Concave if a ≤ 0 and b ≤ 0
 Neither convex nor concave otherwise
 Exponential: 𝑒 𝑎𝑥+𝑏𝑦+𝑐 convex
 Logarithm: log 𝑥 concave
31
Convex/concave-How to Determine
 This issue will not be discussed in this course
 For specific problems, you will be told!

32
Multiple Variable Case: Examples
 Minimize/maximize
𝑓 𝑥, 𝑦 = (𝑥 − 3)2 +(𝑦 − 1)2 +1
in R2
 Rewrite it in standard form:
𝑓 𝑥, 𝑦 = 𝑥 2 + 𝑦 2 − 6𝑥 − 2𝑦 + 11
 Take partial derivatives and set to zero:
𝜕𝑓
= 2𝑥 − 6 = 0
𝜕𝑥
𝜕𝑓
= 2𝑦 − 2 = 0
𝜕𝑦
 Solve for x and y: x = 3, y = 1
 The turning point is (x, y) = (3, 1)
 Since f(x,y) is a convex function, and R2 is an open space,
(1, 3) is the global minima, and cannot be maximized

33
f(x,y) = x2 + y2 – 6x – 2y + 11

2
(3, 1, 1)
1
1

global minima: (3, 1)


1
2
3 4

34

You might also like