0% found this document useful (0 votes)
4 views7 pages

Unit2_4

Download as pdf or txt
Download as pdf or txt
Download as pdf or txt
You are on page 1/ 7

Concept Learning

The concept can be viewed as describing some subset of objects or events defined over a larger
set (e.g., the subset of animals that constitute birds). Alternatively, each concept can be thought of
as a boolean-valued function defined over this larger set (e.g., a function defined over all animals,
whose value is true for birds and false for other animals).

Concept learning. - Inferring a boolean-valued function from training examples of its input and
output.

A CONCEPT LEARNING TASK


Let us consider an example task of learning the target concept "days on which my friend Aldo
enjoys his favorite water sport." The attribute EnjoySport indicates whether or not Aldo enjoys
his favorite water sport on this day. The task is to learn to predict the value of EnjoySport for an
arbitrary day, based on the values of its other attributes.

What hypothesis representation shall we provide to the learner in this case? Let us consider a
simple representation in which each hypothesis consists of a conjunction of constraints on the
instance attributes. In particular, let each hypothesis be a vector of six constraints, specifying the
values of the six attributes Sky, AirTemp, Humidity, Wind, Water, and Forecast. For each attribute,
the hypothesis will either :
 indicate by a "?' that any value is acceptable for this attribute,
 specify a single required value (e.g., Warm) for the attribute, or
 indicate by a "Ø" that no value is acceptable.

If some instance x satisfies all the constraints of hypothesis h, then h classifies x as a positive
example (h(x) = 1). To illustrate, the hypothesis that Aldo enjoys his favorite sport only on cold
days with high humidity (independent of the values of the other attributes) is represented by
the expression -
(?, Cold, High, ?, ?, ?)

 The most general hypothesis-that every day is a positive example-is represented by


(?, ?, ?, ?, ?, ?)
 The most specific possible hypothesis-that no day is a positive example-is represented by
(Ø, Ø, Ø, Ø, Ø, Ø )

To summarize, the EnjoySport concept learning task requires learning the set of days for which
EnjoySport = yes, describing this set by a conjunction of constraints over the instance attributes

Notation used in concept learning task-


Example: The EnjoySport concept learning task.
Given:
Instances X: Possible days, each described by the attributes –
-Sky (with possible values Sunny, Cloudy, and Rainy),
-Air Temp (with values Warm and Cold),
-Humidity (with values Normal and High),
-Wind (with values Strong and Weak),
-Water (with values Warm and Cool), and
-Forecast (with values Same and Change).
Hypotheses H: Each hypothesis is described by a conjunction of constraints on the attributes- Sky,
AirTemp, Humidity, Wind, Water, and Forecast.
The constraints may be "?" (any value is acceptable), " Ø “(no value is acceptable), or a specific
value.
Target concept c: EnjoySport : X -> [0,l]
Training examples D: Positive and negative examples of the target function.
Determine: A hypothesis h in H such that h(x) = c(x) for all x in X.

Inductive Learning Hypothesis – Concept learning task is able to follow inductive learning
hypothesis.
Definition: Any hypothesis found to approximate the target function well over a sufficiently large
set of training examples , will also approximate the target function well over other unobserved
examples. Following points are to be considered- •What are the implications?
•Is this reasonable?
•What (if any) are alternatives?
•What about concept drift (what if our views/tastes change over time)?

Concept Learning as search- Concept learning can be viewed as the task of searching through a
large space of hypotheses with the goal to find the hypothesis that best fits the training
examples. For the given example, the instance space X contains exactly 3 X 2X 2 X 2 X 2 X 2 =
96 distinct instances. A similar calculation shows that there are 5X 4 X 4 X 4 X 4 X 4 = 5 120
syntactically distinct hypotheses within H. Notice, however, that every hypothesis containing one
or more " Ø " symbols represents the empty set of instances; that is, it classifies every instance as
negative. Therefore, the number of semantically distinct hypotheses is only 1 + (4.3.3.3.3.3) = 973.

General-to-specific ordering of Hypothesis-


This structure over the hypothesis space can be used to design learning algorithms that
exhaustively search even infinite hypothesis spaces without explicitly enumerating every
hypothesis. First, for any instance x in X and hypothesis h in H, we say that x satisfies h if and
only if h(x) = 1.
The more-general-than-or-equal-to relation can be defined in terms of the sets of instances that
satisfy the two hypotheses: Given hypotheses hj and hk, hj is more-general-than or-equal-to hk if
and only if any instance that satisfies hk also satisfies hj.

To illustrate the general-to-specific ordering, consider the two hypotheses


h1 = (Sunny, ?, ?, Strong, ?, ?)
h2 = (Sunny, ?, ?, ?, ?, ?)
Here, h2 is more general than h1.

Here, hj is (strictly) more-general than hk (written hj >g hk) if and only if (hj ≥g hk) and (hk
not(≥g) hi). Finally, we will sometimes find the inverse useful and will say that hj is more specific
than hk when hk is more_general-than hj.
The hypothesis h2 is more general than hl because every instance that satisfies hl also satisfies
h2. Similarly, h2 is more general than h3. Note that neither hl nor h3 is more general than the
other. the >g or ≥g relations are defined independent of the target concept, but defines a partial
order over the hypothesis space H (the relation is reflexive, antisymmetric, and transitive).

Find S Algorithm-
The concept learning algorithms that take advantage of this partial order to efficiently organize the
search for hypotheses that fit the training data.

1. Initialize h to the most specific hypothesis in H


2. For each positive training instance x
For each attribute constraint a, in h
If the constraint a, is satisfied by x
Then do nothing
Else replace a, in h by the next more general constraint that is satisfied by x
3. Output hypothesis h

How can we use the more-general-than partial ordering to organize the search for a hypothesis
consistent with the observed training examples?
One way is to begin with the most specific possible hypothesis in H, then generalize this hypothesis
each time it fails to cover an observed positive training example.
Ex. For the EnjoySport task,
the first step of FIND- S is to initialize h to the most specific hypothesis in H---
h  (Ø, Ø, Ø, Ø, Ø, Ø )
In particular, none of the " Ø " constraints in h are satisfied by this example, so each is replaced
by the next more general constraint that fits the example; namely, the attribute values for this
training example
h  (Sunny, Warm, Normal, Strong, Warm, Same)

Still more specific, then consider third training example-


h  (Sunny, Warm, ?, Strong, Warm, Same)

Upon encountering the third training example-in this case a negative example-then algorithm
makes no change to h.
As h correctly classifies this example as negative, and hence no revision is needed.

To complete our trace of FIND-S, the fourth (positive) example leads to a further generalization
of h
h  (Sunny, Warm, ?, Strong, ?, ?)

The search moves from hypothesis to hypothesis, searching from the most specific to progressively
more general hypotheses along one chain of the partial ordering. At each stage, the hypothesis is
the most specific hypothesis consistent with the training examples observed up to this point , hence
the name is FIND-S.
FIND-S is guaranteed to output the most specific hypothesis within H that is consistent with the
positive training examples. Its final hypothesis will also be consistent with the negative examples
provided the correct target concept is contained in H, and provided the training examples are
correct.

Drawbacks of Find S-

1. Has the learner converged to the correct target concept? Although FIND-S will find a
hypothesis consistent with the training data, it has no way to determine whether it has found
the only hypothesis in H consistent with the data (i.e., the correct target concept), or
whether there are many other consistent hypotheses as well. We would prefer a learning
algorithm that could determine whether it had converged and, if not, at least characterize
its uncertainty regarding the true identity of the target concept.
2. Why prefer the most specific hypothesis? It is unclear whether we should prefer this
hypothesis over, say, the most general, or some other hypothesis of intermediate generality.
3. Are the training examples consistent? We would prefer an algorithm that could at least
detect when the training data is in-consistent and, preferably, accommodate such errors.
4. What if there are several maximally specific consistent hypotheses? In the hypothesis
language H for the EnjoySport task, there is always a unique, most specific hypothesis
consistent with any set of positive examples. FIND-S must be extended to allow it to
backtrack on its choices of how to generalize the hypothesis, to accommodate the
possibility that the target concept lies along a different branch of the partial ordering than
the branch it has selected.
Furthermore, we can define hypothesis spaces for which there is no maximally specific
consistent hypothesis, although this is more of a theoretical issue than a practical one.

You might also like