Unit 4
Unit 4
For example, if customers are buying milk, how probably are they to
also buy bread (and which kind of bread) on the same trip to the
supermarket? This information may lead to an increase in sales by
helping retailers to do selective marketing based on predictions,
cross-selling, and planning their ledge space for optimal product
placement.
Now, just think of the universe as the set of items available at the
store, then each item has a Boolean variable that represents the
presence or absence of that item. Now each basket can then be
represented by a Boolean vector of values that are assigned to
these variables. The Boolean vectors can be analyzed for purchase
patterns that reflect items that are frequently associated or bought
together. Such patterns will be represented in the form of
association rules.
support(A⇒ B) =P(A ∪ B)
confidence(A⇒ B) =P(B|A)
Confidence(A⇒ B) = P(B|A) =
support(A ∪ B) /support(A) =
support count(A ∪ B) / support count(A)
Apriori Algorithm
AIS
SETM Algorithm
FP Growth
1. Apriori Algorithm
Advantage: The AIS algorithm was used to find whether there was
an association between items or not.
Disadvantage: The main disadvantage of the AIS algorithm is that
it generates too many candidates set that turn out to be small. As
well as the data structure is to be maintained.
3. SETM Algorithm
4. FP Growth
For Example:
Source: i.imgur