PrefixSpan Final
PrefixSpan Final
𝑆1 = ⟨ 𝑎, 𝑏 ,𝑐 , 𝑎 ⟩
Sequence database
𝑆2 = ⟨ 𝑎, 𝑏 ,𝑏 , 𝑐 ⟩
𝑆3 = ⟨ 𝑏 , 𝑐 , {𝑑}⟩
𝑆4 = ⟨ 𝑏 , 𝑎, 𝑏 ,
{𝑐}⟩
𝑚𝑖𝑛𝑠𝑢𝑝 =
3
Step 1
PrefixSpan eliminates infrequent items:
𝑆1 = ⟨ 𝑎, 𝑏 ,𝑐 , 𝑎 ⟩
Sequence database
⟨ 𝑎 ⟩ support :
Result:
𝑆2 = ⟨ 𝑎, 𝑏 ,𝑏 , 𝑐 ⟩
𝑆3 = ⟨ 𝑏 , 𝑐 , {𝑑}⟩
3
⟨ 𝑏 ⟩ support :
𝑆4 = ⟨ 𝑏 , 𝑎, 𝑏 ,
{𝑐}⟩
4
⟨ 𝑐 ⟩ support :
𝑚𝑖𝑛𝑠𝑢𝑝 =
3
4
⟨ 𝑑 ⟩ support :
1
Step 2 – Find patterns starting with ⟨ 𝑎
⟩
PrefixSpan does a database projection with ⟨ 𝑎
⟩: Sequence database
𝑆1 = ⟨ 𝑎, 𝑏 ,𝑐 , 𝑎 ⟩ ⟨ 𝑎 ⟩𝑆1 = ⟨ 𝑎, 𝑏 ,𝑐 , 𝑎 ⟩
Projected database of
𝑆2 = ⟨ 𝑎, 𝑏 ,𝑏 , 𝑐 ⟩ 𝑆2 = ⟨ 𝑎, 𝑏 ,𝑏 , 𝑐 ⟩
𝑆3 = ⟨ 𝑏 , 𝑐 , {𝑑}⟩ 𝑆3 = ⟨ 𝑏 , 𝑐 , {𝑑}⟩
𝑆4 = ⟨ 𝑏 , 𝑎, 𝑏 , 𝑆4 = ⟨ 𝑏 , 𝑎, 𝑏 ,
{𝑐}⟩ {𝑐}⟩
𝑚𝑖𝑛𝑠𝑢𝑝 =
3
Step 3 – Find patterns starting with ⟨ 𝑎
⟩
PrefixSpan does a database projection with ⟨ 𝑎
⟩: Sequence database
𝑆1 = ⟨ 𝑎, 𝑏 ,𝑐 , 𝑎 ⟩ ⟨ 𝑎 ⟩𝑆1 = ⟨ _𝑏 , 𝑐 , 𝑎 ⟩
Projected database of
𝑆2 = ⟨ 𝑎, 𝑏 ,𝑏 , 𝑐 ⟩ 𝑆2 = ⟨ _𝑏 , 𝑏 , 𝑐 ⟩
𝑆3 = ⟨ 𝑏 , 𝑐 , {𝑑}⟩ 𝑆4 = ⟨ _𝑏 , {𝑐}⟩
𝑆4 = ⟨ 𝑏 , 𝑎, 𝑏 ,
{𝑐}⟩
𝑚𝑖𝑛𝑠𝑢𝑝 =
3
Step 3 – Find patterns starting with ⟨ 𝑎
⟩
Projected database of ⟨ 𝑎 ⟩
𝑆1 = ⟨ _𝑏 , 𝑐 , 𝑎 ⟩
𝑆2 = ⟨ _𝑏 , 𝑏 , 𝑐 ⟩
𝑆4 = ⟨ _𝑏 , {𝑐}⟩
⟨ 𝑎 , {𝑎}⟩ support :
Result:
𝑚𝑖𝑛𝑠𝑢𝑝 =
1
3 ⟨ 𝑎 , {𝑏}⟩ support :
𝑎,
𝑏
support :
1
⟨ 𝑎 , {𝑐}⟩ support:
3
3
Step 4 – Find patterns starting with ⟨ 𝑎 ,
{𝑐}⟩
PrefixSpan does a database projection with ⟨ 𝑎 , 𝑐
⟩:
⟨ 𝑎 ⟩𝑆1 = Projected database of ⟨ 𝑎 ,
⟨ _𝑏 , 𝑐 , 𝑎 ⟩
Projected database of
{𝑐}⟩ 𝑆1 = ⟨ _𝑏 , 𝑐 ,
𝑆2 = ⟨ _𝑏 , 𝑏 , 𝑐 ⟩ 𝑎 ⟩
𝑆4 = ⟨ _𝑏 , {𝑐}⟩ 𝑆2 = ⟨ _𝑏 , 𝑏 ,
𝑐 ⟩
𝑆4 = ⟨ _𝑏 , {𝑐}⟩
𝑚𝑖𝑛𝑠𝑢𝑝 =
3
Step 4 – Find patterns starting with ⟨ 𝑎 ,
{𝑐}⟩
PrefixSpan does a database projection with ⟨ 𝑎 , 𝑐
⟩:
⟨ 𝑎 ⟩𝑆1 = Projected database of ⟨ 𝑎 ,
⟨ _𝑏 , 𝑐 , 𝑎 ⟩
Projected database of
{𝑐}⟩ 𝑆 = ⟨ 𝑎 ⟩
𝑆2 = ⟨ _𝑏 , 𝑏 , 𝑐 ⟩ 1
𝑆4 = ⟨ _𝑏 , {𝑐}⟩
𝑚𝑖𝑛𝑠𝑢𝑝 =
3
Step 4 – Find patterns starting with ⟨ 𝑎 ,
{𝑐}⟩
, 𝑐 ⟩ that has one more item:
Then, PrefixSpan counts the support of each sequential pattern
𝑎
starting with
Projected database of ⟨ 𝑎 ,
{𝑐}⟩ 𝑆 = ⟨ 𝑎 ⟩
1
⟨ 𝑎 , 𝑐 , {𝑎}⟩ support :
Result:
𝑚𝑖𝑛𝑠𝑢𝑝 =
1
3
Step 5 – Find patterns starting with ⟨ 𝑎, 𝑏
⟩
PrefixSpan does a database projection with ⟨{𝑎,
𝑏}⟩:
⟨ _𝑏 , of
𝑆1 = database 𝑐 ,⟨𝑎𝑎 ⟩⟩
𝑆2 = ⟨ _𝑏 , 𝑏 , 𝑐 ⟩
Projected
𝑆4 = ⟨ _𝑏 , {𝑐}⟩
𝑚𝑖𝑛𝑠𝑢𝑝 =
3
Step 5 – Find patterns starting with ⟨ 𝑎, 𝑏
⟩
PrefixSpan does a database projection with ⟨{𝑎,
𝑏}⟩: database of
⟨ 𝑎 ⟩𝑆1 = Projected database of ⟨{𝑎,
⟨ _𝑏 , 𝑐 , 𝑎 ⟩
Projected
𝑏}⟩ S1 = ⟨ _ , 𝑐 , 𝑎 ⟩
𝑆2 = ⟨ _𝑏 , 𝑏 , 𝑐 ⟩ 𝑏
𝑆4 = ⟨ _𝑏 , {𝑐}⟩
𝑆2 = ⟨ _𝑏 , 𝑏 , 𝑐 ⟩
𝑆4 = ⟨ _𝑏 , {𝑐}⟩
𝑚𝑖𝑛𝑠𝑢𝑝 =
3
Step 5 – Find patterns starting with ⟨ 𝑎, 𝑏
⟩
PrefixSpan does a database projection with ⟨{𝑎,
𝑏}⟩: database of
⟨ 𝑎 ⟩𝑆1 = Projected database of ⟨{𝑎,
⟨ _𝑏 , 𝑐 , 𝑎 ⟩
Projected
𝑏}⟩ 𝑆1 = ⟨ 𝑐, 𝑎 ⟩
𝑆2 = ⟨ _𝑏 , 𝑏 , 𝑐 ⟩
𝑆2 = ⟨ 𝑏 ,𝑐⟩
𝑆4 = ⟨ _𝑏 , {𝑐}⟩
𝑆4 = ⟨ {𝑐}⟩
𝑚𝑖𝑛𝑠𝑢𝑝 =
3
Step 5 – Find patterns starting with ⟨ 𝑎, 𝑏
⟩
Then, PrefixSpan removes infrequent patterns:
⟨ 𝑎, 𝑏 , {𝑐}⟩ support
Result:
𝑚𝑖𝑛𝑠𝑢𝑝 =
3
:3
⟨ 𝑎, 𝑏 , {𝑏}⟩
{𝑎}⟩ support
:1
𝑎, 𝑏 ,
{𝑐}
Step 6 – Find patterns starting with
𝑆4 = ⟨ {𝑐}⟩
𝑚𝑖𝑛𝑠𝑢𝑝 =
3
𝑎, 𝑏 ,
{𝑐}
Step 6 – Find patterns starting with
Projected database of ⟨ 𝑎, 𝑏 ,
{𝑐}⟩ 𝑆 = ⟨ 𝑎 ⟩
1
⟨ 𝑎, 𝑏 , 𝑐 , {𝑎}⟩
Result:
𝑚𝑖𝑛𝑠𝑢𝑝 =
3
support : 1
This pattern is
infrequent!
{𝑏}
Step 7 – Find patterns starting with
𝑆1 = ⟨ 𝑎, 𝑏 ,𝑐 , 𝑎 ⟩ ⟨{𝑏}⟩ 𝑆1 = ⟨ 𝑎, 𝑏 , 𝑐 ,
Sequence database Projected database of
𝑆2 = ⟨ 𝑎, 𝑏 ,𝑏 , 𝑐 ⟩ 𝑎 ⟩
𝑆2 = ⟨ 𝑎, 𝑏 , 𝑏 ,
𝑆3 = ⟨ 𝑏 , 𝑐 , {𝑑}⟩ 𝑐 ⟩
𝑆4 = ⟨ 𝑏 , 𝑎, 𝑏 , ⟨ 𝑏 , 𝑐,
{𝑐}⟩ 𝑆3 =
{𝑑}⟩
𝑚𝑖𝑛𝑠𝑢𝑝 = 𝑆4 = ⟨ 𝑏 , 𝑎, 𝑏 ,
3 {𝑐}⟩
{𝑏}
Step 7 – Find patterns starting with
𝑆1 = ⟨ 𝑎, 𝑏 ,𝑐 , 𝑎 ⟩ ⟨{𝑏}⟩ 𝑆1 = ⟨ 𝑐 , 𝑎 ⟩
Sequence database Projected database of
𝑆2 = ⟨ 𝑎, 𝑏 ,𝑏 , 𝑐 ⟩ 𝑆2 = ⟨ 𝑏 , 𝑐⟩
𝑆3 = ⟨ 𝑏 , 𝑐 , {𝑑}⟩ 𝑆3 = ⟨ 𝑐 , {𝑑}⟩
𝑆4 = ⟨ 𝑏 , 𝑎, 𝑏 , 𝑆4 = ⟨ 𝑎, 𝑏 , {𝑐}⟩
{𝑐}⟩
𝑚𝑖𝑛𝑠𝑢𝑝 =
3
{𝑏}
Step 7 – Find patterns starting with
⟨{𝑏}⟩ 𝑆1 = ⟨ 𝑐 , 𝑎 ⟩
Projected database of
𝑆2 = ⟨ 𝑏 , 𝑐⟩
𝑆3 = ⟨ 𝑐 , {𝑑}⟩
𝑆4 = ⟨ 𝑎, 𝑏 , {𝑐}⟩
71
Step 8 – Find patterns starting with ⟨ 𝑏}, {𝑐
⟩
PrefixSpan does a database projection for⟨ 𝑏}, {𝑐
⟩:
Projected database of ⟨ 𝑏 ,
⟨{𝑏}⟩ 𝑆1 = ⟨ 𝑐 , 𝑎 ⟩ {𝑐}⟩ 𝑆1 = ⟨ 𝑎 ⟩
Projected database of
𝑆2 = ⟨ 𝑏 , 𝑐⟩ 𝑆3 = ⟨{𝑑}⟩
𝑆3 = ⟨ 𝑐 , {𝑑}⟩
𝑆4 = ⟨ 𝑎, 𝑏 , {𝑐}⟩
𝑚𝑖𝑛𝑠𝑢𝑝 =
3
Step 8 – Find patterns starting with ⟨ 𝑏}, {𝑐
⟩
starting with⟨ 𝑏}, {𝑐 ⟩ that has one more item:
Then, PrefixSpan counts the support of each sequential pattern
Projected database of ⟨ 𝑏 ,
{𝑐}⟩ 𝑆1 = ⟨ 𝑎 ⟩
𝑆3 = ⟨{𝑑}⟩
⟨ 𝑏 , 𝑐 , {𝑎}⟩ support :
Result:
1𝑏 , 𝑐 ,
𝑚𝑖𝑛𝑠𝑢𝑝 = 3 {𝑑}
support :
1
Final result:
Those are the frequent sequential
patterns:
⟨ 𝑎 ⟩ support : 3
• ⟨ 𝑏 ⟩ support : 4
•
• ⟨ 𝑐 ⟩ support : 4
• ⟨ 𝑎 , {𝑐}⟩ support: 3
• 𝑎, 𝑏
• ⟨ 𝑎, 𝑏 , {𝑐}⟩
support : 3
support : 3
• ⟨ 𝑏 , {𝑐}⟩
support : 3