Professional Documents
Culture Documents
&
Apriori Algorithm
Binu Jasim
Data Mining (Monsoon Sem-2014)
NIT Calicut
Association Analysis
To find out Associations b/w Items/Objects
Transaction Data
Items (I): The Set of all items
I = { i1 , i2 , , i }
Transactions
T = { tr1 , tr2 , , tr }
Transactions
tr I
i1
i2
i3
i4
i5
tr1
tr2
tr3
tr4
tr5
Item Sets
Items set is a collection of zero or more Items
Word2
D1
D2
D2
:
:
:
Wordr
So A may be represented as
0111101010100111 -for 4x4 image resolution
Associations
{A} -> {B} indicates Item B is also bought if
Item A is bought
{A,B} -> {C} indicates C is bought if Items A &
B are bought together
Total # Associations
If we have d items
Then total # Associations = 3 2+1 + 1
Total # Associations
I = {A, B, C}
d=3
Total # Associations = 33 24 + 1 = 12
{A} -> {B, C}
{A}->{B}
{B}-> {A}
{A}->{C}
{C}-> {A}
{B}->{C}
{C}->{B}
Proof !
Each item can go into either of the 3 boxes
Support Count ()
Support count of an Item set X is given as
() = | *trj : x trj +|
Eg: ,
tr1
tr2
= 2,
= 0
Support(X->Y) =
Confidence(X->Y) =
The Idea
If {Milk, Bread, Butter} is frequent
then all of its subsets are also frequent
i.e. support =
({Milk,Bread,Butter})/N > minsup
then ( Milk, Bread /N > minsup