You are on page 1of 21

Chapter-5

Association Rule Generation

This chapter proposes algorithms for generating association rules from association
patterns (frequent itemsets). The notion of Valid Rule Pattern is introduced in Section
5.2. Section 5.3 discusses the issue of pruning the misleading association rules. In
Section 5.4 an improved algorithm for generating strong and useful association rules
is proposed. Section 5.5 gives the algorithms for constraint-based mining. While
Section 5.6 proposes algorithm for generating representative rules.

5.1 Introduction

Generation of strong association rules from association patterns is easy and


straightforward compared to the mining of association patterns. As stated in Chapter 1
(Section 1.4), strong association rules are those which satisfy user-defined thresholds of
minimum support and minimum confidence. In this chapter, association rule algorithm
given in [4] has been modified and a new algorithm for generating representative rules
has been developed to incorporate the following:

• It is hypothesized (explained in Section 5.2) that their would be certain rules


generated from the association patterns, particularly in the dense database
which will be valid requiring no further testing, in the process reducing the
time taken in rule generation.

112
• Some of the association rules generated by algorithms described in [1,4]
may be misleading, thus need to be filtered out. Correlation analysis enables
identification of such rules.

• It has been observed by various researchers that most of the generated rules
are not useful as they can be derived from other rules. Algorithms for
generating representative rules (RR) have been proposed in literature [79].
However, these algorithms seem to be inefficient in the case of dense data.
To handle such situations, an algorithm for generating representative rules
has been proposed in this work.

5.2 Valid Rule Patterns

Existing algorithms of association rule generation generate and test all possible rules
for each l ∈ L. In this section a property (Property 5.1) has been proposed and proved
for identifying frequent itemsets for which association rules can be generated directly
without further testing the validity of such rules. Such patterns have been christened
as Valid Rules Patterns (VRP) in this work.

Property 5.1:

If the support of a frequent itemset is equal to or greater than the minimum


confidence required for the rule, then all association rules generated by such itemset
or from its subset will have confidence at least equal to the minimum required
confidence.

Proof:

113
Let ‘ABC’ be a frequent itemset with support of 70% (say minimum support and
minimum confidence thresholds are 30% and 60% respectively).

Let the generated rule be,

(subset of ABC) ⇒ (ABC – subset of ABC). ………(R5.1)

Then confidence of above generated rule can be given by:

Confidence (C ) = ABC ⁄ subset of ABC

where Xrepresents the support of X (in %) in the given data.

But,
ABC≤subset of ABC≤ 100

Thus, value of subset of ABCwill always be equal to or greater than the value of 
ABC and equal to or less than 100. For two possible extremes value of subset of
ABC (Case I and Case II) the confidence of the above rule will be as follows:

• Case I

Let the value of subset of ABCbe equal to the value of ABC.

In this case confidence of Rule (R5.1) will be 100%, which is greater than the
minimum required confidence that is 60%.

• Case II

114
Let the value of subset of ABCbe equal to 100%.

In this case confidence of Rule (R5.1) will be equal to the value of ABCthat
is, 60%, which is equal to the minimum required confidence.

Now consider the case when the value of subset of ABCis between ABC
and 100%.

• Case III

Let the value ofsubset of ABCbe between value of ABCand 100%.

In this situation confidence of Rule (R5.1) will be between the two extremes
discussed above in case I and case II and hence this value will be greater than
the minimum required confidence of the rule.

Further note that each subset of ‘ABC’ will also be VRP and will always generate
strong association rules. The above discussion establishes the Property 5.1 which
suggests that there is no need to test the validity of any rule generated from Valid
Rule Pattern or its subset, as they are always strong association rules.

Usefulness of Property 5.1

Consider the transaction data shown below in Table 5.1. Assume min_supp = 50%
and min_conf = 50%.

Table 5.1: A transaction data

Tid Items

115
100 A, C, E
200 B, D
300 A, B, C, D
400 B, C, D

Table 5.2 shows all the frequent itemsets (along with their actual support) present in
the given data at the specified value of minimum support.

Table 5.2: Frequent itemsets, L

Frequent Itemset (l
Support (S)
)
{A} 50%
{B} 75%
{C} 75%
{D} 75%
{A, C} 50%
{B, C} 50%
{B, D} 75%
{C, D} 50%
{B, C, D} 50%

Note that in the above table, support of each frequent itemset (l ) is equal to or greater
than the value of min_conf. Hence, all these frequent itemsets are Valid Rule Patterns.
Thus, according to Property 5.1, association rules can be obtained directly (without
testing the validity of the rules) from these frequent itemsets. Author has selected an
itemset ‘BCD’ as an example to demonstrate how the proposed property is useful. Table
5.3 is showing all strong association rules from frequent itemset ‘BCD’ along with their
actual support (S) and confidence (C). Note that all generated rules are strong
association rules as suggested by Property 5.1.

116
Table 5.3: Generated association rules from ‘BCD’

Rule No Rule Support (S) Confidence(C )


Rule 1 B, C ⇒ D 50% 100%
Rule 2 B, D ⇒ C 50% 66.7%
Rule 3 C, D ⇒ B 50% 100%
Rule 4 B ⇒ C, D 50% 66.7%
Rule 5 C ⇒ B, D 50% 66.7%
Rule 6 D ⇒ B, C 50% 66.7%
Rule 7 A⇒C 50% 100%
Rule 8 C⇒A 50% 66.7%
Rule 9 B⇒C 50% 66.7%
Rule 10 C⇒B 50% 66.7%
Rule 11 B ⇒ D 75% 100%
Rule 12 D⇒ B 75% 100%
Rule 13 C⇒ D 50% 66.7%
Rule 14 D⇒ C 50% 66.7%

5.3 Strong Association Rules are not Always Useful

In some cases support-confidence framework, suggested by Agrawal et al. [1], is not


sufficient to find interesting rules. It can be explained by the following example:

Consider the following rule derived in previous example (Rule 10 of Table 5.3).

C⇒B S = 50% C = 66.7% …. (R5.2)

Based on support-confidence framework and definition of strong association rule,


above rule is a strong association rule. However, this rule is misleading since the
overall support of ‘C’ is 75% (Table 5.2), even greater than 66.7%. In other words, a
customer who buys item ‘C’ is less likely to buy item ‘B’ than a customer about
whom we have no information. In reality there is a negative dependence between
buying ‘C’ and buying ‘B’. Hence, all such rules are not useful.

117
Thus, association rules should not be used directly for prediction without further
analysis. They do not necessarily indicate causation. They are, however, a starting
point for further exploration, making them a popular tool for understanding data.

5.3.1 Correlation Analysis

To filter out all misleading strong association rules (like (R5.2)), objective
interestingness criterion (based on the statistic behind the data) can be used as one
step towards the goal of weeding out uninteresting rules from presentation to the user.
Thus need to study how the two events x and y are correlated.

Definition 5.1:

Interestingness of events x, y can be defined as,

I(x, y) = p(x, y)/ p(x).p(y)

where p(x) is the possibility of event x.

The fact that Interestingness of events x and y is less than one indicates negative
correlation. In the above example the Interestingness of C ⇒ B is:

I(C, B) = p( B ∪ C )/( p( B ) * p( C ))
= supp(B ∪ C )/ (supp( B ) * supp(C ))
= (2/4)/(3/4*3/4)
= 0.89 < 1

This indicates buying ‘C’ is negatively associated with buying ‘B’. Hence, this rule is
not interesting enough to be reported to the user.

118
5.4 Improved Algorithm for Association Rule Generation

In the light of above, algorithm proposed in [4] can be improved. Improved algorithm
is given below.

Algorithm: Improved algorithm for association rule generation

Input:
Frequent itemsets, L
Minimum confidence threshold, min_conf

Output:
All interesting and strong association rules, ℜI

Process:
begin

1. Divide all given frequent itemsets in two category: VRP and non-VRP.

2. Generate strong association rules from frequent itemsets of non-VRP category


by using algorithm as proposed in [4] and generate strong association rules
from frequent itemsets of VRP category directly as mentioned above. Let
generated strong association be ℜ.

3. ℜI = φ;

4. forall r : (X ⇒ Y) ∈ ℜ do

I(r) = supp(X∪Y)/ (supp(X) * supp(Y));


if I(r) ≥ 1 then

119
ℜI = ℜI ∪ {r };

endif;
endfor;
end;

5.5 Algorithms for Constraint-based Rule Mining

In this section, algorithms have been proposed for constraint-based rule mining to
discover more interesting rules. As stated earlier, user can impose constraints either
before mining or during post-mining, depending upon his/her domain knowledge.
Accordingly, two separate algorithms are given below.

Algorithm: Algorithm for constraint-based association rule mining

Input:
Database, D
Minimum support, min_supp
Minimum confidence, min_conf
A set of constraints on rules, Ĉ

Output:
The association rules, ℜĈ satisfying thresholds and constraints

Process:
begin

1. Generate frequent itemsets, L satisfying min_supp threshold by using any


algorithm proposed in Chapter 4.

120
2. Generate association rules, ℜI satisfying the min_conf threshold by using
algorithm proposed in Section 5.4.

3. Obtain association rules ℜĈ from ℜI satisfying constraints Ĉ.


ℜĈ = prun_rules(ℜI , Ĉ);
end;

function prun_rules(ℜI , Ĉ)

//This function returns those rules in ℜI satisfying constraints, Ĉ.

forall r ∈ ℜI do begin

if r not satisfying Ĉ then

ℜI = ℜI \ r ;

endif;
endfor;
return ℜĈ;
Above algorithm is simple, but inefficient. Note that constraint checking is not
performed until all the rules satisfying the threshold are generated. This obviously
involves a lot of unnecessary processing.

5.5.1 Anti-monotone Property of Constraints

This property states that if a set S violates the constraints, any superset of S violates
the constraint as well. The formal definition is as follows:

Definition 5.2:

A constraint Ĉ is anti-monotone iff for any set S,

121
S does not satisfy Ĉ
⇒ ∀ S’ ⊇ S , S’ does not satisfy Ĉ

Based on this property mining can be improved by integrating anti-monotone


constraints more deeply during frequent itemsets generation. Algorithm incorporating
the idea is given below.

Algorithm: Improving mining process with anti-monotone constraints.

Input:
Database, D
Minimum support, min_supp
Minimum confidence, min_conf
A set of anti-monotone constraints, Ĉam

Output:
Association rules, ℜĈam satisfying thresholds and constraints

Process:
begin
1. k = 1; L = φ;
2. Find the 1- candidate itemset, C1;
3. Prune C1 based on given constraints Ĉam,
impose_constraint(C1, Ĉam);

4. L1 = identify_frequent(C1, min_supp);
// This function returns 1-frequent itemset, L1

5. Repeat

122
k = k + 1;
Generate k-candidate itemset Ck from Lk-1;
impose_constraint(Ck, Ĉam);
//Prune Ck based on constraints Ĉam
Lk = identify_frequent(Ck, min_supp);
L = L ∪ Lk;
Until Lk is empty

6. Generate association rules, ℜĈam from frequent itemset, L by using


algorithm proposed in Section 5.4.

function impose_constraint(Ck, Ĉam)

//This function filters out all those itemsets in Ck which do not satisfy constraints Ĉam.

for each itemset c ∈ Ck do {


if c does not satisfy Ĉam then
Ck = Ck\ c; //Filter out c from Ck
endif;
endfor;

function identify_frequent(Ck, min_supp)

//This function generates k-frequent itemset, Lk from k-candidate itemset, Ck.

Lk = φ;
for each itemset c ∈ Ck do
S(c) = supp(c);
//computes the support for c by using Property 4.1
if S(c) ≥ min_supp then

123
Lk = Lk ∪ c;
endif;
endfor;
return Lk;

Note that above algorithm performs the constraints check on candidate itemsets
before the support computation. The anti-monotone property guarantees that any k-
itemset satisfying the constraint must be derived from the filtered out (k-1) itemsets.
Thus, it can prune away a significant number of candidates that require support
counting.

5.6 Representative Association Rules

5.6.1 Generation of Representative Rules from Maxpatterns

Function genallrepresentative_max() is being proposed for generating representative


rules from maxpatterns. In contrast to approach [79], the proposed algorithm starts
searching from minimum condition and maximum consequent rule and increases its
conditions on not getting the representative rule for existing conditions. Following are
the properties used in the algorithm:

Property 5.2:

If an association rule r is longer than an association rule r ’ then r ’ can’t be derived

from r.

124
Property 5.3:

If an association rule r : (X ⇒ Y) is shorter than an association rule r ’: (X’ ⇒ Y’)

then r can be derived from r ’ iff X ∪ Y⊂ X’ ∪ Y’ and X ⊇ X’.

Property 5.3:

If r : (X ⇒ Y) and r ’: (X’ ⇒ Y’) are different association rules of same size then r

can be derived from r ’ iff X ∪ Y= X’ ∪ Y’ and X ⊃ X’.

function genallrepresentative_max(L, Lmax)

RR = φ;
MaxK = the cardinality of a set in L with maximal number of items;

/*Loop1*/
for k = MaxK to 2 do
if Lmaxk ≠ φ then
RR = RR ∪ genk_representativepattern(RR, Lmaxk , k);
endif;
endfor;
return RR;

The genallrepresentative_max() controls the generation of representative association


rules, RR from the maxpatterns (see Loop 1). Firstly, longest (starting from size
MaxK) representative association rules are searched and added in RR, by the
genk_representativepattern(). Then representative association rules shorter by 1 are

125
found and added to RR etc. Finally, representative association 2-rules are generated
and added in RR. The computed set (RR) is the set of representative association rules.

function genk_representativepattern(RR, Lmaxk, k)

Rk = φ;
TIRk = φ;

/*Loop 2 */
forall l ∈ Lmaxk do

RR’ = {r ∈ RR| l ⊂ r.antecedent ∪ r.consequent};

RR” = RR’;
A1 = {{ l [1]},{ l [2]},……{ l [k]}};
IRk = φ;
/* Loop 3*/
forall a ∈ A1 do begin
if (|l |/|a| ≥ min_conf AND (NOT(isderivative(RR’, a ⇒ (l \
a)))) ) then
Rk = Rk ∪ a ⇒ (l \ a);
RR” = RR” ∪ a ⇒ (l \ a);
else
IRk = IRk ∪ a ⇒ (l \ a);
endif;
endfor;
Rk = Rk ∪ gen_fixedsizerule(IRk, RR”);
TIRk = TIRk ∪ IRk;
endfor;
Lmaxk-1 = Lmaxk-1 ∪ genpatternk-1(TIRk);
return Rk;

126
The input parameters in genk_representativepattern() are RR, Lmaxk, and k while it
returns Rk. Firstly, Rk and TIRk are initialized to null. Let l be a considered itemset in
Lmaxk. The representative association k-rules are generated from this itemset as follows
(Loop 2):

Set RR’, and A1 are created. IRk is initialized to null. RR’ is the set of representative
association rules longer than k which contain all items occurring in l. A1 is assigned
the family of single-sets created from items in l. First iteration of Loop 3 starts. Each
itemset, a ∈ A1 is treated as an antecedent of a candidate k-rule a ⇒ l \a. The
confidence of each candidate rule is computed in order to check whether it is a strong
association rule. The support of the rule is equal to | l |, while the support of the
antecedent a is found as the support of the respective itemset in L. If the rule’s
confidence is equal to or greater than min_conf then representativeness of the rule is
tested by using function isderivative() which is based on Property 5.2 to 5.4,
otherwise send invalid rule in IRk as well as TIRk. If tested rule is representative then
this rule will be sent to Rk. Loop 3 is repeated for each element of A1. Next generate
all representative rules from IRk using function gen_fixedsizerule(). Loop 2 is
repeated for each itemset. Now by using function genpatternk-1() generate all subset of
size k-1 from each member of TIRk. At last it returns Rk.

function gen_fixedsizerule(IRk, RR”)

n = |IRk|; //number of invalid rules of size k


if no_item(IRk [1].consequent) > 1 AND n > 1 then
R’k = φ;
for i = 1 to n-1 do begin
{

r = IRk [i];

for j = i + 1 to n do begin

127
r ’ = IRk [j];

if (left(r.antecedent, no_item(r.antecedent)-1) =

left(r ’.antecedent, no_item(r ’.antecedent)-1)) then

c = (r.antecedent ∪ r ’.antecedent) ⇒

( r.consequent ∩ r ’.consequent);

if (|c|/|c.antecedent| ≥ min_conf ) then


if NOT(isderivative(RR”, c)) then
R’k= R’k ∪ c;
endif;
else
if no_item(c.consequent) > 1 then
IR’k = IR’k ∪ c;
endif;
endif;
else
endfor;
endif;
endfor;
endfor;
else
return φ;
endif;
R’k = R’k ∪ gen_fixedsizerule(IR’k, RR”);
return R’k;

The brief description of gen_fixedsizerule() is given below.

The input parameters to this function are IRk, and RR” while it returns R’k
(representative rules of fixed size having antecedent of size between 2 and (k-1)).

128
Firstly, it finds number of invalid rules in IRk and assigns the value to n. If value of n
and number of items IRk[1].consequent greater than 1 then antecedent of invalid rules
are combined to get candidate rules having antecedent of one more higher size (This
can be obtained by taking intersection of consequents of corresponding invalid rules).
Each generated candidate rule is tested for validity. If it is not valid then send it in IR’ k,
otherwise it is tested for representativeness. If it passes representativeness test then send
it to R’k. If the number of invalid rules is greater than one then function calls itself. At
last it returns R’k.
fuction genpatternk-1(TIRk)

Pk -1 = φ;
n = |TIRk|;

/* Loop 4 */
for i = 1 to n do begin
cons = TIRk [i].consequent;
sizecons = no_item(cons);
S = subset(cons, sizecons – 1);

/* Loop 5 */
for all s ∈ S AND (TIRk[i].antecedent+s)∉ Pk-1 do
Pk-1 = Pk-1 ∪ (TIRk[i].antecedent ∪ s);
endfor;
endfor;
return Pk-1;

The brief description of genpatternk-1() is given below.

TIRk, is the input to the function while it returns Pk-1. Firstly, it initializes Pk-1 to null
and finds the number of invalid rules in TIRk and assigns the value to n. Now patterns
of size lower than 1 are generated i.e k-1 size patterns (Loop 4) from all elements of

129
TIRk. Now each generated subset is tested whether it is present in Pk-1 (Loop 5), if not
present, then append this element in Pk-1. At last it returns Pk-1.

function isderivative(RR’, c)

RR’= RR’ order by no_item(antecedent) + no_item(consequent) desc,


no_item(consequent) asc;

for all r ’∈ RR’ do

length = | r ’|;

if length ≥ |c| then

if r ’.antecedent = (r ’.antecedent ∩ c.antecedent) AND c = (c

∩ r ’ ) then

isderivative = true;
exitfunction;
endif;
endif;
isderivative = false;
return isderivative;

The brief description of isderivative() is given below.

The input parameters of this function are RR’ and c while it returns true or false. This
function tests whether an association rule can be derived from existing representative
rules by using Property 5.2 to 5.4. If, agrees then it returns true, otherwise false.

130
5.6.2 Measure of Redundancy

Adopting bench mark of [55], the author conducted experiments on the data outlined
in Chapter 4 to determine the level of redundancy in the generated rules with user
specified levels of objective measures. Figure 5.1 and 5.2. illustrate such results. It is
seen from these figures that, in most cases the number of redundant rules is
significantly larger than the number of represented rules. It is also noticed (Figure ---)
that the level of redundancy in generated rules is quite high and depends upon the
minimum support value; the lower the value of minimum support, the higher the
redundancy level. Redundancy level is also seen to be influenced by the
characteristics of the data and for the case of dense data, its value is much higher.

131
132

You might also like