I would like to explain the algorithm that computes optimized
support rules. We denote the confidence of union of the s-th bucket through the t-th bucket, ``confidence(s, t)'', and let theta be the minimum confidence threshold.
We call a pair s, t that satisfies confidence of s, t is not less than theta and maximizes support of the range, an optimal pair.
Our goal is to find all optimal pairs in O(M) time.
Let's call an index s effective when confidence of j, s-1 is less than theta for all j greater than s.
Then, we can prove this lemma, ``if (s, t) is an optimal pair, s is effective''.
If s is not effective, there exists some j such that confidence of j, s-1 is not less than theta.
Because s and j is an optimal pair, confidence of s, t is greater than or equal to theta.
Therefore, confidence of j, t is not less than theta. Obviously, support of j, t is greater than that of s, t.
This contradicts the optimality of the pair s, t.
So, we find all effective indices and choose optimal pairs.
The conclusion of this foil is ``all effective indices can be computed
in O(M) time''.Let's define w as this expression.
Then ``s is effective'' is equivalent to w < 0.
By scanning buckets forwards, this algorithm computes w for all indices in O(M) time.
If s and t is an optimal pair, s is effective and t = top(s).
Therefore, all we need is to find top(s) for every effective index s, and choose s maximizing support of s, top(s).
s is effective is defined as confidence(j, s-1) less than theta for
all j less than s. This is equivalent that maximum value of confidence(j, s-1) for all j is less than theta.
It can be transformed into this expression, w is less than 0.
If both s and s' are effective, top(s) is less than or
equal to top s'. This proposition is true, because s' is effective, confidence(s, s'-1) is less than theta.
From the definition of top(s), confidence(s, top(s)) is greater than or equal to theta.
Thus, confidence(s', top(s)) is greater than or equal to theta.
This implies that top(s) is not less than top(s').
Thanks to this property, we only need to scan backwards through the list of effective indices s(1) through s(q), and the list of all indices 1 through M alternately to find top(s(i)).
Scan indices backwards from M, and find top(s(i)).
To find top(s(i+1)), we can start scanning from here. We don't need to think about this area.
This algorithm computes top(s) for all effective indices.
We implemented the algorithm and evaluated it comparing the results
with a quadratic naive method. Experiments shows that our algorithm is practically fast.
The execution time of our algorithm grows up linearly in the number of bucket in wide range.