Togaware DATA MINING
Desktop Survival Guide
by Graham Williams
Google

Algorithm

The Apriori algorithm is a breadth-first or generate-and-test type of search algorithm. Only after exploring all possibilities of associations containing $k$ items does it then consider those containing $k+1$ items. For each $k$, all candidates are tested to determine whether they have enough support.

The algorithm uses a simple two step generate and merge process: generate frequent itemsets of size $k$ then combine them to generate candidate frequent itemsets of size $k+1$.

The algorithm is generally simple to implement and is reasonably efficient even though the number of possible items is generally large and the baskets are generally small.

The input data to the algorithm consists of entities or transactions, each transaction representing a basket of items.

The two primary tuning parameters are minsup (minimum support expressed as a percentage of the total number of transactions in data) and mincon (minimum confidence also expressed as a percentage of the total number of transactions in data). Typically they have quite small values because of the size of the databases we are dealing with. Thus a support of 0.1% or smaller is not unusual.

Procedure $\proc{Apriori}$ returns a set of association rules, each consisting of a left hand side, right hand side and a support and confidence tuple.

$\textstyle \parbox{0.985\textwidth}{\begin{codebox}
\Procname{$\proc{Apriori}(\...
...Indentless
\li Return $\proc{BuildAssociations}(f, \id{mincon})$
\end{codebox}}$

$\textstyle \parbox{0.985\textwidth}{\begin{codebox}
\Procname{$\proc{GenerateCa...
...ndentless
\Indentless
\Indentless
\li Return $\id{candidates}$
\end{codebox}}$

$\textstyle \parbox{0.985\textwidth}{\begin{codebox}
\Procname{$\proc{BuildAssoc...
...$
\Indentless
\Indentless
\Indentless
\li Return $\id{rules}$
\end{codebox}}$

Copyright © 2004-2010 Togaware Pty Ltd
Support further development through the purchase of the PDF version of the book.
The PDF version is a formatted comprehensive draft book (with over 800 pages).
Brought to you by Togaware. This page generated: Sunday, 22 August 2010