What is Apriori Algorithm? Can some one explain in simple terms?

#1

Hello,

I recently came across the Apriori algorithm used in mining frequent itemsets for Market Basket analysis.

I am trying to understand it but getting stopped due to the complexity of the materials I am finding.Can somebody please explain this in simple terms.

#2

Apriori algorithm is used to find the frequent features/ items that occur together.
An association rule is a pattern that states when X occurs, Y occurs with certain probability.

This process is done iteratively i.e. frequent item-sets with 1 item are found first, then 2 items, then 3 and so on…
Before we move on to the algorithm it is important to understand some important terms:

1. Support: The rule holds with support sup in T (the transaction data set) if sup% of transactions contain X U Y.
sup = Pr(X U Y) = count( X U Y) / total transaction count

2. Confidence: The rule holds in T with confidence conf if conf% of transactions that contain X also contain Y.
conf = Pr(Y | X) = count( X U Y) / count(X)

Algorithm:

1. First we find the single items that have the required count/support.
2. Then we combine this single item with all the other items to shortlist the 2-item data sets that satisfy the required support.
3. Then, we generate all the possible rules that are contained in these 2-item datasets and obtain the rules that satisfy the minimum confidence.
4. Then, we move on to calculate the frequent 3-item data sets instead of 2-item data sets using 2 and 3 recursively and so on…

The algorithm will be clear when you solve the following question by yourself:

http://www2.cs.uregina.ca/~dbd/cs831/notes/itemsets/itemset_apriori.html