Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
FP-growth Algorithm.FP-tree | Mining Frequent Itemsets
Association Rule Mining
course content

Contenido del Curso

Association Rule Mining

FP-growth Algorithm.FP-tree

An FP-tree, or frequent pattern tree, is a data structure that represents a compact and efficient way to store transactional datasets, where each transaction consists of a set of items.
The FP-tree organizes transactions by their common items, linking them in a tree structure based on their frequency of occurrence in a particular item combination.

How to construct FP-tree

  1. Scan the Database: Traverse the transaction database and count the frequency of each item among all dataset. Sort items in descending order of frequency;
  2. Create the Root of the FP-Tree: Initialize an empty root node;
  3. Scan the Database Again: For each transaction in the database, filter and sort the items according to their total frequency calculated on the first point of the algorithm;
  4. Update the FP-Tree: For each filtered transaction:
    • Start at the root node of the tree;
    • If an item in the transaction is already a child node of the current node, increment its count;
    • Otherwise, create a new child node with the item and set its count to 1;
    • Move to the next item in the transaction and repeat the process until all items are processed.
  5. Link Similar Items: After processing all transactions, link nodes representing the same item together by their link pointers;
  6. Return the FP-Tree: The constructed FP-tree is ready for further mining.

FP-tree example

Let's consider the following dataset:

Let's arrange our transaction data in descending order of total frequency across the entire dataset, prioritizing items with higher frequency over those with lower frequency.
Total frequency table looks like this:

ItemTotal Frequency
bread6
eggs6
butter4
jam4
milk4
cheese3

As a result we can create sorted transaction table:

The FP-tree for this dataset will look like this:

As a result we have a FP-tree that represents out transaction data. Red arrows are needed to connect similar products in different tree branches - this will be used in calculating support for frequent itemsets mining.

What does the count variable represent at each node of the FP-tree?

Selecciona la respuesta correcta

¿Todo estuvo claro?

Sección 2. Capítulo 4
We're sorry to hear that something went wrong. What happened?
some-alt