An **FP-tree**, or **frequent pattern tree**, is a data structure that represents a **compact and efficient way** to store transactional datasets, where each transaction consists of a set of items.     
The FP-tree organizes transactions by their common items, linking them in a tree structure based on their **frequency of occurrence** in a particular item combination. 

## How to construct FP-tree

1. **Scan the Database:** Traverse the transaction database and count the **frequency of each item** among all dataset. Sort items in descending order of frequency;

2. **Create the Root of the FP-Tree:** Initialize an empty root node;

3. **Scan the Database Again:** For each transaction in the database, filter and **sort** the items according to their total frequency calculated on the first point of the algorithm;

4. **Update the FP-Tree:** For **each filtered transaction**:
    - Start at the root node of the tree;
    - If an item in the transaction is already a child node of the current node, **increment its count**;
    - Otherwise, create a **new child** node with the item and set its count to 1;
    - Move to the next item in the transaction and repeat the process until all items are processed.

5. **Link Similar Items:** After processing all transactions, link nodes representing the same item together by their link pointers;

6. **Return the FP-Tree:** The constructed FP-tree is ready for further mining.


## FP-tree example
Let's consider the following dataset:

Let's arrange our transaction data in **descending order** of total frequency across the entire dataset, prioritizing items with higher frequency over those with lower frequency.    
Total frequency table looks like this:    
| **Item**   | **Total Frequency** |
|--------|-----------------|
| bread  | 6               |
| eggs   | 6               |
| butter | 4               |
| jam    | 4               |
| milk   | 4               |
| cheese | 3           |


As a result we can create **sorted transaction table**: 

<!DOCTYPE html>
<html>
<head>
<style>
  table {
    width: 80%;
    margin: 20px auto;
    border-collapse: collapse;
    border: 1px solid #ddd;
  }
  th,
  td {
    padding: 10px;
    text-align: left;
  }
  th {
    background-color: #f2f2f2;
  }
  tr:nth-child(even) {
    background-color: #f2f2f2;
  }
  tr:hover {
    background-color: #ddd;
  }
  th strong {
    font-weight: bold;
  }
  td strong {
    font-weight: bold;
  }
</style>
</head>
<body>

<table>
  <tr>
    <th><strong>Transaction ID</strong></th>
    <th><strong>Sorted Items</strong></th>
  </tr>

    <tr>
        <td>1</td>
        <td>bread, eggs, milk, cheese</td>
    </tr>
    
    <tr>
        <td>2</td>
        <td>bread, eggs, butter, jam</td>
    </tr>
    
    <tr>
        <td>3</td>
        <td>bread, milk, butter, cheese</td>
    </tr>
    
    <tr>
        <td>4</td>
        <td>eggs, milk, jam</td>
    </tr>
    
    <tr>
        <td>5</td>
        <td>bread, eggs, butter, jam</td>
    </tr>
    
    <tr>
        <td>6</td>
        <td>bread, eggs</td>
    </tr>
    
    <tr>
        <td>7</td>
        <td>bread, eggs, milk, butter</td>
    </tr>
    
</table>

</body>
</html>

The FP-tree for this dataset will look like this:

As a result we have a FP-tree that represents out transaction data. Red arrows are needed to **connect similar products** in different tree branches - this will be used in calculating support for frequent itemsets mining. 

What does the count variable represent at each node of the FP-tree?

The Association Rule Mining course offers a comprehensive exploration of the principles and methodologies behind uncovering meaningful associations in large datasets. From understanding the fundamental measures like support, confidence, and lift to employing advanced algorithms such as Apriori and FP-Growth, you will develop the skills necessary to extract valuable insights from transactional data. Through practical applications in diverse domains like retail, healthcare, and finance, participants learn to drive data-driven decision-making, optimize business processes, and uncover hidden opportunities for growth and innovation.

Delve into the foundational principles of uncovering hidden connections within vast datasets. Explore key metrics like support, confidence, and lift, illuminating the significance of association rule mining across industries. Through engaging discussions and real-world examples, gain insight into this essential analytical technique, driving strategic decision-making and uncovering actionable insights.

We will cover essential techniques for identifying recurring item combinations in large datasets. Learners explore the Apriori and FP-growth algorithms and learn to assess the significance of discovered associations using metrics like support, confidence, and lift. Participants gain insights into efficiently extracting meaningful patterns from transactional data through practical examples.

We will explore some domains where ARM techniques find practical utility beyond their traditional uses, for example, in recommendation systems or classification tasks. By uncovering hidden patterns and relationships within complex datasets, ARM offers valuable insights and facilitates informed decision-making across various fields, ultimately driving innovation and efficiency.

FP-growth Algorithm.FP-tree

How to construct FP-tree

FP-tree example