Summary  
This chapter covers representing user-item interactions as a sparse matrix and using collaborative filtering via distributed ALS matrix factorization to predict missing preferences.

General domain of usage  
E-commerce product recommendations

A **recommendation system** predicts which items a user is likely to prefer based on past interactions. They power product recommendations on e-commerce sites, movie suggestions on streaming platforms, and playlist generation in music apps.

## Two Main Approaches

**Content-based filtering** recommends items similar to what a user has liked before, based on item attributes. If you liked a thriller set in New York, the system recommends other thrillers set in New York.

**Collaborative filtering** ignores item attributes entirely. It finds users with similar interaction patterns and recommends items those users liked. If you and another user both rated the same ten movies highly, the system recommends movies that user liked but you have not seen yet.

Collaborative filtering scales better and does not require item metadata, which is why it is the standard approach for large-scale systems.

## The Ratings Matrix

Both approaches rely on a **user-item interaction matrix** – a table where rows are users, columns are items, and values are ratings (explicit) or interaction signals like clicks or purchases (implicit).

Most cells are empty – users interact with only a small fraction of available items. The goal is to fill in the missing values.

## Why PySpark for Recommendations

Movie or product datasets can have millions of users and items, making the full ratings matrix too large for memory. PySpark's `ALS` (Alternating Least Squares) algorithm is designed for this scale – it factorizes the matrix in a distributed fashion across a cluster.


What is the main advantage of collaborative filtering over content-based filtering?

Explore the fundamentals and advanced techniques of machine learning using PySpark. This course covers supervised and unsupervised learning, model evaluation, and recommendation systems, all within the scalable PySpark framework.

Dive into machine learning with PySpark, covering supervised and unsupervised learning, model evaluation, and recommendation systems through practical examples and challenges.

Introduction to Recommendation Systems

Two Main Approaches

The Ratings Matrix

Why PySpark for Recommendations