Collaborative filtering is a technique used in recommender systems to predict a user’s preferences by leveraging the preference patterns of many users. The core idea is often summarized as “people who are similar to you liked X, so you might also like X” (user-based perspective) or “items that are similar to what you liked before were liked by you and others” (item-based perspective). Collaborative filtering operates on a user–item interaction matrix (e.g. users vs. movies with ratings): it doesn’t require any information about the items themselves (such as genre or description) – instead, it relies purely on the feedback (ratings, clicks, purchases) that users give to items. By “exploiting the wisdom of the crowd”, the system can make surprisingly accurate recommendations: for instance, even if a new user has never watched a particular movie, if that user’s rating pattern is similar to a group of other users, and those users loved the movie, the system can recommend it.There are two primary approaches to collaborative filtering: user-based and item-based:User-based collaborative filtering: Find users who have historically exhibited similar taste to the target user, and recommend items that those similar users (often called the “neighbors”) liked. For example, if Alice and Bob have rated many movies similarly, and Bob has highly rated a movie that Alice hasn’t seen, that movie would be recommended to Alice. The similarity between users can be computed by comparing their rating vectors (using cosine similarity, Pearson correlation, etc.). This approach essentially assumes that like-minded people will continue to agree on new items.Item-based collaborative filtering: Instead of comparing users, compare items based on the users who interact with them.In this approach, the system looks at the target user’s liked items and finds other items that are similar to those liked items (where similarity between two items is determined by how the entire user base rated them). For instance, if many users who watched “The Lord of the Rings” also highly rated “Harry Potter”, then “Harry Potter” might be recommended to someone who enjoyed “The Lord of the Rings.” This method assumes that items can form clusters or associations (the “item affinity” perspective), and it often works well when there are many more users than items, because item-item similarities can be more stable and quicker to compute in large systems.Collaborative filtering can be implemented with memory-based methods (the above neighborhood approaches) or model-based methods. Model-based collaborative filtering typically uses matrix factorization or latent factor models (e.g., singular value decomposition or modern variations like implicit ALS) to decompose the user-item interaction matrix into latent features. These latent factors automatically capture user tastes and item characteristics (for example, in a movie context, one latent dimension might correspond to a preference for “action” vs “romance”). One challenge with collaborative filtering is the cold start problem: it requires sufficient user-item interactions to make reliable recommendations. New items (with no ratings) or new users (who haven’t rated anything) are hard to recommend with pure collaborative filtering. In practice, systems mitigate this by using content-based information or by prompting initial ratings. Despite such challenges, collaborative filtering remains a dominant approach in recommender systems (powering recommendations on e-commerce sites, streaming services, etc.) because it automatically personalizes to a user’s taste without needing explicit content analysis, simply by learning from the collective behavior of users.
Data Selection & Data Viewer
Get data insights and find the perfect selection strategy
Learn MoreSelf-Supervised Pretraining
Leverage self-supervised learning to pretrain models
Learn MoreSmart Data Capturing on Device
Find only the most valuable data directly on devide
Learn More