Home > Tech > Content

Building a Basic Collaborative Filtering Recommender with Python

Tech May 18 14

import numpy as np


def compute_pair_similarity(vec_a, vec_b):
    shared_mask = (vec_a > 0) & (vec_b > 0)
    if shared_mask.sum() == 0:
        return 0.0
    a = vec_a * shared_mask
    b = vec_b * shared_mask
    return np.dot(a, b) / (np.linalg.norm(a) * np.linalg.norm(b))


def generate_suggestions(target_user, score_matrix, top_k=2):
    item_count = score_matrix.shape[1]
    sim_scores = np.zeros(item_count)

    user_profile = score_matrix[:, target_user]
    for idx in range(item_count):
        sim_scores[idx] = compute_pair_similarity(user_profile, score_matrix[:, idx])

    nearest_indices = np.argsort(sim_scores)[-top_k:]
    rated_sum = np.zeros(item_count)
    for idx in nearest_indices:
        rated_sum[idx] = score_matrix[:, idx].sum()

    return np.argsort(rated_sum)[::-1]


def run_demo():
    ratings = np.array([
        [5, 3, 0, 1],
        [4, 0, 4, 4],
        [1, 1, 3, 2],
        [0, 0, 4, 5],
        [2, 2, 0, 0]
    ])

    user_id = 0
    ordered_items = generate_suggestions(user_id, ratings, top_k=2)
    print(f"Item recommendation order for user {user_id}: {ordered_items}")


if __name__ == "__main__":
    run_demo()

How the Algorithm Works

The code implements a straightforward item-based collaborative filtering suggestion engine. It assumes that items liked by similar user clusters are relevant to a target user. The scoring mechanism relies on a rating matrix where rows represent items, columns represent users, and each cell holds a rating value (zero indicates no rating).

A core component is a similarity function that measures the affinity between two item vectors. Cosine similarity is applied only on positions where both items have received ratings, which helps avoid bias from missing data. If no common ratings exist, the similarity defaults to zero.

The suggestion pipeline is executed by generate_suggestions. It takes a target user identifier, the rating matrix, and a optional neighbor count top_k. The process follows these steps:

For every item column, compute its similarity with the target user’s rating profile using the masked cosine function.
Identify the k items most similar to the user’s taste.
Use those neighbors to estimate item scores by aggregating the total ratings each neighbor item has received.
Return the item indices sorted in descending order of their score, forming a ranked recommendation list.

This compact approach demonstrates the essential mechanism behind more sophisticated recommender frameworks while remaining easy to modify or extend.

Back to List

Prev: Complete HTTP Request Lifecycle: From URL to Rendered Page

Next: Fixing R3trans -d Connectivity Failure After SAP HANA Server Reboot

Fading Coder

Building a Basic Collaborative Filtering Recommender with Python

How the Algorithm Works

Related Articles

Understanding Strong and Weak References in Java

Comprehensive Guide to SSTI Explained with Payload Bypass Techniques

SBUS Signal Analysis and Communication Implementation Using STM32 with Fus Remote Controller

Leave a Comment

Copyright © fadingcoder.top

Fading Coder

Building a Basic Collaborative Filtering Recommender with Python

How the Algorithm Works

Related Articles

Understanding Strong and Weak References in Java

Comprehensive Guide to SSTI Explained with Payload Bypass Techniques

SBUS Signal Analysis and Communication Implementation Using STM32 with Fus Remote Controller

Leave a CommentCancel Reply

Copyright © fadingcoder.top

Leave a Comment