How to build a simple recommendation system in Python

王林
Release: 2023-10-20 17:19:43
Original
1073 people have browsed it

How to build a simple recommendation system in Python

How to build a simple recommendation system in Python

Recommendation systems are designed to help people discover and select items that may be of interest to them. Python provides a wealth of libraries and tools that can help us build a simple but effective recommendation system. This article will introduce how to use Python to build a user-based collaborative filtering recommendation system and provide specific code examples.

Collaborative filtering is a common algorithm for recommendation systems. It infers similarities between users based on users' behavioral history data, and then uses these similarities to predict and recommend items. We will use the MovieLens dataset, which contains a set of user ratings of movies. First, we need to install the required libraries:

pip install pandas scikit-learn
Copy after login

Next, we will import the required libraries and load the MovieLens dataset:

import pandas as pd
from sklearn.model_selection import train_test_split

# 加载数据集
data = pd.read_csv('ratings.csv')
Copy after login

The dataset contains userId## The three columns #, movieId and rating represent the user ID, movie ID and rating respectively. Next, we split the data set into a training set and a test set:

train_data, test_data = train_test_split(data, test_size=0.2, random_state=42)
Copy after login

Now, we can build the recommendation system. Here we will use the cosine similarity between users as the similarity measure. We will create two dictionaries to store the similarity scores of users and movies:

# 计算用户之间的相似度
def calculate_similarity(train_data):
    similarity = dict()
    for user in train_data['userId'].unique():
        similarity[user] = dict()
        user_ratings = train_data[train_data['userId'] == user]
        for movie in user_ratings['movieId'].unique():
            similarity[user][movie] = 1.0

    return similarity

# 计算用户之间的相似度得分
def calculate_similarity_score(train_data, similarity):
    for user1 in similarity.keys():
        for user2 in similarity.keys():
            if user1 != user2:
                user1_ratings = train_data[train_data['userId'] == user1]
                user2_ratings = train_data[train_data['userId'] == user2]
                num_ratings = 0
                sum_of_squares = 0
                for movie in user1_ratings['movieId'].unique():
                    if movie in user2_ratings['movieId'].unique():
                        num_ratings += 1
                        rating1 = user1_ratings[user1_ratings['movieId'] == movie]['rating'].values[0]
                        rating2 = user2_ratings[user2_ratings['movieId'] == movie]['rating'].values[0]
                        sum_of_squares += (rating1 - rating2) ** 2
                similarity[user1][user2] = 1 / (1 + (sum_of_squares / num_ratings) ** 0.5)

    return similarity

# 计算电影之间的相似度得分
def calculate_movie_similarity_score(train_data, similarity):
    movie_similarity = dict()
    for user in similarity.keys():
        for movie in train_data[train_data['userId'] == user]['movieId'].unique():
            if movie not in movie_similarity.keys():
                movie_similarity[movie] = dict()

            for other_movie in train_data[train_data['userId'] == user]['movieId'].unique():
                if movie != other_movie:
                    movie_similarity[movie][other_movie] = similarity[user][other_user]

    return movie_similarity

# 构建推荐系统
def build_recommendation_system(train_data, similarity, movie_similarity):
    recommendations = dict()
    for user in train_data['userId'].unique():
        user_ratings = train_data[train_data['userId'] == user]
        recommendations[user] = dict()
        for movie in train_data['movieId'].unique():
            if movie not in user_ratings['movieId'].unique():
                rating = 0
                num_movies = 0
                for other_user in similarity[user].keys():
                    if movie in train_data[train_data['userId'] == other_user]['movieId'].unique():
                        rating += similarity[user][other_user] * train_data[(train_data['userId'] == other_user) & (train_data['movieId'] == movie)]['rating'].values[0]
                        num_movies += 1
                if num_movies > 0:
                    recommendations[user][movie] = rating / num_movies

    return recommendations

# 计算评价指标
def calculate_metrics(recommendations, test_data):
    num_users = 0
    sum_of_squared_error = 0
    for user in recommendations.keys():
        if user in test_data['userId'].unique():
            num_users += 1
            for movie in recommendations[user].keys():
                if movie in test_data[test_data['userId'] == user]['movieId'].unique():
                    predicted_rating = recommendations[user][movie]
                    actual_rating = test_data[(test_data['userId'] == user) & (test_data['movieId'] == movie)]['rating'].values[0]
                    sum_of_squared_error += (predicted_rating - actual_rating) ** 2
    rmse = (sum_of_squared_error / num_users) ** 0.5

    return rmse

# 计算用户之间的相似度
similarity = calculate_similarity(train_data)

# 计算用户之间的相似度得分
similarity = calculate_similarity_score(train_data, similarity)

# 计算电影之间的相似度得分
movie_similarity = calculate_movie_similarity_score(train_data, similarity)

# 构建推荐系统
recommendations = build_recommendation_system(train_data, similarity, movie_similarity)

# 计算评价指标
rmse = calculate_metrics(recommendations, test_data)
Copy after login

Finally, we can output the results and evaluation metrics of the recommendation system:

print(recommendations)
print('RMSE:', rmse)
Copy after login
With the above code example, we A user-based collaborative filtering recommendation system was successfully constructed in Python and its evaluation indicators were calculated. Of course, this is just a simple example, and actual recommendation systems require more complex algorithms and larger data sets to obtain more accurate recommendation results.

To summarize, Python provides powerful libraries and tools to build recommendation systems. We can use collaborative filtering algorithms to infer similarities between users and make recommendations based on these similarities. I hope this article can help readers understand how to build a simple but effective recommendation system in Python, and provide some ideas for further exploring the field of recommendation systems.

The above is the detailed content of How to build a simple recommendation system in Python. For more information, please follow other related articles on the PHP Chinese website!

Related labels:
source:php.cn
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template