A recommendation system is a tool in Python that recommends items or content to users based on their preferences and past behavior. The technology uses algorithms to predict users’ future preferences to serve them the most relevant content.
The scope of this system is very broad and is widely used in various industries such as e-commerce, streaming services and social media. Products, movies, music, books, etc. can all be recommended through these systems. Providing personalized recommendations not only helps increase customer engagement and loyalty, but can also boost sales.
The idea behind these operations is that users can get recommendations that are comparable to items they have been exposed to before. This system uses algorithms to pinpoint items that are very similar to a user's preferences, with the goal of creating a list of suggestions that are suitable for the user. In this setting, an algorithm analyzes data related to an item, such as its quality and user ratings, to determine which recommendations to make.
Step 1 − Import necessary libraries
Step 2 - Load Dataset
Step 3 - Preprocess data
Step 4 - Calculate similarity matrix
Step 5 − For each user −
Select items they have interacted with
For each item selected in step 5a -
Retrieve similarity scores to all other items
Use the user's rating as the weight to calculate the weighted average of the similarity scores
Sort items in descending order based on weighted similarity scores
Recommend the top N items to users
Step 6 - Return recommendations from all users.
import pandas as pd from sklearn.feature_extraction.text import TfidfVectorizer from sklearn.metrics.pairwise import cosine_similarity # Load data data = pd.read_csv('movies.csv') # Compute TF-IDF vectors for each movie tfidf = TfidfVectorizer(stop_words='english') tfidf_matrix = tfidf.fit_transform(data['description']) # Compute cosine similarity between all movies cosine_sim = cosine_similarity(tfidf_matrix, tfidf_matrix) # Function to get top 10 similar movies based on input movie def get_recommendations(title): idx = data[data['title'] == title].index[0] sim_scores = list(enumerate(cosine_sim[idx])) sim_scores = sorted(sim_scores, key=lambda x: x[1], reverse=True) sim_scores = sim_scores[1:11] movie_indices = [i[0] for i in sim_scores] return data.iloc[movie_indices] # Example usage: get top 10 movies similar to 'The Godfather' get_recommendations('The Godfather')
We load movie data from a local CSV file into a data frame. We convert the movie description into a matrix by using the fit_transform() function and calculate the cosine similarity matrix.
Then we define a function that takes a movie title as a parameter and retrieves the index of the movie title in the dataframe if it exists.
We then create a list of tuples containing the similarity scores between the passed movie title and all other movie titles. Each tuple consists of an index and a similarity score. We then display the list of movie titles by indexing the data frame.
title \ 783 The Godfather 1512 The Godfather: Part II 1103 Casino 3509 Things to Do in Denver When 1246 Snatch 3094 Road to Perdition 2494 Scarface 1244 Following 2164 Dancer 2445 The Day of the Jackal
Instead, these rely on data from other users to generate recommendations. Such a system compares the preferences and behaviors of various users and then suggests items that other users with similar tastes may like. Collaborative filtering is generally more accurate than content-based systems because it takes into account the opinions of many users when generating recommendations.
Step 1 − Import the necessary libraries.
Step 2 - Load the "ratings.csv" file that provides user ratings.
Step 3 - Create "user_item_matrix" to convert user rating data into a matrix
Step 4 - Calculate the similarity of user ratings using cosine similarity.
Step 5 - Identify Similar Users
Step 6 - Calculate the average rating.
Step 7 - Select the target user ID.
Step 8 - Print the movie ID and rating.
import pandas as pd from sklearn.metrics.pairwise import cosine_similarity # Load data ratings_data = pd.read_csv('ratings.csv') # Create user-item matrix user_item_matrix = pd.pivot_table(ratings_data, values='rating', index='userId', columns='movieId') # Calculate cosine similarity between users user_similarity = cosine_similarity(user_item_matrix) # Get top n similar users for each user def get_top_similar_users(similarity_matrix, user_index, n=10): similar_users = similarity_matrix[user_index].argsort()[::-1] return similar_users[1:n+1] # Get recommended items for a user based on similar users def get_recommendations(user_id, user_similarity, user_item_matrix, n=10): similar_users = get_top_similar_users(user_similarity, user_id, n) recommendations = user_item_matrix.iloc[similar_users].mean(axis=0).sort_values(ascending=False).head(n) return recommendations # Example usage user_id = 1 recommendations = get_recommendations(user_id, user_similarity, user_item_matrix) print("Top 10 recommended movies for user", user_id) print(recommendations)
Top 10 recommended movies for user 1 movieId 1196 5.000000 50 5.000000 1210 5.000000 260 5.000000 1198 5.000000 2571 5.000000 527 5.000000 1197 5.000000 2762 5.000000 858 4.961538
The task of creating a recommender system can cause great complexity for programmers, but it is a valuable tool that can bring huge benefits. Building a recommendation system with Python offers a variety of options that simplify the creation and customization process. However, as with any coding endeavor, potential problems can arise when developing recommender systems. Being aware of these typical complications and taking steps to address them is critical to ensuring the success of a recommender system.
Ultimately, it is important to remember that a recommender system can be a very powerful asset, so it is worth investing the necessary time and effort to ensure that it is built correctly and operates optimally.
The above is the detailed content of Recommendation system in Python. For more information, please follow other related articles on the PHP Chinese website!