User Based and Item Based
Mahout is a collection of machine learning
algorithms intended to perform the following operations as
recommendation (Collaborative Filtering), Clustering and Classification.
Initially to implement recommendation we need an input data file where
every line contains one record each. Each record should have the user
ID, Item ID and preference value in order separated by comma.
Input File – input.txt
501,1002,5
501,1012,3
510,1002,2
515,1002,5
501,1020,1
…
The point to be considered here that we need the User Id and Item ID to
be integers, alpha numeric characters won’t serve our purpose. Also the
larger the input files better the quality of recommendations produced
Recommenders
Recommenders are broadly classified into two categories based on the
method or approach they use in generating recommendations
1. User Based Recommendations
Recommendations are derived from how similar items are to items, ie
based on the items a user has already more similar items are recommended
2. Item Based Recommendations
Recommendations are derived on how similar users to users are. ie to
make recommendations for a user(User1) we take into account an
user/users who shares similar tastes and based on the items they possess
we recommend items to User1
When we make mahout recommendations the key components involved are
Data Model
It is an encapsulation used by Mahout to hold input data. It helps
efficient access to data by various recommender algorithms.
Similarity Algorithm
There are various kind of Similarity algorithms available and mahout
has implementations of all the popular ones like Person Correlation,
Cosine Measure, Euclidean Distance, Log Likelihood, Tanimoto coefficient
etc
User Neighborhood
This is applicable for user based recommendations, user based
recommendations are made based on user to user similarity. We form a
neighborhood of most similar users that share almost same tastes so that
we get better recommendations. And the algorithms thet we use to select
user neighborhood are
1. Nearest N User Neighborhood
Here we specify the neighborhood size, ie exactly the number of most
similar uses to be considered for generating recommendations say 100,500
etc
2. Threshold User Neighborhood
We don’t specify the neighborhood size, rather we specify a similarity
measure which is a value between -1 and +1. If we specify a value .7
then only the users that share a similarity greater than ).7 would be
considered in neighborhood. Higher the value more similar the users are
Recommender
It is the final computing object which couples together the datamodel,
similarity algorithm and neighborhood to generate recommendations based
on the same
Samples code snippets to generate user and item based recommendations are given below
User Based Recommender
import java.io.File;
import java.io.IOException;
import java.util.List;
import org.apache.mahout.cf.taste.common.TasteException;
import org.apache.mahout.cf.taste.impl.model.file.FileDataModel;
import org.apache.mahout.cf.taste.impl.neighborhood.NearestNUserNeighborhood;
import org.apache.mahout.cf.taste.impl.recommender.GenericUserBasedRecommender;
import org.apache.mahout.cf.taste.impl.similarity.PearsonCorrelationSimilarity;
import org.apache.mahout.cf.taste.neighborhood.UserNeighborhood;
import org.apache.mahout.cf.taste.recommender.RecommendedItem;
import org.apache.mahout.cf.taste.recommender.Recommender;
import org.apache.mahout.cf.taste.similarity.UserSimilarity;
public class UserRecommender {
public static void main(String args[])
{
// specifying the user id to which the recommendations have to be generated for
int userId=510;
//specifying the number of recommendations to be generated
int noOfRecommendations=5;
try
{
// Data model created to accept the input file
FileDataModel dataModel = new FileDataModel(new File("D://input.txt"));
/*Specifies the Similarity algorithm*/
UserSimilarity userSimilarity = new PearsonCorrelationSimilarity(dataModel);
/*NearestNUserNeighborhood is preferred in situations where we need to have control on the exact no of neighbors*/
UserNeighborhood neighborhood =new NearestNUserNeighborhood(100, userSimilarity, dataModel);
/*Initalizing the recommender */
Recommender recommender =new GenericUserBasedRecommender(dataModel, neighborhood, userSimilarity);
//calling the recommend method to generate recommendations
List<RecommendedItem> recommendations =recommender.recommend(userId, noOfRecommendations);
//
for (RecommendedItem recommendedItem : recommendations)
System.out.println(recommendedItem.getItemID());
}
catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (TasteException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
}
Item Based Recommender
import java.io.File;
import java.io.IOException;
import java.util.List;
import org.apache.mahout.cf.taste.common.TasteException;
import org.apache.mahout.cf.taste.impl.model.file.FileDataModel;
import org.apache.mahout.cf.taste.impl.recommender.GenericItemBasedRecommender;
import org.apache.mahout.cf.taste.impl.similarity.PearsonCorrelationSimilarity;
import org.apache.mahout.cf.taste.recommender.ItemBasedRecommender;
import org.apache.mahout.cf.taste.recommender.RecommendedItem;
import org.apache.mahout.cf.taste.similarity.ItemSimilarity;
public class ItemRecommender {
public static void main(String args[])
{
// specifying the user id to which the recommendations have to be generated for
int userId=510;
//specifying the number of recommendations to be generated
int noOfRecommendations=5;
try
{
// Data model created to accept the input file
FileDataModel dataModel = new FileDataModel(new File("D://input.txt"));
/*Specifies the Similarity algorithm*/
ItemSimilarity itemSimilarity = new PearsonCorrelationSimilarity(dataModel);
/*Initalizing the recommender */
ItemBasedRecommender recommender =new GenericItemBasedRecommender(dataModel, itemSimilarity);
//calling the recommend method to generate recommendations
List<RecommendedItem> recommendations =recommender.recommend(userId, noOfRecommendations);
//
for (RecommendedItem recommendedItem : recommendations)
System.out.println(recommendedItem.getItemID());
}
catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (TasteException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
}
No comments:
Post a Comment