In real world data we can’t always ensure that
the input data supplied to us in order to generate recommendations
should contain only integer values for User and Item Ids. If these
values or any one of these are not integers the default data models that
mahout provides won’t be suitable to process our data. Here let us
consider the case where out Item ID is Strings we’d define our custom
data model. In our data model we need to override a method in order to
read item id as string and convert the same into long and return the
unique long value
Data Model Class
import java.io.File;
import java.io.IOException;
import org.apache.mahout.cf.taste.common.TasteException;
import org.apache.mahout.cf.taste.impl.model.file.FileDataModel;
public class AlphaItemFileDataModel extends FileDataModel {
private final ItemMemIDMigrator memIdMigtr = new ItemMemIDMigrator();
public AlphaItemFileDataModel(File dataFile) throws IOException {
super(dataFile);
}
public AlphaItemFileDataModel(File dataFile, boolean transpose) throws IOException {
super(dataFile, transpose);
}
@Override
protected long readItemIDFromString(String value) {
long retValue = memIdMigtr.toLongID(value);
if(null == memIdMigtr.toStringID(retValue)){
try {
memIdMigtr.singleInit(value);
} catch (TasteException e) {
e.printStackTrace();
}
}
return retValue;
}
String getItemIDAsString(long itemId){
return memIdMigtr.toStringID(itemId);
}
}
Class that defines the map to store the String to Long values
import org.apache.mahout.cf.taste.common.TasteException;
import org.apache.mahout.cf.taste.impl.common.FastByIDMap;
import org.apache.mahout.cf.taste.impl.model.AbstractIDMigrator;
public class ItemMemIDMigrator extends AbstractIDMigrator {
private final FastByIDMap<String> longToString;
public ItemMemIDMigrator() {
this.longToString = new FastByIDMap<String>(100);
}
@Override
public void storeMapping(long longID, String stringID) {
synchronized (longToString) {
longToString.put(longID, stringID);
}
}
@Override
public String toStringID(long longID) {
synchronized (longToString) {
return longToString.get(longID);
}
}
public void singleInit(String stringID) throws TasteException {
storeMapping(toLongID(stringID), stringID);
}
}
In your Recommender implementation you
can use this Data Model class instead of the default file data model to
accept an input that contains alpha numeric Item Ids. Similar you can
device the code to form a data model that would accommodate alpha
numeric User Ids as well.
No comments:
Post a Comment