Rankings represent the preferences of a user by sorting a set of items from the most to the least preferred. Rankings are often used in order to concisely represent preference information in many situations (rankings of universities in magazines, rankings of movies or restaurants in recommender systems, ...).
When analysing ranking data, one may wonder how to adapt standard machine learning techniques. Aggregation of rankings can be used in order to produce an "improved" ranking in output . Clustering the preferences of different users (expressed in form of rankings) can be used to identify users with similar taste, for instance. In order to use algorithms as K-means, one needs however to define the notion of "distance" between rankings, and several possibilities (Spearman, Kendall tau, foot rule, etc) exists.
By assuming a different distance function, different clusterings can be found.
As shown in , calculating the "centroid" of a group of rankings can be done in a very effient way when using Spearman and a generalisation of Spearman giving different weights to different position in the rankings.
This project requires to implement K-means with rank data and to report on the results obtained by clustering when considering different distance measures (Kendall tau, footrule, Spearman, and positional Spearman from ). You shall analyse the results when considering different values for k (the number of clusters) and compare the running times. For data, you need to consider at least a couple of datasets from PREFLIB (link given below). Motivate the way to address specific issues related to the data (for instance: how to handle incomplete data?).
- PREF lib library
[url removed, login to view]
 Cynthia Dwork, Ravi Kumar, Moni Naor, and D. Sivakumar. Rank aggregation methods for the web. World-Wide Web conference (WWW), 2001 pages 613-622.
 Paolo Viappiani. Characterization of Scoring Rules with Distances: Application to the Clustering of Rankings. Proceedings of the International Joint Conferences on Artificial Intelligence (IJCAI), 2015, p. 104-110.
10 pekerja bebas membida secara purata $144 untuk pekerjaan ini
Hi, I show my interest towards the assignment and I am new here. please provide an opportunity. You can visit my work portfolio here: [login to view URL]