Recommender Systems @ SIGIR 2009

Jul 24, 2009

There were two sessions on recommender systems at this year’s ACM SIGIR (held in Boston). Overall, it was a good conference- organised well, run smoothly. It became very quickly apparent to me (a first-timer to SIGIR) that this is a tight community of researchers; there were many hugs at the opening drinks. Here is a quick summary of the recommender system papers and a couple other noteworthy papers/events.

On Social Networks and Collaborative Recommendation. A group from Glasgow explored the idea of random walks on the user-user, user-tag, and user-track graphs for personalising recommendations in social web contexts, like last.fm. The authors posted their dataset online (but- at time of writing, this link is broken). An interesting discussion broke out after the presentation, regarding suitable graph subsampling when crawling online datasets- the authors had removed users who had no social links, and all tracks that had been listened to by less than 8 users (why 8? I don’t know), in order to avoid the “sparsity problem,” which caused some stir in the audience.

Learning to Recommend with Social Trust Ensemble. Researchers from Hong Kong university showed how to merge trust-values (such as those found in the Epinions dataset) with state-of-the-art collaborative filtering algorithms that are based on matrix factorisation. An interesting paper, since most trust-based approaches rely on nearest-neighbour algorithms in order to maintain the transparency that seems required in this context. The authors said they may perform a user study to explore this latter problem, which I think would be very interesting.

Fast Nonparametric Matrix Factorization for Large-Scale Collaborative Filtering. An ensemble of NEC Labs and Carnegie Mellon researchers presented a method for performing fast collaborative filtering- as the title suggests, their methods are nonparametric. They present both accuracy and run-time results; impressively, they ran a number of algorithms, and offered source code to anyone who would get in touch. By far the most math-y paper of the session, but well done guys!

The Wisdom of the Few. This is a paper that I co-authored with Xavier Amatriain, who presented at SIGIR (I presented the follow up work at an IJCAI workshop), while I was an intern at Telefonica research. The basic idea is to use a small dataset of ‘experts’ in order to predict the masses’ preferences. While a small number of experts did not prove to be sufficient to out-predict other methods, an extensive user study showed opposing results: the users liked the expert recommendations more.

Personalized Tag Recommendation Using Graph-Based Ranking on Multi-type Interrelated Objects. I find the problem of recommending tags interesting because users are being suggested how to annotate content (rather than the traditional recommender problem of suggesting what content to rate/annotate). I would like to see a study comparing the performance differences (in retrieval/recommendation) when using a dataset produced with tag recommendation and without- perhaps the tag recommendations denoises the data? The authors approach the tag-recommendation problem by formulating it as a retrieval problem, where the document and user are the query and the suggested tags are the result.

A related paper (external to the Recommenders session) is A Statistical Comparison of Tag and Query Logs, a paper that compares the way people tag to the way that they search- looking at word overlap and term distribution.

Leveraging Sources of Collective Wisdom on the Web for Discovering Technology Synergies. Unfortunately Ziegler was not here to present his work (who wrote his PhD thesis on decentralised recommender systems), and a Siemens representative replaced him. The paper dealt with an automated way of finding technological synergies between departments in a large R&D organisation (like Siemens!)

Other than the above recommender-related papers, there were two UCL contributions (that appeared in the Retrieval Models session): “Risky Business: Modeling and Exploiting Uncertainty in Information Retrieval” and “Portfolio Theory of Information Retrieval.” I am planning on reading these soon; more details are available on Jun Wang’s (excellent) blog.

There was also an extensive poster session (I was, however, busy attending to mine), sessions on Web 2.0, interactive search, and multimedia (amongst others), a keynote by Barabasi, a banquet at the JFK museum, a boat ride on the harbor, and a number of workshops. I’ll limit this post to the recommender system stuff. More details on the workshop to come.