Lead by Jure Žbontar, the team from University of Ljubljana wins over 126 other entrants in an international competition in predictive data analytics.
Jure’s team consisted of several Orange developers and computer science students: Miha Zidar, Blaž Zupan, Gregor Majcen, Marinka Žitnik in Matic Potočnik. To win, the team had to predict topics for 10.000 MedLine documents that were represented with over 25.000 algorithmically derived numerical features. Given was training set of another 10.000 documents in the same representation but each labeled with a set of topics. From the training set the task was to develop a model to predict labels for documents in the test set. A particular challenge was guessing the right number of topics to be associated with the documents, as these, at least in the training set, varied from one to a dozen.
JRS 2012 is just one in a series of competitions recently organized on servers such as TunedIT and Kaggle. The price for winning was $1000 and a trip to Joint Rough Set Symposium in Chengdu, China, to present a winning strategy and developed data mining techiques.