One of the ranking algorithms I've tried was RankBoost with binary rankers originally proposed by Freund and Schapire. To have ability to separate not only documents with high value of some feature from documents with low value of the same feature, but also, for example, documents with feature value distributed somewhere around 0.5 from any other, I've performed additional experiments using ranking features that are functions of another features. For that purpose I've selected truncated gaussian with mean=0.5 and also [0,1]-multimodal sinus-based function:

There are plots representing experiment results in terms of RankBoost performance value and Yandex DCG:

As for me, I've drawn 3 conslusions:
- Using only original features for creating weak ranker sucks.
- Using weak rankers based on functions of features is slightly better.
- The whole approach still sucks in terms of DCG.
Final DCG value is then acquired by calculating sum of DCGs of all the queries and then dividing it to the number of queries.
Hello,
ReplyDeleteI do not understand your use of features transformations. With negative alpha's, the boosting process of rankboost is able, if necessary, to approximate any function of the provided features.
Tanguy
Alex Gorodilov (from Yandex) who gained 1st (?) unofficial place in the rating used GBM package for R. Nothing to invent =)
ReplyDeleteLearning to rank is a mature branch of ML and it isn't a productive idea just take and try some general ML techniques - they have their own state-of-arts.