Summary of the Datasets
NewsRecLib integrates, to date, 2 benchmark datasets: MIND and Adressa. Each is supported in two variants, depending on the dataset size.
MIND Dataset
NewsRecLib provides downloading, parsing, annotation, and loading functionalities for two variants of the MIND: MINDsmall and MINDlarge.
Reference: Wu, Fangzhao, Ying Qiao, Jiun-Hung Chen, Chuhan Wu, Tao Qi, Jianxun Lian, Danyang Liu et al. “Mind: A large-scale dataset for news recommendation.” In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 3597-3606. 2020.
For further details, please refer to the paper
Adreesa Dataset
NewsRecLib provides downloading, parsing, annotation, and loading functionalities for two variants of the Adressa: 1-week and 3-month.
Reference: Gulla, Jon Atle, Lemei Zhang, Peng Liu, Özlem Özgöbek, and Xiaomeng Su. “The adressa dataset for news recommendation.” In Proceedings of the international conference on web intelligence, pp. 1042-1048. 2017.
For further details, please refer to the paper