newsreclib.data.components package

Submodules

newsreclib.data.components.adressa_dataframe module

newsreclib.data.components.adressa_user_info module

class newsreclib.data.components.adressa_user_info.UserInfo(train_date_split: int, test_date_split: int)[source]

Bases: object

train_date_split

A string with the date before which click behaviors are included in the history of a user.

test_date_split

A string with the date after which click behaviors are included in the test set.

sort_click()[source]

Sorts user clicks by time in ascending order.

update(nindex: int, click_time: int, date: str)[source]
Parameters:
  • nindex – The index of a news article.

  • click_time – The time when the user clicked on the news article.

  • date – The processed click time used to assign the sample into the history of the user, the train or the test set.

newsreclib.data.components.batch module

class newsreclib.data.components.batch.NewsBatch(*args, **kwargs)[source]

Bases: dict

Batch used for reshaping the embedding space based on an aspect of the news.

Reference: Iana, Andreea, Goran Glavaš, and Heiko Paulheim. “Train Once, Use Flexibly: A Modular Framework for Multi-Aspect Neural News Recommendation.” arXiv preprint arXiv:2307.16089 (2023). https://arxiv.org/pdf/2307.16089.pdf

news

Dictionary mapping features of news to values.

Type:

Dict[str, Any]

labels

Labels of news based on the specified aspect.

Type:

torch.Tensor

labels: Tensor
news: Dict[str, Any]
class newsreclib.data.components.batch.RecommendationBatch(*args, **kwargs)[source]

Bases: dict

Batch used for recommendation.

batch_hist

Batch of histories of users.

Type:

torch.Tensor

batch_cand

Batch of candidates for each user.

Type:

torch.Tensor

x_hist

Dictionary of news from a the users’ history, mapping news features to values.

Type:

Dict[str, Any]

x_cand

Dictionary of news from a the users’ candidates, mapping news features to values.

Type:

Dict[str, Any]

labels

Ground truth specifying whether the news is relevant to the user.

Type:

torch.Tensor

users

Users included in the batch.

Type:

torch.Tensor

batch_cand: Tensor
batch_hist: Tensor
labels: Tensor
users: Tensor
x_cand: Dict[str, Any]
x_hist: Dict[str, Any]

newsreclib.data.components.data_utils module

newsreclib.data.components.download_utils module

newsreclib.data.components.file_utils module

newsreclib.data.components.file_utils.check_integrity(fpath: str) bool[source]

Checks whether a file exists.

newsreclib.data.components.file_utils.load_idx_map_as_dict(fpath: str) Dict[str, int][source]

Loads a table as dictionary.

newsreclib.data.components.file_utils.to_tsv(df: DataFrame, fpath: str) None[source]

Stores a dataframe in .tsv format.

newsreclib.data.components.mind_dataframe module

newsreclib.data.components.news_dataset module

newsreclib.data.components.rec_dataset module

newsreclib.data.components.sentiment_annotator module