newsreclib.models.components.encoders.news package

Submodules

newsreclib.models.components.encoders.news.aspect module

class newsreclib.models.components.encoders.news.aspect.SentimentEncoder(num_sent_classes: int, sent_embed_dim: int, sent_output_dim: int)[source]

Bases: Module

Implements the sentiment encoder from SentiDebias.

Reference: Wu, Chuhan, Fangzhao Wu, Tao Qi, Wei-Qiang Zhang, Xing Xie, and Yongfeng Huang. “Removing AI’s sentiment manipulation of personalized news delivery.” Humanities and Social Sciences Communications 9, no. 1 (2022): 1-9.

For further details, please refer to the paper

num_sent_classes

Number of sentiment classes.

sent_embed_dim

Number of features in the sentiment embedding.

sent_output_dim

Number of output features in the linear layer (equivalent to the final dimensionality of the sentiment vector).

forward(sentiment) Tensor[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool

newsreclib.models.components.encoders.news.category module

class newsreclib.models.components.encoders.news.category.LinearEncoder(pretrained_embeddings: Optional[Tensor], from_pretrained: bool, freeze_pretrained_emb: bool, num_categories: int, embed_dim: Optional[int], use_dropout: bool, dropout_probability: Optional[float], linear_transform: bool, output_dim: Optional[int])[source]

Bases: Module

Implements a category encoder.

pretrained_embeddings

Matrix of pretrained embeddings.

from_pretrained

If True, it initializes the category embedding layer with pretrained embeddings. If False, it initializes the category embedding layer with random weights.

freeze_pretrained_emb

If True, it freezes the pretrained embeddings during training. If False, it updates the pretrained embeddings during training.

num_categories

Number of categories.

embed_dim

Number of features in the category vector.

use_dropout

Whether to use dropout after the embedding layer.

dropout_probability

Dropout probability.

linear_transform

Whether to linearly transform the category vector.

output_dim

Number of output features in the category encoder (equivalent to the final dimensionality of the category vector).

forward(category: Tensor) Tensor[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool

newsreclib.models.components.encoders.news.news module

class newsreclib.models.components.encoders.news.news.KCNN(pretrained_text_embeddings: Tensor, pretrained_entity_embeddings: Tensor, pretrained_context_embeddings: Optional[Tensor], use_context: bool, text_embed_dim: int, entity_embed_dim: int, num_filters: int, window_sizes: List[int])[source]

Bases: Module

Implements the knowledge-aware CNN from DKN.

Reference: Wang, Hongwei, Fuzheng Zhang, Xing Xie, and Minyi Guo. “DKN: Deep knowledge-aware network for news recommendation.” In Proceedings of the 2018 world wide web conference, pp. 1835-1844. 2018.

For further details, please refer to the paper

pretrained_text_embeddings

Matrix of pretrained text embeddings.

pretrained_entity_embeddings

Matrix of pretrained entity embeddings.

pretrained_context_embeddings

Matrix of pretrained context embeddings.

use_context

Whether to use context embeddings.

text_embed_dim

The number of features in the text vector.

entity_embed_dim

The number of features in the entity vector.

num_filters

The number of filters in the CNN.

window_sizes

List of window sizes for the CNN.

forward(news: Dict[str, Tensor]) Tensor[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool
class newsreclib.models.components.encoders.news.news.NewsEncoder(dataset_attributes: List[str], attributes2encode: List[str], concatenate_inputs: bool, text_encoder: Optional[Module], category_encoder: Optional[Module], entity_encoder: Optional[Module], combine_vectors: bool, combine_type: Optional[str], input_dim: Optional[int], query_dim: Optional[int], output_dim: Optional[int])[source]

Bases: Module

Implements a news encoder.

dataset_attributes

List of news features available in the used dataset.

attributes2encode

List of news features used as input to the news encoder.

concatenate_inputs

Whether the inputs (e.g., title and abstract) were concatenated into a single sequence.

text_encoder

The text encoder module.

category_encoder

The category encoder module.

entity_encoder

The entity encoder module.

combine_vectors

Whether to aggregate the representations of multiple news features.

combine_type

The type of aggregation to use for combining multiple news features representations. Choose between add_att (additive attention), linear, and concat (concatenate).

input_dim

The number of input features in the aggregation layer.

query_dim

The number of features in the query vector.

output_dim

The number of features in the final news vector.

forward(news: Dict[str, Tensor]) Tensor[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool

newsreclib.models.components.encoders.news.text module

class newsreclib.models.components.encoders.news.text.CNNAddAtt(pretrained_embeddings: Tensor, embed_dim: int, num_filters: int, window_size: int, query_dim: int, dropout_probability: float)[source]

Bases: Module

Implements a text encoder based on CNN and additive attention.

Reference: Wu, Chuhan, Fangzhao Wu, Mingxiao An, Jianqiang Huang, Yongfeng Huang, and Xing Xie. “Neural news recommendation with attentive multi-view learning.” arXiv preprint arXiv:1907.05576 (2019).

For further details, please refer to the paper

pretrained_embeddings

Matrix of pretrained embeddings.

embed_dim

The number of features in the text vector.

num_filters

The number of filters in the CNN.

window_size

The window size in the CNN.

query_dim

The number of features in the query vector.

dropout_probability

Dropout probability.

forward(text: Tensor) Tensor[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool
class newsreclib.models.components.encoders.news.text.CNNMHSAAddAtt(pretrained_embeddings: Tensor, embed_dim: int, num_filters: int, window_size: int, num_heads: int, query_dim: int, dropout_probability: float)[source]

Bases: Module

Implements a text encoder based on CNN, multi-head self-attention, and additive attention.

Reference: Qi, Tao, Fangzhao Wu, Chuhan Wu, Yongfeng Huang, and Xing Xie. “Privacy-Preserving News Recommendation Model Learning.” In Findings of the Association for Computational Linguistics: EMNLP 2020, pp. 1423-1432. 2020.

For further details, please refer to the paper

pretrained_embeddings

Matrix of pretrained embeddings.

num_filters

The number of filters in the CNN.

window_size

The window size in the CNN.

num_heads

The number of heads in the MultiheadAttention.

query_dim

The number of features in the query vector.

dropout_probability

Dropout probability.

forward(text: Tensor) Tensor[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool
class newsreclib.models.components.encoders.news.text.CNNPersAtt(pretrained_embeddings: Tensor, text_embed_dim: int, user_embed_dim: int, num_filters: int, window_size: int, query_dim: int, dropout_probability: float)[source]

Bases: Module

Implements a text encoder based on CNN and Personalized Attention.

Reference: Wu, Chuhan, Fangzhao Wu, Mingxiao An, Jianqiang Huang, Yongfeng Huang, and Xing Xie. “NPA: neural news recommendation with personalized attention.” In Proceedings of the 25th ACM SIGKDD international conference on knowledge discovery & data mining, pp. 2576-2584. 2019.

For further details, please refer to the paper

pretrained_embeddings

Matrix of pretrained embeddings.

text_embed_dim

The number of features in the text vector.

user_embed_dim

The number of features in the user vector.

num_filters

The number of filters in the CNN.

window_size

The window size in the CNN.

query_dim

The number of features in the query vector.

dropout_probability

Dropout probability.

forward(text: Tensor, lengths: Tensor, projected_users: Tensor) Tensor[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool
class newsreclib.models.components.encoders.news.text.MHSAAddAtt(pretrained_embeddings: Tensor, embed_dim: int, num_heads: int, query_dim: int, dropout_probability: float)[source]

Bases: Module

Implements a text encoder based on multi-head self-attention and additive attention.

Reference: Wu, Chuhan, Fangzhao Wu, Suyu Ge, Tao Qi, Yongfeng Huang, and Xing Xie. “Neural news recommendation with multi-head self-attention.” In Proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing (EMNLP-IJCNLP), pp. 6389-6394. 2019.

For further details, please refer to the paper

pretrained_embeddings

Matrix of pretrained embeddings.

embed_dim

The number of features in the text vector.

num_heads

The number of heads in the MultiheadAttention.

query_dim

The number of features in the query vector.

dropout_probability

Dropout probability.

forward(text: Tensor) Tensor[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool
class newsreclib.models.components.encoders.news.text.PLM(plm_model, frozen_layers: Optional[List[int]], embed_dim: int, use_mhsa: bool, apply_reduce_dim: bool, reduced_embed_dim: Optional[int], num_heads: Optional[int], query_dim: Optional[int], dropout_probability: float)[source]

Bases: Module

Implements a text encoder based on a pretrained language model.

plm_model

Name of the pretrained language model.

frozen_layers

List of layers to freeze during training.

embed_dim

Number of features in the text vector.

use_mhsa

If True, it aggregates the token embeddings with a multi-head self-attention network into a final text representation. If False, it uses the CLS embedding as the final text representation.

apply_reduce_dim

Whether to linearly reduce the dimensionality of the news vector.

reduced_embed_dim

The number of features in the reduced news vector.

num_heads

The number of heads in the MultiheadAttention.

query_dim

The number of features in the query vector.

dropout_probability

Dropout probability.

forward(text: Tensor) Tensor[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool

Module contents