newsreclib.models.components.encoders.news package
Submodules
newsreclib.models.components.encoders.news.aspect module
- class newsreclib.models.components.encoders.news.aspect.SentimentEncoder(num_sent_classes: int, sent_embed_dim: int, sent_output_dim: int)[source]
Bases:
ModuleImplements the sentiment encoder from SentiDebias.
Reference: Wu, Chuhan, Fangzhao Wu, Tao Qi, Wei-Qiang Zhang, Xing Xie, and Yongfeng Huang. “Removing AI’s sentiment manipulation of personalized news delivery.” Humanities and Social Sciences Communications 9, no. 1 (2022): 1-9.
For further details, please refer to the paper
- num_sent_classes
Number of sentiment classes.
- sent_embed_dim
Number of features in the sentiment embedding.
- sent_output_dim
Number of output features in the linear layer (equivalent to the final dimensionality of the sentiment vector).
- forward(sentiment) Tensor[source]
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Moduleinstance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool
newsreclib.models.components.encoders.news.category module
- class newsreclib.models.components.encoders.news.category.LinearEncoder(pretrained_embeddings: Optional[Tensor], from_pretrained: bool, freeze_pretrained_emb: bool, num_categories: int, embed_dim: Optional[int], use_dropout: bool, dropout_probability: Optional[float], linear_transform: bool, output_dim: Optional[int])[source]
Bases:
ModuleImplements a category encoder.
- pretrained_embeddings
Matrix of pretrained embeddings.
- from_pretrained
If
True, it initializes the category embedding layer with pretrained embeddings. IfFalse, it initializes the category embedding layer with random weights.
- freeze_pretrained_emb
If
True, it freezes the pretrained embeddings during training. IfFalse, it updates the pretrained embeddings during training.
- num_categories
Number of categories.
- embed_dim
Number of features in the category vector.
- use_dropout
Whether to use dropout after the embedding layer.
- dropout_probability
Dropout probability.
- linear_transform
Whether to linearly transform the category vector.
- output_dim
Number of output features in the category encoder (equivalent to the final dimensionality of the category vector).
- forward(category: Tensor) Tensor[source]
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Moduleinstance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool
newsreclib.models.components.encoders.news.news module
- class newsreclib.models.components.encoders.news.news.KCNN(pretrained_text_embeddings: Tensor, pretrained_entity_embeddings: Tensor, pretrained_context_embeddings: Optional[Tensor], use_context: bool, text_embed_dim: int, entity_embed_dim: int, num_filters: int, window_sizes: List[int])[source]
Bases:
ModuleImplements the knowledge-aware CNN from DKN.
Reference: Wang, Hongwei, Fuzheng Zhang, Xing Xie, and Minyi Guo. “DKN: Deep knowledge-aware network for news recommendation.” In Proceedings of the 2018 world wide web conference, pp. 1835-1844. 2018.
For further details, please refer to the paper
- pretrained_text_embeddings
Matrix of pretrained text embeddings.
- pretrained_entity_embeddings
Matrix of pretrained entity embeddings.
- pretrained_context_embeddings
Matrix of pretrained context embeddings.
- use_context
Whether to use context embeddings.
- text_embed_dim
The number of features in the text vector.
- entity_embed_dim
The number of features in the entity vector.
- num_filters
The number of filters in the
CNN.
- window_sizes
List of window sizes for the
CNN.
- forward(news: Dict[str, Tensor]) Tensor[source]
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Moduleinstance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool
- class newsreclib.models.components.encoders.news.news.NewsEncoder(dataset_attributes: List[str], attributes2encode: List[str], concatenate_inputs: bool, text_encoder: Optional[Module], category_encoder: Optional[Module], entity_encoder: Optional[Module], combine_vectors: bool, combine_type: Optional[str], input_dim: Optional[int], query_dim: Optional[int], output_dim: Optional[int])[source]
Bases:
ModuleImplements a news encoder.
- dataset_attributes
List of news features available in the used dataset.
- attributes2encode
List of news features used as input to the news encoder.
- concatenate_inputs
Whether the inputs (e.g., title and abstract) were concatenated into a single sequence.
- text_encoder
The text encoder module.
- category_encoder
The category encoder module.
- entity_encoder
The entity encoder module.
- combine_vectors
Whether to aggregate the representations of multiple news features.
- combine_type
The type of aggregation to use for combining multiple news features representations. Choose between add_att (additive attention), linear, and concat (concatenate).
- input_dim
The number of input features in the aggregation layer.
- query_dim
The number of features in the query vector.
- output_dim
The number of features in the final news vector.
- forward(news: Dict[str, Tensor]) Tensor[source]
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Moduleinstance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool
newsreclib.models.components.encoders.news.text module
- class newsreclib.models.components.encoders.news.text.CNNAddAtt(pretrained_embeddings: Tensor, embed_dim: int, num_filters: int, window_size: int, query_dim: int, dropout_probability: float)[source]
Bases:
ModuleImplements a text encoder based on CNN and additive attention.
Reference: Wu, Chuhan, Fangzhao Wu, Mingxiao An, Jianqiang Huang, Yongfeng Huang, and Xing Xie. “Neural news recommendation with attentive multi-view learning.” arXiv preprint arXiv:1907.05576 (2019).
For further details, please refer to the paper
- pretrained_embeddings
Matrix of pretrained embeddings.
- embed_dim
The number of features in the text vector.
- num_filters
The number of filters in the
CNN.
- window_size
The window size in the
CNN.
- query_dim
The number of features in the query vector.
- dropout_probability
Dropout probability.
- forward(text: Tensor) Tensor[source]
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Moduleinstance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool
- class newsreclib.models.components.encoders.news.text.CNNMHSAAddAtt(pretrained_embeddings: Tensor, embed_dim: int, num_filters: int, window_size: int, num_heads: int, query_dim: int, dropout_probability: float)[source]
Bases:
ModuleImplements a text encoder based on CNN, multi-head self-attention, and additive attention.
Reference: Qi, Tao, Fangzhao Wu, Chuhan Wu, Yongfeng Huang, and Xing Xie. “Privacy-Preserving News Recommendation Model Learning.” In Findings of the Association for Computational Linguistics: EMNLP 2020, pp. 1423-1432. 2020.
For further details, please refer to the paper
- pretrained_embeddings
Matrix of pretrained embeddings.
- num_filters
The number of filters in the
CNN.
- window_size
The window size in the
CNN.
- num_heads
The number of heads in the
MultiheadAttention.
- query_dim
The number of features in the query vector.
- dropout_probability
Dropout probability.
- forward(text: Tensor) Tensor[source]
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Moduleinstance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool
- class newsreclib.models.components.encoders.news.text.CNNPersAtt(pretrained_embeddings: Tensor, text_embed_dim: int, user_embed_dim: int, num_filters: int, window_size: int, query_dim: int, dropout_probability: float)[source]
Bases:
ModuleImplements a text encoder based on CNN and Personalized Attention.
Reference: Wu, Chuhan, Fangzhao Wu, Mingxiao An, Jianqiang Huang, Yongfeng Huang, and Xing Xie. “NPA: neural news recommendation with personalized attention.” In Proceedings of the 25th ACM SIGKDD international conference on knowledge discovery & data mining, pp. 2576-2584. 2019.
For further details, please refer to the paper
- pretrained_embeddings
Matrix of pretrained embeddings.
- text_embed_dim
The number of features in the text vector.
- user_embed_dim
The number of features in the user vector.
- num_filters
The number of filters in the
CNN.
- window_size
The window size in the
CNN.
- query_dim
The number of features in the query vector.
- dropout_probability
Dropout probability.
- forward(text: Tensor, lengths: Tensor, projected_users: Tensor) Tensor[source]
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Moduleinstance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool
- class newsreclib.models.components.encoders.news.text.MHSAAddAtt(pretrained_embeddings: Tensor, embed_dim: int, num_heads: int, query_dim: int, dropout_probability: float)[source]
Bases:
ModuleImplements a text encoder based on multi-head self-attention and additive attention.
Reference: Wu, Chuhan, Fangzhao Wu, Suyu Ge, Tao Qi, Yongfeng Huang, and Xing Xie. “Neural news recommendation with multi-head self-attention.” In Proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing (EMNLP-IJCNLP), pp. 6389-6394. 2019.
For further details, please refer to the paper
- pretrained_embeddings
Matrix of pretrained embeddings.
- embed_dim
The number of features in the text vector.
- num_heads
The number of heads in the
MultiheadAttention.
- query_dim
The number of features in the query vector.
- dropout_probability
Dropout probability.
- forward(text: Tensor) Tensor[source]
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Moduleinstance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool
- class newsreclib.models.components.encoders.news.text.PLM(plm_model, frozen_layers: Optional[List[int]], embed_dim: int, use_mhsa: bool, apply_reduce_dim: bool, reduced_embed_dim: Optional[int], num_heads: Optional[int], query_dim: Optional[int], dropout_probability: float)[source]
Bases:
ModuleImplements a text encoder based on a pretrained language model.
- plm_model
Name of the pretrained language model.
- frozen_layers
List of layers to freeze during training.
- embed_dim
Number of features in the text vector.
- use_mhsa
If
True, it aggregates the token embeddings with a multi-head self-attention network into a final text representation. IfFalse, it uses the CLS embedding as the final text representation.
- apply_reduce_dim
Whether to linearly reduce the dimensionality of the news vector.
- reduced_embed_dim
The number of features in the reduced news vector.
- num_heads
The number of heads in the
MultiheadAttention.
- query_dim
The number of features in the query vector.
- dropout_probability
Dropout probability.
- forward(text: Tensor) Tensor[source]
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Moduleinstance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool