Model Checkpointing

NewsRecLib integrates the checkpointing functionality supported by PyTorch. We adopt the same notation as PyTorch.

For more details, please refer to the corresponding PyTorch checkpointing page.

This is an example that shows all the options.

model_checkpoint:
  _target_: lightning.pytorch.callbacks.ModelCheckpoint
  dirpath: null # directory to save the model file
  filename: null # checkpoint filename
  monitor: null # name of the logged metric which determines when model is improving
  verbose: False # verbosity mode
  save_last: null # additionally always save an exact copy of the last checkpoint to a file last.ckpt
  save_top_k: 1 # save k best models (determined by above metric)
  mode: "min" # "max" means higher metric value is better, can be also "min"
  auto_insert_metric_name: True # when True, the checkpoints filenames will contain the metric name
  save_weights_only: False # if True, then only the model’s weights will be saved
  every_n_train_steps: null # number of training steps between checkpoints
  train_time_interval: null # checkpoints are monitored at the specified time interval
  every_n_epochs: null # number of epochs between checkpoints
  save_on_train_epoch_end: null # whether to run checkpointing at the end of the training epoch or the end of validation