datagnosis.plugins.generic.plugin_forgetting module#

class ForgettingPlugin(model: torch.nn.modules.module.Module, criterion: torch.nn.modules.module.Module, optimizer: torch.optim.optimizer.Optimizer, lr: float, epochs: int, num_classes: int, total_samples: int, device: Optional[torch.device] = device(type='cpu'), logging_interval: int = 100)[source]#

Bases: datagnosis.plugins.core.plugin.Plugin

compute_scores(recompute: bool = False) Union[Tuple[numpy.ndarray, numpy.ndarray], numpy.ndarray][source]#

A method to compute the forgetting scores for the plugin.

Parameters

recompute (bool, optional) – A flag to indicate whether to recompute the scores. Defaults to False.

Raises

ValueError – If the plugin has not been fit yet.

Returns

The forgetting scores.

Return type

np.ndarray

extract_datapoints(method: Literal['threshold', 'top_n', 'index'] = 'threshold', hardness: Literal['hard', 'easy'] = 'hard', threshold: Optional[float] = None, threshold_range: Optional[Tuple[float, float]] = None, n: Optional[int] = None, indices: Optional[List[int]] = None, sort_by_index: bool = True) Tuple[Tuple[torch.Tensor, torch.Tensor, List], numpy.ndarray]#

Extracts datapoints from the plugin model by applying a threshold or range to the scores, selecting the top n scores, or selecting datapoints by index.

Parameters
  • method (Literal[threshold, top_n, index], optional) – The method to use to extract datapoints. Defaults to “threshold”.

  • hardness (Literal[hard, easy], optional) – Flag to indicate whether to extract hard or easy data points. Defaults to “hard”.

  • threshold (Optional[float], optional) – The threshold to apply to the scores. Must be provided if the given method is “threshold” and threshold_range is None. Defaults to None.

  • threshold_range (Optional[Tuple[float, float]], optional) – The range of thresholds to apply to the scores. Must be provided if the given method is “threshold” and the value passed to threshold is None. Defaults to None.

  • n (Optional[int], optional) – The number of datapoints to extract. Must be provided if the given method is “top_n”. Defaults to None.

  • indices (Optional[List[int]], optional) – The indices of the datapoints to extract. Must be provided if the given method is “index”. Defaults to None.

  • sort_by_index (bool, optional) – Flag to indicate whether to sort_by_index the extracted datapoints. Defaults to True.

Raises
  • ValueError – raised if the given method is not one of “threshold”, “top_n”, or “index”.

  • ValueError – raised if the given method is “threshold” but neither threshold nor threshold_range is provided.

  • ValueError – raised if the given method is “top_n” but n is not provided.

  • ValueError – raised if the given method is “index” but a list of indices is not provided.

Returns

The extracted datapoints and the scores of the extracted datapoints. Datapoints returned in the format ((Features, Labels, Indices), scores)

Return type

Tuple[np.ndarray, np.ndarray]

fit(datahandler: datagnosis.plugins.core.datahandler.DataHandler, use_caches_if_exist: bool = True, workspace: Union[pathlib.Path, str] = PosixPath('workspace'), *args: Any, **kwargs: Any) typing_extensions.Self#

Fit the plugin model.

Parameters
  • datahandler (DataHandler) – The datagnosis.plugins.core.datahandler.DataHandler object that contains the data to be used for fitting.

  • use_caches_if_exist (bool, optional) – A flag to indicate whether or not to use cached data if it exists. Defaults to True.

  • workspace (Union[Path, str], optional) – A path to the workspace directory. Defaults to Path(“workspace/”).

Raises

RuntimeError – Raises a RuntimeError if the plugin’s fit method has already been called.

classmethod fqdn() str#

The Fully-Qualified name of the plugin.

static hard_direction() str[source]#
Returns

The direction of hardness for the plugin, i.e. whether high or low scores indicate hardness.

Return type

str

static long_name() str[source]#
Returns

The long name of the plugin.

Return type

str

static name() str[source]#
Returns

The name of the plugin.

Return type

str

plot_scores(*args: Any, axis: Optional[int] = None, show: bool = True, plot_type: Literal['scatter', 'dist'] = 'dist', **kwargs: Any) None#

_summary_

Parameters
  • axis (Optional[int], optional) – The axis to plot. If None, plot a higher dimentional plot. Defaults to None.

  • show (bool, optional) – Flag to indicate whether to show the plot. Defaults to True.

  • plot_type (Literal[scatter, dist], optional) – The type of plot to show. Can be either “scatter” or “dist”. Defaults to “dist”.

Raises
  • ValueError – raised if the scores have not been computed.

  • ValueError – raised if scores have more than 2 dimensions. You must specify which axis to plot.

static score_description() str[source]#
Returns

A description of the score.

Return type

str

property scores: Union[Tuple[numpy.ndarray, numpy.ndarray], numpy.ndarray]#

The scores for the plugin model

Raises
  • ValueError – raised if the plugin has not been fit.

  • ValueError – raised if the scores have not been computed.

Returns

The scores for the plugin model

Return type

np.ndarray

static type() str[source]#
Returns

The type of the plugin.

Return type

str

plugin#

alias of datagnosis.plugins.generic.plugin_forgetting.ForgettingPlugin