datagnosis.plugins.generic.plugin_data_maps module#

class DataMapsPlugin(model: torch.nn.modules.module.Module, criterion: torch.nn.modules.module.Module, optimizer: torch.optim.optimizer.Optimizer, lr: float, epochs: int, num_classes: int, device: Optional[torch.device] = device(type='cpu'), logging_interval: int = 100)[source]#

Bases: datagnosis.plugins.core.plugin.Plugin

compute_scores(recompute: bool = False) Union[Tuple[numpy.ndarray, numpy.ndarray], numpy.ndarray][source]#

A method to compute scores for the plugin. This method is called during the score() method.

Parameters

recompute (bool, optional) – A flag to indicate whether or not to recompute scores from scratch. Defaults to False.

Raises

ValueError – raises a ValueError if the plugin has not been fit yet.

Returns

Tuple of two numpy arrays. The first is the Confidence and the second is the Epistemic Uncertainty otherwise known as Variability.

Return type

Tuple[np.ndarray, np.ndarray]

extract_datapoints(method: Literal['threshold', 'top_n', 'index'] = 'threshold', hardness: Literal['hard', 'easy'] = 'hard', threshold: Optional[float] = None, threshold_range: Optional[Tuple[float, float]] = None, n: Optional[int] = None, indices: Optional[List[int]] = None, sort_by_index: bool = True) Tuple[Tuple[torch.Tensor, torch.Tensor, List], numpy.ndarray]#

Extracts datapoints from the plugin model by applying a threshold or range to the scores, selecting the top n scores, or selecting datapoints by index.

Parameters
  • method (Literal[threshold, top_n, index], optional) – The method to use to extract datapoints. Defaults to “threshold”.

  • hardness (Literal[hard, easy], optional) – Flag to indicate whether to extract hard or easy data points. Defaults to “hard”.

  • threshold (Optional[float], optional) – The threshold to apply to the scores. Must be provided if the given method is “threshold” and threshold_range is None. Defaults to None.

  • threshold_range (Optional[Tuple[float, float]], optional) – The range of thresholds to apply to the scores. Must be provided if the given method is “threshold” and the value passed to threshold is None. Defaults to None.

  • n (Optional[int], optional) – The number of datapoints to extract. Must be provided if the given method is “top_n”. Defaults to None.

  • indices (Optional[List[int]], optional) – The indices of the datapoints to extract. Must be provided if the given method is “index”. Defaults to None.

  • sort_by_index (bool, optional) – Flag to indicate whether to sort_by_index the extracted datapoints. Defaults to True.

Raises
  • ValueError – raised if the given method is not one of “threshold”, “top_n”, or “index”.

  • ValueError – raised if the given method is “threshold” but neither threshold nor threshold_range is provided.

  • ValueError – raised if the given method is “top_n” but n is not provided.

  • ValueError – raised if the given method is “index” but a list of indices is not provided.

Returns

The extracted datapoints and the scores of the extracted datapoints. Datapoints returned in the format ((Features, Labels, Indices), scores)

Return type

Tuple[np.ndarray, np.ndarray]

fit(datahandler: datagnosis.plugins.core.datahandler.DataHandler, use_caches_if_exist: bool = True, workspace: Union[pathlib.Path, str] = PosixPath('workspace'), *args: Any, **kwargs: Any) typing_extensions.Self#

Fit the plugin model.

Parameters
  • datahandler (DataHandler) – The datagnosis.plugins.core.datahandler.DataHandler object that contains the data to be used for fitting.

  • use_caches_if_exist (bool, optional) – A flag to indicate whether or not to use cached data if it exists. Defaults to True.

  • workspace (Union[Path, str], optional) – A path to the workspace directory. Defaults to Path(“workspace/”).

Raises

RuntimeError – Raises a RuntimeError if the plugin’s fit method has already been called.

classmethod fqdn() str#

The Fully-Qualified name of the plugin.

static hard_direction() str[source]#
Returns

The direction of hardness for the plugin, i.e. whether high or low scores indicate hardness.

Return type

str

static long_name() str[source]#
Returns

The long name of the plugin.

Return type

str

static name() str[source]#
Returns

The name of the plugin.

Return type

str

plot_scores(*args: Any, axis: Optional[int] = None, show: bool = True, plot_type: Literal['scatter', 'dist'] = 'dist', **kwargs: Any) None#

_summary_

Parameters
  • axis (Optional[int], optional) – The axis to plot. If None, plot a higher dimentional plot. Defaults to None.

  • show (bool, optional) – Flag to indicate whether to show the plot. Defaults to True.

  • plot_type (Literal[scatter, dist], optional) – The type of plot to show. Can be either “scatter” or “dist”. Defaults to “dist”.

Raises
  • ValueError – raised if the scores have not been computed.

  • ValueError – raised if scores have more than 2 dimensions. You must specify which axis to plot.

static score_description() str[source]#
Returns

A description of the score.

Return type

str

property scores: Union[Tuple[numpy.ndarray, numpy.ndarray], numpy.ndarray]#

The scores for the plugin model

Raises
  • ValueError – raised if the plugin has not been fit.

  • ValueError – raised if the scores have not been computed.

Returns

The scores for the plugin model

Return type

np.ndarray

static type() str[source]#
Returns

The type of the plugin.

Return type

str

plugin#

alias of datagnosis.plugins.generic.plugin_data_maps.DataMapsPlugin