backwardcompatibilityml package

Submodules

backwardcompatibilityml.comparison_management module

class backwardcompatibilityml.comparison_management.ComparisonManager(dataset, get_instance_image_by_id=None)

Bases: object

The ComparisonManager class is used to field any REST requests by the ModelComparison widget UI components from within the Jupyter notebook.

Parameters:
  • training_set – The list of training samples as (batch_ids, input, target).
  • dataset – The list of dataset samples as (batch_ids, input, target).
  • get_instance_image_by_id
    A function that returns an image representation of the data corresponding to the instance id, in PNG format. It should be a function of the form:
    get_instance_image_by_id(instance_id)
    instance_id: An integer instance id

    And should return a PNG image.

get_instance_image(instance_id)

backwardcompatibilityml.metrics module

backwardcompatibilityml.metrics.model_accuracy(model, dataset, device='cpu')

backwardcompatibilityml.scores module

backwardcompatibilityml.scores.error_compatibility_score(h1_output_labels, h2_output_labels, expected_labels)

The fraction of instances labeled incorrectly by h1 and h2 out of the total number of instances labeled incorrectly by h1.

Parameters:
  • h1_output_labels – A list of the labels outputted by the model h1.
  • h2_output_labels – A list of the labels output by the model h2.
  • expected_labels – A list of the corresponding ground truth target labels.
Returns:

If h1 has any errors, then we return the error compatibility score of h2 with respect to h1. If h1 has no errors then we return 0.

backwardcompatibilityml.scores.trust_compatibility_score(h1_output_labels, h2_output_labels, expected_labels)

The fraction of instances labeled correctly by both h1 and h2 out of the total number of instances labeled correctly by h1.

Parameters:
  • h1_output_labels – A list of the labels outputted by the model h1.
  • h2_output_labels – A list of the labels output by the model h2.
  • expected_labels – A list of the corresponding ground truth target labels.
Returns:

If h1 has any errors, then we return the trust compatibility score of h2 with respect to h1. If h1 has no errors then we return 0.

backwardcompatibilityml.sweep_management module

class backwardcompatibilityml.sweep_management.SweepManager(folder_name, number_of_epochs, h1, h2, training_set, test_set, batch_size_train, batch_size_test, OptimizerClass, optimizer_kwargs, NewErrorLossClass, StrictImitationLossClass, lambda_c_stepsize=0.25, new_error_loss_kwargs=None, strict_imitation_loss_kwargs=None, performance_metric=<function model_accuracy>, get_instance_image_by_id=None, get_instance_metadata=None, device='cpu', use_ml_flow=False, ml_flow_run_name='compatibility_sweep')

Bases: object

The SweepManager class is used to manage an experiment that performs training / updating a model h2, with respect to a reference model h1 in a way that preserves compatibility between the models. The experiment performs a sweep of the parameter space of the regularization parameter lambda_c, by performing compatibility trainings for small increments in the value of lambda_c for some settable step size.

The sweep manager can run the sweep experiment either synchronously, or within a separate thread. In the latter case, it provides some helper functions that allow you to check on the percentage of the sweep that is complete.

Parameters:
  • folder_name – A string value representing the full path of the folder wehre the result of the compatibility sweep is to be stored.
  • number_of_epochs – The number of training epochs to use on each sweep.
  • h1 – The reference model being used.
  • h2 – The new model being traind / updated.
  • training_set – The list of training samples as (batch_ids, input, target).
  • test_set – The list of testing samples as (batch_ids, input, target).
  • batch_size_train – An integer representing batch size of the training set.
  • batch_size_test – An integer representing the batch size of the test set.
  • OptimizerClass – The class to instantiate an optimizer from for training.
  • optimizer_kwargs – A dictionary of the keyword arguments to be used to instantiate the optimizer.
  • NewErrorLossClass – The class of the New Error style loss function to be instantiated and used to perform compatibility constrained training of our model h2.
  • StrictImitationLossClass – The class of the Strict Imitation style loss function to be instantiated and used to perform compatibility constrained training of our model h2.
  • performance_metric – Optional performance metric to be used when evaluating the model. If not specified then accuracy is used.
  • lambda_c_stepsize – The increments of lambda_c to use as we sweep the parameter space between 0.0 and 1.0.
  • get_instance_image_by_id
    A function that returns an image representation of the data corresponding to the instance id, in PNG format. It should be a function of the form:
    get_instance_image_by_id(instance_id)
    instance_id: An integer instance id

    And should return a PNG image.

  • get_instance_metadata
    A function that returns a text string representation of some metadata corresponding to the instance id. It should be a function of the form:
    get_instance_metadata(instance_id)
    instance_id: An integer instance id

    And should return a string.

  • device – A string with values either “cpu” or “cuda” to indicate the device that Pytorch is performing training on. By default this value is “cpu”. But in case your models reside on the GPU, make sure to set this to “cuda”. This makes sure that the input and target tensors are transferred to the GPU during training.
  • use_ml_flow – A boolean flag controlling whether or not to log the sweep with MLFlow. If true, an MLFlow run will be created with the name specified by ml_flow_run_name.
  • ml_flow_run_name – A string that configures the name of the MLFlow run.
get_evaluation(evaluation_id)
get_instance_image(instance_id)
get_sweep_status()
get_sweep_summary()
is_running()
start_sweep()
start_sweep_synchronous()

Module contents