BaseVAE

This is the base VAE architecture module from which all future models should inherit. It contains:

  • a BaseModelConfig instance containing the main model’s parameters (e.g. latent dimension …)
  • a BaseVAE instance which creates a BaseVAE model having a basic autoencoding architecture
  • a BaseSamplerConfig instance containing the main sampler’s parameters used to sample from the latent space of the BaseVAE
  • a BaseSampler instance which creates a BaseSampler.

Note

If you want ot build upon this work try to make any new model inherit from these 4 classes.

class pyraug.models.base.base_config.BaseModelConfig(input_dim=None, latent_dim=10, uses_default_encoder=True, uses_default_decoder=True)[source]

This is the base configuration instance of the models deriving from BaseConfig.

Parameters
  • input_dim (int) – The input_data dimension

  • latent_dim (int) – The latent space dimension. Default: None.

  • default_encoder (bool) – Whether the encoder default. Default: True.

  • default_decoder (bool) – Whether the encoder default. Default: True.

class pyraug.models.BaseVAE(model_config, encoder=None, decoder=None)[source]

Base class for VAE based models.

Parameters
  • model_config (BaseModelConfig) – An instance of BaseModelConfig in which any model’s parameters is made available.

  • encoder (BaseEncoder) – An instance of BaseEncoder (inheriting from torch.nn.Module which plays the role of encoder. This argument allows you to use your own neural networks architectures if desired. If None is provided, a simple Multi Layer Preception (https://en.wikipedia.org/wiki/Multilayer_perceptron) is used. Default: None.

  • decoder (BaseDecoder) – An instance of BaseDecoder (inheriting from torch.nn.Module which plays the role of encoder. This argument allows you to use your own neural networks architectures if desired. If None is provided, a simple Multi Layer Preception (https://en.wikipedia.org/wiki/Multilayer_perceptron) is used. Default: None.

Note

For high dimensional data we advice you to provide you own network architectures. With the provided MLP you may end up with a MemoryError.

forward(inputs)[source]

Main forward pass outputing the VAE outputs This function should output an model_output instance gathering all the model outputs

Parameters

inputs (Dict[str, torch.Tensor]) – The training data with labels, masks etc…

Returns

The output of the model.

Return type

(ModelOutput)

Note

The loss must be computed in this forward pass and accessed through loss = model_output.loss

classmethod load_from_folder(dir_path)[source]

Class method to be used to load the model from a specific folder

Parameters

dir_path (str) – The path where the model should have been be saved.

Note

This function requires the folder to contain:

a model_config.json and a model.pt if no custom architectures were provided

or a model_config.json, a model.pt and a encoder.pkl (resp. decoder.pkl) if a custom encoder (resp. decoder) was provided

save(dir_path)[source]

Method to save the model at a specific location. It saves, the model weights as a models.pt file along with the model config as a model_config.json file. If the model to save used custom encoder (resp. decoder) provided by the user, these are also saved as decoder.pkl (resp. decoder.pkl).

Parameters

dir_path (str) – The path where the model should be saved. If the path path does not exist a folder will be created at the provided location.

set_decoder(decoder)[source]

Set the decoder of the model

set_encoder(encoder)[source]

Set the encoder of the model

update()[source]

Method that allows model update during the training.

If needed, this method must be implemented in a child class.

By default, it does nothing.

class pyraug.models.base.base_config.BaseSamplerConfig(output_dir=None, batch_size=50, samples_per_save=500, no_cuda=False)[source]

This is the base configuration of a model sampler

Parameters
  • samples_number (int) – The number of samples to generate

  • batch_size (int) – The number of samples to generate in each batch

  • samples_per_save (int) – The number of samples to be saved together. By default, when generating, the generated data is saved in .pt format in several files. This specifies the number of samples to be saved in these files. Amend this argument if you deal with particularly large data. Default: 500.

  • no_cuda (bool) – Disable cuda. Default: False

class pyraug.models.base.base_sampler.BaseSampler(model, sampler_config=None)[source]

Base class for sampler used to generate from the VAEs models

Parameters
  • model (BaseVAE) – The vae model to sample from.

  • sampler_config (BaseSamplerConfig) – An instance of BaseSamplerConfig in which any sampler’s parameters is made available. If None a default configuration is used. Default: None

sample(num_samples)[source]

Main sampling function of the samplers. The data is saved in the output_dir/generation_ folder passed in the ~pyraug.models.model_config.SamplerConfig instance. If output_dir if None, a folder named dummy_output_dir is created in this folder.

Parameters

num_samples (int) – The number of samples to generate

save(dir_path)[source]

Method to save the sampler config. The config is saved a as sampler_config.json file in dir_path

save_data_batch(data, dir_path, number_of_samples, batch_idx)[source]

Method to save a batch of generated data. The data will be saved in the dir_path folder. The batch of data is saved in a file named generated_data_{number_of_samples}_{batch_idx}.pt

Parameters
  • data (torch.Tensor) – The data to save

  • dir_path (str) – The folder where the data and config file must be saved

  • batch_idx (int) – The batch idx

Note

You can then easily reload the generated data using