API reference / Other

You are viewing the documentation for metbit 9.1.0. Change release context

metbit.dl.models

Other module in metbit 9.1.0.

import metbit.dl.models

Classes

SpectralAutoencoder

Symmetric encoder-decoder for unsupervised NMR spectral embedding.

Trains a fully-connected autoencoder on spectral data and exposes methods to retrieve latent embeddings, reconstructions, and diagnostic plots.

Args:

XInput spectra as a DataFrame (samples x variables) or ndarray

of shape (n_samples, n_features).

latent_dimDimensionality of the bottleneck layer.

hidden_dimsList of hidden layer sizes for the encoder; the

decoder mirrors these in reverse.

epochsNumber of training epochs.

lrLearning rate for Adam optimiser.

batch_sizeMini-batch size.

random_stateSeed for reproducibility.

deviceCompute device - 'auto', 'cpu', 'cuda', or 'mps'.

'auto' selects CUDA > MPS > CPU automatically.

Attributes:

training_loss_List of per-epoch MSE loss values populated after

calling fit().

Examples: >>> import numpy as np >>> from metbit.dl import SpectralAutoencoder >>> X = np.random.rand(80, 1000).astype("float32") >>> ae = SpectralAutoencoder(X, latent_dim=4, epochs=5) >>> ae.fit(verbose=False) SpectralAutoencoder(latent_dim=4, ...) >>> emb = ae.encode() >>> emb.shape (80, 4)

Methods

__init__(self, X, latent_dim: int=8, hidden_dims: list=None, epochs: int=100, lr: float=0.001, batch_size: int=32, random_state: int=42, device: str='auto')

fit(self, verbose: bool=True)

Train the autoencoder.

Args:

verboseIf True, print loss every 10 epochs.

Returns: self, to allow method chaining.

encode(self, X=None)

Return latent embeddings for X.

Args:

XData to encode. If None, encodes the training data.

Returns: np.ndarray of shape (n_samples, latent_dim).

reconstruct(self, X=None)

Return reconstructed spectra.

Args:

XData to reconstruct. If None, reconstructs the training

data.

Returns: np.ndarray of shape (n_samples, n_features) in original (un-normalised) scale.

plot_embedding(self, color_: 'pd.Series | list | None'=None, color_dict: 'dict | None'=None, components: list=None, fig_height: int=700, fig_width: int=900)

Scatter plot of two selected latent dimensions.

Args:

color_Group labels aligned with training samples. Used to

colour points.

color_dictMapping from label to hex/CSS colour string.

If None, colours are assigned automatically.

componentsTwo-element list of zero-based latent dimension

indices to plot. Defaults to [0, 1].

fig_heightFigure height in pixels.

fig_widthFigure width in pixels.

Returns: plotly.graph_objects.Figure.

Examples: >>> fig = ae.plot_embedding(color_=labels) >>> fig.show()

plot_loss(self, fig_height: int=400, fig_width: int=700)

Plot the training MSE loss curve.

Args:

fig_heightFigure height in pixels.

fig_widthFigure width in pixels.

Returns: plotly.graph_objects.Figure.

SpectralMLP

Multi-layer perceptron classifier for NMR spectral data.

Trains a fully-connected MLP with dropout regularisation using cross-entropy loss. Labels are encoded internally via sklearn.preprocessing.LabelEncoder.

Args:

XInput spectra as a DataFrame (samples x variables) or ndarray

of shape (n_samples, n_features).

yClass labels as a Series, ndarray, or list. Can be strings or

integers.

hidden_dimsList of hidden layer sizes.

epochsNumber of training epochs.

lrLearning rate for Adam optimiser.

batch_sizeMini-batch size.

dropoutDropout probability applied after each hidden ReLU.

random_stateSeed for reproducibility.

deviceCompute device - 'auto', 'cpu', 'cuda', or 'mps'.

Attributes:

training_loss_List of per-epoch cross-entropy loss values

populated after calling fit().

label_encoder_Fitted LabelEncoder instance.

Examples: >>> import numpy as np, pandas as pd >>> from metbit.dl import SpectralMLP >>> X = np.random.rand(60, 500).astype("float32") >>> y = ["ctrl"] * 30 + ["case"] * 30 >>> clf = SpectralMLP(X, y, epochs=5) >>> clf.fit(verbose=False) SpectralMLP(hidden_dims=[256, 128, 64], ...) >>> preds = clf.predict() >>> preds.shape (60,)

Methods

__init__(self, X, y, hidden_dims: list=None, epochs: int=100, lr: float=0.001, batch_size: int=32, dropout: float=0.3, random_state: int=42, device: str='auto')

fit(self, verbose: bool=True)

Train the MLP classifier.

Args:

verboseIf True, print loss every 10 epochs.

Returns: self, to allow method chaining.

predict(self, X=None)

Return predicted class labels.

Args:

XData to classify. If None, classifies the training data.

Returns: np.ndarray of class label strings (or original dtype).

predict_proba(self, X=None)

Return class probabilities.

Args:

XData to score. If None, scores the training data.

Returns: np.ndarray of shape (n_samples, n_classes).

get_accuracy(self)

Compute training accuracy.

Returns: Fraction of correctly classified training samples.

plot_loss(self, fig_height: int=400, fig_width: int=700)

Plot the training cross-entropy loss curve.

Args:

fig_heightFigure height in pixels.

fig_widthFigure width in pixels.

Returns: plotly.graph_objects.Figure.

plot_confusion_matrix(self, normalize: bool=True, fig_height: int=600, fig_width: int=700)

Plot a confusion matrix heatmap for training predictions.

Args:

normalizeIf True, show row-normalised proportions;

otherwise show raw counts.

fig_heightFigure height in pixels.

fig_widthFigure width in pixels.

Returns: plotly.graph_objects.Figure.

SpectralCNN

1-D Convolutional Neural Network for NMR spectral classification.

Each block applies Conv1d -> BatchNorm1d -> ReLU -> MaxPool1d. An AdaptiveAvgPool1d(1) aggregates temporal features before the final linear classifier.

Args:

XInput spectra as a DataFrame (samples x variables) or ndarray

of shape (n_samples, n_features).

yClass labels as a Series, ndarray, or list.

filtersNumber of filters in each convolutional block.

kernel_sizeKernel size for all Conv1d layers.

epochsNumber of training epochs.

lrLearning rate for Adam optimiser.

batch_sizeMini-batch size.

dropoutDropout probability before the final linear layer.

random_stateSeed for reproducibility.

deviceCompute device - 'auto', 'cpu', 'cuda', or 'mps'.

Attributes:

training_loss_List of per-epoch cross-entropy loss values

populated after calling fit().

label_encoder_Fitted LabelEncoder instance.

Examples: >>> import numpy as np >>> from metbit.dl import SpectralCNN >>> X = np.random.rand(60, 1024).astype("float32") >>> y = ["ctrl"] * 30 + ["case"] * 30 >>> cnn = SpectralCNN(X, y, epochs=5) >>> cnn.fit(verbose=False) SpectralCNN(filters=[32, 64, 128], ...) >>> preds = cnn.predict() >>> preds.shape (60,)

Methods

__init__(self, X, y, filters: list=None, kernel_size: int=7, epochs: int=100, lr: float=0.001, batch_size: int=32, dropout: float=0.3, random_state: int=42, device: str='auto')

fit(self, verbose: bool=True)

Train the CNN classifier.

Args:

verboseIf True, print loss every 10 epochs.

Returns: self, to allow method chaining.

predict(self, X=None)

Return predicted class labels.

Args:

XData to classify. If None, classifies the training data.

Returns: np.ndarray of class label strings (or original dtype).

predict_proba(self, X=None)

Return class probabilities.

Args:

XData to score. If None, scores the training data.

Returns: np.ndarray of shape (n_samples, n_classes).

get_accuracy(self)

Compute training accuracy.

Returns: Fraction of correctly classified training samples.

plot_loss(self, fig_height: int=400, fig_width: int=700)

Plot the training cross-entropy loss curve.

Args:

fig_heightFigure height in pixels.

fig_widthFigure width in pixels.

Returns: plotly.graph_objects.Figure.

plot_confusion_matrix(self, normalize: bool=True, fig_height: int=600, fig_width: int=700)

Plot a confusion matrix heatmap for training predictions.

Args:

normalizeIf True, show row-normalised proportions;

otherwise show raw counts.

fig_heightFigure height in pixels.

fig_widthFigure width in pixels.

Returns: plotly.graph_objects.Figure.

Source

metbit/dl/models.py at v9.1.0

metbit.dl.models

Other module in metbit 9.1.0.

import metbit.dl.models

Classes

SpectralAutoencoder

Symmetric encoder-decoder for unsupervised NMR spectral embedding.

Trains a fully-connected autoencoder on spectral data and exposes methods to retrieve latent embeddings, reconstructions, and diagnostic plots.

Args:

XInput spectra as a DataFrame (samples x variables) or ndarray

of shape (n_samples, n_features).

latent_dimDimensionality of the bottleneck layer.

hidden_dimsList of hidden layer sizes for the encoder; the

decoder mirrors these in reverse.

epochsNumber of training epochs.

lrLearning rate for Adam optimiser.

batch_sizeMini-batch size.

random_stateSeed for reproducibility.

deviceCompute device - 'auto', 'cpu', 'cuda', or 'mps'.

'auto' selects CUDA > MPS > CPU automatically.

Attributes:

training_loss_List of per-epoch MSE loss values populated after

calling fit().

Methods

__init__(self, X, latent_dim: int=8, hidden_dims: list=None, epochs: int=100, lr: float=0.001, batch_size: int=32, random_state: int=42, device: str='auto')

fit(self, verbose: bool=True)

Train the autoencoder.

Args:

verboseIf True, print loss every 10 epochs.

Returns: self, to allow method chaining.

encode(self, X=None)

Return latent embeddings for X.

Args:

XData to encode. If None, encodes the training data.

Returns: np.ndarray of shape (n_samples, latent_dim).

reconstruct(self, X=None)

Return reconstructed spectra.

Args:

XData to reconstruct. If None, reconstructs the training

data.

Returns: np.ndarray of shape (n_samples, n_features) in original (un-normalised) scale.

plot_embedding(self, color_: 'pd.Series | list | None'=None, color_dict: 'dict | None'=None, components: list=None, fig_height: int=700, fig_width: int=900)

Scatter plot of two selected latent dimensions.

Args:

color_Group labels aligned with training samples. Used to

colour points.

color_dictMapping from label to hex/CSS colour string.

If None, colours are assigned automatically.

componentsTwo-element list of zero-based latent dimension

indices to plot. Defaults to [0, 1].

fig_heightFigure height in pixels.

fig_widthFigure width in pixels.

Returns: plotly.graph_objects.Figure.

Examples: >>> fig = ae.plot_embedding(color_=labels) >>> fig.show()

plot_loss(self, fig_height: int=400, fig_width: int=700)

Plot the training MSE loss curve.

Args:

fig_heightFigure height in pixels.

fig_widthFigure width in pixels.

Returns: plotly.graph_objects.Figure.

SpectralMLP

Multi-layer perceptron classifier for NMR spectral data.

Trains a fully-connected MLP with dropout regularisation using cross-entropy loss. Labels are encoded internally via sklearn.preprocessing.LabelEncoder.

Args:

XInput spectra as a DataFrame (samples x variables) or ndarray

of shape (n_samples, n_features).

yClass labels as a Series, ndarray, or list. Can be strings or

integers.

hidden_dimsList of hidden layer sizes.

epochsNumber of training epochs.

lrLearning rate for Adam optimiser.

batch_sizeMini-batch size.

dropoutDropout probability applied after each hidden ReLU.

random_stateSeed for reproducibility.

deviceCompute device - 'auto', 'cpu', 'cuda', or 'mps'.

Attributes:

training_loss_List of per-epoch cross-entropy loss values

populated after calling fit().

label_encoder_Fitted LabelEncoder instance.

Methods

__init__(self, X, y, hidden_dims: list=None, epochs: int=100, lr: float=0.001, batch_size: int=32, dropout: float=0.3, random_state: int=42, device: str='auto')

fit(self, verbose: bool=True)

Train the MLP classifier.

Args:

verboseIf True, print loss every 10 epochs.

Returns: self, to allow method chaining.

predict(self, X=None)

Return predicted class labels.

Args:

XData to classify. If None, classifies the training data.

Returns: np.ndarray of class label strings (or original dtype).

predict_proba(self, X=None)

Return class probabilities.

Args:

XData to score. If None, scores the training data.

Returns: np.ndarray of shape (n_samples, n_classes).

get_accuracy(self)

Compute training accuracy.

Returns: Fraction of correctly classified training samples.

plot_loss(self, fig_height: int=400, fig_width: int=700)

Plot the training cross-entropy loss curve.

Args:

fig_heightFigure height in pixels.

fig_widthFigure width in pixels.

Returns: plotly.graph_objects.Figure.

plot_confusion_matrix(self, normalize: bool=True, fig_height: int=600, fig_width: int=700)

Plot a confusion matrix heatmap for training predictions.

Args:

normalizeIf True, show row-normalised proportions;

otherwise show raw counts.

fig_heightFigure height in pixels.

fig_widthFigure width in pixels.

Returns: plotly.graph_objects.Figure.

SpectralCNN

1-D Convolutional Neural Network for NMR spectral classification.

Each block applies Conv1d -> BatchNorm1d -> ReLU -> MaxPool1d. An AdaptiveAvgPool1d(1) aggregates temporal features before the final linear classifier.

Args:

XInput spectra as a DataFrame (samples x variables) or ndarray

of shape (n_samples, n_features).

yClass labels as a Series, ndarray, or list.

filtersNumber of filters in each convolutional block.

kernel_sizeKernel size for all Conv1d layers.

epochsNumber of training epochs.

lrLearning rate for Adam optimiser.

batch_sizeMini-batch size.

dropoutDropout probability before the final linear layer.

random_stateSeed for reproducibility.

deviceCompute device - 'auto', 'cpu', 'cuda', or 'mps'.

Attributes:

training_loss_List of per-epoch cross-entropy loss values

populated after calling fit().

label_encoder_Fitted LabelEncoder instance.

Methods

__init__(self, X, y, filters: list=None, kernel_size: int=7, epochs: int=100, lr: float=0.001, batch_size: int=32, dropout: float=0.3, random_state: int=42, device: str='auto')

fit(self, verbose: bool=True)

Train the CNN classifier.

Args:

verboseIf True, print loss every 10 epochs.

Returns: self, to allow method chaining.

predict(self, X=None)

Return predicted class labels.

Args:

XData to classify. If None, classifies the training data.

Returns: np.ndarray of class label strings (or original dtype).

predict_proba(self, X=None)

Return class probabilities.

Args:

XData to score. If None, scores the training data.

Returns: np.ndarray of shape (n_samples, n_classes).

get_accuracy(self)

Compute training accuracy.

Returns: Fraction of correctly classified training samples.

plot_loss(self, fig_height: int=400, fig_width: int=700)

Plot the training cross-entropy loss curve.

Args:

fig_heightFigure height in pixels.

fig_widthFigure width in pixels.

Returns: plotly.graph_objects.Figure.

plot_confusion_matrix(self, normalize: bool=True, fig_height: int=600, fig_width: int=700)

Plot a confusion matrix heatmap for training predictions.

Args:

normalizeIf True, show row-normalised proportions;

otherwise show raw counts.

fig_heightFigure height in pixels.

fig_widthFigure width in pixels.

Returns: plotly.graph_objects.Figure.