Back to API
API Documentation

metbit

Category: Statistical Models

Classes

opls_da

Methods

init(X, y, features_name, n_components, scaling_method, kfold, estimator, random_state, auto_ncomp)

Purpose: Initializes the model, validates the inputs, and preprocesses the data for further analysis.

fit()

Purpose: Fits the OPLS-DA model to the data and computes model performance metrics.

get_oplsda_scores()

Get OPLS-DA scores

get_s_scores()

Get S scores

get_oplsda_model()

Get OPLS-DA model

get_cv_model()

Get cross-validation model

permutation_test(n_permutations, cv, n_jobs, verbose)
get_permutation_scores()

Get permutation scores

vip_scores(model, features_name)

Get VIP score

Parameters
  • model: object, default=None OPLS-DA model.
  • features_name: array-like, shape (n_features,), default=None Name of features.
get_vip_scores(filter_, threshold)

Get VIP score

Parameters
  • filter_: bool, default=False If True, filter VIP score based on threshold.
  • threshold: int, default=1 Threshold of VIP score.
vip_plot(x_range, threshold, marker_size, fig_width, fig_height, filter_, vip_transform, font_size, title_font_size, xaxis_direction)

Purpose: Plots the VIP scores for the features, allowing for customization and thresholding.

plot_oplsda_scores(x_, y_, color_, color_dict, symbol_, symbol_dict, fig_height, fig_width, marker_size, marker_opacity, marker_label, font_size, title_font_size, legend_name, individual_ellipse)

Plot OPLS-DA scores plot

Parameters
  • color_: array-like, shape (n_samples,), default=None color_ of the group. If None, color_ will be based on the group in y.
  • color_dict: dict, default=None Dictionary of color_ for the group. If None, color_ will be based on the group in y.
  • symbol_: array-like, shape (n_samples,), default=None symbol_ of the group. If None, symbol_ will be based on the group in y.
  • symbol_dict: dict, default=None Dictionary of symbol_ for the group. If None, symbol_ will be based on the group in y.
  • fig_height: int, default=900 Height of the figure.
  • fig_width: int, default=1300 Width of the figure.
  • marker_size: int, default=35 Size of the marker.
  • marker_opacity: float, default=0.7 Opacity of the marker. - Purpose: This function generates an OPLS-DA (Orthogonal Partial Least Squares Discriminant Analysis) scores plot, showing how samples are positioned based on their scores in two principal components (t_scores and t_ortho). - Parameters: - x_, y_: The names of the columns in the DataFrame (df_opls_scores) that contain the scores for the x and y axes (default to 't_scores' and 't_ortho'). - color_: An array that assigns a color to each sample (optional). - color_dict: A dictionary of color mappings for the groups (optional). - symbol_: An array of symbols for the samples (optional). - fig_height, fig_width: Dimensions for the plot. - marker_size, marker_opacity: Control the appearance of the markers. - legend_name: Custom labels for the legend. - individual_ellipse: Whether to add ellipses to individual groups (default True). - Plot Details: - Uses Plotly (px.scatter) to create an interactive scatter plot. - Can display confidence ellipses around each group. - Adds annotations for R²X, R²Y, and Q² statistics. - The plot is highly customizable (marker size, opacity, labels, colors, etc.).
plot_hist(nbins_, fig_height, fig_width, font_size, title_font_size)

Plot histogram of permutation scores

Parameters
  • nbins_: int, default=50 Number of bins for histogram.
  • fig_height: int, default=500 Height of the figure.
  • fig_width: int, default=1000 Width of the figure. - Purpose: This function creates a histogram of permutation scores from a permutation test, commonly used to evaluate model stability and significance. - Parameters: - nbins_: Number of bins in the histogram. - fig_height, fig_width: Dimensions of the plot. - font_size, title_font_size: Font size for labels and title. - Plot Details: - Plots the permutation scores as a histogram with Plotly. - Marks the actual model accuracy score with a red dashed line. - Adds additional annotations for the number of permutations, the accuracy score, and the p-value.
plot_s_scores(fig_height, fig_width, range_color_, color_continuous_scale_, marker_size, font_size, title_font_size)

Plot S-plot

Parameters
  • fig_height: int, default=900 Height of the figure.
  • fig_width: int, default=2000 Width of the figure.
  • range_color: list, default=[-0.05,0.05] Range of color_ for the plot.
  • color_continuous_scale_: str, default='jet' color_ scale for the plot. - Purpose: This function generates a scatter plot (S-plot), which visualizes the covariance and correlation between the scores of the model. - Parameters: - fig_height, fig_width: Dimensions of the plot. - range_color_: Range of colors to display. - color_continuous_scale_: Color scale for the plot. - marker_size, font_size, title_font_size: Customize marker size and font sizes. - Plot Details: - The plot visualizes the relationship between covariance and correlation for features. - Uses Plotly’s scatter plot to create an interactive S-plot. - The axes are customizable, and the plot is set to be visually clean (e.g., axes lines and tick marks).
plot_loading(fig_height, fig_width, range_color_, color_continuous_scale_, marker_size, font_size, title_font_size, xaxis_direction, xaxis_title)

Plot loading plot

Parameters
  • fig_height: int, default=900 Height of the figure.
  • fig_width: int, default=2000 Width of the figure.
  • range_color: list, default=[-0.05,0.05] Range of color_ for the plot.
  • color_continuous_scale_: str, default='jet' color_ scale for the plot. - Purpose: This function generates a loading plot, typically used in multivariate analysis to visualize the relationship between features and the scores. - Parameters: - fig_height, fig_width: Dimensions of the plot. - range_color_: Color range to represent the covariance values. - color_continuous_scale_: Color scale used for continuous color mapping. - marker_size, font_size, title_font_size: Customize the appearance of the markers and fonts. - xaxis_direction: Set the direction for the x-axis (e.g., reversed or not). - xaxis_title: Title for the x-axis. - Plot Details: - The loading plot shows the relationship between features (usually a set of variables or compounds) and the model scores. - The correlation for each feature is displayed alongside its covariance, with colors representing the covariance values. - The plot is interactive and customizable (e.g., marker size, color scale, axis settings).

pca

PCA model

Constructor Parameters

  • X: array-like, shape (n_samples, n_features) Training data, where n_samples is the number of samples and n_features is the number of features.
  • label: array-like, shape (n_samples,) Target data, where n_samples is the number of samples.
  • features_name: array-like, shape (n_features,), default=None Name of features.
  • n_components: int, default=2 Number of components to keep.
  • scale: str, default='pareto' Method of scaling. 'pareto' for pareto scaling, 'mean' for mean centering, 'uv' for unitvarian scaling.
  • random_state: int, default=42 Random state for permutation test.
  • test_size: float, default=0.3 Size of test set.

Methods

init(X, label, features_name, n_components, scaling_method, random_state, test_size)
fit()
get_explained_variance()
get_scores()
get_loadings()
get_q2_test()
plot_observe_variance(fig_height, fig_width, font_size)

Visualise explained variance plot

Returns
  • fig: plotly.graph_objects.Figure Explained variance plot.
plot_cumulative_observed(fig_height, fig_width, font_size, marker_size)

Visualise cumulative variance plot

Returns
  • fig: plotly.graph_objects.Figure Cumulative variance plot.
plot_pca_scores(pc, color_, color_dict, symbol_, symbol_dict, marker_label, fig_height, fig_width, marker_size, marker_opacity, font_size, title_font_size, individual_ellipse, legend_name)

Visualise PCA scores plot

Parameters
  • pc: list, default=['PC1', 'PC2'] List of principal components to plot.
  • color: array-like, shape (n_samples,), default=None Target data, where n_samples is the number of samples.
  • color_dict: dict, default=None Dictionary of color_ mapping.
  • symbol_: array-like, shape (n_samples,), default=None Target data, where n_samples is the number of samples.
  • symbol_dict: dict, default=None Dictionary of symbol_ mapping.
  • fig_height: int, default=900 Height of figure.
  • fig_width: int, default=1300 Width of figure.
  • marker_size: int, default=35 Size of marker.
  • marker_opacity: float, default=0.7 Opacity of marker.
  • text_: array-like, shape (n_samples,), default=None Text to display on each point.
Returns
  • fig: plotly.graph_objects.Figure PCA scores plot.
plot_loading_(pc, fig_height, fig_width, font_size, title_font_size, marker_size, x_axis_title, xaxis_direction)

Visualise PCA loadings

Parameters
  • pc: list, default=['PC1', 'PC2'] Principle component to plot.
  • fig_height: int, default=600 Height of figure.
  • fig_width: int, default=1800 Width of figure.
Returns
  • fig: plotly.graph_objects.Figure Plotly figure. ----------
plot_pca_trajectory(time_, time_order, stat_, pc, color_dict, symbol_dict, fig_height, fig_width, marker_size, marker_opacity, title_font_size, font_size, legend_name)
plot_3d_pca(pc, color_, color_dict, symbol_, symbol_dict, fig_height, fig_width, marker_size, marker_opacity, marker_label, font_size, title_font_size, legend_name)