metbit.lazy_opls_da
Analysis and models module in metbit 8.4.4.
import metbit.lazy_opls_daClasses
lazy_opls_da
Parameters:
• data (pd.DataFrame): DataFrame containing the dataset. • groups (list): List of class labels for each data sample. • working_dir (str): Directory path for storing output files. • feature_names (list, optional): Names of features, defaults to None. • n_components (int, optional): Number of components for OPLS-DA, defaults to 2. • scaling (str, optional): Scaling method ('pareto'), defaults to 'pareto'. • estimator (str, optional): Model estimator, defaults to 'opls'. • kfold (int, optional): Number of folds in cross-validation, defaults to 3. • random_state (int, optional): Random seed, defaults to 94. • auto_ncomp (bool, optional): Automatically choose the optimal number of components, defaults to True. • permutation (bool, optional): Conduct permutation tests, defaults to True. • VIP (bool, optional): Calculate VIP scores, defaults to True. • linear_regression (bool, optional): Conduct linear regression analysis, defaults to True.
Returns:
• A printout of the model summary, including the project name, dataset information, configuration, and directory paths.
fit Method
Fits the OPLS-DA model to the dataset, generates plots, and saves them to the output directory.
Parameters:
• marker_color (dict, optional): Dictionary mapping groups to colors. • custom_color (list, optional): Custom color grouping. • custom_shape (list, optional): Custom shape grouping. • symbol_dict (dict, optional): Dictionary mapping groups to marker symbols. • custom_legend_name (list, optional): Custom for the legend, defaults to ['Group', 'Sub-group']. • marker_label (str or None, optional): Specifies marker labels ('class', 'group', or 'sub-group'). • marker_size (int or None, optional): Size of markers in plots. • marker_opacity (float or None, optional): Opacity level of markers in plots. • individual_ellipse (bool, optional): Option to display individual ellipses for each group.
Returns:
• A message indicating the model fitting was successful.
Directory and Project Setup =========================== Creates necessary folders in the working directory based on project needs (e.g., for VIP score plots, permutation scores, etc.). Paths are stored in a dictionary (self.path).
Directories Created:
• working_dir/project_name/element/plots/... for different plots. • working_dir/project_name/element/data/... for data outputs.
Plotting and Saving Data
1. Score Plot: Generates OPLS-DA score plots for each group. 2. Loading Plot: Generates and saves loading plots. 3. S Plot: Generates and saves S-score plots. 4. VIP Score Plot: Generates VIP score plots and saves VIP scores as CSV if VIP=True. 5. Permutation Test Plot: Conducts permutation tests and saves permutation scores as CSV if permutation=True. 6. Volcano Plot (Linear Regression): Generates volcano plot and saves data if linear_regression=True.