metbit.utility
Statistics and utilities module in metbit 8.1.0.
import metbit.utilityClasses
lazypair
Methods
__init__(self, dataset, column_name)
get_index(self)
get_name(self)
get_meta(self)
get_column_name(self)
get_dataset(self)
gen_page
Methods
__init__(self, data_path)
This function takes in the path to the data folder and returns the HTML files for the OPLS-DA plots.
Parameters
data_pathstrThe path to the data folder. gen_page(data_path).get_files()
get_files(self)
oplsda_path
Methods
__init__(self, data_path)
make_path(self)
get_path(self)
Normality_distribution
Methods
__init__(self, data: pd.DataFrame)
plot_distribution(self, feature)
pca_distributions(self)
Normalise
Methods
__init__(self, data: pd.DataFrame, compute_missing: bool=True)
pqn_normalise(self, ref_index: list=None, plot: bool=True)
decimal_place_normalisation(self, decimals: int=2)
This function returns the dataframe with values rounded to a specified number of decimal places.
Parameters
decimalsintThe number of decimal places to round to.
z_score_normalisation(self)
This function returns the dataframe normalized using Z-Score.
linear_normalisation(self)
This function returns the dataframe normalized using Min-Max (linear normalization).
normalize_to_100(self)
This function returns the dataframe with values normalized to 100.
clipping_normalisation(self, lower: float, upper: float)
This function returns the dataframe with values clipped to the specified range.
Parameters
lowerfloatThe lower bound for clipping.
upperfloatThe upper bound for clipping.
standard_deviation_normalisation(self)
This function returns the dataframe normalized using Standard Deviation.
univar_stats
A class for generating univariate box or violin plots with statistical annotations using Plotly. Supports group-wise comparisons using t-tests, ANOVA, or nonparametric tests, along with effect size calculation and multiple testing correction.
Parameters
dfpandas.DataFrameThe input dataframe containing numeric and grouping columns.
x_colstrColumn name used for group/category (x-axis).
y_colstrColumn name used for values (y-axis).
group_orderlist of str, optionalCustom ordering of groups on the x-axis. Defaults to the order in the dataframe.
custom_colorsdict, optionalDictionary mapping group names to Plotly color codes.
stats_optionslist of str, optionalStatistical tests to perform. Choices: - 't-test' : independent two-sample t-test - 'anova' : one-way ANOVA - 'nonparametric' : Mann-Whitney U test - 'effect-size' : Computes Cohen's d
p_value_thresholdfloat, default=0.05Threshold for marking comparisons as significant.
annotate_stylestr, default="value"How to display p-values. Options: - 'value' : show exact p-values (e.g., p=0.0031) - 'symbol' : show significance level (*, **, ***) or 'ns'
y_offset_factorfloat, default=0.35Controls spacing between stacked annotation lines (relative to y-axis range).
show_non_significantbool, default=TrueIf False, non-significant comparisons are hidden from the plot.
correct_pstr or None, default="bonferroni"Method for multiple testing correction (e.g., "bonferroni", "fdr_bh", or None).
title_str, optionalTitle for the plot. Defaults to y_col.
y_labelstr, optionalCustom label for the y-axis. Defaults to y_col.
x_labelstr, optionalCustom label for the x-axis. Defaults to x_col.
fig_heightint, default=800Height of the figure in pixels.
fig_widthint, default=600Width of the figure in pixels.
plot_typestr, default="box"Type of plot. Choices: - "box" - "violin"
show_axis_linesbool, default=TrueWhether to show border lines on axes.
Attributes
dfpandas.DataFrameThe input data.
plot() : plotly.graph_objects.Figure Generates the interactive annotated plot.
Examples
>>> import pandas as pd >>> import numpy as np >>> from univar_stats import univar_stats
>>> # Create mock data >>> df = pd.DataFrame({ ... "group": np.repeat(["A", "B", "C"], 30), ... "value": np.concatenate([ ... np.random.normal(5, 1, 30), ... np.random.normal(6, 1, 30), ... np.random.normal(7, 1, 30) ... ]) ... })
>>> # Initialize and plot >>> plotter = univar_stats( ... df, x_col="group", y_col="value", ... stats_options=["t-test", "effect-size"], ... annotate_style="symbol", plot_type="box", ... show_non_significant=False ... ) >>> fig = plotter.plot() >>> fig.show()
Methods
__init__(self, df: pd.DataFrame, x_col: str, y_col: str, group_order: Optional[List[str]]=None, custom_colors: Optional[Dict[str, str]]=None, stats_options: Optional[List[str]]=None, p_value_threshold: float=0.05, annotate_style: str='value', y_offset_factor: float=0.35, show_non_significant: bool=True, correct_p: Optional[str]='bonferroni', title_: Optional[str]=None, y_label: Optional[str]=None, x_label: Optional[str]=None, fig_height: int=800, fig_width: int=600, plot_type: str='box', show_axis_lines: bool=True)
compute_effsize(a, b, eftype: str='cohen')
Compute effect size (Cohen's d).