API reference / NMR and preprocessing

You are viewing the documentation for metbit 9.1.0. Change release context

metbit.nmr.preprocess

NMR and preprocessing module in metbit 9.1.0.

import metbit.nmr.preprocess

Classes

nmr_preprocessing

A class for preprocessing NMR data. This class handles the following preprocessing steps: 1. Reading FID files 2. Zero-filling 3. Fourier Transform 4. Phasing 5. Baseline correction 6. Calibration 7. Data storage in a pandas DataFrame 8. Data visualization 9. Data export

Parameters:

data_pathstr

Path to the directory containing FID files.

bin_sizefloat

Size of the bins for zero-filling (default: 0.0003).

auto_phasingbool

If True, automatic phasing is applied (default: True).

fn_str

Function name for phasing (default: 'acme').

baseline_correctionbool

If True, baseline correction is applied (default: True).

baseline_typestr

Type of baseline correction to apply (default: 'linear').

Options'corrector', 'constant', 'explicit', 'median', 'solvent filter'.

calibrationbool

If True, calibration is applied (default: True).

calib_typestr

Type of calibration to apply (default: 'tsp').

Options'tsp', 'acetate', 'glucose', 'alanine', 'formate'.

custom_rangetuple

Optional (start, end) PPM range for custom calibration.

export_pathstr

Path to save the processed data (default: None).

export_formatstr

Format to save the processed data (default: 'csv').

export_namestr

Name of the exported file (default: 'processed_nmr_data').

Attributes:

nmr_datapd.DataFrame

Processed NMR data.

ppmnp.ndarray

PPM scale.

dic_arraydict

Dictionary containing metadata from the FID files.

phase_datapd.DataFrame

DataFrame containing phase information.

Methods:

get_data() : pd.DataFrame Returns the processed NMR data. get_ppm() : np.ndarray Returns the PPM scale. get_metadata() : dict Returns the metadata from the FID files. get_phase() : pd.DataFrame Returns the phase information. plot_data() : None Plots the processed NMR data. export_data() : None Exports the processed NMR data to a specified format.

Example:

>>> fid = 'dev/launch/data/test_nmr_data' >>> nmr = nmr_preprocessing(fid, bin_size=0.0005, auto_phasing=False, fn_='acme', baseline_correction=True, baseline_type='corrector', calibration=True, calib_type='glucose') >>> data = nmr.get_data() >>> ppm = nmr.get_ppm() >>> metadata = nmr.get_metadata()

Methods

__init__(self, data_path: str, bin_size: float=0.0003, auto_phasing: bool=False, fn_='acme', baseline_correction: bool=True, baseline_type: str='linear', calibration: bool=True, calib_type: str='tsp', custom_range: tuple | None=None, custom_target: float | None=None, align: bool=False, align_reference: str='median', align_max_shift_ppm: float=0.02, align_top_n: int=30, align_windows: list[tuple[float, float]] | None=None)

get_data(self, flip_data=True)

Return the processed NMR data as a DataFrame.

Args:

flip_dataIf True (default), reverse the column order so that

the PPM axis runs from low to high field (left to right).

Returns: pd.DataFrame: Processed NMR spectra with PPM values as column headers and sample identifiers as the index.

Examples: >>> nmr = nmr_preprocessing('data/cohort_fids', calib_type='tsp') >>> spectra = nmr.get_data() >>> print(spectra.shape) (20, 39936)

get_ppm(self)

Return the PPM scale array for the processed spectra.

Returns: numpy.ndarray: 1-D array of PPM values corresponding to the columns of the DataFrame returned by get_data().

Examples: >>> nmr = nmr_preprocessing('data/cohort_fids', calib_type='tsp') >>> ppm = nmr.get_ppm() >>> print(ppm.min(), ppm.max()) -3.012 11.987

get_metadata(self)

Return the Bruker acquisition metadata for all processed samples.

Returns:

dictMapping of sample folder name to its nmrglue parameter

dictionary (dic) from the Bruker read step.

Examples: >>> nmr = nmr_preprocessing('data/cohort_fids', calib_type='tsp') >>> meta = nmr.get_metadata() >>> sample_key = list(meta.keys())[0] >>> print(meta[sample_key]['acqus']['SFO1']) 600.13

get_phase(self)

Return the phase correction values applied to each spectrum.

Returns: pd.DataFrame: DataFrame with columns 'p0' and 'p1' containing the zero-order and first-order phase angles (degrees) used for each sample.

Examples: >>> nmr = nmr_preprocessing('data/cohort_fids', auto_phasing=True) >>> phase_df = nmr.get_phase() >>> print(phase_df.head()) p0 p1 1 12.3 -245.1 2 9.8 -231.4

Functions

read_fid(data_path: str)

Read a Bruker FID file from the given directory.

Args:

data_pathPath to the Bruker FID directory.

Returns:

tuple(dic, data) where dic is the parameter dictionary and

data is the raw FID array.

Examples: >>> dic, data = read_fid('data/sample_001') >>> print(type(data)) <class 'numpy.ndarray'>

remove_digital_filter(dic, data)

Remove the Bruker digital filter from the FID data.

Args:

dicBruker parameter dictionary returned by read_fid.

dataRaw FID array returned by read_fid.

Returns: numpy.ndarray: FID data with the digital filter removed.

Examples: >>> dic, data = read_fid('data/sample_001') >>> data = remove_digital_filter(dic, data) >>> print(data.shape) (65536,)

generate_ppm_scale(dic, data)

Generate a PPM scale array for a processed NMR spectrum.

Args:

dicBruker parameter dictionary containing acquisition parameters.

dataProcessed spectrum array (used only to determine the number

of points).

Returns: numpy.ndarray: PPM values corresponding to each data point, running from high to low field.

Examples: >>> dic, raw = read_fid('data/sample_001') >>> raw = remove_digital_filter(dic, raw) >>> ppm = generate_ppm_scale(dic, raw) >>> print(ppm[0], ppm[-1]) 11.9873 -3.0124

phasing(data, index, auto=True, fn='peak_minima', p0=0.0, p1=0.0)

Apply phase correction to an NMR spectrum at the given index.

Args:

data2-D array of spectra where rows are individual spectra.

indexRow index of the spectrum to phase.

autoIf True, automatic phasing is applied using nmrglue

autops (default: True).

fnAlgorithm name passed to autops (default: 'peak_minima').

p0Zero-order phase correction in degrees (default: 0.0).

p1First-order phase correction in degrees (default: 0.0).

Returns: numpy.ndarray: The data array with the spectrum at *index* phase-corrected in place.

Examples: >>> import numpy as np >>> data = np.random.randn(5, 65536) + 1j * np.random.randn(5, 65536) >>> data = phasing(data, index=0, auto=True, fn='peak_minima') >>> print(data.shape) (5, 65536)

Source

metbit/nmr/preprocess.py at v9.1.0

metbit.nmr.preprocess

NMR and preprocessing module in metbit 9.1.0.

import metbit.nmr.preprocess

Classes

nmr_preprocessing

Parameters:

data_pathstr

Path to the directory containing FID files.

bin_sizefloat

Size of the bins for zero-filling (default: 0.0003).

auto_phasingbool

If True, automatic phasing is applied (default: True).

fn_str

Function name for phasing (default: 'acme').

baseline_correctionbool

If True, baseline correction is applied (default: True).

baseline_typestr

Type of baseline correction to apply (default: 'linear').

Options'corrector', 'constant', 'explicit', 'median', 'solvent filter'.

calibrationbool

If True, calibration is applied (default: True).

calib_typestr

Type of calibration to apply (default: 'tsp').

Options'tsp', 'acetate', 'glucose', 'alanine', 'formate'.

custom_rangetuple

Optional (start, end) PPM range for custom calibration.

export_pathstr

Path to save the processed data (default: None).

export_formatstr

Format to save the processed data (default: 'csv').

export_namestr

Name of the exported file (default: 'processed_nmr_data').

Attributes:

nmr_datapd.DataFrame

Processed NMR data.

ppmnp.ndarray

PPM scale.

dic_arraydict

Dictionary containing metadata from the FID files.

phase_datapd.DataFrame

DataFrame containing phase information.

Methods:

Example:

Methods

__init__(self, data_path: str, bin_size: float=0.0003, auto_phasing: bool=False, fn_='acme', baseline_correction: bool=True, baseline_type: str='linear', calibration: bool=True, calib_type: str='tsp', custom_range: tuple | None=None, custom_target: float | None=None, align: bool=False, align_reference: str='median', align_max_shift_ppm: float=0.02, align_top_n: int=30, align_windows: list[tuple[float, float]] | None=None)

get_data(self, flip_data=True)

Return the processed NMR data as a DataFrame.

Args:

flip_dataIf True (default), reverse the column order so that

the PPM axis runs from low to high field (left to right).

Returns: pd.DataFrame: Processed NMR spectra with PPM values as column headers and sample identifiers as the index.

Examples: >>> nmr = nmr_preprocessing('data/cohort_fids', calib_type='tsp') >>> spectra = nmr.get_data() >>> print(spectra.shape) (20, 39936)

get_ppm(self)

Return the PPM scale array for the processed spectra.

Returns: numpy.ndarray: 1-D array of PPM values corresponding to the columns of the DataFrame returned by get_data().

Examples: >>> nmr = nmr_preprocessing('data/cohort_fids', calib_type='tsp') >>> ppm = nmr.get_ppm() >>> print(ppm.min(), ppm.max()) -3.012 11.987

get_metadata(self)

Return the Bruker acquisition metadata for all processed samples.

Returns:

dictMapping of sample folder name to its nmrglue parameter

dictionary (dic) from the Bruker read step.

Examples: >>> nmr = nmr_preprocessing('data/cohort_fids', calib_type='tsp') >>> meta = nmr.get_metadata() >>> sample_key = list(meta.keys())[0] >>> print(meta[sample_key]['acqus']['SFO1']) 600.13

get_phase(self)

Return the phase correction values applied to each spectrum.

Returns: pd.DataFrame: DataFrame with columns 'p0' and 'p1' containing the zero-order and first-order phase angles (degrees) used for each sample.

Examples: >>> nmr = nmr_preprocessing('data/cohort_fids', auto_phasing=True) >>> phase_df = nmr.get_phase() >>> print(phase_df.head()) p0 p1 1 12.3 -245.1 2 9.8 -231.4

Functions

read_fid(data_path: str)

Read a Bruker FID file from the given directory.

Args:

data_pathPath to the Bruker FID directory.

Returns:

tuple(dic, data) where dic is the parameter dictionary and

data is the raw FID array.

Examples: >>> dic, data = read_fid('data/sample_001') >>> print(type(data)) <class 'numpy.ndarray'>

remove_digital_filter(dic, data)

Remove the Bruker digital filter from the FID data.

Args:

dicBruker parameter dictionary returned by read_fid.

dataRaw FID array returned by read_fid.

Returns: numpy.ndarray: FID data with the digital filter removed.

Examples: >>> dic, data = read_fid('data/sample_001') >>> data = remove_digital_filter(dic, data) >>> print(data.shape) (65536,)

generate_ppm_scale(dic, data)

Generate a PPM scale array for a processed NMR spectrum.

Args:

dicBruker parameter dictionary containing acquisition parameters.

dataProcessed spectrum array (used only to determine the number

of points).

Returns: numpy.ndarray: PPM values corresponding to each data point, running from high to low field.

Examples: >>> dic, raw = read_fid('data/sample_001') >>> raw = remove_digital_filter(dic, raw) >>> ppm = generate_ppm_scale(dic, raw) >>> print(ppm[0], ppm[-1]) 11.9873 -3.0124

phasing(data, index, auto=True, fn='peak_minima', p0=0.0, p1=0.0)

Apply phase correction to an NMR spectrum at the given index.

Args:

data2-D array of spectra where rows are individual spectra.

indexRow index of the spectrum to phase.

autoIf True, automatic phasing is applied using nmrglue

autops (default: True).

fnAlgorithm name passed to autops (default: 'peak_minima').

p0Zero-order phase correction in degrees (default: 0.0).

p1First-order phase correction in degrees (default: 0.0).

Returns: numpy.ndarray: The data array with the spectrum at *index* phase-corrected in place.

Examples: >>> import numpy as np >>> data = np.random.randn(5, 65536) + 1j * np.random.randn(5, 65536) >>> data = phasing(data, index=0, auto=True, fn='peak_minima') >>> print(data.shape) (5, 65536)