Classes
nmr_preprocessing
A class for preprocessing NMR data. This class handles the following preprocessing steps: 1. Reading FID files 2. Zero-filling 3. Fourier Transform 4. Phasing 5. Baseline correction 6. Calibration 7. Data storage in a pandas DataFrame 8. Data visualization 9. Data export
Parameters:
data_pathstrPath to the directory containing FID files.
bin_sizefloatSize of the bins for zero-filling (default: 0.0003).
auto_phasingboolIf True, automatic phasing is applied (default: True).
fn_strFunction name for phasing (default: 'acme').
baseline_correctionboolIf True, baseline correction is applied (default: True).
baseline_typestrType of baseline correction to apply (default: 'linear').
Options'corrector', 'constant', 'explicit', 'median', 'solvent filter'.calibrationboolIf True, calibration is applied (default: True).
calib_typestrType of calibration to apply (default: 'tsp').
Options'tsp', 'acetate', 'glucose', 'alanine', 'formate'.custom_rangetupleOptional (start, end) PPM range for custom calibration.
export_pathstrPath to save the processed data (default: None).
export_formatstrFormat to save the processed data (default: 'csv').
export_namestrName of the exported file (default: 'processed_nmr_data').
Attributes:
nmr_datapd.DataFrameProcessed NMR data.
ppmnp.ndarrayPPM scale.
dic_arraydictDictionary containing metadata from the FID files.
phase_datapd.DataFrameDataFrame containing phase information.
Methods:
get_data() : pd.DataFrame Returns the processed NMR data. get_ppm() : np.ndarray Returns the PPM scale. get_metadata() : dict Returns the metadata from the FID files. get_phase() : pd.DataFrame Returns the phase information. plot_data() : None Plots the processed NMR data. export_data() : None Exports the processed NMR data to a specified format.
Example:
>>> fid = 'dev/launch/data/test_nmr_data' >>> nmr = nmr_preprocessing(fid, bin_size=0.0005, auto_phasing=False, fn_='acme', baseline_correction=True, baseline_type='corrector', calibration=True, calib_type='glucose') >>> data = nmr.get_data() >>> ppm = nmr.get_ppm() >>> metadata = nmr.get_metadata()
Methods
__init__(self, data_path: str, bin_size: float=0.0003, auto_phasing: bool=False, fn_='acme', baseline_correction: bool=True, baseline_type: str='linear', calibration: bool=True, calib_type: str='tsp', custom_range: tuple | None=None, custom_target: float | None=None, align: bool=False, align_reference: str='median', align_max_shift_ppm: float=0.02, align_top_n: int=30, align_windows: list[tuple[float, float]] | None=None)
get_data(self, flip_data=True)
Return the processed NMR data as a DataFrame.
Args:
flip_dataIf True (default), reverse the column order so thatthe PPM axis runs from low to high field (left to right).
Returns: pd.DataFrame: Processed NMR spectra with PPM values as column headers and sample identifiers as the index.
Examples: >>> nmr = nmr_preprocessing('data/cohort_fids', calib_type='tsp') >>> spectra = nmr.get_data() >>> print(spectra.shape) (20, 39936)
get_ppm(self)
Return the PPM scale array for the processed spectra.
Returns: numpy.ndarray: 1-D array of PPM values corresponding to the columns of the DataFrame returned by get_data().
Examples: >>> nmr = nmr_preprocessing('data/cohort_fids', calib_type='tsp') >>> ppm = nmr.get_ppm() >>> print(ppm.min(), ppm.max()) -3.012 11.987
get_metadata(self)
Return the Bruker acquisition metadata for all processed samples.
Returns:
dictMapping of sample folder name to its nmrglue parameterdictionary (dic) from the Bruker read step.
Examples: >>> nmr = nmr_preprocessing('data/cohort_fids', calib_type='tsp') >>> meta = nmr.get_metadata() >>> sample_key = list(meta.keys())[0] >>> print(meta[sample_key]['acqus']['SFO1']) 600.13
get_phase(self)
Return the phase correction values applied to each spectrum.
Returns: pd.DataFrame: DataFrame with columns 'p0' and 'p1' containing the zero-order and first-order phase angles (degrees) used for each sample.
Examples: >>> nmr = nmr_preprocessing('data/cohort_fids', auto_phasing=True) >>> phase_df = nmr.get_phase() >>> print(phase_df.head()) p0 p1 1 12.3 -245.1 2 9.8 -231.4
Functions
read_fid(data_path: str)
Read a Bruker FID file from the given directory.
Args:
data_pathPath to the Bruker FID directory.Returns:
tuple(dic, data) where dic is the parameter dictionary anddata is the raw FID array.
Examples: >>> dic, data = read_fid('data/sample_001') >>> print(type(data)) <class 'numpy.ndarray'>
remove_digital_filter(dic, data)
Remove the Bruker digital filter from the FID data.
Args:
dicBruker parameter dictionary returned by read_fid.dataRaw FID array returned by read_fid.Returns: numpy.ndarray: FID data with the digital filter removed.
Examples: >>> dic, data = read_fid('data/sample_001') >>> data = remove_digital_filter(dic, data) >>> print(data.shape) (65536,)
generate_ppm_scale(dic, data)
Generate a PPM scale array for a processed NMR spectrum.
Args:
dicBruker parameter dictionary containing acquisition parameters.dataProcessed spectrum array (used only to determine the numberof points).
Returns: numpy.ndarray: PPM values corresponding to each data point, running from high to low field.
Examples: >>> dic, raw = read_fid('data/sample_001') >>> raw = remove_digital_filter(dic, raw) >>> ppm = generate_ppm_scale(dic, raw) >>> print(ppm[0], ppm[-1]) 11.9873 -3.0124
phasing(data, index, auto=True, fn='peak_minima', p0=0.0, p1=0.0)
Apply phase correction to an NMR spectrum at the given index.
Args:
data2-D array of spectra where rows are individual spectra.indexRow index of the spectrum to phase.autoIf True, automatic phasing is applied using nmrglueautops (default: True).
fnAlgorithm name passed to autops (default: 'peak_minima').p0Zero-order phase correction in degrees (default: 0.0).p1First-order phase correction in degrees (default: 0.0).Returns: numpy.ndarray: The data array with the spectrum at *index* phase-corrected in place.
Examples: >>> import numpy as np >>> data = np.random.randn(5, 65536) + 1j * np.random.randn(5, 65536) >>> data = phasing(data, index=0, auto=True, fn='peak_minima') >>> print(data.shape) (5, 65536)