API reference / Other

You are viewing the documentation for metbit 9.1.0. Change release context

metbit._native

_native.py - Compute-backend dispatcher for metbit large-scale kernels.

import metbit._native

Functions

native_available()

Return True when the compiled C extension is active.

gpu_available()

Return True when a CUDA-capable GPU backend is available.

backend_info()

Return a dict describing the active compute backends.

pearson_columns(data, anchor_index: int, chunk_size: int=_DEFAULT_CHUNK, n_jobs: int=_N_JOBS)

Pearson r between one column and all other columns of a 2D matrix.

Backend auto-selected based on dataset size and available hardware:

GPU (cupy/torch) n*p > LARGE_THRESH and GPU available C + OpenMP (parallel) n*p > SMALL_THRESH and C ext available C single-threaded n*p <= SMALL_THRESH and C ext available multiprocessing + NumPy C ext absent, n_jobs > 1 chunked NumPy (1 process) absolute fallback

Parameters

dataarray-like, shape (n_samples, n_features)

anchor_indexint

chunk_sizeint

Feature chunk for chunked NumPy / multiprocessing paths.

n_jobsint

Worker processes for the multiprocessing path.

column_variances(data, chunk_size: int=_DEFAULT_CHUNK, n_jobs: int=_N_JOBS)

Per-column sample variance for feature pre-selection.

Auto-dispatches to GPU, C extension, multiprocessing, or NumPy using the same hierarchy as pearson_columns.

Parameters

dataarray-like, shape (n_samples, n_features)

chunk_sizeint

n_jobsint

Returns

np.ndarray of shape (n_features,), float64

vip_scores(t_scores: np.ndarray, x_weights: np.ndarray, y_loadings: np.ndarray)

Vectorised VIP scores.

VIP[i] = sqrt( p * sum_h( S[h] * (w[i,h]/||w[:,h]||)^2 ) / sum(S) ) where S[h] = ||t[:,h]||^2 * q[h]^2.

Parameters

t_scores(n_samples, n_components) float64

x_weights(n_features, n_components) float64

y_loadings(n_components,) or (1, n_components) float64

Returns

np.ndarray of shape (n_features,), float64

nipals(x: np.ndarray, y: np.ndarray, tol: float=1e-10, max_iter: int=1000)

NIPALS-PLS1. Returns (w, u, c, t).

Dispatches to the C extension when available; falls back to pure NumPy.

Parameters

x(n, p) float64

y(n,) float64

tolconvergence tolerance

max_itermaximum iterations

Returns

w(p,) weights

u(n,) y-scores

cfloat y-weight

t(n,) x-scores

scale_transform(X: np.ndarray, mean: np.ndarray, s: np.ndarray)

Element-wise (X - mean) / s.

s is std for standard scaling or sqrt(std) for pareto scaling. Dispatches to the C extension when available; otherwise NumPy broadcast.

Parameters

X(n, p) float64

mean(p,) float64 column means

s(p,) float64 divisors (std or sqrt(std))

Returns

np.ndarray (n, p) float64

xcorr_max_shift(template: np.ndarray, query: np.ndarray, max_shift: int)

Find the integer shift in [-max_shift, max_shift] maximising cross-correlation.

Parameters

template(n,) float64 reference spectrum window

query(n,) float64 sample spectrum window

max_shiftint half-width of the shift search range

Returns

(shift, corr) – best integer shift and its cross-correlation value

pqn_median_quotient(sample: np.ndarray, reference: np.ndarray)

Median of (sample / reference) over non-zero reference entries.

Parameters

sample(n,) float64

reference(n,) float64

Returns

float – median quotient (1.0 if no valid entries)

Source

metbit/_native.py at v9.1.0

metbit._native

_native.py - Compute-backend dispatcher for metbit large-scale kernels.

import metbit._native

Functions

native_available()

Return True when the compiled C extension is active.

gpu_available()

Return True when a CUDA-capable GPU backend is available.

backend_info()

Return a dict describing the active compute backends.

pearson_columns(data, anchor_index: int, chunk_size: int=_DEFAULT_CHUNK, n_jobs: int=_N_JOBS)

Pearson r between one column and all other columns of a 2D matrix.

Backend auto-selected based on dataset size and available hardware:

Parameters

dataarray-like, shape (n_samples, n_features)

anchor_indexint

chunk_sizeint

Feature chunk for chunked NumPy / multiprocessing paths.

n_jobsint

Worker processes for the multiprocessing path.

column_variances(data, chunk_size: int=_DEFAULT_CHUNK, n_jobs: int=_N_JOBS)

Per-column sample variance for feature pre-selection.

Auto-dispatches to GPU, C extension, multiprocessing, or NumPy using the same hierarchy as pearson_columns.

Parameters

dataarray-like, shape (n_samples, n_features)

chunk_sizeint

n_jobsint

Returns

np.ndarray of shape (n_features,), float64

vip_scores(t_scores: np.ndarray, x_weights: np.ndarray, y_loadings: np.ndarray)

Vectorised VIP scores.

VIP[i] = sqrt( p * sum_h( S[h] * (w[i,h]/||w[:,h]||)^2 ) / sum(S) ) where S[h] = ||t[:,h]||^2 * q[h]^2.

Parameters

t_scores(n_samples, n_components) float64

x_weights(n_features, n_components) float64

y_loadings(n_components,) or (1, n_components) float64

Returns

np.ndarray of shape (n_features,), float64

nipals(x: np.ndarray, y: np.ndarray, tol: float=1e-10, max_iter: int=1000)

NIPALS-PLS1. Returns (w, u, c, t).

Dispatches to the C extension when available; falls back to pure NumPy.

Parameters

x(n, p) float64

y(n,) float64

tolconvergence tolerance

max_itermaximum iterations

Returns

w(p,) weights

u(n,) y-scores

cfloat y-weight

t(n,) x-scores

scale_transform(X: np.ndarray, mean: np.ndarray, s: np.ndarray)

Element-wise (X - mean) / s.

s is std for standard scaling or sqrt(std) for pareto scaling. Dispatches to the C extension when available; otherwise NumPy broadcast.

Parameters

X(n, p) float64

mean(p,) float64 column means

s(p,) float64 divisors (std or sqrt(std))

Returns

np.ndarray (n, p) float64

xcorr_max_shift(template: np.ndarray, query: np.ndarray, max_shift: int)

Find the integer shift in [-max_shift, max_shift] maximising cross-correlation.

Parameters

template(n,) float64 reference spectrum window

query(n,) float64 sample spectrum window

max_shiftint half-width of the shift search range

Returns

(shift, corr) – best integer shift and its cross-correlation value

pqn_median_quotient(sample: np.ndarray, reference: np.ndarray)

Median of (sample / reference) over non-zero reference entries.

Parameters

sample(n,) float64

reference(n,) float64

Returns

float – median quotient (1.0 if no valid entries)