Finds and fits peaks and extracts peak parameters from a list of chromatograms at the specified wavelengths.
Arguments
- chrom_list
A list of profile matrices, each of the same dimensions (timepoints × wavelengths).
- lambdas
Character vector of wavelengths to find peaks at.
- fit
What type of fit to use. Current options are exponential-gaussian hybrid (
egh
), gaussian or raw. Theraw
setting performs trapezoidal integration directly on the raw data without fitting a peak shape.- sd.max
Maximum width (standard deviation) for peaks. Defaults to 50.
- max.iter
Maximum number of iterations for non-linear least squares in
fit_peaks
.- time.units
Units of
sd
,FWHM
,area
, andtau
(if applicable). Options are minutes ("min"
), seconds ("s"
), or milliseconds ("ms"
).- estimate_purity
Logical. Whether to estimate purity or not. Defaults to
FALSE
. (IfTRUE
, this will slow down the function significantly).- noise_threshold
Noise threshold. Argument to
get_purity
.- show_progress
Logical. Whether to show progress bar. Defaults to
TRUE
ifpbapply
is installed.- cl
Argument to
pblapply
ormclapply
. Either an integer specifying the number of clusters to use for parallel processing or a cluster object created bymakeCluster
. Defaults to 2. On Windows integer values will be ignored.- collapse
Logical. Whether to collapse multiple peak lists per sample into a single list when multiple wavelengths (
lambdas
) are provided.- ...
Additional arguments to
find_peaks
. Arguments provided tofind_peaks
can be used to fine-tune the peak-finding algorithm. Most importantly, thesmooth_window
should be increased if features are being split into multiple bins. Other arguments that can be used here includesmooth_type
,slope_thresh
, andamp_thresh
.
Value
The result is an S3 object of class peak_list
, containing a
nested list of data.frames containing information about the peaks fitted for
each chromatogram at each of wavelengths specified by the lamdas
argument. Each row in these data.frames is a peak and the columns contain
information about various peak parameters:
rt
: The retention time of the peak maximum.start
: The retention time where the peak is estimated to begin.end
: The retention time where the peak is estimated to end.sd
: The standard deviation of the fitted peak shape.tau
The value of parameter \(\tau\). This parameter determines peak asymmetry for peaks fit with an exponential-gaussian hybrid function. (This column will only appear iffit = egh
.FWHM
: The full-width at half maximum.height
: The height of the peak.area
: The area of the peak as determined by trapezoidal approximation.r.squared
The coefficient of determination (\(R^2\)) of the fitted model to the raw data. (Note: this value is calculated by fitting a linear model of the fitted peak values to the raw data. This approach is statistically questionable, since the models are fit using non-linear least squares. Nevertheless, it can still be useful as a rough metric for "goodness-of-fit").purity
The peak purity.
Details
Peaks are located by finding zero-crossings in the smoothed first derivative
of the specified chromatographic traces (function find_peaks
).
At the given positions, an exponential-gaussian hybrid (or regular gaussian)
function is fit to the signal using fit_peaks
according to the
value of fit
. Finally, the area is calculated using trapezoidal
approximation.
Additional arguments can be provided to find_peaks
to fine-tune
the peak-finding algorithm. For example, the smooth_window
can be
increased to prevent peaks from being split into multiple features. Overly
aggressive smoothing may cause small peaks to be overlooked.
The standard deviation (sd
), full-width at half maximum (FWHM
),
tau tau
, and area
are returned in units determined by
time.units
. By default, the units are in minutes. To compare directly
with 'ChemStation' integration results, the time units should be in seconds.
Note
The bones of this function are adapted from the getAllPeaks function authored by Ron Wehrens (though the underlying algorithms for peak identification and peak-fitting are not the same).
References
Lan, K. & Jorgenson, J. W. 2001. A hybrid of exponential and gaussian functions as a simple model of asymmetric chromatographic peaks. Journal of Chromatography A 915:1-13. doi:10.1016/S0021-9673(01)00594-5 .
Naish, P. J. & Hartwell, S. 1988. Exponentially Modified Gaussian functions - A good model for chromatographic peaks in isocratic HPLC? Chromatographia, 26: 285-296. doi:10.1007/BF02268168 .
O'Haver, Tom. Pragmatic Introduction to Signal Processing: Applications in scientific measurement. https://terpconnect.umd.edu/~toh/spectrum/ (Accessed January, 2022).
Wehrens, R., Carvalho, E., Fraser, P.D. 2015. Metabolite profiling in LC–DAD using multivariate curve resolution: the alsace package for R. Metabolomics 11:143-154. doi:10.1007/s11306-014-0683-5 .