Returns a peak_table
object. The first slot contains a matrix of
intensities, where rows correspond to samples and columns correspond to
aligned features. The rest of the slots contain various meta-data about peaks,
samples, and experimental settings.
Arguments
- peak_list
A
peak_list
object created byget_peaks
, containing a nested list of peak tables: the first level is the sample, and the second level is the spectral wavelength. Every component is described by adata.frame
with a row for each peak and columns containing information on various peak parameters.- chrom_list
A list of chromatographic matrices.
- response
Indicates whether peak area or peak height is to be used as intensity measure. Defaults to
area
setting.- use.cor
Logical. Indicates whether to use corrected retention times (
rt.cor
column) or raw retention times (rt
column). Unless otherwise specified, thert.cor
column will be used by default if it exists in the providedpeak_list
.- hmax
Height at which the complete linkage dendrogram will be cut. Can be interpreted as the maximal intercluster retention time difference.
- plot_it
Logical. If
TRUE
, for every component a strip plot will be shown indicating the clustering.- ask
Logical. Ask before showing new plot? Defaults to
TRUE
.- clust
Specify whether to perform hierarchical clustering based on spectral similarity and retention time (
sp.rt
) or retention time alone (rt
). Defaults tort
. Thesp.rt
option is experimental and should be used with caution.- sigma.t
Width of gaussian in retention time distance function. Controls weight given to retention time if
sp.rt
is selected.- sigma.r
Width of gaussian in spectral similarity function. Controls weight given to spectral correlation if
sp.rt
is selected.- deepSplit
Logical. Controls sensitivity to cluster splitting. If
TRUE
, function will return more smaller clusters. See documentation forcutreeDynamic
for additional information.- verbose
Logical. Whether to print warning when combining peaks into single time window. Defaults to
FALSE
.- out
Specify
data.frame
ormatrix
as output. Defaults todata.frame
.
Value
The function returns an S3 peak_table
object, containing the
following elements:
tab
: the peak table itself – a data-frame of intensities in a sample x peak configuration.pk_meta
: A data.frame containing peak meta-data (e.g., the spectral component, peak number, and average retention time).sample_meta
: A data.frame of sample meta-data. Must be added usingattach_metadata
.ref_spectra
: A data.frame of reference spectra (in a wavelength x peak configuration). Must be added usingattach_ref_spectra
.args
: A vector of arguments given toget_peaktable
to generate the peak table.
Details
The function performs a complete linkage clustering of retention times across
all samples, and cuts at a height given by the user (which can be understood
as the maximal inter-cluster retention time difference) in the simple case
based on retention times. Clustering can also incorporate information about
spectral similarity using a distance function adapted from Broeckling et al.,
2014:
$$e^{-\frac{(1-c_{ij})^2}{2\sigma_r^2}} \cdot e^{-\frac{(1-(t_i-t_j)^2)}{2\sigma_t^2}}$$
If two peaks from the same sample are assigned to the same cluster, a warning
message is printed to the console. These warnings can usually be ignored, but
one could also consider reducing the hmax
variable. However, this may
lead to splitting of peaks across multiple clusters. Another option is to
filter the peaks by intensity to remove small features.
Note
This function is adapted from getPeakTable function in the alsace package by Ron Wehrens.
References
Broeckling, C. D., Afsar F.A., Neumann S., Ben-Hur A., and Prenni J.E. 2014. RAMClust: A Novel Feature Clustering Method Enables Spectral-Matching-Based Annotation for Metabolomics Data. Anal. Chem. 86:6812-6817. doi:10.1021/ac501530d .
Wehrens, R., Carvalho, E., Fraser, P.D. 2015. Metabolite profiling in LC–DAD using multivariate curve resolution: the alsace package for R. Metabolomics 11:143-154. doi:10.1007/s11306-014-0683-5 .