Standard pre-processing of response matrices, consisting of a time axis and
a spectral axis (e.g. HPLC-DAD/UV data). For smooth data, like UV-VIS data,
the size of the matrix can be reduced by interpolation. By default,
the data are baseline-corrected in the time direction
(baseline.corr
) and smoothed in the
spectral dimension using cubic smoothing splines
(smooth.spline
.
Usage
preprocess(
X,
dim1,
dim2,
remove.time.baseline = TRUE,
spec.smooth = TRUE,
maxI = NULL,
parallel = NULL,
interpolate_rows = TRUE,
interpolate_cols = TRUE,
mc.cores,
cl = 2,
show_progress = NULL,
...
)
Arguments
- X
A numerical data matrix, or list of data matrices. Missing values are not allowed. If rownames or colnames attributes are used, they should be numerical and signify time points and wavelengths, respectively.
- dim1
A new, usually shorter, set of time points (numerical). The range of these should not exceed the range of the original time points.
- dim2
A new, usually shorter, set of wavelengths (numerical). The range of these should not exceed the range of the original wavelengths.
- remove.time.baseline
Logical, indicating whether baseline correction should be done in the time direction, according to
baseline.corr
. Default is TRUE.- spec.smooth
Logical, indicating whether smoothing should be done in the spectral direction, according to
smooth.spline
. Default is TRUE.- maxI
if given, the maximum intensity in the matrix is set to this value.
- parallel
Logical, indicating whether to use parallel processing. Defaults to TRUE (unless you're on Windows).
- interpolate_rows
Logical. Whether to interpolate along the time axis (
dim1
). Defaults to TRUE.- interpolate_cols
Logical. Whether to interpolate along the spectral axis (
dim2
). Defaults to TRUE.- mc.cores
How many cores to use for parallel processing. Defaults to 2. This argument has been deprecated and replaces with
cl
.- cl
Argument to
pblapply
ormclapply
. Either an integer specifying the number of clusters to use for parallel processing or a cluster object created bymakeCluster
. Defaults to 2. On Windows integer values will be ignored.- show_progress
Logical. Whether to show progress bar. Defaults to
TRUE
ifpbapply
is installed.- ...
Further optional arguments to
baseline.corr
.
Value
The function returns the preprocessed data matrix (or list of matrices), with row names and column names indicating the time points and wavelengths, respectively.
Note
Adapted from the preprocess function in the alsace package by Ron Wehrens.
References
Wehrens, R., Bloemberg, T.G., and Eilers P.H.C. 2015. Fast parametric time warping of peak lists. Bioinformatics 31:3063-3065. doi:10.1093/bioinformatics/btv299 .
Wehrens, R., Carvalho, E., Fraser, P.D. 2015. Metabolite profiling in LC–DAD using multivariate curve resolution: the alsace package for R. Metabolomics 11:1:143-154. doi:10.1007/s11306-014-0683-5 .