library(chromatographR)
data("Sa_warp")
pks <- get_peaks(Sa_warp,lambdas=210)
pktab <- get_peaktable(pks)
pktab <- attach_ref_spectra(pktab, ref = "max.cor")
After assigning reference spectra to our peak table, we can use
pearson correlations to easily locate peaks falling within a certain
similarity threshold to a reference spectrum. To do this, we must first
construct a correlation matrix using the cor
function.
Analysis of spectral correlations
cor_mat <- cor(pktab$ref_spectra)
We can then plot spectra falling within a desired similarity
threshold (t
) to a reference peak (in this case ‘V21’).
par(mfrow=c(2,2))
idx_90 <- which(cor_mat[,"V21"] > .9)
matplot(pktab$ref_spectra[,idx_90],type='l', ylab="Abs")
legend("top", "t = 0.90", bty = "n")
idx_95 <- which(cor_mat[,"V21"] > .95)
matplot(pktab$ref_spectra[,idx_95],type='l', ylab="Abs")
legend("top", "t = 0.95", bty = "n")
idx_97 <- which(cor_mat[,"V21"] > .97)
matplot(pktab$ref_spectra[,idx_97],type='l', ylab="Abs")
legend("top", "t = 0.97", bty = "n")
idx_99 <- which(cor_mat[,"V21"] > .99)
matplot(pktab$ref_spectra[,idx_99],type='l', ylab="Abs")
legend("top", "t = 0.99", bty = "n")
We can also extract retention times for spectra above the desired spectral similarity threshold with the reference spectra.
rts_99 = pktab$pk_meta["rt", idx_99]
rts_99
#> V9 V17 V21 V26
#> rt 12.18 14.44 15.51 16.78
Below we plot traces of our four chromatograms at 210 nm. The 4 peaks that match the reference spectrum at the 99% level are marked with solid black lines and the 6 peaks that match the reference spectrum at the 97% level are marked with dotted gray lines.
par(mfrow = c(1, 1))
plot_chroms(Sa_warp, lambdas=210)
abline(v = rts_99)
abline(v = pktab$pk_meta["rt", idx_97], lty=2, col = "darkgray")
If we are interested in the combined area of these peaks, we could also use these indices to sum them together.
apply(pktab$tab[,idx_99], 1, sum)
#> 119 121 122 458
#> 143.6671 100.9055 112.6122 171.6950
Hierarchical clustering of spectra
chromatographR
also includes a function
(cluster_spectra
) to group spectra using hierarchical
clustering. We can set the seed with the iseed
argument to
ensure repeatability among instantiations of this vignette. For
simplicity, we use the argument max.only = TRUE
to return
only the largest clusters.
par(mfrow=c(2,2))
cl <- cluster_spectra(pktab, alpha = 0.01, iseed = 1, max.only = TRUE)
The output is a dendrogram of peaks annotated with bootstrap values
showing the support for each node and spectra of the clusters with
p-values below the significance level specified by alpha
.
In this case, we recovered two clusters, representing two different
classes of compounds distinguished by their chromophores. The second
cluster closely matches the spectrum represented by peak
'V21'
. We can access the peaks in cluster two as
follows:
cl2 <- cl$clusters$c2@peaks
cl2
#> [1] "V6" "V9" "V11" "V14" "V15" "V17" "V19" "V21" "V22" "V23" "V26"
In this case, the peaks in cluster two happen to overlap completely with the peaks recovered through pearson correlation at the 90% threshold.