Skip to contents

Read chromatogram data streams from 'Shimadzu' GCD files.

Usage

read_shimadzu_gcd(
  path,
  format_out = c("matrix", "data.frame", "data.table"),
  data_format = c("wide", "long"),
  read_metadata = TRUE,
  metadata_format = c("chromconverter", "raw")
)

Arguments

path

Path to GCD file.

format_out

Class of output. Either matrix, data.frame, or data.table.

data_format

Either wide (default) or long.

read_metadata

Logical. Whether to attach metadata.

metadata_format

Format to output metadata. Either chromconverter or raw.

Value

A 2D chromatogram from the chromatogram stream in matrix or data.frame format, according to the value of format_out. The chromatograms will be returned in wide or long format according to the value of data_format.

Details

A parser to read chromatogram data streams from 'Shimadzu' .gcd files. GCD files are encoded as 'Microsoft' OLE documents. The parser relies on the olefile package in Python to unpack the files. The PDA data is encoded in a stream called PDA 3D Raw Data:3D Raw Data. The GCD data stream contains a segment for each retention time, beginning with a 24-byte header.

The 24 byte header consists of the following fields:

  • 4 bytes: segment label (17234).

  • 4 bytes: Little-endian integer specifying the sampling interval in milliseconds.

  • 4 bytes: Little-endian integer specifying the number of values in the file.

  • 4 bytes: Little-endian integer specifying the total number of bytes in the file (However, this seems to be off by a few bytes?).

  • 8 bytes of 00s

After the header, the data are simply encoded as 64-bit (little-endian) floating-point numbers. The retention times can be (approximately?) derived from the number of values and the sampling interval encoded in the header.

Note

This parser is experimental and may still need some work. It is not yet able to interpret much metadata from the files.

Author

Ethan Bass