Skip to contents

This function parses SMILES strings and computes chemical descriptors using rcdk. It stores cleaned, non-redundant descriptors in tools$chem_descriptors.

Usage

compute_chem_descriptors(
  data,
  compound_column = NULL,
  treatment_ids = NULL,
  r_squared = 0.6,
  descriptors = NULL
)

Arguments

data

A tidyseurat object with a smiles column.

compound_column

Column in metadata with compound identifiers, default combined_ids

treatment_ids

A list of unique sample identifiers, default combined_ids

r_squared

R squared value, default of 0.6

descriptors

Specify a subset of descriptors of interest from rcdk

Value

The same tidyseurat object with a new entry in tools$chem_descriptors.

Examples

# \donttest{
mock_data <- tibble::tibble(
  Treatment = c("Aspirin", "Caffeine", "NonExistentCompound_123")
)
result <- compute_smiles(mock_data, compound_column = "Treatment" )
data <- compute_chem_descriptors(result, 
   compound_column = "Treatment", 
   treatment_ids = mock_data$Treatment, 
   descriptors = "org.openscience.cdk.qsar.descriptors.molecular.FractionalCSP3Descriptor")
# }