Skip to content

I/O

Source and sink blocks handle reading and writing molecules in various file formats.

Supported Formats

Extension Format
.sdf SD file
.sdf.gz Gzipped SD file
.mol2 Mol2 file
.smi SMILES file
.smi.gz Gzipped SMILES file
.csv CSV with SMILES column
.parquet Parquet with SMILES column

Sources

cmxflow.sources.reader.MoleculeSourceBlock(wrap: bool = True)

Bases: SourceBlock

Source block for reading molecules from various file formats.

Supports .sdf, .sdf.gz, .mol2, .smi, .smi.gz, .csv, and .parquet files. File format is automatically detected based on extension.

Example
workflow.add(
    MoleculeSourceBlock(),
    MoleculeSinkBlock(),
)

Initialize the molecule source block.

Parameters:

Name Type Description Default
wrap bool

If True (default), wrap molecules in Mol for property preservation through pickling.

True

Sinks

cmxflow.sinks.writer.MoleculeSinkBlock()

Bases: SinkBlock

Sink block for writing molecules to various file formats.

Supports .sdf, .sdf.gz, .mol2, .smi, .smi.gz, .csv, and .parquet files. File format is automatically detected based on extension.

Example
workflow.add(
    MoleculeSourceBlock(),
    MoleculeSinkBlock(),
)

Initialize the molecule sink block.