Building An Agentic Cheminformatics Workflow Tool
Published:
One of my goals for 2026 is to figure out exactly how useful agentic systems can be for cheminformatics and drug discovery. To get started I built chemscii to learn the basics of model context protocol (MCP) server and Claude Code tool development. However, the chemscii tool itself was a toy example. In this post we will start to explore building agentic tools that might be useful for answering real cheminformatics and drug discovery research questions.
MCP servers make it straightforward to deploy tools for agentic systems. One approach to cheminformatics tool development is to parameterize opinionated workflows and use agents to parse the inputs (e.g., ligand preparation with ligprep, ligand-based virtual screening with FastROCS, and docking with gnina). The advantage of this method is that we can enshrine best practices and democratize access to existing production software previously run by experts. However, it is rigid and may not effectively handle new applications or problems that deviate from the mean. In addition, as an experienced cheminformatician it is underwhelming to watch an agent parse a user query and construct a CLI command when the initial text has more characters than running the command myself. Agentic coding tools are powerful not because they run a set of fixed scripts but because they can build and run their own. What if instead of building plugins for legacy software, we built a toolbox of capabilities for agents to build their own workflows?
Agent systems have demonstrated the ability to optimize workflows and output when given guard rails and feedback mechanisms. For example, a Ralph loop with unit tests to guide requirements and pre-commit hooks / continuous integration (CI) for style / type checking can deliver impressive results. However, when a workflow has a well-defined set of parameters and a clear objective this approach is inefficient compared to the principled algorithms developed for black-box optimization. Consider a virtual screen (VS) to identify hits in early drug discovery. The data-driven approach to building a VS workflow would be to identify the parameters that could impact the outcome of the screen, construct a benchmark set of known active and inactive (or decoy) molecules, and optimize the protocol for enrichment. What if agents constructed workflows based on context and then passed the workflow to an optimization algorithm to select the best configuration of parameters?
The result of exploring the above questions is cmxflow, a Python package for building, running, and optimizing arbitrarily composable cheminformatics workflows.
What It Does
While the main goal of this project was to explore building a general agentic tool for cheminformatics and drug discovery, the package is grounded in a programmatic API that can be used and tested directly in scripts and notebooks. From a development point of view, this allows us to extend the capabilities and associated unit tests in familiar way, by following schema enforced by abstractions. Then MCP server integration is as simple as exposing a new block and some context.
The core abstraction is a Block. Each block operates on an input stream of molecules and performs an operation, like standardization, substructure filtering, fingerprint similarity, conformer generation, docking, or clustering. Any number of blocks can be chained together in (almost) any order to construct a workflow: a SourceBlock reads molecules from a file, MoleculeBlock objects transform them, and a terminal block either writes the results (SinkBlock) or computes a score (ScoreBlock) to be used as a Bayesian optimization objective.
Example Prompt With Claude Code
I need to build a ligand-based virtual screening workflow. I’m not sure if 2D or 3D is better. Can you optimize two workflows? I want to see the results of 2D first. The benchmark is in “benchmark.parquet” with hits labeled in the “active” column and the query is in “reference.sdf”.
⏺ Let me start by checking available blocks and existing workflows.
⏺ cmxflow - build_workflow (MCP)(action: "list_blocks")
⎿ {
"status": "success",
"blocks": {
… +37 lines (ctrl+o to expand)
⏺ cmxflow - manage_workflows (MCP)(action: "list")
⎿ {
"status": "success",
"message": "Registered workflows:",
… +6 lines (ctrl+o to expand)
⏺ There's an existing prep workflow but not what we need. Let me build the 2D similarity workflow first, then we'll do 3D after.
For virtual screening benchmarking, we'll use EnrichmentScoreBlock which should be maximized. Does that sound right to you before I proceed?
Let me build the 2D workflow now.
⏺ cmxflow - build_workflow (MCP)(action: "create")
⎿ {
"status": "success",
"message": "Created new workflow with MoleculeSourceBlock",
… +3 lines (ctrl+o to expand)
⏺ cmxflow - build_workflow (MCP)(action: "add_block", block_type: "MoleculeStandardizeBlock")
⎿ {
"status": "success",
"message": "Added MoleculeStandardize",
… +4 lines (ctrl+o to expand)
⏺ cmxflow - build_workflow (MCP)(action: "add_block", block_type: "MoleculeSimilarityBlock")
⎿ {
"status": "success",
"message": "Added Molecule2DSimilarity",
… +10 lines (ctrl+o to expand)
⏺ cmxflow - build_workflow (MCP)(action: "add_block", block_type: "EnrichmentScoreBlock", block_config:
{"property":"max_similarity","label_column":"active"})
⎿ {
"status": "success",
"message": "Added EnrichmentScoreBlock",
… +13 lines (ctrl+o to expand)
⏺ cmxflow - build_workflow (MCP)(action: "validate")
⎿ {
"status": "success",
"message": "Workflow is valid",
… +15 lines (ctrl+o to expand)
⏺ Now let me check the required inputs and set them, then get the optimizable parameters.
⏺ cmxflow - run_workflow (MCP)(action: "set_inputs", inputs: {"2.file@queries":"reference.sdf","3.text@target":"active"})
⎿ {
"status": "success",
"message": "Inputs set successfully"
}
⏺ cmxflow - build_workflow (MCP)(action: "get_params")
⎿ {
"status": "success",
"params": [
… +30 lines (ctrl+o to expand)
⏺ The workflow is validated and ready. Let me confirm the optimization setup with you:
The optimizer will tune these 2D similarity parameters:
- fingerprint_type: morgan, rdkit, maccs, atom_pair, topological_torsion
- similarity_metric: tanimoto, dice, cosine, sokal, russel
- radius: 1–4
- nbits: 512–4096
The enrichment score will be maximized. 30 trials is typical — does that work for you, or would you prefer more/fewer?
⏺ User answered Claude's questions:
⎿ · How many optimization trials for the 2D workflow? → 30 trials (Recommended)
⏺ Great, 30 trials it is. Let me kick off the 2D optimization.
---
🔬 2D Fingerprint Virtual Screening Optimization 🔬
MoleculeSource → MoleculeStandardize → Molecule2DSimilarity → EnrichmentScoreBlock
"The good thing about science is that it's true whether or not you believe in it." — Carl Sagan (close enough to chemistry!)
---
⏺ cmxflow - optimize_workflow (MCP)(action: "start", n_trials: 30, input_file: "benchmark.parquet", inputs:
{"2.file@queries":"reference.sdf","3.text@target":"active"}, direction: "maximize")
⎿ {
"status": "started",
"message": "Optimization started with 30 trials",
… +3 lines (ctrl+o to expand)
⏺ The 2D optimization is running (30 trials, maximizing enrichment AUC). I'll check the status when you're ready — just let me know when you'd like to see results, or I can check progress at any point.
While we wait, shall I go ahead and build the 3D workflow so it's ready to optimize next?
✻ Sautéed for 53s
❯ It should have been quick. How did the optimization turn out?
⏺ cmxflow - optimize_workflow (MCP)(action: "status")
⎿ {
"status": "completed",
"message": "Optimization completed successfully",
… +8 lines (ctrl+o to expand)
⏺ The 2D optimization is done. Here are the results:
Best Enrichment AUC: 0.65
┌───────────────────┬─────────────────────┐
│ Parameter │ Best Value │
├───────────────────┼─────────────────────┤
│ fingerprint_type │ topological_torsion │
├───────────────────┼─────────────────────┤
│ similarity_metric │ cosine │
├───────────────────┼─────────────────────┤
│ radius │ 1 │
├───────────────────┼─────────────────────┤
│ nbits │ 738 │
└───────────────────┴─────────────────────┘
Interesting — topological torsion fingerprints with cosine similarity came out on top over the more common Morgan/Tanimoto combo. A score of 0.65 is decent but leaves room for improvement.
Want me to set the best parameters and save this workflow, then build and optimize the 3D workflow to see if we can do better?
❯ yes, save it and build the 3D workflow
⏺ cmxflow - optimize_workflow (MCP)(action: "set_best_params")
⎿ {
"status": "success",
"message": "Best parameters applied to workflow",
… +7 lines (ctrl+o to expand)
⏺ User answered Claude's questions:
⎿ · What name would you like for the saved 2D workflow? → 2d-virtual-screen
⏺ cmxflow - manage_workflows (MCP)(action: "save", name: "2d-virtual-screen")
⎿ {
"status": "success",
"message": "Workflow registered as '2d-virtual-screen'. List registered workflows with the 'list' action."
}
⏺ Saved. Now let me build the 3D workflow.
⏺ cmxflow - build_workflow (MCP)(action: "create")
⎿ {
"status": "success",
"message": "Created new workflow with MoleculeSourceBlock",
… +3 lines (ctrl+o to expand)
⏺ cmxflow - build_workflow (MCP)(action: "add_block", block_type: "MoleculeStandardizeBlock")
⎿ {
"status": "success",
"message": "Added MoleculeStandardize",
… +4 lines (ctrl+o to expand)
⏺ cmxflow - build_workflow (MCP)(action: "add_block", block_type: "EnumerateStereoBlock")
⎿ {
"status": "success",
"message": "Added EnumerateStereochemistry",
… +5 lines (ctrl+o to expand)
⏺ cmxflow - build_workflow (MCP)(action: "add_block", block_type: "ConformerGenerationBlock")
⎿ {
"status": "success",
"message": "Added ConformerGeneration",
… +7 lines (ctrl+o to expand)
⏺ cmxflow - build_workflow (MCP)(action: "add_block", block_type: "MoleculeAlignBlock")
⎿ {
"status": "success",
"message": "Added MoleculeAlign",
… +12 lines (ctrl+o to expand)
⏺ cmxflow - build_workflow (MCP)(action: "add_block", block_type: "Molecule3DSimilarityBlock")
⎿ {
"status": "success",
"message": "Added Molecule3DSimilarity",
… +14 lines (ctrl+o to expand)
⏺ cmxflow - build_workflow (MCP)(action: "add_block", block_type: "EnrichmentScoreBlock", block_config:
{"property":"similarity_3d","label_column":"active"})
⎿ {
"status": "success",
"message": "Added EnrichmentScoreBlock",
… +17 lines (ctrl+o to expand)
⏺ Now let me make the conformer generation step parallel (it's the slow step), validate, and set inputs.
⏺ cmxflow - build_workflow (MCP)(action: "make_parallel", index: 3, block_config: {"max_workers":4})
⎿ {
"status": "success",
"message": "Parallelized ParallelConformerGeneration at index 3",
… +17 lines (ctrl+o to expand)
⏺ cmxflow - build_workflow (MCP)(action: "validate")
⎿ {
"status": "success",
"message": "Workflow is valid",
… +18 lines (ctrl+o to expand)
⏺ cmxflow - run_workflow (MCP)(action: "set_inputs", inputs:
{"4.file@query":"reference.sdf","5.file@query":"reference.sdf","6.text@target":"active"})
⎿ {
"status": "success",
"message": "Inputs set successfully"
}
⏺ Workflow is validated and inputs are set. Ready to optimize. The 3D workflow will be slower due to conformer generation, so 30 trials will take longer. Shall I start with 30 trials again?
---
🧬 3D Shape-Based Virtual Screening Optimization 🧬
MoleculeSource → MoleculeStandardize → EnumerateStereochemistry → ParallelConformerGeneration → MoleculeAlign → Molecule3DSimilarity →
EnrichmentScoreBlock
"The structure of a molecule is the key to understanding its properties." — Linus Pauling
---
⏺ cmxflow - optimize_workflow (MCP)(action: "start", n_trials: 30, input_file: "benchmark.parquet", inputs:
{"4.file@query":"reference.sdf","5.file@query":"reference.sdf","6.text@target":"active"}, direction:
"maximize")
⎿ {
"status": "started",
"message": "Optimization started with 30 trials",
… +3 lines (ctrl+o to expand)
⏺ The 3D optimization is running. This one will take longer due to conformer generation and 3D alignment at each trial. I'll check results when you ask — just let me know!
✻ Cooked for 57s
❯ check the 3D optimization status
⏺ cmxflow - optimize_workflow (MCP)(action: "status")
⎿ {
"status": "running",
"message": "Optimization in progress",
… +12 lines (ctrl+o to expand)
⏺ Still running — 17/30 trials completed so far.
Best score so far: 0.65 (same as the 2D result)
Current best parameters:
┌──────────────────┬───────────────┐
│ Parameter │ Value │
├──────────────────┼───────────────┤
│ numConfs │ 62 │
├──────────────────┼───────────────┤
│ pruneRmsThresh │ 0.017 │
├──────────────────┼───────────────┤
│ useRandomCoords │ false │
├──────────────────┼───────────────┤
│ alignment_method │ crippen_o3a │
├──────────────────┼───────────────┤
│ method │ shape_tversky │
├──────────────────┼───────────────┤
│ tversky_alpha │ 0.558 │
├──────────────────┼───────────────┤
│ tversky_beta │ 0.229 │
└──────────────────┴───────────────┘
Interesting that it's gravitating toward Tversky similarity (asymmetric) rather than standard shape Tanimoto. 13 trials left — it may still improve. Let me know when you'd like to check again.
❯ check again
⎿ ⧉ Selected 1 lines from ../cmxflow/docs/using-with-claude.md in Visual Studio Code
⏺ cmxflow - optimize_workflow (MCP)(action: "status")
⎿ {
"status": "completed",
"message": "Optimization completed successfully",
… +11 lines (ctrl+o to expand)
⏺ The 3D optimization is done.
Best Enrichment AUC: 0.65 — same as 2D.
┌──────────────────┬───────────────┐
│ Parameter │ Best Value │
├──────────────────┼───────────────┤
│ numConfs │ 62 │
├──────────────────┼───────────────┤
│ pruneRmsThresh │ 0.017 │
├──────────────────┼───────────────┤
│ useRandomCoords │ false │
├──────────────────┼───────────────┤
│ alignment_method │ crippen_o3a │
├──────────────────┼───────────────┤
│ method │ shape_tversky │
├──────────────────┼───────────────┤
│ tversky_alpha │ 0.558 │
├──────────────────┼───────────────┤
│ tversky_beta │ 0.229 │
└──────────────────┴───────────────┘
Summary: Both 2D and 3D workflows achieved the same enrichment (0.65). The 2D workflow (topological torsion + cosine) is significantly faster since it skips conformer generation and alignment. For this target, the 2D approach gives you the same performance at a fraction of the computational cost.
Would you like me to:
1. Save the 3D workflow as well?
2. Set the best params and run either workflow on the full dataset?
What I Learned About Building Agentic Tools
This was the real point of the project. A few things stood out:
Composability is effective. The block-and-workflow abstraction isn’t just clean software design‚ it’s what makes the tool legible to an LLM. Claude can reason about a workflow much more naturally than it can reason about raw function calls with dozens of keyword arguments (e.g., “adding conformer generation requires stereochemistry enumeration before”). Giving the agent a small vocabulary of well-defined building blocks turns out to be a pretty effective way to get reliable but diverse behavior. It also gives multiple pathways to success.
Bayesian optimization is still relevant. Agents are amazing at parsing context and building complex workflows. But once a workflow is built Bayesian parameter optimization enables efficient tuning without burning through tokens.
Background execution is worth the complexity. Bayesian optimization runs can take minutes or hours. The MCP server was designed to run optimization in a background thread so Claude can keep conversing while it runs, then check the status on demand. Getting the state management right (thread pool, global optimizer state) added real complexity, but the alternative‚ the agent just going silent for five minutes‚ would have made the tool painful to use.
CI guardrails are important. This echoes what I said in the chemscii post: having pytest, black, ruff, and mypy in CI from the start was helpful when developing a larger application. Claude Code writes a lot of code fast. It also introduces subtle bugs fast. The feedback loop of agent writes code, CI catches the issue, agent fixes it worked well, but only because the guardrails were already in place. This larger project would have been a mess without it.
What’s Next
There are still a few capabilities to add and kinks to work out in this project. However, I believe that cmxflow may be useful for some cheminformatics and drug discovery applications. The next step will be to finish the package and share it with the community. Then you can decide for yourself. Stay tuned!
