Scores¶
Score blocks compute a single numeric value from a stream of molecules, used as the optimization objective during Bayesian optimization.
Note: Score blocks only return scores when called during Bayesian optimization. Otherwise molecules pass through. This mechanism is in place so that agents have two paths to successful conversion of optimized to runnable workflows (delete the score and add a sink or add a sink at the end).
Enrichment¶
cmxflow.scores.automatic.EnrichmentScoreBlock(pooler: Callable[[Iterator[Any]], pd.DataFrame] = mol_to_dataframe, metric: Callable[[np.ndarray, np.ndarray], float] = enrichment_auc, **kwargs: Any)
¶
Bases: ScoreBlock
ScoreBlock for enrichment-based molecular scoring.
Uses molecule properties as features and computes enrichment AUC as the optimization metric. Non-numeric properties are automatically filtered.
Required Inputs
- target (text): Name of the binary label column (1 = hit, 0 = non-hit).
Output Properties
- workflow_score: Best enrichment score assigned during optimization.
Example
Initialize with scoring configuration.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
pooler
|
Callable[[Iterator[Any]], DataFrame]
|
Function to convert iterator to DataFrame. |
mol_to_dataframe
|
metric
|
Callable[[ndarray, ndarray], float]
|
Function to compute metric from scores and labels. |
enrichment_auc
|
**kwargs
|
Any
|
Keyword arguments passed to |
{}
|
Average Property¶
cmxflow.scores.automatic.AverageScoreBlock(pooler: Callable[[Iterator[Any]], pd.DataFrame] = mol_to_dataframe, **kwargs: Any)
¶
Bases: ScoreBlock
ScoreBlock that computes average of a molecular property.
Uses the same pooler approach as EnrichmentScoreBlock to convert molecules to a DataFrame, then computes the mean of the specified property column.
Required Inputs
- property (text): Name of the numeric property column to average.
Example
Initialize with pooler configuration.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
pooler
|
Callable[[Iterator[Any]], DataFrame]
|
Function to convert molecule iterator to DataFrame. |
mol_to_dataframe
|
Shape Overlay¶
cmxflow.scores.shape.ShapeOverlayScoreBlock(**kwargs: Any)
¶
Bases: ScoreBlock
ScoreBlock for shape overlay-based molecular scoring.
Computes the average maximum shape Tanimoto similarity between input molecules and reference ligands. Both input and reference molecules must have pre-existing 3D conformers.
The objective function computes
- For each input molecule, find the maximum shape Tanimoto similarity across all conformer pairs (input conformer x reference conformer).
- Return the average of these maximum similarities across all molecules.
Required Inputs
- query (file): Path to reference ligand file with 3D conformers.
Output Properties
- shape_overlay_score: Maximum shape Tanimoto similarity to any reference conformer.
- shape_overlay_reference: Name of the most similar reference molecule.
Example
Initialize the shape overlay score block.
Cluster Quality¶
cmxflow.scores.cluster.ClusterScoreBlock()
¶
Bases: ScoreBlock
Score clustering quality from RepresentativeClusterBlock.
Computes mean intra-cluster similarity (excluding singletons) minus the fraction of singleton molecules. Designed to be used with RepresentativeClusterBlock upstream.
Score formula
score = mean_similarity - (n_single / n_molecules)
Both terms are in [0, 1], so the score range is [-1, 1].
Initialize the cluster score block.