Medchem Rules and Filters Catalog

Comprehensive catalog of all available medicinal chemistry rules, structural alerts, and filters in medchem.

Drug-Likeness Rules
Lead-Likeness Rules
Fragment Rules
CNS Rules
Structural Alert Filters
Chemical Group Patterns

Drug-Likeness Rules

Rule of Five (Lipinski)

Reference: Lipinski et al., Adv Drug Deliv Rev (1997) 23:3-25

Purpose: Predict oral bioavailability

Criteria: - Molecular Weight ≤ 500 Da - LogP ≤ 5 - Hydrogen Bond Donors ≤ 5 - Hydrogen Bond Acceptors ≤ 10

Usage:

mc.rules.basic_rules.rule_of_five(mol)

Notes: - One of the most widely used filters in drug discovery - About 90% of orally active drugs comply with these rules - Exceptions exist, especially for natural products and antibiotics

Rule of Veber

Reference: Veber et al., J Med Chem (2002) 45:2615-2623

Purpose: Additional criteria for oral bioavailability

Criteria: - Rotatable Bonds ≤ 10 - Topological Polar Surface Area (TPSA) ≤ 140 Ų

Usage:

mc.rules.basic_rules.rule_of_veber(mol)

Notes: - Complements Rule of Five - TPSA correlates with cell permeability - Rotatable bonds affect molecular flexibility

Rule of Drug

Purpose: Combined drug-likeness assessment

Criteria: - Passes Rule of Five - Passes Veber rules - Does not contain PAINS substructures

Usage:

mc.rules.basic_rules.rule_of_drug(mol)

REOS (Rapid Elimination Of Swill)

Reference: Walters & Murcko, Adv Drug Deliv Rev (2002) 54:255-271

Purpose: Filter out compounds unlikely to be drugs

Criteria: - Molecular Weight: 200-500 Da - LogP: -5 to 5 - Hydrogen Bond Donors: 0-5 - Hydrogen Bond Acceptors: 0-10

Usage:

mc.rules.basic_rules.rule_of_reos(mol)

Golden Triangle

Reference: Johnson et al., J Med Chem (2009) 52:5487-5500

Purpose: Balance lipophilicity and molecular weight

Criteria: - 200 ≤ MW ≤ 50 × LogP + 400 - LogP: -2 to 5

Usage:

mc.rules.basic_rules.golden_triangle(mol)

Notes: - Defines optimal physicochemical space - Visual representation resembles a triangle on MW vs LogP plot

Lead-Likeness Rules

Rule of Oprea

Reference: Oprea et al., J Chem Inf Comput Sci (2001) 41:1308-1315

Purpose: Identify lead-like compounds for optimization

Criteria: - Molecular Weight: 200-350 Da - LogP: -2 to 4 - Rotatable Bonds ≤ 7 - Number of Rings ≤ 4

Usage:

mc.rules.basic_rules.rule_of_oprea(mol)

Rationale: Lead compounds should have "room to grow" during optimization

Rule of Leadlike (Soft)

Purpose: Permissive lead-like criteria

Criteria: - Molecular Weight: 250-450 Da - LogP: -3 to 4 - Rotatable Bonds ≤ 10

Usage:

mc.rules.basic_rules.rule_of_leadlike_soft(mol)

Rule of Leadlike (Strict)

Purpose: Restrictive lead-like criteria

Criteria: - Molecular Weight: 200-350 Da - LogP: -2 to 3.5 - Rotatable Bonds ≤ 7 - Number of Rings: 1-3

Usage:

mc.rules.basic_rules.rule_of_leadlike_strict(mol)

Fragment Rules

Rule of Three

Reference: Congreve et al., Drug Discov Today (2003) 8:876-877

Purpose: Screen fragment libraries for fragment-based drug discovery

Criteria: - Molecular Weight ≤ 300 Da - LogP ≤ 3 - Hydrogen Bond Donors ≤ 3 - Hydrogen Bond Acceptors ≤ 3 - Rotatable Bonds ≤ 3 - Polar Surface Area ≤ 60 Ų

Usage:

mc.rules.basic_rules.rule_of_three(mol)

Notes: - Fragments are grown into leads during optimization - Lower complexity allows more starting points

CNS Rules

Rule of CNS

Purpose: Central nervous system drug-likeness

Criteria: - Molecular Weight ≤ 450 Da - LogP: -1 to 5 - Hydrogen Bond Donors ≤ 2 - TPSA ≤ 90 Ų

Usage:

mc.rules.basic_rules.rule_of_cns(mol)

Rationale: - Blood-brain barrier penetration requires specific properties - Lower TPSA and HBD count improve BBB permeability - Tight constraints reflect CNS challenges

Structural Alert Filters

PAINS (Pan Assay INterference compoundS)

Reference: Baell & Holloway, J Med Chem (2010) 53:2719-2740

Purpose: Identify compounds that interfere with assays

Categories: - Catechols - Quinones - Rhodanines - Hydroxyphenylhydrazones - Alkyl/aryl aldehydes - Michael acceptors (specific patterns)

Usage:

mc.rules.basic_rules.pains_filter(mol)
# Returns True if NO PAINS found

Notes: - PAINS compounds show activity in multiple assays through non-specific mechanisms - Common false positives in screening campaigns - Should be deprioritized in lead selection

Common Alerts Filters

Source: Derived from ChEMBL curation and medicinal chemistry literature

Purpose: Flag common problematic structural patterns

Alert Categories: 1. Reactive Groups - Epoxides - Aziridines - Acid halides - Isocyanates

Metabolic Liabilities
Hydrazines
Thioureas
Anilines (certain patterns)
Aggregators
Polyaromatic systems
Long aliphatic chains
Toxicophores
Nitro aromatics
Aromatic N-oxides
Certain heterocycles

Usage:

alert_filter = mc.structural.CommonAlertsFilters()
has_alerts, details = alert_filter.check_mol(mol)

Return Format:

{
    "has_alerts": True,
    "alert_details": ["reactive_epoxide", "metabolic_hydrazine"],
    "num_alerts": 2
}

NIBR Filters

Source: Novartis Institutes for BioMedical Research

Purpose: Industrial medicinal chemistry filtering rules

Features: - Proprietary filter set developed from Novartis experience - Balances drug-likeness with practical medicinal chemistry - Includes both structural alerts and property filters

Usage:

nibr_filter = mc.structural.NIBRFilters()
results = nibr_filter(mols=mol_list, n_jobs=-1)

Return Format: Boolean list (True = passes)

Lilly Demerits Filter

Reference: Based on Eli Lilly medicinal chemistry rules

Source: 275 structural patterns accumulated over 18 years

Purpose: Identify assay interference and problematic functionalities

Mechanism: - Each matched pattern adds demerits - Molecules with >100 demerits are rejected - Some patterns add 10-50 demerits, others add 100+ (instant rejection)

Demerit Categories:

High Demerits (>50):
Known toxic groups
Highly reactive functionalities
Strong metal chelators
Medium Demerits (20-50):
Metabolic liabilities
Aggregation-prone structures
Frequent hitters
Low Demerits (5-20):
Minor concerns
Context-dependent issues

Usage:

lilly_filter = mc.structural.LillyDemeritsFilters()
results = lilly_filter(mols=mol_list, n_jobs=-1)

Return Format:

{
    "demerits": 35,
    "passes": True,  # (demerits ≤ 100)
    "matched_patterns": [
        {"pattern": "phenolic_ester", "demerits": 20},
        {"pattern": "aniline_derivative", "demerits": 15}
    ]
}

Chemical Group Patterns

Hinge Binders

Purpose: Identify kinase hinge-binding motifs

Common Patterns: - Aminopyridines - Aminopyrimidines - Indazoles - Benzimidazoles

Usage:

group = mc.groups.ChemicalGroup(groups=["hinge_binders"])
has_hinge = group.has_match(mol_list)

Application: Kinase inhibitor design

Phosphate Binders

Purpose: Identify phosphate-binding groups

Common Patterns: - Basic amines in specific geometries - Guanidinium groups - Arginine mimetics

Usage:

group = mc.groups.ChemicalGroup(groups=["phosphate_binders"])

Application: Kinase inhibitors, phosphatase inhibitors

Michael Acceptors

Purpose: Identify electrophilic Michael acceptor groups

Common Patterns: - α,β-Unsaturated carbonyls - α,β-Unsaturated nitriles - Vinyl sulfones - Acrylamides

Usage:

group = mc.groups.ChemicalGroup(groups=["michael_acceptors"])

Notes: - Can be desirable for covalent inhibitors - Often flagged as reactive alerts in screening

Reactive Groups

Purpose: Identify generally reactive functionalities

Common Patterns: - Epoxides - Aziridines - Acyl halides - Isocyanates - Sulfonyl chlorides

Usage:

group = mc.groups.ChemicalGroup(groups=["reactive_groups"])

Custom SMARTS Patterns

Define custom structural patterns using SMARTS:

custom_patterns = {
    "my_warhead": "[C;H0](=O)C(F)(F)F",  # Trifluoromethyl ketone
    "my_scaffold": "c1ccc2c(c1)ncc(n2)N",  # Aminobenzimidazole
}

group = mc.groups.ChemicalGroup(
    groups=["hinge_binders"],
    custom_smarts=custom_patterns
)

Filter Selection Guidelines

Initial Screening (High-Throughput)

Recommended filters: - Rule of Five - PAINS filter - Common Alerts (permissive settings)

rfilter = mc.rules.RuleFilters(rule_list=["rule_of_five", "pains_filter"])
alert_filter = mc.structural.CommonAlertsFilters()

Hit-to-Lead

Recommended filters: - Rule of Oprea or Leadlike (soft) - NIBR filters - Lilly Demerits

rfilter = mc.rules.RuleFilters(rule_list=["rule_of_oprea"])
nibr_filter = mc.structural.NIBRFilters()
lilly_filter = mc.structural.LillyDemeritsFilters()

Lead Optimization

Recommended filters: - Rule of Drug - Leadlike (strict) - Full structural alert analysis - Complexity filters

rfilter = mc.rules.RuleFilters(rule_list=["rule_of_drug", "rule_of_leadlike_strict"])
alert_filter = mc.structural.CommonAlertsFilters()
complexity_filter = mc.complexity.ComplexityFilter(max_complexity=400)

CNS Targets

Recommended filters: - Rule of CNS - Reduced PAINS criteria (CNS-focused) - BBB permeability constraints

rfilter = mc.rules.RuleFilters(rule_list=["rule_of_cns"])
constraints = mc.constraints.Constraints(
    tpsa_max=90,
    hbd_max=2,
    mw_range=(300, 450)
)

Fragment-Based Drug Discovery

Recommended filters: - Rule of Three - Minimal complexity - Basic reactive group check

rfilter = mc.rules.RuleFilters(rule_list=["rule_of_three"])
complexity_filter = mc.complexity.ComplexityFilter(max_complexity=250)

Important Considerations

False Positives and False Negatives

Filters are guidelines, not absolutes:

False Positives (good drugs flagged):
~10% of marketed drugs fail Rule of Five
Natural products often violate standard rules
Prodrugs intentionally break rules
Antibiotics and antivirals frequently non-compliant
False Negatives (bad compounds passing):
Passing filters doesn't guarantee success
Target-specific issues not captured
In vivo properties not fully predicted

Context-Specific Application

Different contexts require different criteria:

Target Class: Kinases vs GPCRs vs ion channels have different optimal spaces
Modality: Small molecules vs PROTACs vs molecular glues
Administration Route: Oral vs IV vs topical
Disease Area: CNS vs oncology vs infectious disease
Stage: Screening vs hit-to-lead vs lead optimization

Complementing with Machine Learning

Modern approaches combine rules with ML:

# Rule-based pre-filtering
rule_results = mc.rules.RuleFilters(rule_list=["rule_of_five"])(mols)
filtered_mols = [mol for mol, r in zip(mols, rule_results) if r["passes"]]

# ML model scoring on filtered set
ml_scores = ml_model.predict(filtered_mols)

# Combined decision
final_candidates = [
    mol for mol, score in zip(filtered_mols, ml_scores)
    if score > threshold
]

References

Lipinski CA et al. Adv Drug Deliv Rev (1997) 23:3-25
Veber DF et al. J Med Chem (2002) 45:2615-2623
Oprea TI et al. J Chem Inf Comput Sci (2001) 41:1308-1315
Congreve M et al. Drug Discov Today (2003) 8:876-877
Baell JB & Holloway GA. J Med Chem (2010) 53:2719-2740
Johnson TW et al. J Med Chem (2009) 52:5487-5500
Walters WP & Murcko MA. Adv Drug Deliv Rev (2002) 54:255-271
Hann MM & Oprea TI. Curr Opin Chem Biol (2004) 8:255-263
Rishton GM. Drug Discov Today (1997) 2:382-384

references/rules_catalog.md

Medchem Rules and Filters Catalog

Table of Contents

Drug-Likeness Rules

Rule of Five (Lipinski)

Rule of Veber

Rule of Drug

REOS (Rapid Elimination Of Swill)

Golden Triangle

Lead-Likeness Rules

Rule of Oprea

Rule of Leadlike (Soft)

Rule of Leadlike (Strict)

Fragment Rules

Rule of Three

CNS Rules

Rule of CNS

Structural Alert Filters

PAINS (Pan Assay INterference compoundS)

Common Alerts Filters

NIBR Filters

Lilly Demerits Filter

Chemical Group Patterns

Hinge Binders

Phosphate Binders

Michael Acceptors

Reactive Groups

Custom SMARTS Patterns

Filter Selection Guidelines

Initial Screening (High-Throughput)

Hit-to-Lead

Lead Optimization

CNS Targets

Fragment-Based Drug Discovery

Important Considerations

False Positives and False Negatives

Context-Specific Application

Complementing with Machine Learning

References