ChemFTO

Turn patent PDFs into molecular data you can actually use.

Patent documents flowing into AI-driven chemical intelligence dashboards
Current benchmark signal Live local data

What makes this different

    What the local artifacts already show

    A proof page first. A broader benchmark next.

    This page uses real outputs already generated inside the ChemFTO workspace. Nothing below is a placeholder score. Every figure comes from local benchmark files, curated PDFs, or a recent live scheduler run.

    Anchor patent

    -

    Structure-heavy extraction run used as the first public benchmark anchor.

    Structured rows

    -

    Rows emitted into the integrated analysis table for the anchor patent.

    Canonical SMILES coverage

    -

    Rows with canonical SMILES populated in the current anchor run.

    Conflict review rows

    -

    Rows deliberately routed for review instead of being hidden.

    Current benchmark corpus

    -

    A curated chemistry-heavy PDF set already lives inside the ChemFTO repo. It is the basis for the first externally presentable benchmark program.

    Workflow

    Search first. Filter fast. Hand off only what deserves deep analysis.

    The product story should not start with a generic “AI platform” claim. It should start with a disciplined pipeline that turns raw patent noise into traceable chemistry output.

    01

    Search and collect

    Target-driven patent retrieval, publication deduplication, family consolidation, and source-level logging across the enabled search stack.

    02

    Filter for relevance

    Nanobot-assisted screening separates keep, target-pending, and exclude candidates before expensive downstream work begins.

    03

    Extract structures and evidence

    Stage05 turns selected patents into structured molecular rows, provenance-rich outputs, and review queues instead of raw screenshots.

    Recent live pipeline snapshot

    -

    -

    This is exactly why the product needs a benchmark page. The pipeline is already producing measurable stage-by-stage outputs, even when a given search ends with zero keep candidates.

    Benchmark snapshot

    Show the numbers that matter now, and label the rest honestly.

    Early product pages lose trust when they mix validated numbers with aspirational claims. ChemFTO should separate current extraction evidence, current search funnel evidence, and the benchmark program still in progress.

    Anchor patent extraction

    -

    Rows with SMILES -
    Rows with IUPAC -
    Rows with IC50 evidence -
    Run time -

    Text-forward chemistry pass

    -

    Final rows -
    Rows with bioactivity evidence -
    Rows with IC50 -
    Wall time -

    How to present this publicly

    Benchmark framing that will survive scrutiny

    1. Use a fixed chemistry-heavy corpus and publish the patent IDs.
    2. Separate document search metrics from molecule extraction metrics.
    3. Report review queues and skipped stages instead of hiding them.
    4. Only publish accuracy claims after bond-level manual validation.

    Contact

    Need a benchmark run on your own patent set?

    If you want a pilot on structure-heavy patents, a benchmark dataset built around your target, or a review of whether your current workflow can surface SMILES, IUPAC, and bioactivity data in one pass, start with a short benchmark engagement.

    What to send

    Tell us what target, disease area, patent set, or molecule extraction problem you want benchmarked. We will use that to scope a first-pass review and a realistic pilot.

    • Target or program name
    • Representative patents or a sample dataset
    • What outputs matter: SMILES, IUPAC, IC50, in vivo, tables, or review workflow
    • Expected timing and delivery constraints

    Static site deployment is live first. Contact submission will be re-enabled after the hosted form path is wired.