Methodology

Version 0.2 — last updated May 31, 2026

Purpose

The OncologyAI Registry exists to provide a single, curated, source-cited reference for AI/ML-based diagnostic, predictive, and prognostic tools available or in late-stage development for U.S. oncology. The registry is independent, open-data (CC BY 4.0), and updated quarterly.

Inclusion criteria

The product uses an AI or machine-learning algorithm as a core component of clinical decision support, diagnosis, prognosis, or treatment selection.
The intended clinical use is in oncology (any cancer type, any modality).
The product is available, in clinical trial, or formally announced for the U.S. market — including Laboratory-Developed Tests (LDTs), FDA-cleared/approved devices, and Breakthrough Device-designated products.
Each entry is supported by at least one verifiable public source (FDA database, peer-reviewed publication, or company-issued press release).

Exclusion criteria

Pure research-only tools without a stated clinical translation pathway.
Algorithms embedded in non-oncology workflows (e.g., AI for general radiology triage that doesn't specifically address oncology).
Unsourced claims or products mentioned only in non-attributable secondary coverage.

Data fields

Each entry captures the following, where verifiable:

Product name and company
Cancer type(s) and modality (histopathology, radiology, genomic, liquid biopsy, multi-omics)
Intended use as stated by the manufacturer or in the FDA labeling
Regulatory status: FDA pathway (510(k), De Novo, PMA, Breakthrough Device, LDT), decision date, NY CLEP status, CLIA/CAP accreditation
Validation summary: structured fields capturing the evidence behind the tool — see "Validation summary schema" below
Deployment: market availability, partner labs, estimated cancer-center adoption
Reimbursement: CPT / PLA codes, payer coverage notes (where public)
Sources: every claim is linked to a verifiable URL with date accessed

Validation summary schema

To make the registry useful for comparison and citation, each entry has a structured validation summary capturing the evidence behind the tool — not just its regulatory pathway. FDA clearance is a regulatory bar; it is not synonymous with rigorous clinical validation. Two FDA-cleared tools can rest on very different evidence bases. The validation summary surfaces those differences.

Fields per entry:

Study design: prospective, retrospective, RCT, meta-analysis, bench-only, or unpublished
Cohort size: separate counts for patients vs. samples (slides / images / scans), with a unit note for clarity
Number of sites and site geography: single-center US, multi-center US, or multi-center international
Comparator: what the AI's performance was measured against. Allowed values:
- pathologist_consensus — ground truth set by expert reviewer agreement
- gold_standard_test — orthogonal reference assay (e.g., IHC, FISH, sequencing)
- clinical_outcomes — actual events (recurrence, metastasis, death) used as ground truth
- clinicopathologic_factors — AI compared against baseline demographic/pathologic variables in multivariable analysis to show incremental prognostic value (common for prognostic genomic classifiers)
- predicate_device — comparison to an existing FDA-cleared predicate (typical for 510(k)s)
- none — no formal comparator
Primary endpoint and result: e.g., AUC 0.89 (95% CI 0.85–0.92). Allowed endpoint categories include AUC, sensitivity, specificity, sensitivity_specificity, ppv_npv (positive/negative predictive value — common for screening tests like multi-cancer early detection), hazard_ratio, time_to_event_risk_strata (Kaplan–Meier event rates by pre-specified risk categories — common for prognostic classifiers), concordance, and other.
External validation: whether an independent cohort was used, with description and result
Peer-reviewed: whether the pivotal evidence is peer-reviewed
Key publications: list of cited publications, with the pivotal trial flagged
Limitations noted: study-level caveats sourced from the publication or FDA summary
FDA decision summary: link to accessdata.fda.gov where available
Data completeness: full, partial, or stub — explicit signaling of how much we have verified for each entry

v0.2 ships with a subset of entries fully populated and the remainder marked as stubs. Stub entries are still useful for the regulatory and deployment fields; the validation block will be filled in subsequent releases as primary sources are reviewed for each tool. We are explicit about this asymmetry rather than implying uniform depth across all entries.

Source hierarchy

When a fact is supported by multiple sources, we prioritize:

U.S. federal database entries (FDA 510(k), De Novo, PMA, Breakthrough Devices, ClinicalTrials.gov)
Peer-reviewed publications (PubMed-indexed)
Manufacturer regulatory documentation (FDA submissions, labeling)
Company-issued press releases attributable to a named source
Major trade press (STAT, Endpoints, MedTech Dive, BusinessWire)

Performance metrics are reported only from peer-reviewed publications, never from press releases or marketing materials.

Refresh cycle

The registry is reviewed quarterly. Material updates (new FDA decisions, major partnerships, or significant publications) are applied within two weeks of public availability.

Conflict-of-interest disclosure

The curator (Ahmed Elbakri) is currently employed by Valar Labs. Valar Labs products are listed in the registry under the same inclusion criteria and source-citation standards as every other entry. The curator has no equity, advisory, or compensation relationships with any other listed company. Disclosures are reviewed and republished with each quarterly update.

Citation

Please cite the registry as:

Elbakri A. OncologyAI Registry [v0.2]. Available at: https://oncologyairegistry.org. Accessed [date].

To cite a specific tool, use the per-tool URL — e.g., https://oncologyairegistry.org/tools/<tool-id>.html.

A formal methods paper describing the registry is in preparation and will be the canonical citation upon publication.

Contributing

Suggestions, corrections, and new-entry submissions are welcome via GitHub pull request. Each contribution must include a verifiable source URL and meet the inclusion criteria above.