Methodology
Version 0.2 — last updated May 31, 2026
Purpose
The OncologyAI Registry exists to provide a single, curated, source-cited reference for AI/ML-based diagnostic, predictive, and prognostic tools available or in late-stage development for U.S. oncology. The registry is independent, open-data (CC BY 4.0), and updated quarterly.
Inclusion criteria
- The product uses an AI or machine-learning algorithm as a core component of clinical decision support, diagnosis, prognosis, or treatment selection.
- The intended clinical use is in oncology (any cancer type, any modality).
- The product is available, in clinical trial, or formally announced for the U.S. market — including Laboratory-Developed Tests (LDTs), FDA-cleared/approved devices, and Breakthrough Device-designated products.
- Each entry is supported by at least one verifiable public source (FDA database, peer-reviewed publication, or company-issued press release).
Exclusion criteria
- Pure research-only tools without a stated clinical translation pathway.
- Algorithms embedded in non-oncology workflows (e.g., AI for general radiology triage that doesn't specifically address oncology).
- Unsourced claims or products mentioned only in non-attributable secondary coverage.
Data fields
Each entry captures the following, where verifiable:
- Product name and company
- Cancer type(s) and modality (histopathology, radiology, genomic, liquid biopsy, multi-omics)
- Intended use as stated by the manufacturer or in the FDA labeling
- Regulatory status: FDA pathway (510(k), De Novo, PMA, Breakthrough Device, LDT), decision date, NY CLEP status, CLIA/CAP accreditation
- Validation summary: structured fields capturing the evidence behind the tool — see "Validation summary schema" below
- Deployment: market availability, partner labs, estimated cancer-center adoption
- Reimbursement: CPT / PLA codes, payer coverage notes (where public)
- Sources: every claim is linked to a verifiable URL with date accessed
Validation summary schema
To make the registry useful for comparison and citation, each entry has a structured validation summary capturing the evidence behind the tool — not just its regulatory pathway. FDA clearance is a regulatory bar; it is not synonymous with rigorous clinical validation. Two FDA-cleared tools can rest on very different evidence bases. The validation summary surfaces those differences.
Fields per entry:
- Study design: prospective, retrospective, RCT, meta-analysis, bench-only, or unpublished
- Cohort size: separate counts for patients vs. samples (slides / images / scans), with a unit note for clarity
- Number of sites and site geography: single-center US, multi-center US, or multi-center international
- Comparator: what the AI's performance was measured against. Allowed values:
pathologist_consensus— ground truth set by expert reviewer agreementgold_standard_test— orthogonal reference assay (e.g., IHC, FISH, sequencing)clinical_outcomes— actual events (recurrence, metastasis, death) used as ground truthclinicopathologic_factors— AI compared against baseline demographic/pathologic variables in multivariable analysis to show incremental prognostic value (common for prognostic genomic classifiers)predicate_device— comparison to an existing FDA-cleared predicate (typical for 510(k)s)none— no formal comparator
- Primary endpoint and result: e.g., AUC 0.89 (95% CI 0.85–0.92). Allowed endpoint categories include
AUC,sensitivity,specificity,sensitivity_specificity,ppv_npv(positive/negative predictive value — common for screening tests like multi-cancer early detection),hazard_ratio,time_to_event_risk_strata(Kaplan–Meier event rates by pre-specified risk categories — common for prognostic classifiers),concordance, andother. - External validation: whether an independent cohort was used, with description and result
- Peer-reviewed: whether the pivotal evidence is peer-reviewed
- Key publications: list of cited publications, with the pivotal trial flagged
- Limitations noted: study-level caveats sourced from the publication or FDA summary
- FDA decision summary: link to accessdata.fda.gov where available
- Data completeness:
full,partial, orstub— explicit signaling of how much we have verified for each entry
v0.2 ships with a subset of entries fully populated and the remainder marked as stubs. Stub entries are still useful for the regulatory and deployment fields; the validation block will be filled in subsequent releases as primary sources are reviewed for each tool. We are explicit about this asymmetry rather than implying uniform depth across all entries.
Source hierarchy
When a fact is supported by multiple sources, we prioritize:
- U.S. federal database entries (FDA 510(k), De Novo, PMA, Breakthrough Devices, ClinicalTrials.gov)
- Peer-reviewed publications (PubMed-indexed)
- Manufacturer regulatory documentation (FDA submissions, labeling)
- Company-issued press releases attributable to a named source
- Major trade press (STAT, Endpoints, MedTech Dive, BusinessWire)
Performance metrics are reported only from peer-reviewed publications, never from press releases or marketing materials.
Refresh cycle
The registry is reviewed quarterly. Material updates (new FDA decisions, major partnerships, or significant publications) are applied within two weeks of public availability.
Conflict-of-interest disclosure
The curator (Ahmed Elbakri) is currently employed by Valar Labs. Valar Labs products are listed in the registry under the same inclusion criteria and source-citation standards as every other entry. The curator has no equity, advisory, or compensation relationships with any other listed company. Disclosures are reviewed and republished with each quarterly update.
Citation
Please cite the registry as:
Elbakri A. OncologyAI Registry [v0.2]. Available at: https://oncologyairegistry.org. Accessed [date].
To cite a specific tool, use the per-tool URL — e.g., https://oncologyairegistry.org/tools/<tool-id>.html.
A formal methods paper describing the registry is in preparation and will be the canonical citation upon publication.
Contributing
Suggestions, corrections, and new-entry submissions are welcome via GitHub pull request. Each contribution must include a verifiable source URL and meet the inclusion criteria above.