How to Access ClinicalTrials.gov Data Programmatically in 2026

May 2, 2026 · Web Data Labs

ClinicalTrials.gov is the largest public registry of clinical research in the world. More than 500,000 trials. Decades of pharmaceutical history. Recruitment status, enrollment counts, sponsor relationships, intervention details — all sitting in one government database.

And yet, if you've ever tried to actually use this data at scale, you know the catch: the web UI is fine for clinicians looking up a single trial, but it's hopeless for anyone who needs structured data across thousands of studies. There's no clean export button. No "download all trials sponsored by Pfizer in Phase 3" link. Just a search box, a table, and a lot of clicking.

Why bulk access is hard

The challenge isn't access — the data is genuinely public. The challenge is shape. Trials live as deeply nested records, sometimes with twenty layers of metadata: identification module, status module, design module, sponsor collaborators module, eligibility module. Pulling one trial is easy. Pulling a useful slice of the registry — filtered by phase, status, sponsor, and indication — and getting back a flat, analysis-ready dataset is the work.

Most teams that need this data end up writing the same plumbing over and over: pagination handling, response flattening, status filtering, deduplication, retry logic for transient errors. It's not hard, exactly. It's just hours that nobody charges for.

Who actually uses this data

Pharma competitive intelligence. Tracking what trials a competitor has open, in which indications, at what phase, with what enrollment numbers.
Biotech investment due diligence. Pre-IPO and pre-acquisition checks: is the pipeline real? Are the trials actually recruiting? What's the realistic primary completion date?
Academic research. Meta-analyses across thousands of trials, especially in rare-disease or oncology subfields.
Patient advocacy. Building disease-specific dashboards of recruiting trials with eligibility summaries.
CRO business development. Identifying sponsors with active trials in therapeutic areas the CRO supports.
Regulatory consultancy. Tracking submission patterns by sponsor or country.

The scraper

We built a ClinicalTrials.gov scraper actor that handles the messy parts — pagination, nested-field flattening, status and phase filtering, sponsor matching — and gives you back clean rows. You give it a query, you get structured trial records. That's it.

Input

{
  "query": "breast cancer immunotherapy",
  "status": "RECRUITING",
  "phase": "PHASE3",
  "maxItems": 50
}

Search by indication, drug, sponsor name, or any keyword. Filter by recruitment status (recruiting, completed, active, not yet recruiting). Optionally filter by phase. Cap results with maxItems — up to 200 per run.

Output

[
  {
    "nct_id": "NCT06123456",
    "url": "https://clinicaltrials.gov/study/NCT06123456",
    "title": "Phase 3 Study of Pembrolizumab in Triple-Negative Breast Cancer",
    "official_title": "A Randomized Phase 3 Study of Pembrolizumab Plus Chemotherapy...",
    "status": "RECRUITING",
    "phase": "PHASE3",
    "study_type": "INTERVENTIONAL",
    "conditions": ["Breast Cancer", "Triple Negative Breast Neoplasms"],
    "interventions": ["pembrolizumab (DRUG)", "carboplatin (DRUG)"],
    "sponsor": "Merck Sharp & Dohme LLC (INDUSTRY)",
    "enrollment": 450,
    "start_date": "2024-03-15",
    "primary_completion_date": "2027-09-30",
    "last_update_date": "2026-04-22",
    "scraped_at": "2026-05-02T10:48:00Z"
  }
]

Every record is flat, JSON-serializable, and ready to dump into a CSV, a database, or a notebook. conditions and interventions are arrays so you can filter or join on them. Sponsor includes the funder class (INDUSTRY, OTHER, NIH, FED) so you can split commercial from academic trials in one line of code.

Common queries

Goal	Input
All Phase 3 oncology trials a company is running	`{"query": "Pfizer", "status": "RECRUITING", "phase": "PHASE3"}`
Recruiting CAR-T trials	`{"query": "CAR-T", "status": "RECRUITING"}`
Completed trials in a rare indication	`{"query": "Niemann-Pick", "status": "COMPLETED"}`
Pediatric trials starting now	`{"query": "pediatric", "status": "NOT_YET_RECRUITING"}`

Pricing

Pay per result. One cent per trial extracted — no monthly subscription, no minimum, no proxy fees. A 50-trial run costs you fifty cents. A bulk pull of 200 trials runs two dollars. You only pay for the records that come back populated.

Compliance note. ClinicalTrials.gov data is public under U.S. federal law. The scraper uses the official public endpoints with a polite User-Agent and a one-second delay between pagination requests. No authentication, no rate-limit gymnastics, no anti-bot bypass — this is the way the registry was meant to be used.

Try it

The actor is on Apify. Paste your query, set the filters, hit run. Results land in your dataset within seconds — downloadable as JSON, CSV, or Excel.

Run the ClinicalTrials.gov Scraper →