Skip to main content
NLP & Knowledge Systems — Case Study

Healthcare Clinical Trial Data Q&A Platform

The client was a clinical research organization supporting pharmaceutical and biotech sponsors across late-phase trials. Their medical-affairs team fielded roughly two hundred sponsor inquiries per week, each requiring synthesis across multiple completed trials — efficacy comparisons, safety signal analyses, subpopulation outcomes. The existing approach used a research librarian who pulled relevant papers from an internal repository and produced a synthesis memo, which took two to three weeks per inquiry.

-95%synthesis time
10k+trials indexed
90%+paired-review pass
+2new sponsors won
Healthcare Clinical Trial Data Q&A Platform
Category

NLP & Knowledge Systems

Industry

HealthTech, Pharmaceuticals

Timeline

14 weeks from kickoff to sponsor-facing rollout

Team size

5 specialists

Project Overview

The full story

The practical problem was that ten thousand trial reports existed in a mix of PDF, scanned image, and structured XML formats with inconsistent metadata. Search returned page-level hits across reports, but cross-trial synthesis required reading and comparing dozens of papers manually. The library team was three people and could not scale, but sponsors were beginning to ask for synthesis turnaround in days, not weeks, and competing CROs were starting to offer faster service.

We built a trial-aware Q&A platform that ingested the entire repository into a structured index with per-trial metadata (phase, indication, population, endpoints), supported filtered cross-trial retrieval, and produced synthesis-style answers with citations to specific trial reports and pages. Every output preserved a traceable lineage so medical-affairs could defend conclusions to sponsors, and the platform operated entirely within HIPAA-compliant infrastructure with PHI screening on uploads.

What shipped was a researcher workspace where a medical-affairs analyst types a synthesis question — "compare cardiovascular safety signals across phase three trials in our HFrEF library" — and gets a draft synthesis with cited findings per trial in under five minutes. The library team’s three people moved from synthesis production to synthesis review, which expanded effective throughput by an order of magnitude. Sponsor turnaround dropped to days for routine inquiries and the CRO took on two large new sponsor accounts within the next two quarters.

The Problem

Medical-affairs inquiries took weeks because cross-trial synthesis required reading dozens of papers by hand.

01Friction point

Ten thousand trial reports lived in mixed formats — PDF, scanned image, XML — with inconsistent metadata across vintage.

02Friction point

Cross-trial synthesis required manual reading and comparison, with no tooling for filtered retrieval by phase or endpoint.

03Friction point

Sponsor expectations shifted toward days-not-weeks turnaround, but a three-person library team could not scale linearly.

04Friction point

PHI screening on uploads was manual, which created compliance exposure and slowed every new sponsor onboarding cycle.

05Friction point

Outputs from generic search tools lacked traceable lineage, which made conclusions hard to defend in sponsor review meetings.

Our Approach

How we structured the engagement

Indexed every trial with structured metadata so cross-trial synthesis became a filtered retrieval problem instead of a manual reading problem.

  1. Phase 01Weeks 1-3

    Discovery

    Audited the trial report library to taxonomize formats, extracted metadata patterns by vintage, and worked with medical-affairs on the typical shape of a synthesis question. Output: a per-trial metadata schema, a HIPAA-compliant infrastructure spec, and a synthesis answer template the team would approve.

  2. Phase 02Weeks 4-5

    Architecture

    Designed a hybrid retrieval system over ElasticSearch for filtered structured search plus dense embeddings for semantic relevance, with strict tenant isolation per sponsor. Built PHI screening into the ingestion path so no untyped data crossed into the index. Used LangChain for the synthesis-style answer composition.

  3. Phase 03Weeks 6-12

    Build

    Shipped ingestion and metadata extraction first because the index quality gated everything else. Built the filtered retrieval and synthesis composition next, then the citation-preserving answer renderer. Implemented sponsor-scoped tenancy with row-level isolation and a HIPAA audit log for every retrieval and answer.

  4. Phase 04Weeks 13-14

    Launch

    Rolled out to medical-affairs with a four-week paired-review period where every synthesis produced by the platform was reviewed by the library team. Tuned retrieval and answer templates against actual reviewer feedback. Promoted to sponsor-facing turnaround service once paired-review pass rate cleared ninety percent.

System Architecture

What we built, component by component

  1. 01

    Ingestion pipeline

    Parses mixed-format trial reports, extracts per-trial metadata, and runs PHI screening before any indexing.

  2. 02

    Structured metadata store

    Per-trial fields including phase, indication, population, endpoints, and outcomes, used for filtered retrieval.

  3. 03

    Hybrid retrieval

    ElasticSearch for structured filters plus dense embeddings for semantic relevance, combined with score fusion.

  4. 04

    Synthesis composer

    LangChain orchestration that composes a synthesis-style answer with per-finding citations to specific trials and pages.

  5. 05

    Tenant isolation layer

    Per-sponsor row-level isolation, audit logging, and access controls aligned with HIPAA-compliant infrastructure.

  6. 06

    Review surface

    Library-team interface for paired-review during rollout and ongoing quality monitoring on sponsor-facing outputs.

Data Flow

Trial reports flow through the ingestion pipeline with PHI screening, are indexed with structured metadata into both ElasticSearch and a dense embedding store, and become available for filtered retrieval. A synthesis query runs hybrid retrieval, the composer produces a cited answer, the tenant isolation layer enforces sponsor scoping, and outputs land in the review surface for human approval before sponsor delivery.

Ingestion pipeline
Structured metadata store
Hybrid retrieval
Synthesis composer
Tenant isolation layer
Key Decisions

The trade-offs we made and why

Decision 01Lead trade-off

Built hybrid retrieval over either pure structured or pure semantic

Pure structured search missed semantic relevance across differently-worded endpoints. Pure semantic search missed the filtered precision that synthesis questions required — phase, indication, population. Hybrid retrieval with score fusion gave the precision and the relevance simultaneously.

Decision 02

Ran PHI screening in ingestion, not at retrieval time

Screening at retrieval would have introduced latency on every query and risked PHI escape through cached embeddings. Screening at ingestion meant the index was clean by construction, which simplified the audit story and removed runtime risk.

Decision 03

Paired-review during rollout, not auto-delivery

Sponsor-facing outputs in pharma are high-stakes and the library team’s credibility was on the line. Paired-review during rollout produced labeled quality data, built medical-affairs trust, and let us tune the system against real reviewer feedback before any sponsor saw an output.

Decision 04

Used ElasticSearch over a pure vector store

The filtered queries — phase, indication, endpoint — were a structured-search problem first and a semantic problem second. ElasticSearch handled the filters natively and combined cleanly with dense embeddings, which a vector-store-only design would have made awkward.

Outcomes

What changed for the client

synthesis time

Median time from synthesis question to draft answer with citations across thirty representative medical-affairs inquiries.

trials indexed

Trial reports ingested with structured metadata at cutover, with quarterly refresh as new completed trials are added.

paired-review pass

Share of platform-generated syntheses approved by library reviewers without substantive correction during rollout.

new sponsors won

New sponsor accounts onboarded in the two quarters following launch, attributed by sales to the faster turnaround capability.

Tech Stack

The tools behind the system

Built with a deliberate stack chosen for production reliability and operational velocity.

4 componentsProduction-grade
LangChainFastAPIElasticSearchAWS
What we’d carry forward

Lessons learned from the build

01Lesson

Hybrid retrieval was the only design that fit the question shape. Pure-vector approaches looked cleaner architecturally but failed on the filtered queries that made up most of medical-affairs work. Always shape retrieval to the question, not to the model.

02Lesson

Ingestion-time PHI screening was a structural simplification we underestimated. Screening at retrieval would have made every query a compliance question. Doing it once at ingest meant the rest of the system did not have to know about PHI at all, which removed an entire class of operational risk.

03Lesson

Paired-review built more trust than any accuracy metric. Library reviewers saw the system improve under their own corrections and became advocates for the rollout. Skipping that phase and going straight to auto-delivery would have lost the team and the sponsor confidence at the same time.

Related Services

Similar delivery work usually starts in these service areas

If you are exploring a similar product, workflow, or implementation challenge, these are the service tracks that usually fit best.

Industry Context

Where this project sits in the bigger market picture

Healthcare and clinical AI systems with practical workflow and data considerations.

Similar Project?

Build a result-driven AI product with a team that has shipped before

If you are exploring a similar product, workflow, or AI use case, we can help scope the right architecture, delivery model, and first milestone.

Start with clarity

Have an AI idea, messy workflow, or product vision? Let's make it buildable.

Bring the problem. We'll help shape the product, define the architecture, and show the fastest path to a serious first version.

  • A practical first roadmap in the discovery call

  • Architecture, timeline, and delivery options in plain English

  • Security, scalability, and reliability discussed upfront

Model registry

softus-rag-v4.2

live

187ms

Latency

128k

Context

$0.004

Cost / req

Evaluation suite

Faithfulness94%
Answer relevance97%
Citation accuracy99%

Deploy pipeline

prod / canary 25% — healthy