Healthcare Clinical Trial Data Q&A Platform
The client was a clinical research organization supporting pharmaceutical and biotech sponsors across late-phase trials. Their medical-affairs team fielded roughly two hundred sponsor inquiries per week, each requiring synthesis across multiple completed trials — efficacy comparisons, safety signal analyses, subpopulation outcomes. The existing approach used a research librarian who pulled relevant papers from an internal repository and produced a synthesis memo, which took two to three weeks per inquiry.
NLP & Knowledge Systems
HealthTech, Pharmaceuticals
14 weeks from kickoff to sponsor-facing rollout
5 specialists
The full story
The practical problem was that ten thousand trial reports existed in a mix of PDF, scanned image, and structured XML formats with inconsistent metadata. Search returned page-level hits across reports, but cross-trial synthesis required reading and comparing dozens of papers manually. The library team was three people and could not scale, but sponsors were beginning to ask for synthesis turnaround in days, not weeks, and competing CROs were starting to offer faster service.
We built a trial-aware Q&A platform that ingested the entire repository into a structured index with per-trial metadata (phase, indication, population, endpoints), supported filtered cross-trial retrieval, and produced synthesis-style answers with citations to specific trial reports and pages. Every output preserved a traceable lineage so medical-affairs could defend conclusions to sponsors, and the platform operated entirely within HIPAA-compliant infrastructure with PHI screening on uploads.
What shipped was a researcher workspace where a medical-affairs analyst types a synthesis question — "compare cardiovascular safety signals across phase three trials in our HFrEF library" — and gets a draft synthesis with cited findings per trial in under five minutes. The library team’s three people moved from synthesis production to synthesis review, which expanded effective throughput by an order of magnitude. Sponsor turnaround dropped to days for routine inquiries and the CRO took on two large new sponsor accounts within the next two quarters.
Medical-affairs inquiries took weeks because cross-trial synthesis required reading dozens of papers by hand.
Ten thousand trial reports lived in mixed formats — PDF, scanned image, XML — with inconsistent metadata across vintage.
Cross-trial synthesis required manual reading and comparison, with no tooling for filtered retrieval by phase or endpoint.
Sponsor expectations shifted toward days-not-weeks turnaround, but a three-person library team could not scale linearly.
PHI screening on uploads was manual, which created compliance exposure and slowed every new sponsor onboarding cycle.
Outputs from generic search tools lacked traceable lineage, which made conclusions hard to defend in sponsor review meetings.
How we structured the engagement
Indexed every trial with structured metadata so cross-trial synthesis became a filtered retrieval problem instead of a manual reading problem.
- 01Phase 01Weeks 1-3
Discovery
Audited the trial report library to taxonomize formats, extracted metadata patterns by vintage, and worked with medical-affairs on the typical shape of a synthesis question. Output: a per-trial metadata schema, a HIPAA-compliant infrastructure spec, and a synthesis answer template the team would approve.
- 02Phase 02Weeks 4-5
Architecture
Designed a hybrid retrieval system over ElasticSearch for filtered structured search plus dense embeddings for semantic relevance, with strict tenant isolation per sponsor. Built PHI screening into the ingestion path so no untyped data crossed into the index. Used LangChain for the synthesis-style answer composition.
- 03Phase 03Weeks 6-12
Build
Shipped ingestion and metadata extraction first because the index quality gated everything else. Built the filtered retrieval and synthesis composition next, then the citation-preserving answer renderer. Implemented sponsor-scoped tenancy with row-level isolation and a HIPAA audit log for every retrieval and answer.
- 04Phase 04Weeks 13-14
Launch
Rolled out to medical-affairs with a four-week paired-review period where every synthesis produced by the platform was reviewed by the library team. Tuned retrieval and answer templates against actual reviewer feedback. Promoted to sponsor-facing turnaround service once paired-review pass rate cleared ninety percent.
What we built, component by component
- 01
Ingestion pipeline
Parses mixed-format trial reports, extracts per-trial metadata, and runs PHI screening before any indexing.
- 02
Structured metadata store
Per-trial fields including phase, indication, population, endpoints, and outcomes, used for filtered retrieval.
- 03
Hybrid retrieval
ElasticSearch for structured filters plus dense embeddings for semantic relevance, combined with score fusion.
- 04
Synthesis composer
LangChain orchestration that composes a synthesis-style answer with per-finding citations to specific trials and pages.
- 05
Tenant isolation layer
Per-sponsor row-level isolation, audit logging, and access controls aligned with HIPAA-compliant infrastructure.
- 06
Review surface
Library-team interface for paired-review during rollout and ongoing quality monitoring on sponsor-facing outputs.
Trial reports flow through the ingestion pipeline with PHI screening, are indexed with structured metadata into both ElasticSearch and a dense embedding store, and become available for filtered retrieval. A synthesis query runs hybrid retrieval, the composer produces a cited answer, the tenant isolation layer enforces sponsor scoping, and outputs land in the review surface for human approval before sponsor delivery.
The trade-offs we made and why
Built hybrid retrieval over either pure structured or pure semantic
Pure structured search missed semantic relevance across differently-worded endpoints. Pure semantic search missed the filtered precision that synthesis questions required — phase, indication, population. Hybrid retrieval with score fusion gave the precision and the relevance simultaneously.
Ran PHI screening in ingestion, not at retrieval time
Screening at retrieval would have introduced latency on every query and risked PHI escape through cached embeddings. Screening at ingestion meant the index was clean by construction, which simplified the audit story and removed runtime risk.
Paired-review during rollout, not auto-delivery
Sponsor-facing outputs in pharma are high-stakes and the library team’s credibility was on the line. Paired-review during rollout produced labeled quality data, built medical-affairs trust, and let us tune the system against real reviewer feedback before any sponsor saw an output.
Used ElasticSearch over a pure vector store
The filtered queries — phase, indication, endpoint — were a structured-search problem first and a semantic problem second. ElasticSearch handled the filters natively and combined cleanly with dense embeddings, which a vector-store-only design would have made awkward.
What changed for the client
synthesis time
Median time from synthesis question to draft answer with citations across thirty representative medical-affairs inquiries.
trials indexed
Trial reports ingested with structured metadata at cutover, with quarterly refresh as new completed trials are added.
paired-review pass
Share of platform-generated syntheses approved by library reviewers without substantive correction during rollout.
new sponsors won
New sponsor accounts onboarded in the two quarters following launch, attributed by sales to the faster turnaround capability.
The tools behind the system
Built with a deliberate stack chosen for production reliability and operational velocity.
Lessons learned from the build
Hybrid retrieval was the only design that fit the question shape. Pure-vector approaches looked cleaner architecturally but failed on the filtered queries that made up most of medical-affairs work. Always shape retrieval to the question, not to the model.
Ingestion-time PHI screening was a structural simplification we underestimated. Screening at retrieval would have made every query a compliance question. Doing it once at ingest meant the rest of the system did not have to know about PHI at all, which removed an entire class of operational risk.
Paired-review built more trust than any accuracy metric. Library reviewers saw the system improve under their own corrections and became advocates for the rollout. Skipping that phase and going straight to auto-delivery would have lost the team and the sponsor confidence at the same time.
Similar delivery work usually starts in these service areas
If you are exploring a similar product, workflow, or implementation challenge, these are the service tracks that usually fit best.
Where this project sits in the bigger market picture
Healthcare and clinical AI systems with practical workflow and data considerations.
Build a result-driven AI product with a team that has shipped before
If you are exploring a similar product, workflow, or AI use case, we can help scope the right architecture, delivery model, and first milestone.
Related case studies worth reviewing next
Have an AI idea, messy workflow, or product vision? Let's make it buildable.
Bring the problem. We'll help shape the product, define the architecture, and show the fastest path to a serious first version.
A practical first roadmap in the discovery call
Architecture, timeline, and delivery options in plain English
Security, scalability, and reliability discussed upfront
Model registry
softus-rag-v4.2
187ms
Latency
128k
Context
$0.004
Cost / req
Evaluation suite
Deploy pipeline
prod / canary 25% — healthy
