Solutions by Role

QA & Test Engineers

Six weeks of legal review.
Or one command.

You have four test environments and every one of them is waiting on data. Your test deadline is Friday. Legal will not approve production extracts. Your QA team is testing against 14 patients, all named Test Patient, all born January 1, 1980. And you already know: what works in dev will break in prod, because your test data looks nothing like production.

Start Your Free Console Trial Join Flock Waitlist

The Test Data Problem

You have three options today. None of them work.

Option 1

Copy production data.

Six weeks of legal review. HIPAA risk assessment. De-identification vendor contract. Budget approval. By the time you get the data, your test deadline has passed.

Option 2

Build test data by hand.

14 patients with identical demographics. No clinical variance. No edge cases. Your tests pass because your data is too simple to fail — not because your system is correct.

Option 3

Use a generic data generator.

Patients with diabetes prescribed pediatric antibiotics. Lab results that are physiologically impossible. Data that looks synthetic because it is — and your system handles it differently than real data.

The common thread: your test environment does not reflect production. And the failures you find in production are the failures your test data was too simple to reveal. The 99-year-old patient. The neonate. The polypharmacy case. The patient with 47 active diagnoses. The edge cases that cause production failures are the edge cases your test environment never contains.

What Changes With Pidgeon

De-Identify Without Waiting

On-device. No data leaves your machine. No vendor contract. No legal review. Your existing production messages become safe test data in one command, with referential integrity preserved across related messages.

pidgeon-cli

$ pidgeon deident --in ./production_samples --out ./safe_test_data --date-shift 30d

Processed 2,847 messages

✓ Names replaced (consistent cross-message)

✓ MRNs hashed (referential integrity preserved)

✓ Dates shifted +30 days (temporal coherence maintained)

✓ Addresses replaced

✓ SSNs removed

✓ 18 HIPAA Safe Harbor identifier categories handled

Zero PHI in output. Nothing to approve. Nothing to breach.

Generate What Production Looks Like

Flock generates patient populations grounded in real epidemiological data. CDC WONDER prevalence rates. Census demographic distributions. NHANES lab value ranges. Schema-aware output that respects your database's foreign key constraints.

Not 14 test patients — 14,000

Realistic age distributions, comorbidity correlations, and clinical variance — including the edge cases that break production systems.

Epidemiologically grounded

CDC WONDER prevalence rates, Census demographic distributions, and NHANES lab value ranges. Your synthetic data matches what real populations look like.

Schema-aware output

SQL INSERT (FK-ordered), CSV, HL7 message streams, or FHIR bundles. Shaped to your schema, not a generic template.

The test environment that actually reflects production

No more 'works in dev, breaks in prod.' Your test environments contain the demographic variance, clinical complexity, and edge cases that production throws at you. When your tests pass, they mean something.

Edge cases on demand

The 99-year-old patient. The neonate. The polypharmacy case. The patient with 47 active diagnoses. Generate the exact edge cases that cause production failures.

Clinically coherent scenarios

ICD-10 diagnoses, LOINC lab codes, NDC drug codes, and CVX vaccine codes applied consistently. Your test data reflects real clinical relationships, not random combinations.

Referential integrity preserved

Cross-message consistency across entire patient records. The same patient MRN, the same visit dates, the same provider — cohesive across every message in a test run.

Population-scale output

10,000 patients with realistic Mississippi diabetes prevalence. Age distributions, comorbidity correlations, and lab values that match CDC data — generated in under a minute.

It is Tuesday morning. Your test environment needs 10,000 patients with realistic Mississippi diabetes prevalence for a population health module validation. Flock generates them in under a minute — age distributions, comorbidity correlations, and lab values that match CDC data. Your QA team starts testing before lunch. The compliance officer asks about PHI exposure. You tell her: “No PHI ever existed. There is nothing to scrub, nothing to approve, nothing to breach.”
The QA Manager Whose Test Data Matches Production

The right Pidgeon product for each QA challenge

Post by Pidgeon

Free CLI

Generate clinically coherent HL7/FHIR test messages at volume. Validate against specs. De-identify production messages on-device in seconds. The free CLI that every QA engineer should have installed.

On-device de-identification — 18 HIPAA Safe Harbor categories
Temporal coherence across admission, order, and result messages
Strict and compatibility validation modes
HL7 v2.3.1–v2.8, FHIR R4, NCPDP SCRIPT

$ dotnet tool install pidgeon

$ pidgeon deident --in ./samples --out ./safe --date-shift 30d

Download Free CLI

Flock by Pidgeon

Coming Soon

Generate entire synthetic patient populations for database-level testing. Schema-aware, relationally consistent, epidemiologically grounded. The tool for QA teams who need their test environment to look like a real production database.

CDC WONDER disease prevalence — grounded in real epidemiology
SQL DDL schema reader with FK constraint graph
FK-ordered SQL INSERT, CSV, HL7 streams, FHIR bundles
Population analytics — validate distributions against CDC data

Flock population generation is in development. Join the waitlist to be first.

Join Flock Waitlist

De-Identification Is Free. Start Today.

pidgeon deident ships with the free CLI. No PHI ever touches the network — processing runs entirely on your machine. Enter your email to get the secure Desktop download.