Repository: github.com/srodriguezloya/omop-development-environment
Introduction
In my previous post, I covered the OHDSI ecosystem explaining what each tool does, when you need it, and how the components work together. That guide focused on understanding the architecture and making informed deployment decisions for production environments.
This post tackles a different but equally important challenge: how do you actually learn and experiment with the OHDSI stack without breaking the bank?
The OHDSI community provides an excellent quick-start solution called OHDSI-in-a-Box, designed for rapid deployment on AWS. It’s purpose-built for personal learning and training environments—you can have a complete OHDSI stack running in minutes.
There’s just one problem: it costs approximately $40-44 per month to run on AWS.
For personal learning, weekend experimentation, or training workshops, those costs add up quickly. I wanted the same “one-click” deployment experience, but without the monthly AWS bill.
So I set up a local alternative—and deployed it on an old laptop sitting in my closet.
What’s Included
Core Services:
- PostgreSQL - Database server with multiple schemas
- WebAPI - Spring Boot REST API for OHDSI tools
- Atlas - Web-based research interface
- Achilles - Database characterization tool (via Docker profile)
Supporting Infrastructure:
- OMOP Vocabulary loading - Automated scripts for vocabulary management
- Synthea Integration - Generates synthetic patient data and ETL to OMOP CDM
Quick Start
Requirements
- Docker and Docker Compose installed
- Java 11+ (for Synthea data generation)
Installation
1. Clone and configure:
git clone https://github.com/srodriguezloya/omop-development-environment.git
cd omop-development-environment
# Copy environment file
cp .env.example .env
# Edit .env and change POSTGRES_PASSWORD
vim .env
2. Initial setup:
# Make scripts executable
chmod +x scripts/*.sh
# Run setup (first time only)
./scripts/setup.sh
This creates directories, downloads OMOP CDM DDL files, and prepares the environment.
Download OMOP Vocabularies
CRITICAL: OMOP vocabularies are required for the ETL to work.
- Go to https://athena.ohdsi.org/
- Create a free account
- Select vocabularies (minimum required):
- SNOMED (required)
- RxNorm (required for medications)
- LOINC (required for labs)
- Optional but recommended: ICD10CM, CPT4
- Click “Download Vocabularies”
- Wait for email notification (a few minutes)
- Download the ZIP file
- Extract to
./sample-data/vocabulary/
Note: For limited disk space, select only SNOMED, RxNorm, and LOINC.
Start Services
./scripts/start.sh
Wait 2-3 minutes for all services to become healthy. The script will show you when they’re ready.
Configure Data Source
Register your CDM database with WebAPI:
./scripts/add-webapi-source.sh 1 "Synthea OMOP CDM" SYNTHEA
Load Sample Data
# Load 50 synthetic patients (recommended for first run)
./scripts/load-synthea-data.sh 50
# Or more patients (takes longer)
./scripts/load-synthea-data.sh 100
# Load 1000 patients (default)
./scripts/load-synthea-data.sh
First run takes longer (loads vocabularies). Subsequent runs are faster.
Run Achilles
Generate database characterization statistics:
docker compose --profile tools up achilles
Results appear in Atlas under “Data Sources”. Runtime varies by data size and hardware.
Access Services
| Service | URL | Description |
|---|---|---|
| Atlas | http://localhost:8081/atlas | OHDSI Atlas - Data exploration UI |
| WebAPI | http://localhost:8080/WebAPI | OHDSI WebAPI - REST API |
| PostgreSQL | localhost:5432 | Database (credentials in .env) |
Common Operations
Add More Data
# Add 200 more patients
./scripts/load-synthea-data.sh 200
# Generate patients from different states
./scripts/load-synthea-data.sh 100 California
View Logs
# All services
docker compose logs -f
# Specific service
docker compose logs -f webapi
docker compose logs -f atlas
Database Access
# Using the helper script
./scripts/psql.sh
# Or directly
docker exec -it omop-postgres psql -U ohdsi_admin -d ohdsi
Stop Services
./scripts/stop.sh
# Or with Docker Compose
docker compose down
Reset Everything
docker compose down -v # Removes volumes (deletes all data!)
./scripts/setup.sh # Re-initialize
./scripts/start.sh # Start fresh
Troubleshooting
“No Data Sources” in Atlas
WebAPI hasn’t been configured yet.
Solution:
./scripts/add-webapi-source.sh 1 "Synthea OMOP CDM" SYNTHEA
docker compose restart webapi atlas
Services Won’t Start
Check if ports are already in use:
# Check port 5432 (PostgreSQL)
lsof -i :5432
# Check port 8080 (WebAPI)
lsof -i :8080
# Check port 8081 (Atlas)
lsof -i :8081
Vocabularies Not Loading
Ensure vocabulary files are in the correct location:
ls -la ./sample-data/vocabulary/
# Should show: CONCEPT.csv, VOCABULARY.csv, etc.
Conclusion
Running the OHDSI stack locally removes the financial barrier to learning. Cost: $0/month instead of $44/month on AWS.
That old laptop you have sitting around is probably sufficient—mine is a 7-year-old HP Pavilion, and it works great.
Plus, there’s something satisfying about pointing at an old laptop in the corner and saying “that’s my OHDSI research server.” :)