De-identifying OMOP Databases
Before building anything, I wanted to understand what’s already out there. I’m planning to build an open-source de-identification pipeline for OMOP data. But before writing any code, I needed to survey the landscape: what tools exist, what’s been proven in production, and what the research says actually works. These are my notes. Why De-identification Matters De-identification is the process of removing or transforming data elements that could identify individuals. In healthcare, this typically means addressing the 18 identifiers specified under HIPAA’s Safe Harbor provision, or demonstrating through Expert Determination that re-identification risk is “very small.” ...