OMOP Common Data Model

The purpose of the OMOP Common Data Model (CDM) is to standardize the format and content of the observational data, so standardized applications, tools and methods can be applied to them. This page explains the Common Data Model. It also provides a collection of Data ETL for a number of popular databases. All material is licensed under the Apache License Version 2.0, making it available to the public as Open Source with minimal restrictions. To learn more about what you need to do to use the CDM, please review the License.


Below is the latest CDM in Version 4.0. In addition to person, condition, drug, procedure and visit information, it now models provider and cost information. This will support health economics use cases and medical treatment outcome studies, including medical device safety, comparative effectiveness and healthcare quality.

This is the DRAFT specification for the OMOP CDM in Version 5.0.

This is the most time consuming part of creating a database in OMOP forma. You need to write a script or program to convert your data to meet the specifications. To make things easier and get things organized below is a template for mapping:

In addition to mapping and transforming of the data to the CDM, the content also has to conform to the Standard Vocabularies. This is performed through a process called Vocabulary Mapping described in the Standard Vocabulary Specifications.

The following is a list of existing ETL implementation for a number of popular databases. All of them are Open Source. However, note that some of them are converting to previous versions of the CDM.



Scalable Architecture for Federated Therapeutic Inquiries Network (SaftiNet)

Clinical Practice Research Datalink (CPRD)




3. Generalized ERA Logic Developer (GERALD)
The last step of transforming your data is to create ERAs for drugs and conditions. GERALD will do this on the data in OMOP format.