Standard Vocabulary

**As of June 30, 2016, the IMEDS Research Lab will be suspended for the remainder of the year until further notice. Thank you for your support. For additional information or questions about this update, or to learn more about other areas of the IMEDS program, please contact IMEDS@reaganudall.org.

All this material is licensed under the Apache License Version 2.0, making it available to the public as Open Source with minimal restrictions.

The Standard Vocabulary originated in the Observational Medical Outcomes Partnership (OMOP) and is a foundational tool to enable transparent and consistent content across disparate observational databases, and serves to support the research community in conducting efficient and reproducible observational research. It contains all of the code sets, terminologies, vocabularies, nomenclatures, lexicons, thesauri, ontologies, taxonomies, classifications, abstractions, and other such data that are required for:

  1. Creating the transformed (i.e., standardized) data from the raw data sets,
  2. Searching and querying the transformed data, and browsing and navigating the hierarchies of classes and abstractions inherent in the transformed data
  3. Interpreting the meanings of the data.


Click on the document titles below to obtain the Standard Vocabulary Implementation Specification and latest vocabulary data files and tools.


Using the vocabulary with your data

All content in your data, such as drugs or conditions, are referred to by concepts. You therefore need the Standard Vocabulary to understand and make use of these concepts. The Standard Vocabulary provides you with additional class concepts, relationships and ancestry relationships between concepts and a source to concept map that you need to convert non-standard vocabularies to the Standard Vocabulary during the ETL process of your data. In order to use the Standard Vocabulary, you must load them into the database or SAS file system next to your data. For information about this, download the specification, the DDL files and the vocabulary data files from the list above.

Vocabulary releases

The Standard Vocabulary undergoes constant changes: concepts are created, improved or deprecated because of bug fixes, changes in the data sources or the underlying reality of health care. For example, new drugs are developed and enter the market, new procedures are invented, and new diagnostic codes are introduced. Same is true for relationships and mappings. We therefore release the vocabulary files quarterly and provide Release Notes. Sometimes, interim releases are necessary for urgent bug fixes or additions. Check the release schedule regularly, or sign up for release notifications. The process by which we build the Standard Vocabulary is available as Open Source. We encourage the community to help improve the vocabulary build process. If you find problems or know of a better way, contact IMEDS with your input.

Querying the vocabulary

The Standard Vocabulary is organized with the goal in mind that all vocabularies are represented in the same fashion, no matter their origin. They can therefore be queried in a standardized fashion. IMEDS is holding a collection of Standard Queries to answer typical questions relevant for data researchers, such as identification of conditions and drugs, membership in classifications, etc.

License information

For the most part, vocabularies have been adapted from public or proprietary sources. There are very few vocabularies created de-novo by IMEDS. All publicly available Vocabularies are called "unrestricted" and are distributed in a file that is Open Source, licensed under the Apache License Version 2.0. Some third party vocabularies (called "restricted") are only available to be used for certain research purposes and an End User License Agreement (EULA) has to be executed.

  • Table of Standard Vocabularies - Unrestricted and Restricted List