OSIM2 - Observational Medical Dataset Simulator Generation 2

Simulated Observational Data - OSIM2 OSIM2 is not available in OMOP CDM V4, only in CDM V2 format.

The Observational Medical Dataset Simulator (OSIM) is an open-source software application, written in R, that allows users to create simulated datasets that conform to the OMOP Common Data Model. OSIM2 represents an alternative design to accommodate additional complexities observed in real-world data, including advanced modeling of the correlations between drugs and conditions. OSIM2 allows for more direct comparisons between simulated data and real observational databases, and should enable greater methods evaluation by allowing assessment of how methods accommodate these complex interrelationships. OSIM2 can be used to benchmark the performance of methods to estimate the strength of association between drug treatment and outcome.

Please contact IMEDS to share with us your experience with OSIM2 datasets.

OSIM2 source code, documentation, and databases are available for download:

  • OSIM2 Introduction (without audio)
  • OSIM2 Introduction (with audio narration)
  • OSIM2 Architecture and Execution
  • OSIM2 Source Code and Documentation
  • OSIM2 validation dashboard procedures
  • Download of OSIM2 Datasets
    There are 16 OSIM2 datasets that are available for download. Each dataset is a 10m person dataset modeled after Thomson Reuters MarketScan® Lab Database (MSLR), one without any signals injected, and then the other 15 databases have different size/types of signals (relative risk: 1.25, 1.5, 2, 4, 10; and risk type: acute onset (equals 'any exposure' events occurring within 30d of exposure start), insidious, and accumulative). MSLR, covering 2003 – 2009, represents privately-insured population, with administrative claims from inpatient, outpatient, and pharmacy services supplemented by laboratory results.

    The datasets listed below are freely available for download through an anonymous FTP server. For example, you can download: OSIM2_10M_MSLR_MEDDRA_6, which has a set of signals injected at RR=1.50 and with insidious onset (during exposure or 30d afterwards).

    OSIM2 Datasets Injected Signals at Relative Risk Equals Risk Type Size
    OSIM2_10M_MSLR_MEDDRA_0 None None 3.5GB
    OSIM2_10M_MSLR_MEDDRA_3 1.25 Insidious 3.5GB
    OSIM2_10M_MSLR_MEDDRA_6 1.5 Insidious 3.5GB
    OSIM2_10M_MSLR_MEDDRA_9 2 Insidious 3.5GB
    OSIM2_10M_MSLR_MEDDRA_12 4 Insidious 3.5GB
    OSIM2_10M_MSLR_MEDDRA_15 10 Insidious 3.8GB
    OSIM2_10M_MSLR_MEDDRA_2 1.25 Any Exposure 3.5GB
    OSIM2_10M_MSLR_MEDDRA_5 1.5 Any Exposure 3.5GB
    OSIM2_10M_MSLR_MEDDRA_8 2 Any Exposure 3.5GB
    OSIM2_10M_MSLR_MEDDRA_11 4 Any Exposure 3.5GB
    OSIM2_10M_MSLR_MEDDRA_14 10 Any Exposure 3.6GB
    OSIM2_10M_MSLR_MEDDRA_1 1.25 Accumulative 3.5GB
    OSIM2_10M_MSLR_MEDDRA_4 1.5 Accumulative 3.5GB
    OSIM2_10M_MSLR_MEDDRA_7 2 Accumulative 3.5GB
    OSIM2_10M_MSLR_MEDDRA_10 4 Accumulative 3.5GB
    OSIM2_10M_MSLR_MEDDRA_13 10 Accumulative 3.7GB

    These are very large files. We have tested the OSIM2 dataset downloads using FileZilla and WS-FTP. FileZilla is free open source client software that can be downloaded from: http://filezilla-project.org/download.php

    To log in to the anonymous FTP server use the following credentials:

    Login: anonymous
    Password: blank
    Our FTP server supports SFTP protocol (port 22)

    On the server, there are two main folders:
    ● MedDRA: All data in this folder use MedDRA based condition concepts.
    ○ Transition Matrices. Currently there are transition matrices available for the following databases: GE, MDCD, MDCR, MSLR
    ○ OSIM2 dataset. All 16 OSIM2 datasets are available in individual directories. OSIM2 is not available in CDM V4, only in V2 format.

    ● SNOMED: All data in this folder use SNOMED-CT based condition concepts.
    ○ Transition Matrices. Currently there are transition matrices available for the following databases: CCAE, MDCD, MDCR, MSLR
    ○ IN THE FUTURE: OSIM2 data will be available in SNOMED format.