Simulated Observational Data - OSIM2 OSIM2 is not available in OMOP CDM V4, only in CDM V2 format.
The Observational Medical Dataset Simulator (OSIM) is an open-source software application, written in R, that allows users to create simulated datasets that conform to the OMOP Common Data Model. OSIM2 represents an alternative design to accommodate additional complexities observed in real-world data, including advanced modeling of the correlations between drugs and conditions. OSIM2 allows for more direct comparisons between simulated data and real observational databases, and should enable greater methods evaluation by allowing assessment of how methods accommodate these complex interrelationships. OSIM2 can be used to benchmark the performance of methods to estimate the strength of association between drug treatment and outcome.
Please contact IMEDS to share with us your experience with OSIM2 datasets.
OSIM2 source code, documentation, and databases are available for download:
Download of OSIM2 Datasets
There are 16 OSIM2 datasets that are available for download. Each dataset is a 10m person dataset modeled after Thomson Reuters MarketScan® Lab Database (MSLR), one without any signals injected, and then the other 15 databases have different size/types of signals (relative risk: 1.25, 1.5, 2, 4, 10; and risk type: acute onset (equals 'any exposure' events occurring within 30d of exposure start), insidious, and accumulative). MSLR, covering 2003 – 2009, represents privately-insured population, with administrative claims from inpatient, outpatient, and pharmacy services supplemented by laboratory results.
The datasets listed below are freely available for download through an anonymous FTP server. For example, you can download: OSIM2_10M_MSLR_MEDDRA_6, which has a set of signals injected at RR=1.50 and with insidious onset (during exposure or 30d afterwards).
|OSIM2 Datasets||Injected Signals at Relative Risk Equals||Risk Type||Size|
These are very large files. We have tested the OSIM2 dataset downloads using FileZilla and WS-FTP. FileZilla is free open source client software that can be downloaded from: http://filezilla-project.org/download.php
To log in to the anonymous FTP server use the following credentials:
Our FTP server supports SFTP protocol (port 22)
On the server, there are two main folders:
● MedDRA: All data in this folder use MedDRA based condition concepts.
○ Transition Matrices. Currently there are transition matrices available for the following databases: GE, MDCD, MDCR, MSLR
○ OSIM2 dataset. All 16 OSIM2 datasets are available in individual directories. OSIM2 is not available in CDM V4, only in V2 format.
● SNOMED: All data in this folder use SNOMED-CT based condition concepts.
○ Transition Matrices. Currently there are transition matrices available for the following databases: CCAE, MDCD, MDCR, MSLR
○ IN THE FUTURE: OSIM2 data will be available in SNOMED format.