4 Database Schema

The OpenClinica data model is designed to mirror the structure and nomenclature of the CDISC ODM standard as closely as possible. Key tables in the physical schema represent studies, study subjects, CRFs, items, item data, and other objects, with the relationships between them modelled as foreign keys. The data model follows an 'Entity Attribute Value', or EAV, approach, where data values are saved as individual records in a 'long & skinny' table (item_data) with the entity name and attributes (metadata and other properties) related to the value through foreign key relationships [1]. 

OpenClinica Database Schema Conceptual View

The diagram above is a 'cheat sheet' version of the OpenClinica logical model [2], showing key tables in the schema and their relationships. Note that shorthand abbreviations are used rather than the full table names, so for example 'IG' is used in place of 'item_group'. The arrows represent foriegn keys, pointing toward the primary keys. The circled stars mark repeating objects (those with an 'ordinal' column). The lines through IGM and IFM indicate that they are ternary: each of their instances describes a 1:1 relation of a CRFV and an Item.

For a more comprehensive diagram of the current physical data model, see https://dev.openclinica.com/tools/db/relationships.html. From here you can also use the tabs at the top of the page to navigate to more detailed technical views of the database objects. Alternatively, a technical report on the OpenClinica 3.1 Database Model can be downloaded as a PDF here

Here's a mapping[3] of how key tables in the data model map to CDISC ODM, first for study metadata:



and next, for clinical data that is part of a study:


In principle, the OpenClinica data model is designed to closely mirror the structure and nomenclature of CDISC ODM. In practice there are deviations, either where the logical design of OpenClinica is different from ODM or where the physical implementation of the data model is different than ODM's XML structure. While we try to avoid both types of deviations they are sometimes unavoidable. There are also some deviations that are simply legacy artifacts, without any really good reason for departing from the standard. Where these deviations do exist, we look for ways to refactor the database schema to ensure better harmonization with ODM, since we believe this makes OpenClinica more consistent, more easily understood, and easier to develop for.

[1] - For more on EAV data models, see Wikipedia and Nadkarni, et al, Organization of heterogeneous scientific data using the EAV/CR representation. Journal of the American Medical Informatics Association 1999 Nov-Dec;6(6):478-93. Abstract.

[2] - Many thanks to Marco van Zwetselaar of Kilimanjaro Clinical Research Institute for contributing this diagram on the OpenClinica users mailing list.

[3] - See the OpenClinica Blog for more detail on how the data model maps to ODM.

This page is not approved for publication.