Extract Data

Getting data out of OpenClinica is as important as getting data in. You can create a dataset based on any data that’s been entered into OpenClinica, and can combine data from different events and forms as needed. Once you define the data that you want to extract, you can select from a number of different formats to view and work with that data. See details below on how to extract data from OpenClinica.

Create a Dataset

To create a dataset, follow the steps below:

  1. From the Tasks menu, select Create Dataset.

  2. On the left panel, expand the event that contains the form data that you want to extract. If the form is in multiple events, select one event now, and you can add more events and forms in subsequent steps. In the following example, the Baseline event has been expanded.

  3. Select the form that contains the data you want to extract. This populates the center of the page with the items available in the selected form. In this example, Physical Exam was selected.

  4. Select individual items or, to select all items in that form, check the Select All Items checkbox above the item list.
  5. To select additional items – either from the same form in other events, or from a different form, click Save and Add More Items, and then repeat steps 2 throuh 4 until you have all the items you’d like to see in the extracted dataset.

    Note that you also have options to select Event, Participant, and CRF Attributes. The available attributes are listed below:

    You also have the option to Select All Items in the Study.

  6. Once you have selected all the desired items and attributes, click Save and Define Scope.
    The Name and Description page displays.

  7. Provide a name and description for the dataset. Names must be alphanumeric characters; underscores are permitted. Then select the completion status of the data you want to see in the extract. Ignore the message and fields displayed in the lower portion of the page, and click Confirm and Save.
    The Select Format page displays.

  8. Select the output format for the extracted dataset and click Run Now to extract the data. Note that the first option listed is the most complete extract format. It is the ONLY option that includes the audit log data as well as all of the clinical data and metadata.

    For more information on the extract formats, see Formats for Dataset Files. Though this references a chapter in the OpenClinica 3 user documentation, the extract formats are the same in OpenClinica 4.

  9. OpenClinica displays a page that indicates your extract is running. After a few moments (or longer, for large datasets), to view the extracted data, click Back to Dataset.
    The Select Format page displays again, and your dataset (if it completed) is listed at the bottom of the page. If it is not listed, it is still running or is queued to run. When it completes, it will display at the bottom of the page. Refresh your browser periodically, or at a later time, go to Tasks > View Datasets to see the results.


  10. To download the dataset, in the Action column, click the download link. To delete the dataset (this only deletes the data extract – it does NOT delete data from the database), click the delete icon.

    The data in the dataset reflects the OpenClinica database at the time the dataset file was generated – not at the time that you download the file. The dataset file name includes the date and time that the file was generated.

    You can run the same extract in a number of different formats. OpenClinica retains one dataset file for each format for the dataset definition. If you generate a dataset and the dataset format already exists, the file you create overwrites the existing file. For example, in the above screenshot, an Excel extract was created. If the same extract was run and Excel was selected, that extract would overwrite the Excel file shown above. However, if HTML was selected for the second extract of this dataset, two files would be listed – one for Excel, and one for HTML.