This page explains how you can create and manage datasets in OpenClinica. A dataset defines the specific events, forms, items, and attributes you want to extract from your study. Once created, you can run the extract on demand or through a scheduled job, and download the resulting files in multiple formats.

What Is an Extract?

A dataset defines the specific events, forms, items, and attributes you want to include in your output. In addition to form items, you can also add event attributes, participant attributes, and CRF attributes to your dataset. This gives you fine-grained control over which data are included when you generate an extract.

OpenClinica supports datasets that include:

Any clinical data collected in the study, including data from Participate forms
Items from multiple events and forms
Event-, participant-, and CRF-level attributes
Either all available items or only those you select

Why Use a Dataset?

Datasets allow you to:

Generate customized extracts for analysis or reporting
Combine data across events, forms, and sites
Produce consistent extracts over time using the same dataset definition
Download data in multiple output formats
Run extracts on demand or schedule them to run automatically

When you run a dataset, OpenClinica generates an extract containing data for all participants and sites you have permission to access.

Who Can Create and Run Extracts?

The following roles typically have access to create and run datasets:

Data Managers
Data Specialists
Investigators
Monitors

Access to specific extracts may vary depending on your study configuration and role permissions.

How Extracts Work

When you run a dataset:

OpenClinica compiles the selected data from all permitted participants and sites
You can download the extract in multiple format options
Extracts can be run manually or scheduled to run automatically

ℹ️ Note: Archived or removed forms are not included in extracts by default. Archived form versions are included. This behavior also applies to Participant Casebooks and API-based extracts unless overridden.

💡 Tip: To extract all data for a single participant, use a Participant Casebook. For details, refer to Generating Participant Casebooks.

Create a Dataset

This section guides you through creating a dataset by selecting Events, Forms, and Items to include in your extract.

Open the Create Dataset Page
1. In Study Runner, click Tasks in the header bar.
2. Select Create Dataset.

Select Events, Forms, and Items
Use the left panel to choose the data you want to include.
Select an Event
1. Expand the Event that contains the Form data you want to extract.
  - If the same Form appears in multiple Events, select one now—you can add additional Events later.

Select a Form

1. Click the Form containing the Items you want to extract.

Select Items

1. Choose specific Items, or select all Items in the Form:

- - Check Select All Items to include every Item within the Form.
  - Otherwise, select Items individually.

⚠️ Important:

Forms that contain only contact data Items do not appear during dataset creation. Similarly, contact data items are not available for inclusion into the extract.

All non–contact data items are available for selection when building a dataset, and Item metadata is always visible. Actual data access is enforced only when the extract is run or downloaded.

Add More Items (Optional)

1. To select additional Items from other Events or Forms:

- - Click Save and Add More Items.
  - Repeat the selection steps above until all desired Items are included.

ℹ️ Note: You may also add data from Events, Participants, and CRFs using the Event/Participant/CRF Attributes screens.
To include the entire study, click Select All Items in the Study.

Name and Describe the Dataset
1. Click Save and Define Scope.
  The Name and Description page appears.
2. Enter a Name and Description for the dataset.

ℹ️ Note: Dataset names must use alphanumeric characters; underscores are allowed.

Choose Item Status
From the Item Status field, select which CRF status you want to include:
- CRFs marked Complete
- CRFs not marked Complete
- All CRFs

Save the Dataset Definition
Ignore the message and optional fields shown in the lower portion of the screen, then click Confirm and Save.
Your dataset definition is now saved. As additional data are entered in the study, future extracts will include any new data that match your saved definition.

Run an Extract

On the Select Format screen, choose an output format for your dataset.
Click Run Now.
You will receive an email notification when the extract has finished processing.

OpenClinica displays a progress message while the extract is running. Large datasets may take several minutes to complete. While processing, the dataset status appears as IN PROGRESS.

ℹ️ Note: The first format option—CDISC ODM XML 1.3 Full with OpenClinica extensions—is the most complete extract. It is the only format that includes:
• Audit log data
• All clinical data
• All metadata

Download an Extract

You can download an extract using either method below:

Click the link in the email you receive after the extract is complete, or
Download from the user interface:
1. Click Back to Dataset to return to the status screen.
2. Locate your dataset at the bottom of the page.
3. Click Download in the Actions column next to the extract file.

ℹ️ Note: Some extract files may display “Filtered” in the file name or may be unavailable for download. Refer to the following section for details on filtered extracts and access restrictions.

Extract Permissions and Access Rules

OpenClinica enforces form-level permissions when running and downloading the extract. These rules ensure you only access data from Forms you are authorized to view. Metadata (Item names and labels) are always visible.

Permissions When Running an Extract
When you run an extract:
- OpenClinica checks your access to all permission tags associated with selected Forms.
- If you lack access to any Form:
  - Metadata remains visible; but the restricted Form data is omitted.
  - The system generates a filtered extract with the prefix filtered_.
Permissions When Downloading an Extract
Before a dataset (full or filtered) is downloaded, OpenClinica performs a second permission check:
- You must have access to every Form included in the dataset.
- If you lack access to one or more Forms:
  - The download is blocked.
  - An error message appears.
  - You cannot view, open, or delete the dataset file.
This behavior applies to extracts you created and extracts created by other users with different permission sets.

For more information about form permissions, refer to Managing Form Access and Permissions.

Deleting a Dataset

To delete a dataset, click the Delete button in the Actions column. Deleting a dataset removes only the dataset definition—it does not remove any data from the OpenClinica database.

ℹ️ Notes:

The dataset file reflects the data that existed at the time the extract was generated. The file name includes the date and time of generation.
If you do not have access to one or more Forms in the extracted dataset file, you cannot delete the dataset or filtered dataset.

File Overwrite Behavior

OpenClinica retains one dataset file per format for each dataset definition. If you run the same extract again in a format that already exists:

The new file overwrites the previous file of that format.
If you run the extract in a new format, both files remain available.

For example, if an extract was originally generated as Excel and you run it again as Excel, the Excel file is replaced. If you then run the same extract as HTML, both the Excel and HTML files will be available.

Scheduled Export Jobs

Scheduled export jobs allow you to automate dataset extracts so they run at predefined intervals without manual intervention.

View Scheduled Jobs

To view scheduled export jobs:

In Study Runner, click Tasks in the header bar.
Under Extract Data, select Jobs.
The Scheduled Export Data Jobs screen appears.

View Job Details

Go to Tasks > Jobs under Extract Data.
Click View in the Actions column.

Create a Scheduled Job

Only users with the Admin User Type can create scheduled jobs.

Click Tasks > Jobs under Extract Data.
Select Create New Scheduled Extract.
Complete all required fields.
Click Confirm and Save to create the job, or Cancel to discard it.
You will receive an email when the job completes.

Retention Setting

The Number of files to save field allows you to retain up to 10 past extract files.

The most recent file is always emailed to recipients.
Older files can be accessed through the API.

ℹ️ Note: The default date/time is the current server time. Any date/time after the server time is valid.

Edit a Scheduled Job

Navigate to Tasks > Jobs.
Click Edit in the Actions column.
Update one or more fields.
Click Confirm and Save, or Cancel.

Remove a Scheduled Job

Open Tasks > Jobs.
Click Remove in the Actions column.
- Removing a job stops it from running but allows it to be restored later.
Confirm removal.

Restore a Scheduled Job

Go to Tasks > Jobs.
Click Restore in the Actions column.

Delete a Scheduled Job

Go to Tasks > Jobs.
Click Delete in the Actions column.
Confirm the deletion.

⚠️ Warning: Deleting a job permanently removes it. Unlike the Remove action, deleted jobs cannot be restored.

Role-Based Access to Scheduled Job APIs

Some users may also access scheduled job files through the Scheduled Jobs API on the Web Services Information screen.

User Type	Roles	Allowed Actions	Access Limitations
Study-level roles	Data Managers Data Specialists Data Monitors	Call the job execution API Call the job file retrieval API Download dataset files	Access is not limited by site, but is controlled by study permissions and permission tags
Site-level roles	Site Monitors Investigators	Call the job execution API and view job UUIDs for their site Call the job file retrieval API and download files for jobs scheduled for their site only Access only datasets they have both role and permission tag access to	Cannot access job files for other sites or study-level jobs
No API access	Clinical Research Coordinators (CRCs) Data Entry Persons Site Viewers Study Viewers	None	Cannot execute jobs or retrieve files via API

Permission Enforcement

Retrieving a job file via API follows the same form-level permission rules as manual downloads:

If you lack access to any Form included in the dataset, API retrieval fails.

Relevant API Endpoints

Get job execution UUIDs:
GET /auth/api/extractJobs/{jobUuid}/jobExecutions
Retrieve a dataset file for a job execution:
GET /auth/api/extractJobs/jobExecutions/{jobExecutionUuid}/dataset

Dataset Formats

OpenClinica allows you to download datasets in several formats based on how you want to view the data. Tabular formats (Tab Delimited Text, HTML, and Excel) are the easiest to read.

Available Dataset Formats

Tabular Formats

These formats are the easiest to read and are commonly used for review and reporting.

Format	File Type(s)	Applications	Description
Tab-Delimited Text	tsv	Text Editor	Easy to read; Includes a table with information on the dataset and a table that contains the data; can be parsed by other programs
HTML	html	Internet Browser	Easy to read; Includes a table with information on the dataset and a table that contains the data
Excel	xls	Excel	Easy to read; Includes a table with information on the dataset and a table that contains the data
SPSS	dat sps	IBM SPSS	File contains information about data set; .dat file contains data; uses different syntax; useful for analysis
CDISC ODM XML (1.2 or 1.3, With extensions, or Full)	xml	XML Editor or Internet Browser	The most complete extract; Contains information about the dataset, data, and metadata; limitations
SAS Data and Syntax	xml sas	SAS Data and Syntax	Requires the most set-up; uses different syntax; useful for analysis

For more information, refer to OC Data Extracts and Reporting Types.

Below Are Some Images of Extract Formats:

Tab-Delimited

HTML Format

💡 Tip: When viewing an HTML file, you can click an Item’s column header to view its metadata.

Excel Format

CDISC ODM XML Format

CDISC

CDISC ODM is a vendor-neutral, platform-independent format used for the interchange and archiving of data collected in clinical trials. It represents study metadata, clinical data, and administrative data, and is designed to comply with guidance and regulations published by the FDA for computer systems used in clinical research.

ODM Data Model Structure

The ODM model organizes clinical study data into structured entities, including:

Subjects
Study Events
Forms
Item Groups
Items
Annotations

Metadata and Clinical Data

Metadata defines the types of Study Events, Forms, Item Groups, and Items permitted in the study.
Clinical Data consists of the actual collected entities that correspond to those metadata definitions.

ODM File Composition

An ODM file is an XML document structured as a hierarchical tree of elements. Each element represents an entity and contains required and optional attributes.

File Components

An ODM file consists of two main sections:

Metadata
Includes Study unit OIDs, Event information, CRF details, Item Groups, Items, validation rules, and user account information.
Subject Data
Includes Subject details, Event data, CRF data, and collected Item values.

ODM File Types

An ODM file must be one of the following:

Snapshot
Represents the current state of the included data.
Transactional
Represents the latest state and, optionally, prior states of the included entities.

Granularity Attribute

Each ODM file includes a Granularity attribute that defines the scope and coverage of the data contained within the file.

CDISC ODM Format Options

When you select a CDISC ODM format for a dataset, OpenClinica exports the data as an .xml file that complies with the Clinical Data Interchange Standards Consortium (CDISC) Operational Data Model (ODM).

Available ODM Variants

You can choose from the following options:

1.2 or 1.3 – Specifies the version of the ODM standard used for the export.
With Extensions – Includes OpenClinica-specific entities that are not part of the ODM specification, such as OpenClinica:SdvStatus.
Full – Includes Discrepancy Notes (Queries) and the Audit Log.

ℹ️ Note: In the Full ODM XML format, contact data is always masked in the audit log, regardless of user permissions.

SPSS Format

When you select the SPSS format, the extracted data is provided as a .DAT file that you can open in a text editor. The SPSS output displays data in a table layout similar to Excel for easier review and analysis.

Variable Naming and Identifier Conventions

To prevent duplication and ensure accurate identification of data collected across multiple Events and CRFs, OpenClinica appends identifiers and ordinal numbers to each variable name.

These variable names can be used in multiple CRFs across multiple Events. These appended numbers will help identify the event, CRF and item the value was collected in.

Where Identifiers Are Defined

For Tab-Delimited, HTML, and Excel formats, identifiers are defined in the header table.
For SPSS, identifiers are defined in the separate syntax file (.sps).

The following conventions apply to Tab-Delimited, HTML, and Excel formats:

E1
- E represents the Event identifier.
- 1 indicates which Event the variable originated from, as defined in the header table.
- For repeating Events, this appears as E1_1, E1_2, E1_3, and so on.
C1
- C represents the CRF identifier.
- 1 indicates which CRF the variable originated from, as defined in the header table.
- For repeating Events or repeating Item Groups, an ordinal value _X is appended to specify the occurrence.

For example:

An item named DEMO in the 3rd occurrence of a repeating event and the 5th repeat of a group would appear as: DEMO_E1_3_C1_5
An item in a repeating event but not part of a repeating group would appear as: DEMO_E1_3_C1

The [EVENT HANDLE] and [CRF HANDLE] represent system-generated identifiers appended to each Item name to prevent duplication across repeating data points.

Structure of Tabular Extracts

Tabular formats (Tab-Delimited Text, HTML, and Excel) contain two distinct sections:

Header Table
The header table includes the following information:

Dataset name
Dataset description
Study name
Protocol ID
Date
Subjects
Study Event Definitions
CRFs

For each included Study Event Definition, the event name and associated identifier appear for reference in the data table.
For each included CRF, the CRF name and associated identifier appear for reference in the data table.

Data Table
The data table includes the data you selected for the dataset.

SAS

This section explains how to work with datasets exported in SAS format, including how to prepare files and configure SAS Studio to generate usable output tables.

SAS Output Files

When you select the SAS format, OpenClinica generates the following files:

SAS_DATA.xml – The extracted data.
SAS_MAP.xml – A mapping file that maps the data to the appropriate structures
SAS_Format.sas – For items defined as select_one or select_multiple, OpenClinica creates the library and maps response values to the appropriate response text

ℹ️ Note: Select_multiple and checkbox Items appear as comma-separated values in OpenClinica (for example, 1,2,7), these cannot be mapped to individual response text options.

Prepare SAS Studio

If using SAS Studio, please follow the directions below. Other versions of SAS will retain the basic instructions related to what needs to be uploaded and the code that needs to be run.

After Creating a Dataset in OpenClinica and Downloading it in SAS Format:

Create a SAS Studio account by going to SAAS Studio | SAAS
1. Sign in and select SAS® Studio (Launch)
In SAS Studio, right-click Files.
1. Select New > Folder, enter a folder name, and click Save.
To upload the data file (xml) and the map file (xml), click Upload Files at the top of the sidebar or right-click the folder and select Upload Files.
1. Click Choose Files after confirming the folder.
2. Select the SAS_MAP and SAS_DATA xml files to upload and click Open.
3. Verify the information and click Upload.
Click New at the top of the sidebar or right-click on your folder and select New > SAS Program (F4) to open a new Program window.
Open the SAS_FORMAT file in an external text editor.
Before running this code, edit the first three lines of the code by replacing the ~ with the path of the files.
1. Find the paths by right-clicking the folder that contains these files and selecting Properties.

Example

BEFORE – The First 3 Lines of Your Format File:
FILENAME S100_155 “~/SAS_DATA.xml”;
FILENAME map “~/SAS_MAP.xml”;
LIBNAME S100_155 xml xmlmap=map access=readonly;

AFTER – The First 3 Lines of Your Format File:
FILENAME S100_155 “/home/u62714010/sasuser.v94/Demographics/SAS_DATA.xml/”;
FILENAME map “/home/u62714010/sasuser.v94/Demographics/SAS_MAP.xml”;
LIBNAME S100_155 xml xmlmap=map access=readonly;

Run the Program

Paste the updated syntax into the SAS Program window.
Click Run.
View results in the Output Data tab.

Generated Output

Table Generation

SAS generates data tables based on OpenClinica Item Groups. Each Item Group produces a corresponding SAS table.

Tables are generated from OpenClinica metadata.
All Item Groups included in the extract produce a table.
If no data exists for a specific Item Group, the corresponding SAS table is still created but remains empty.

Column Naming

OpenClinica Items are used as SAS column names.
Tables include the complete master set of Items defined by the Item Group, even if Items span multiple CRF Versions.
The SAS output does not indicate which CRF Version the Item originated from.

Data Type Classification

SAS output supports two data types:

Numeric
Includes all OpenClinica Items defined as Integer or Real.
Char
Includes all other OpenClinica Item data types.

Constraints and System Rules

OpenClinica supports a maximum of 3,999 single-byte characters in a text field. When extracted to SAS, the full value is preserved in the SAS_DATA.xml file.

SAS Dataset Naming Rules

SAS dataset names must:

Not exceed 32 characters
Begin with a letter (A–Z) or underscore (_)

To comply with these rules, OpenClinica generates dataset names using a modified Item Group OID, based on the following logic:

If the Item Group is Ungrouped, the CRF Name is used as the dataset name.
Otherwise:
- The prefixed IG is removed to reduce character length.
- The resulting format is:
  _[First 5 characters of CRF Name]_GROUPLABEL
- If the resulting name exceeds 35 characters, OpenClinica appends a three- or four-digit number derived from the original IG_OID to ensure uniqueness.

SAS Column Naming Rules

SAS column names must:

Not exceed 32 characters
Begin with a letter (A–Z) or underscore (_)

To meet these requirements, OpenClinica modifies the Item OID when generating column names as follows:

Truncates the I_5CHAR prefix from the left.
Retains the portion beginning with _ITEMNAME to prevent numeric-leading names.
Any appended three- or four-digit numbers are retained to preserve uniqueness.

SPSS File Structure and Access

When you select the SPSS format, the extracted .zip file contains the following files:

.DAT file – A tab-delimited data file containing the dataset values
.SPS file – An SPSS syntax file that defines the dataset structure and formatting

Load Data into SPSS

To access the dataset:

Save both the .dat and .sps files to the same folder.
Open IBM SPSS.
Open the .sps file in SPSS.
If the files are not in the same location, update the file path in the .sps file to point to the physical location of the .dat file.
Select Run > All to load the data into SPSS.

You can preview the raw data by opening the .dat file in a text editor.

SPSS File Specifications

When you select the SPSS format, OpenClinica generates a package of files for use in IBM SPSS. These files have been tested with SPSS for Windows, version 20.

Although SPSS can read almost any ASCII file and deduce parameters for some of these variable attributes, any other attributes must be typed in by hand, which is tedious for large datasets.

Instead of using a generic ASCII dataset file, select the SPSS Syntax format (.sps). When used in conjunction with the associated .dat file, this format automatically loads the dataset into SPSS with the correct variable definitions and attributes applied.

SPSS Data Definitions Cover Ten Main Properties for Any Variable:

Name
Type
Width
Decimals
Label
Values
Missing
Columns
Align
Measure.

OpenClinica Currently Supports Automated Definition of:

Name
Type
Width
Decimals
Label
Values

SPSS Conceptual Mapping

This table presents the conceptual mapping of SPSS Data Definitions to OpenClinica data element metadata:

SPSS Data Definition Metadata	OpenClinica CRF Metadata
Name	Item Name
Type	Mapped to Item Types
Width	Calculated from the Widest Value in the Field
Decimals	If the Item Type is Decimal, it is Calculated from the Most Precise Value in the Field
Label	Item Label
Values	Generated from Choice Labels and Choice Names
Missing	N/A
Columns	N/A
Align	N/A
Measure	N/A

Mapping between SPSS types and OpenClinica CRF Item Types

The table below describes the mapping of OpenClinica CRF ITEM data types to SPSS types.

CRF Data Type	CRF Width (Decimal)	CDISC ODM XML Data Type	SPSS Variable Type	SPSS Syntax for Type Format
text, select_one, select_multiple	n	text	String	An
integer	n	integer	Numeric	Fn.0
decimal	n(d)	float	Numeric	Fn.d
file, image, audio, video	n	text	String	An
date	N/A	date	Date	ADATE10

ℹ️ Note:

Multi-Select Item Behavior
Items with a data type of ST, INT, or REAL are treated as multi-select Items when associated with a CRF response type of multi-select or checkbox.
In this case:

- The Item is defined as a string (A) in SPSS.
- Selected values appear as a comma-separated list in the field, even if the original CRF Item data type is INT or REAL.
Numeric Precision Limitation
SPSS supports a maximum of 17 significant figures. Values exceeding this limit lose precision during export. This is a limitation of SPSS and not of OpenClinica.

Examples

Significant Figures	Entered Value	Value Stored in SPSS
20	12345678901234567890	12345678901234567000
19	0.1234567890123456789	0.123456789012345

Mapping Between SPSS Values and OpenClinica Choice Labels

In SPSS, the VALUE LABELS section of the syntax file maps OpenClinica choice labels to the corresponding discrete values used in SPSS.

Only Items with a response type of select_one or select_multiple appear in the VALUE LABELS section.

VALUE LABELS Syntax Structure

Value labels are defined for each variable using the following format:

Syntax Pattern – Example

Variable 1
VARNAME1
Choice Name[0] “Choice Label[0]”
Choice Name [1] “Choice Label[1]”
Choice Name [2] “Choice Label[2]”
Variable 2
VARNAME2
Choice Name[0] “Choice Label[0]”Choice Name [1] “Choice Label[1]”
Choice Name [2] “Choice Label[2]”

SPSS Data Definitions for Built-in System Fields

Subject Attribute: Subject Status

SPSS Data Definition Property	Value	Encoding
Name	SubjectStatus	SubjectStatus
Type	String	A
Width	[maximum length of subject status string across all the subjects]	[maximum length of subject status string across all the subjects]
Decimal	N/A
Label	Subject Status	Subject Status
Values	None
Missing	None
Columns	[maximum length of subject status string across all the subjects]	[maximum length of subject status string across all the subjects]
Align	Left
Measure	Unknown

Event Attribute: Start Date

SPSS Data Definition Property	Value	Encoding
Name	STARTDATE_[EVENT HANDLE]	STARTDATE_[EVENT HANDLE]
Type	Date	ADATE10
Width	N/A
Decimals	N/A
Label	Start Date for [EVENT NAME] (EVENT HANDLE)	Start Date for [EVENT NAME] (EVENT HANDLE)
Values	None
Missing	None
Columns	10
Align	Right
Measure	Unknown

Event Attribute: Status

SPSS Data Definition Property	Value	Encoding
Name	EventStatus_ [EVENT HANDLE]	EndDate_[EVENT HANDLE]
Type	String
Width	[maximum length of event status string across all the subjects]	[maximum length of event status string across all the subjects]
Decimals	N/A
Labels	Event Status For [EVENT NAME] (EVENT HANDLE)	End Date for [EVENT NAME] (EVENT HANDLE)
Values	None
Missing	None
Columns	[maximum length of event status string across all the subjects]	[maximum length of event status string across all the subjects]
Align	Right
Measure	Unknown

CRF Attribute: CRF Version Status

SPSS Data Definition Property	Value	Encoding
Name	CRFVersionStatus_[EVENT HANDLE]_[CRF HANDLE]	CRFVersionStatus_[EVENT HANDLE]_[CRF HANDLE]
Type	String	A
Width	[maximum length of CRF version status string across all the event CRFs]	[maximum length of CRF version status string across all the event CRFs]
Decimals	N/A
Labels	Event Status For [EVENT NAME] (EVENT HANDLE)	CRF Version Status For [EVENT NAME]
Values	None
Missing	None
Columns	[maximum length of CRF version status string across all the event CRFs]	[maximum length of CRF version status string across all the event CRFs]
Align	Left
Measure	Unknown

CRF Attribute: CRF Version Name

SPSS Data Definition Property	Value	Encoding
Name	VersionName_ [EVENT HANDLE]_[CRF HANDLE]	VersionName_ [EVENT HANDLE]_[CRF HANDLE]
Type	String	A
Width	[maximum length of CRF version name string across all the event CRFs]	[maximum length of CRF version name string across all the event CRFs]
Decimals	N/A
Labels	Version Name For [EVENT NAME]	Version Name For [EVENT NAME]
Values	None
Missing	None
Columns	[maximum length of CRF version name string across all the event CRFs]	[maximum length of CRF version name string across all the event CRFs]
Align	Left
Measure	Unknown

The Following Rules Apply to Variable Names in SPSS:

General Requirements

Must begin with a letter.
Remaining characters can include:
- Letters
- Digits
- Period ( . )
- Symbols: @, #, _, $
Must be unique.
Must not exceed 64 bytes:
- Typically 64 characters in single-byte languages (for example, English, French, German, Spanish, Italian, Hebrew, Russian, Greek, Arabic, Thai).
- Typically 32 characters in double-byte languages (e.g., Japanese, Chinese, Korean).

Character Restrictions

Cannot contain spaces or special characters such as:
- !, ?, ‘, *
Avoid ending with:
- A period ( . ), as it may be interpreted as a command terminator.
- An underscore ( _ ), to prevent conflict with system-generated variables.
The $ symbol:
- Indicates a system variable when used as the first character.
- Is not permitted as the first character of a user-defined variable.

Reserved Keywords

Variable names cannot use the following reserved keywords:

ALL, AND, BY, EQ, GE, GT, LE, LT, NE, NOT, OR, TO, or WITH

Case Sensitivity

Variable names may include any mixture of uppercase and lowercase letters.
Case is preserved for display purposes only.

Line Wrapping Behavior

When long variable names wrap across multiple lines in SPSS output, line breaks occur at:

Underscores
Periods
Transitions from lowercase to uppercase characters

OpenClinica Variable Name Conversion Rules

When an invalid variable name is encountered, OpenClinica automatically converts it to a valid SPSS variable name using the following logic:

If the first character is not a letter, V is prefixed to the name.
Invalid characters are replaced with #.
If the final character is a period or underscore, it is replaced with #.
Names longer than 64 characters are truncated to 64 characters.
If truncation results in non-unique names:
- Sequential numbers are appended to ensure uniqueness.
- The default size of the sequential number is 3 digits.
If a reserved keyword is used:
- Sequential numbers are appended to create a valid variable name.

Revision	Published	Approved By
Updates for stack 20	2025-12-12 10:39AM	Kate Lambert
Updated screenshots to include Help link and updated tables to no longer include table link	2025-02-10 13:47PM	Paul Bowen
Minor formatting update.	2024-07-18 00:29AM	Paul Bowen
Removed Downloading Participant Casebooks section in order to move it to a new page being created and updated screenshots as part of Stack 19.	2024-07-18 00:16AM	Paul Bowen
updated screenshot for scheduled job for stack 17.4 (text changed form Monthly to Every 4 weeks)	2023-01-24 15:38PM	Riley Bianchi
Updated broken screenshots and updated SAS section per ticket request.	2023-01-05 14:08PM	Riley Bianchi
S17 - updated scheduled export jobs	2022-07-27 19:32PM	Paul Bowen
updated link for "OC Data Extracts and Reporting Types" with Excel doc rather than the Google doc.	2022-02-04 15:12PM	Paul Bowen
Updated Casebook screenshot for PDP changes	2020-12-01 15:51PM	Kerry Tamm
	2020-11-16 16:18PM	Kerry Tamm
	2020-11-16 10:51AM	Kerry Tamm
OpenClinica:SdvStatus	2020-09-17 08:27AM	Kerry Tamm
Formatting	2020-09-14 09:11AM	Kerry Tamm
Added user type	2020-09-11 15:50PM	Kerry Tamm
Added API	2020-09-11 15:47PM	Kerry Tamm
Added Sched Jobs	2020-09-11 15:07PM	Kerry Tamm
Added section for Scheduled Jobs	2020-09-11 14:59PM	Kerry Tamm
Added section	2020-09-02 13:54PM	Kerry Tamm
Updates	2020-09-02 13:50PM	Kerry Tamm
Updated	2020-09-02 13:42PM	Kerry Tamm
Add links	2020-07-29 13:51PM	Kerry Tamm
Added note about Bulk Actions Log and email notification	2020-07-29 13:40PM	Kerry Tamm
Fixed Note	2020-03-05 10:30AM	Kerry Tamm
Fixed title	2020-03-05 10:20AM	Kerry Tamm
Updated title	2020-03-05 10:19AM	Kerry Tamm
publish	2020-03-03 16:20PM	Ben Baumann
Publishing (signed for Review by mistake last time)	2019-04-28 19:51PM	Paul Bowen
publish	2019-02-01 08:41AM	Ben Baumann
publish laura's changes	2018-10-11 16:27PM	Ben Baumann
just changed title	2018-07-12 09:06AM	Ben Baumann
publish	2018-06-29 09:58AM	Ben Baumann
publish	2018-06-28 21:39PM	Ben Baumann
test	2017-06-09 16:15PM	Laura Keita

18.3 Extract Data