Creating A Submission Package
Creating A Submission Package
Creating A Submission Package
ABSTRACT
Creating a clinical trial data package for electronic submission to a regulatory agency is a daunting task.
There are many steps that must be executed with precision and efficiency to create a good quality
submission package. If you are working on a submission study for the first time, then this paper is for you.
Each electronic submission contains 5 modules; however, this paper will focus on steps involved in
creating data set components for Module 5. This includes validating SDTM and ADaM data sets using
Pinnacle 21 Enterprise software, generating reviewer's guides (cSDRG and ADRG), creating the
define.xml for SDTM and ADaM, and much more! We will also look at important points from FDA’s Study
Data Technical Conformance Guide and sprinkle in various lessons learned from earlier submissions in
pulling all this together to create a high-quality submission package.
INTRODUCTION
There are many components that contribute towards an entire electronic submission package for FDA
that is prepared hands-on by statistical programming teams and included in the electronic Common
Technical Document (eCTD) Module 5. This module consists of everything related to human clinical trial
data such as data sets, reviewer’s guides, define.xml, additional definition documents (if needed), and
data sets/TLF/macro programs (as agreed with FDA in a Type C meeting or correspondence). In this
paper we will discuss and walk through various good practices about creating these items.
Note: This paper assumes that a submission package is created in adherence with CDISC clinical data
structures and standards, thus those aspects will not be discussed in this paper.
1
Figure 1. M5 Folder Structure from the FDA TCG
TABULATIONS FOLDER
This folder stores all components related to the SDTM eSub data package, such as SDTM data sets (as
SAS® V5 transport files), SDTM annotated CRF (aCRF), SDTM reviewer's guide (cSDRG), define.xml
(and define style sheet), printable define PDF (optional), and any additional definitions document or
study-specific supplemental files/documents. In our experience, companies do not submit SAS programs
for SDTM data sets, because these data sets are sufficiently standardized to be understood without
programming code to reference; such programs also operate on raw data which are typically not provided
to FDA, thus further reducing the degree to which the code may be meaningful to reviewers.
ANALYSIS FOLDER
This contains all information regarding ADaMs such as the ADaM reviewer’s guide (ADRG), define.xml
and style sheet, printable define PDF (optional), ADaM data sets (as SAS® V5 transport files), and
2
ADaM/TLF/macro SAS or R programs in ASCII format (as agreed with FDA). Note that a common
misperception is that when the TCG mentions "ASCII format," it means renaming SAS or R programs to a
*.txt extension. SAS and R programs are actually already in ASCII format by default regardless of their
file extension, and they can be submitted directly as *.sas or *.r without being renamed to *.txt.
ANNOTATED CRF
Annotated CRFs are blank casebooks mapped to SDTM data set and variable information. While some
may consider an annotated CRF as something to be developed right before a submission, several
companies, including ours, actually see an upside to making this the very first step towards developing
SDTM data sets. Doing so helps expedite and bring efficiencies to the process of building SDTM data
sets. During the lifecycle of study, a working version of this blank CRF can be maintained that has
annotations for both raw database extract variables as well as SDTM variables, as it helps with internal
traceability and fast development of SDTM data sets. Note that for submission purposes the requirements
are more strict: a blank CRF without raw data variable annotations should be used to annotate SDTM
data sets and variables. This is the aCRF file that will be submitted to the regulatory agency.
3
Display 1. Annotated CRF with raw data and SDTM variable annotations
Display 2. Annotated CRF with SDTM variable annotations only, prepared for submission
STEPS TO FLATTEN ANNOTATED CRF: USING ADOBE® ACROBAT PRO VERSION 2017
Options to flatten the file: Tools > Optimize PDF > Check for option “Flatten annotations and form fields” >
Edit to its right > Click on Edit.
4
Display 3
Display 4. Check if flattening options are unlocked > Save > Analyse and Fix > Save as PDF >
Rename to acrf.pdf and save.
PINNACLE 21 ENTERPRISE
This section is applicable only for subscribers to Pinnacle 21 Enterprise software, installed on the user’s
computer. It is expected that respective project, study, dictionaries and standards information and
versions are accurately entered to the Pinnacle 21 Enterprise software prior to using it for package
validation and generating define.xml and reviewer’s guides.
Once the study CRFs are annotated with SDTM mapping, the next step is to develop SDTM and ADaM
data sets based on data specifications, the SDTM and ADaM IG, study SAP, TLF mock shells, etc. Once
a considerable number of datasets are programmed, it is recommended to validate them using Pinnacle
21 software. Currently, there are two version of Pinnacle 21 available to the users. The Community
version is freely available to download from the website and Pinnacle 21 Enterprise is the paid version,
that is built over the community version and performs additional checks, provides data fitness scorecards,
and facilitates generation of define.xml and reviewer’s guides.
This section will cover some tips and tricks in using Pinnacle 21 Enterprise to leverage additional features
of the enhanced software.
5
Display 5. Pinnacle 21 Enterprise Dashboard
Use Pinnacle 21 Enterprise instructions to upload and validate the data. Once the data is uploaded to
Pinnacle 21 Enterprise, based on the quality and completeness of the data, the dashboard will look
something like the image shown in Display 5.
While there is no specific target score that makes or breaks your submission, in general it is fair to say
that the higher the data fitness score, likely the better the quality and structural integrity of the data. Users
can improve the data fitness score by reviewing the list of issues in the “Issues” tab and updating the data
package accordingly. Once the data is updated, a reupload is needed to see the change in score.
Pinnacle 21 Enterprise also facilitates issue explanation and tracking, to help build the “Issues Summary”
section in the reviewer's guides.
Once you have used your Excel spec to develop several data sets on your study, you can create the
Pinnacle 21 load file based upon it. This will usually involve a degree of transformation of your internal
spec to the load file structure. At Seagen we capitalized on our department-wide standard specification
structure by developing a central macro that creates the load files automatically based on any of our
internal study specs; one of the many benefits of a centralized and structurally consistent representation
of data specs. Once your load file is ready, you can bring it into Pinnacle 21 as follows:
1. Click on the ‘Define.xml’ link at the right-hand corner of the dashboard.
2. A new window will open which will give tab options such as Import, Export, Preview, etc.
3. Use the ‘Import’ tab and upload the data set specification using the ‘Excel Spec’ option.
6
4. Once the specification is uploaded, use the ‘Export’ tab, and download the ‘Define.xml’,
‘Define.pdf’ and ‘Stylesheet’, all in the same folder.
Display 9. Example define.xml created using Pinnacle 21 Enterprise. Sections are accurately
hyperlinked. SDTM versions also have correct link between aCRF pages and corresponding
define.xml data specifications
7
SOME IMPORTANT POINTS TO CONSIDER:
1. The define.xml file and corresponding style sheet should be in the same location for the define.xml to
open in a browser in the expected display format.
2. Check for any truncations in the ‘Comment’ or ‘Method’ column of define.xml, especially for variables
that have a long derivation logic.
3. Check if all the hyperlinks work and open as desired.
4. A define.pdf can also be generated using Pinnacle 21 Enterprise, if desired. This is optional if
define.xml v2.0 or higher is submitted, yet we recommend producing it regardless, so reviewers have
a more easily printable version of these critical metadata.
5. CRF page numbers can be automatically generated by Pinnacle 21 if both aCRF and the excel load
file are provided. Hence, users do not need to manually maintain the CRF pointers or build tools for it
if they have access to Pinnacle 21 Enterprise.
6. Additional documents or datasets can be hyperlinked in define.xml as well. For links to work, they
should be mentioned in the ‘Documents’ tab of the Pinnacle 21 spec template and stored in the same
folder as the specs. If any documents are a part of the package and mentioned in the ‘Documents’
tab, it should be carried over to ‘Documents’ column in ‘Methods’ or ‘Comments’ tab of the Pinnacle
21 spec template.
REVIEWER'S GUIDE
Pinnacle 21 can be used to generate reviewer's guides as well for both SDTM and ADaM, using the link
at the right-hand corner of the dashboard. An efficient feature of Pinnacle 21 Enterprise is that all
unresolved issues can dynamically be extracted into the ‘Issue Summary’ section of the reviewer's guide.
Display 11. Pinnacle 21 Enterprise: Link to generate Reviewer’s Guide on the dashboard
If you do not have access to Pinnacle 21 Enterprise, no worries! PHUSE provides stepwise instructions
along with a template which can help to generate reviewer’s guides as well.
https://advance.phuse.global/display/WEL/Deliverables
8
SOME IMPORTANT POINTS TO CONSIDER:
1. The reviewer’s guide is a good place to document important information regarding the study as well
as data set-related structural or logical decisions taken throughout the course of the study, to be
shared with the reviewers of a submission.
2. Check that all hyperlinks work accurately throughout the document in its final location – not just in the
location where you first wrote the guide.
3. The ‘Issue Summary’ section should be completed with proper explanation for any unresolved errors,
warnings, or notices, such that reviewers can easily understand why a specific item was not resolved
before submission.
4. The reviewer’s guide for SDTM should be named ‘csdrg.pdf’ and for ADaM ‘adrg.pdf,’ to follow the
correct eCTD naming conventions as expected by FDA.
3. Use hyphens instead of underscores in filename for all files included in the submission package.
The Agency understands that if companies submit SAS programs that were finalized with
underscores in their name, or SAS macros that were automatically included in the programs via
the SAS global options mautosource and sasautos rather than with explicit %include
statements, such programs may not exactly align to the file names they reference in headers or
macro names in all cases. This is acceptable, especially if briefly explained in the ADRG.
4. Programs submitted as part of the package should be cleaned for any "dead code" or unwanted
comments. Programs should be clean, and comments should be provided wherever necessary.
CONCLUSION
As daunting a task as it may seem, creating an acceptable electronic submission package for FDA
benefits greatly from early and detailed planning. It is very important to gain familiarity and knowledge of
guidelines provided by FDA. With the help of efficient planning, knowledge, and various tools and
software provided to Industry like Pinnacle 21, sponsors can create a high-quality package for regulatory
agencies.
REFERENCES
Study Data Technical Conformance Guide: https://www.fda.gov/media/143550/download
Technical Rejection Criteria for Study Data: https://www.fda.gov/media/100743/download
9
Providing Regulatory Submissions in Electronic Format - Certain Human Pharmaceutical Product
Applications and Related Submissions Using the eCTD Specifications:
https://www.fda.gov/media/135373/download
PHUSE Reviewer’s Guide Templates (ADRG and (cSDRG):
https://advance.phuse.global/display/WEL/Deliverables
Ready, Set, Go: Planning and Preparing a CDISC Submission: https://www.lexjansen.com/phuse-
us/2018/sp/SP03.pdf
ACKNOWLEDGMENTS
I would like to thank Shefalica Chand, John Shaik and Michiel Hagendoorn for their valuable feedback
and constant support and guidance.
CONTACT INFORMATION
Your comments and questions are valued and encouraged. Contact the author at:
Lyma Faroz
Seagen Inc.
21823 - 30th Drive S.E.
Bothell, WA 98021
lfaroz@seagen.com
10