Documentation

Summary

EMDB map data model

The EM Data Bank (EMDB) accepts and distributes 3D map volumes derived from several types of EM reconstruction methods, including single particle averaging, helical averaging, 2D crystallography, and tomography. Since its inception in 2002, the EMDB map distribution format has followed CCP4 definition (CCP4 map format) , which is widely recognized by software packages used by the structural biology community. CCP4 map format is closely related to the MRC map format used in the 3DEM community (MRC map format); CCP4 is slightly more restrictive, in that voxel positions are limited to a grid that includes the Cartesian coordinate origin (0,0,0). Further details can be found here.

EMDB header data model

Every EMDB entry has a header file containing meta data (e.g., sample, detector, microscope, image processing) describing the experiment. The header file is an XML file and the structure and content of the header file is described by a XSD data model. With a highly dynamic field such as cryo-EM there is a constant need to adapt and modify the schema to keep it up-to-date with the most recent developments. We consult extensively with the EM community regarding such issues and version the schema according to the policy described here.

Data model version 1.9

This has been a long-term stable version of the data model. It was be replaced in 2018 with an updated model but XML header files in version 1.9 continues to be distributed in parallel for at least one year to give EMDB users ample time to switch. It should be noted that the generation of the version 1.9 header files will be on a best effort basis but involves a back translation from recent versions that are richer in content and will therefore not contain all the information that can be found in the more recent versions.

Download schema
Browse schema documentation
Download Python code to facilitate reading and writing XML version 1.9 header files
 

Data model version 3.0 (current model)

This data model replaced version 1.9, however header files corresponding to both data models will be distributed in parallel with the view of stopping the distribution of the version 1.9 files in 2019 once users have had a chance to adopt version 3.0.

This version adds a number of features including:

  • An improved description of direct electron detectors, specimen preparation and tomography experiments.
  • A hierarchal description of the overall sample composition in combination with a low-level description of the macromolecular composition to allow the description of both molecular and cellular samples.
  • Specific data items describing the half-maps and segmentations included with the entry.

Download schema
Browse schema documentation
Download Python code to facilitate reading and writing XML version 1.9 header files

Segmentation data model prototype

Find out more about the project here.

Publications