Step-by-step Instructions for using the pdb_extract webserver

pdb_extract can be used to convert PDB format files to PDBx/mmCIF format and is a vital resource for the submission of large structures. You will be offered the option of using a template file containing data that can be shared among several depositions.

  1. Select one of the three methods (X‐Ray, NMR, or EM).
    • X-ray is fully supported for all of the current crystallographic applications including neutron and X-ray/neutron hybrids. pdb_extract has a data template file that can describe the data collection. Log files can be processed to extract meta for data processing. The extracted data will be combined with the templates and merged into one model mmCIF file.
    • The NMR option is supported. A data template file (for the author, and experiment conditions) is also provided.
    • The EM option is supported. A data template file is available.
  2. Select the appropriate file type. Note: PDB or mmCIF files can be used as input.
  3. Browse to find the structure model coordinate file to be converted.
  4. Click "Run" button to start PDB_extract
  5. If your coordinate file has no chain-IDs, pdb_extract will try to add by the best guess.
  6. If pdb_extract detects any formatting or data errors, it will stop and provide error message. After they are fixed, you can re-upload the file
  7. If everything is ok after you click the run button, you will get a second webpage titled "Extracting Information for PDB Deposition." This new page contains various sections depending on the experimental method.
    • "Information about Authors, Detectors…" provides a template that can be used for inserting various data items into the extracted mmCIF file. This is optional, as the information it provides can also be entered via the deposition interface.
    • For X-ray structure, there are also options to convert structure factors to mmCIF format, extract information from various steps of structure determination and provide or verify the unit cell parameters.
    • "Extracting Information for PDB Deposition" page provides macromolecular sequence information that has been extracted from the provided coordinate file. The user can then review and edit each sequence and its associated chain ID and polymer type. Please note the following points:
      • Sequence information extracted by pdb_extract is automatically read into the wwPDB Deposition System. It is therefore important to correct all issues with sequence at this stage.
      • Missing, concatenated, or otherwise unusual sequence information presented on the “Extracting Information for PDB Deposition” page may indicate the presence of data or formatting errors in the original coordinates that were not directly detected by pdb_extract. Please review these sequences carefully.
      • The extracted sequence information should be corrected to include all residues of each unique biopolymer, including any unmodeled terminal regions or gaps. Unobserved/missing residues should be included experimental sequence.
  8. Press "Run" On the next page, a link will be available for downloading the extracted mmCIF format file (pdb_extract_coord_xxxx.cif).
  9. The file(s) are now ready for deposition. To deposit the structure to the wwPDB, please go to: https://deposit.wwpdb.org/deposition