The ihm.startmodel Python module

Classes to handle starting models.

class ihm.startmodel.SequenceIdentityDenominator(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)[source]

The denominator used while calculating the sequence identity. One of these constants can be passed to SequenceIdentity.

MEAN_LENGTH = 4

Arithmetic mean sequence length

NUM_ALIGNED_WITHOUT_GAPS = 3

Number of aligned residue pairs (not including the gaps)

NUM_ALIGNED_WITH_GAPS = 2

Number of aligned positions (including gaps)

OTHER = 5

Another method not covered here

SHORTER_LENGTH = 1

Length of the shorter sequence

class ihm.startmodel.SequenceIdentity(value, denominator=SequenceIdentityDenominator.SHORTER_LENGTH)[source]

Describe the identity between template and target sequences. See Template.

Parameters:
  • value – Percentage sequence identity.

  • denominator – Way in which sequence identity was calculated - see SequenceIdentityDenominator.

class ihm.startmodel.Template(dataset, asym_id, seq_id_range, template_seq_id_range, sequence_identity, alignment_file=None)[source]

A PDB file used as a comparative modeling template for part of a starting model.

See StartingModel.

Parameters:
  • dataset (Dataset) – Pointer to where this template is stored.

  • asym_id (str) – The asymmetric unit (chain) to use from the template dataset (not necessarily the same as the starting model’s asym_id or the ID of the asym_unit in the final IHM model).

  • seq_id_range (tuple) – The sequence range in the dataset that is modeled by this template. Note that this numbering may differ from the IHM numbering. See offset in StartingModel.

  • template_seq_id_range (tuple) – The sequence range of the template that is used in comparative modeling.

  • sequence_identity (SequenceIdentity or float) – Sequence identity between template and the target sequence.

  • alignment_file (Location) – Reference to the external file containing the template-target alignment.

class ihm.startmodel.StartingModel(asym_unit, dataset, asym_id, templates=None, offset=0, metadata=None, software=None, script_file=None, description=None)[source]

A starting guess for modeling of an asymmetric unit

See ihm.representation.Segment and ihm.System.orphan_starting_models.

Parameters:
  • asym_unit (AsymUnit or AsymUnitRange) – The asymmetric unit (or part of one) this starting model represents.

  • dataset (Dataset) – Pointer to where this model is stored.

  • asym_id (str) – The asymmetric unit (chain) to use from the starting model’s dataset (not necessarily the same as the ID of the asym_unit in the final model).

  • templates (list) – A list of Template objects, if this is a comparative model.

  • offset (int) – Offset between the residue numbering in the dataset and the IHM model (the offset is added to the starting model numbering to give the IHM model numbering).

  • metadata (list) – List of PDB metadata, such as HELIX records.

  • software (Software) – The software used to generate the starting model.

  • script_file (Location) – Reference to the external file containing the script used to generate the starting model (usually a WorkflowFileLocation).

  • description (str) – Additional text describing the starting model.

add_atom(atom)[source]

Add to the model’s set of Atom objects.

See get_atoms() for more details.

add_seq_dif(seq_dif)[source]

Add to the model’s set of SeqDif objects.

See get_atoms() for more details.

get_atoms()[source]

Yield Atom objects that represent this starting model. This allows the starting model coordinates to be embedded in the mmCIF file, which is useful if the starting model is not available elsewhere (or it has been modified).

The default implementation returns an internal list of atoms; it is usually necessary to subclass and override this method. See ihm.model.Model.get_spheres() for more details.

Note that the returned atoms should be those used in modeling, not those stored in the file. In particular, the numbering scheme should be that used in the IHM model (add offset to the dataset numbering). If any residues were changed (for example it is common to mutate MSE in the dataset to MET in the modeling) the final mutated name should be used (MET in this case) and get_seq_dif() overridden to note the change.

get_seq_dif()[source]

Yield SeqDif objects for any sequence changes between the dataset and the starting model. See get_atoms().

The default implementation returns an internal list of objects; it is usually necessary to subclass and override this method.

Note that this is always called after get_atoms().

class ihm.startmodel.PDBHelix(line)[source]

Represent a HELIX record from a PDB file.

class ihm.startmodel.SeqDif(db_seq_id, seq_id, db_comp_id, details=None)[source]

Annotate a sequence difference between a dataset and starting model. See StartingModel.get_seq_dif() and MSESeqDif.

Parameters:
  • db_seq_id (int) – The residue index in the dataset.

  • seq_id (int) – The residue index in the starting model. This should normally be db_seq_id + offset.

  • db_comp_id (str) – The name of the residue in the dataset.

  • details (str) – Descriptive text for the sequence difference.

class ihm.startmodel.MSESeqDif(db_seq_id, seq_id, details='Conversion of modified residue MSE to MET')[source]

Denote that a residue was mutated from MSE to MET. See SeqDif for a description of the parameters.