The ihm.reader Python module

Utility classes to read in information in mmCIF or BinaryCIF format

ihm.reader.read(fh, model_class=<class 'ihm.model.Model'>, format='mmCIF', handlers=[], warn_unknown_category=False, warn_unknown_keyword=False, read_starting_model_coord=True, starting_model_class=<class 'ihm.startmodel.StartingModel'>, reject_old_file=False, variant=<class 'ihm.reader.IHMVariant'>)[source]

Read data from the file handle fh.

Note that the reader currently expects to see a file compliant with the PDBx and/or IHM dictionaries. It is not particularly tolerant of noncompliant or incomplete files, and will probably throw an exception rather than warning about and trying to handle such files. Please open an issue if you encounter such a problem.

Files can be read in either the text-based mmCIF format or the BinaryCIF format. The mmCIF reader works by breaking the file into tokens, and using this stream of tokens to populate Python data structures. Two tokenizers are available: a pure Python implementation and a C-accelerated version. The C-accelerated version is much faster and so is used if built. The BinaryCIF reader needs the msgpack Python module to function.

The file handle should be opened in text mode for mmCIF files. Traditionally, mmCIF files used ASCII encoding. More and more recent files are UTF-8 encoded instead, but some use other encodings such as latin-1. To handle most current files use something like:

try:
    with open('input.cif', encoding='utf-8') as fh:
        systems = ihm.reader.read(fh)
except UnicodeDecodeError:
    with open('input.cif', encoding='latin-1') as fh:
        systems = ihm.reader.read(fh)

The file handle should be opened in binary mode for BinaryCIF files:

with open('input.bcif', 'rb') as fh:
    systems = ihm.reader.read(fh, format='BCIF')
Parameters:
  • fh (file) – The file handle to read from. (For BinaryCIF files, the file should be opened in binary mode. For mmCIF files, files opened in binary mode with Python 3 will be treated as if they are Latin-1-encoded.)

  • model_class – The class to use to store model information (such as coordinates). For use with other software, it is recommended to subclass ihm.model.Model and override add_sphere() and/or add_atom(), and provide that subclass here. See ihm.model.Model.get_spheres() for more information.

  • format (str) – The format of the file. This can be ‘mmCIF’ (the default) for the (text-based) mmCIF format or ‘BCIF’ for BinaryCIF.

  • handlers (list) – A list of Handler classes (not objects). These can be used to read extra categories from the file.

  • warn_unknown_category (bool) – if set, emit an UnknownCategoryWarning for each unknown category encountered in the file.

  • warn_unknown_keyword (bool) – if set, emit an UnknownKeywordWarning for each unknown keyword (within an otherwise-handled category) encountered in the file.

  • read_starting_model_coord (bool) – if set, read coordinates for starting models, if provided in the file.

  • starting_model_class – The class to use to store starting model information. If read_starting_model_coord is also set, it is recommended to subclass ihm.startmodel.StartingModel and override add_atom() and/or add_seq_dif().

  • reject_old_file (bool) – If True, raise an ihm.reader.OldFileError if the file conforms to an older version of the dictionary than this library supports (by default the library will read what it can from the file).

  • variant (Variant) – A class or object that selects the type of file to read. This primarily controls the set of tables that are read from the file. In most cases the default IHMVariant should be used.

Returns:

A list of ihm.System objects.

exception ihm.reader.UnknownCategoryWarning[source]

Warning for unknown categories encountered in the file by read()

exception ihm.reader.UnknownKeywordWarning[source]

Warning for unknown keywords encountered in the file by read()

exception ihm.reader.OldFileError[source]

Exception raised if a file conforms to too old a version of the IHM extension dictionary. See read().

class ihm.reader.Handler(sysr)[source]

Base class for all handlers of mmCIF data. Each class handles a single category in the mmCIF or BinaryCIF file. To add a new handler (for example to handle a custom category) make a subclass and set the class attribute category to the mmCIF category name (e.g. _struct). Provide a __call__ method. This will be called for each category (multiple times for loop constructs) with the parameters to __call__ filled in with the same-named mmCIF keywords. For example the class:

class CustomHandler(Handler):
    category = "_custom"
    def __call__(self, key1, key2):
        pass

will be called with arguments “x”, “y” when given the mmCIF input:

_custom.key1 x
_custom.key2 y

Note that the arguments will always be strings when reading an mmCIF file. To convert to integer, floating point, or boolean, use the utility methods get_int(), get_float() or get_bool() respectively.

copy_if_present(obj, data, keys=[], mapkeys={})[source]

Set obj.x from data[‘x’] for each x in keys if present in data. The dict mapkeys is handled similarly except that its keys are looked up in data and the corresponding value used to set obj.

end_save_frame()[source]

Called at the end of each save frame.

finalize()[source]

Called at the end of each data block.

get_bool(val)[source]

Convert val to bool and return, or leave as is if None or ihm.unknown

get_float(val)[source]

Return float(val) or leave as is if None or ihm.unknown

get_int(val)[source]

Return int(val) or leave as is if None or ihm.unknown

get_int_or_string(val)[source]

Return val as an int or str as appropriate, or leave as is if None or ihm.unknown

get_lower(val)[source]

Return lowercase string val or leave as is if None or ihm.unknown

ignored_keywords = []

Keywords which are explicitly ignored (read() will not warn about their presence in the file). These are usually things like ordinal fields which we don’t use.

not_in_file = None

Value passed to __call__ for keywords not in the file

omitted = None

Value passed to __call__ for data marked as omitted (‘.’) in the file

sysr

Utility class to map IDs to Python objects.

property system

The ihm.System object to read into

unknown = ?

Value passed to __call__ for data marked as unknown (‘?’) in the file

class ihm.reader.SystemReader(model_class, starting_model_class)[source]

Utility class to track global information for a ihm.System being read from a file, such as the mapping from IDs to objects (as IDMapper objects). This can be used by Handler subclasses.

alignments

Mapping from ID to ihm.reference.Alignment objects

analyses

Mapping from ID to ihm.analysis.Analysis objects

analysis_steps

Mapping from ID to ihm.analysis.Step objects

assemblies

Mapping from ID to ihm.Assembly objects

asym_units

Mapping from ID to ihm.AsymUnit objects

centers

Mapping from ID to ihm.geometry.Center objects

chem_comps

Mapping from ID to ihm.ChemComp objects

chem_descriptors

Mapping from ID to ihm.ChemDescriptor objects

citations

Mapping from ID to ihm.Citation objects

Mapping from ID to ihm.restraint.CrossLinkPseudoSite

Mapping from ID to ihm.restraint.CrossLink

data_transformations

Mapping from ID to ihm.geometry.Transformation objects used by ihm.dataset.TransformedDataset objects (this is distinct from transformations since they are stored in separate tables, with different IDs, in the mmCIF file).

dataset_groups

Mapping from ID to ihm.dataset.DatasetGroup objects

datasets

Mapping from ID to ihm.dataset.Dataset objects

db_locations

Mapping from ID to ihm.location.DatabaseLocation objects

densities

Mapping from ID to ihm.model.LocalizationDensity objects

dist_restraint_groups

Mapping from ID to ihm.restraint.RestraintGroup of ihm.restraint.DerivedDistanceRestraint objects

dist_restraints

Mapping from ID to ihm.restraint.DerivedDistanceRestraint objects

em2d_restraints

Mapping from ID to ihm.restraint.EM2DRestraint objects

em3d_restraints

Mapping from ID to ihm.restraint.EM3DRestraint objects

ensembles

Mapping from ID to ihm.model.Ensemble objects

entities

Mapping from ID to ihm.Entity objects

experimental_xl_groups

Mapping from ID to groups of ihm.restraint.ExperimentalCrossLink objects

experimental_xls

Mapping from ID to ihm.restraint.ExperimentalCrossLink objects

external_files

Mapping from ID to ihm.location.FileLocation objects

features

Mapping from ID to ihm.restraint.Feature objects

flr_data

Mapping from ID to ihm.flr.FLRData objects

flr_entity_assemblies

Mapping from ID to ihm.flr.EntityAssembly objects

flr_exp_conditions

Mapping from ID to ihm.flr.ExpCondition objects

flr_experiments

Mapping from ID to ihm.flr.Experiment objects

flr_fps_av_modeling

Mapping from ID to ihm.flr.FPSAVModeling objects

flr_fps_av_parameters

Mapping from ID to ihm.flr.FPSAVParameter objects

flr_fps_global_parameters

Mapping from ID to ihm.flr.FPSGlobalParameters objects

flr_fps_mean_probe_positions

Mapping from ID to ihm.flr.FPSMeanProbePosition objects

flr_fps_modeling

Mapping from ID to ihm.flr.FPSModeling objects

flr_fps_mpp_atom_position_groups

Mapping from ID to ihm.flr.FPSMPPAtomPositionGroup objects

flr_fps_mpp_atom_positions

Mapping from ID to ihm.flr.FPSMPPAtomPosition objects

flr_fps_mpp_modeling

Mapping from ID to ihm.flr.FPSMPPModeling objects

flr_fret_analyses

Mapping from ID to ihm.flr.FRETAnalysis objects

flr_fret_calibration_parameters

Mapping from ID to ihm.flr.FRETCalibrationParameters objects

flr_fret_distance_restraint_groups

Mapping from ID to ihm.flr.FRETDistanceRestraintGroup objects

flr_fret_distance_restraints

Mapping from ID to ihm.flr.FRETDistanceRestraint objects

flr_fret_forster_radius

Mapping from ID to ihm.flr.FRETForsterRadius objects

flr_fret_model_distances

Mapping from ID to ihm.flr.FRETModelDistance objects

flr_fret_model_qualities

Mapping from ID to ihm.flr.FRETModelQuality objects

flr_inst_settings

Mapping from ID to ihm.flr.InstSetting objects

flr_instruments

Mapping from ID to ihm.flr.Instrument objects

flr_kinetic_rate_fret_analysis_connection

Mapping from ID to ihm.flr.KineticRateFretAnalysisConnection objects

flr_lifetime_fit_models

Mapping from ID to ihm.flr.LifetimeFitModel objects

flr_peak_assignments

Mapping from ID to ihm.flr.PeakAssignment objects

flr_poly_probe_conjugates

Mapping from ID to ihm.flr.PolyProbeConjugate objects

flr_poly_probe_positions

Mapping from ID to ihm.flr.PolyProbePosition objects

flr_probes

Mapping from ID to ihm.flr.Probe objects

flr_ref_measurement_groups

Mapping from ID to ihm.flr.RefMeasurementGroup objects

flr_ref_measurement_lifetimes

Mapping from ID to ihm.flr.RefMeasurementLifetime objects

flr_ref_measurements

Mapping from ID to ihm.flr.RefMeasurement objects

flr_relaxation_time_fret_analysis_connection

Mapping from ID to ihm.flr.RelaxationTimeFretAnalysisConnection objects

flr_sample_conditions

Mapping from ID to ihm.flr.SampleCondition objects

flr_sample_probe_details

Mapping from ID to ihm.flr.SampleProbeDetails objects

flr_samples

Mapping from ID to ihm.flr.Sample objects

geom_restraints

Mapping from ID to ihm.restraint.GeometricRestraint objects

geometries

Mapping from ID to ihm.geometry.GeometricObject objects

hdx_restraints

Mapping from ID to ihm.restraint.HDXRestraint objects

kinetic_rates

Mapping from ID to ihm.multi_state_scheme.KineticRate objects

model_groups

Mapping from ID to ihm.model.ModelGroup objects

models

Mapping from ID to ihm.model.Model objects

multi_state_scheme_connectivities

Mapping from ID to ihm.multi_state_scheme.Connectivity objects

multi_state_schemes

Mapping from ID to ihm.multi_state_scheme.MultiStateScheme objects

ordered_procs

Mapping from ID to ihm.model.OrderedProcess objects

ordered_steps

Mapping from ID to ihm.model.ProcessStep objects

pred_cont_restraint_groups

Mapping from ID to ihm.restraint.RestraintGroup of ihm.restraint.PredictedContactRestraint objects

pred_cont_restraints

Mapping from ID to ihm.restraint.PredictedContactRestraint objects

protocols

Mapping from ID to ihm.protocol.Protocol objects

pseudo_sites

Mapping from ID to ihm.restraint.PseudoSite objects

ranges

Mapping from ID to ihm.AsymUnitRange or ihm.EntityRange objects

references

Mapping from ID to ihm.reference.Reference objects

relaxation_times

Mapping from ID to ihm.multi_state_scheme.RelaxationTime objects

repos

Mapping from ID to ihm.location.Repository objects

representations

Mapping from ID to ihm.representation.Representation objects

sas_restraints

Mapping from ID to ihm.restraint.SASRestraint objects

software

Mapping from ID to ihm.Software objects

src_gens

Mapping from ID to ihm.source.Manipulated objects

src_nats

Mapping from ID to ihm.source.Natural objects

src_syns

Mapping from ID to ihm.source.Synthetic objects

starting_models

Mapping from ID to ihm.startmodel.StartingModel objects

state_groups

Mapping from ID to ihm.model.StateGroup objects

states

Mapping from ID to ihm.model.State objects

system

The ihm.System object being read in

transformations

Mapping from ID to ihm.geometry.Transformation objects

xl_restraints

Mapping from ID to ihm.restraint.CrossLinkRestraint objects

class ihm.reader.IDMapper(system_list, cls, *cls_args, **cls_keys)[source]

Utility class to handle mapping from mmCIF IDs to Python objects.

Parameters:
  • system_list (list) – The list in ihm.System that keeps track of these objects.

  • cls (class) – The base class for the Python objects.

get_all()[source]

Yield all objects seen so far (unordered)

get_by_id(objid, newcls=None)[source]

Get the object with given ID, creating it if it doesn’t already exist. If newcls is specified, the object will be an instance of that class (this is commonly used when different subclasses are employed depending on a type specified in the mmCIF file, such as the various subclasses of ihm.dataset.Dataset).

get_by_id_or_none(objid, newcls=None)[source]

Get the object with given ID, creating it if it doesn’t already exist. If ID is None or ihm.unknown, return None instead.

class ihm.reader.RangeIDMapper[source]

Utility class to handle mapping from mmCIF IDs to ihm.AsymUnitRange or ihm.EntityRange objects.

get(asym_or_entity, range_id)[source]

Get a range from an ID.

Parameters:
  • asym_or_entity – An ihm.Entity or ihm.AsymUnit object representing the part of the system to which the range will be applied.

  • range_id (str) – mmCIF ID

Returns:

A range as a ihm.Entity, ihm.AsymUnit, ihm.EntityRange or ihm.AsymUnitRange object.

set(range_id, seq_id_begin, seq_id_end)[source]

Add a range.

Parameters:
  • range_id (str) – mmCIF ID

  • seq_id_begin (int) – Index of the start of the range

  • seq_id_end (int) – Index of the end of the range

class ihm.reader.Variant[source]

Utility class to select the type of file to read with read().

get_audit_conform_handler(sysr)[source]

Get a Handler to check the audit_conform table. If read() is called with reject_old_file=True, this handler is used to check the audit_conform table and reject the file if it is deemed to be too old.

Parameters:

sysr (SystemReader) – class to track global file information.

Returns:

a suitable handler.

Return type:

Handler

get_handlers(sysr)[source]

Get the Handler objects to use to parse input.

Parameters:

sysr (SystemReader) – class to track global file information.

Returns:

a list of Handler objects.

system_reader = None

Class to track global file information, e.g. SystemReader

class ihm.reader.IHMVariant[source]

Used to select typical PDBx/IHM file input. See read().