The ihm.reader Python module

Utility classes to read in information in mmCIF or BinaryCIF format

ihm.reader.read(fh, model_class=<class 'ihm.model.Model'>, format='mmCIF', handlers=[], warn_unknown_category=False, warn_unknown_keyword=False, read_starting_model_coord=True, starting_model_class=<class 'ihm.startmodel.StartingModel'>, reject_old_file=False, variant=<class 'ihm.reader.IHMVariant'>)[source]

Read data from the file handle fh.

Note that the reader currently expects to see a file compliant with the PDBx and/or IHM dictionaries. It is not particularly tolerant of noncompliant or incomplete files, and will probably throw an exception rather than warning about and trying to handle such files. Please open an issue if you encounter such a problem.

Files can be read in either the text-based mmCIF format or the BinaryCIF format. The mmCIF reader works by breaking the file into tokens, and using this stream of tokens to populate Python data structures. Two tokenizers are available: a pure Python implementation and a C-accelerated version. The C-accelerated version is much faster and so is used if built. The BinaryCIF reader needs the msgpack Python module to function.

The file handle should be opened in text mode for mmCIF files. Traditionally, mmCIF files used ASCII encoding. More and more recent files are UTF-8 encoded instead, but some use other encodings such as latin-1. To handle most current files use something like:

try:
    with open('input.cif', encoding='utf-8') as fh:
        systems = ihm.reader.read(fh)
except UnicodeDecodeError:
    with open('input.cif', encoding='latin-1') as fh:
        systems = ihm.reader.read(fh)

The file handle should be opened in binary mode for BinaryCIF files:

with open('input.bcif', 'rb') as fh:
    systems = ihm.reader.read(fh, format='BCIF')
Parameters:
  • fh (file) – The file handle to read from. (For BinaryCIF files, the file should be opened in binary mode. For mmCIF files, files opened in binary mode with Python 3 will be treated as if they are Latin-1-encoded.)
  • model_class – The class to use to store model information (such as coordinates). For use with other software, it is recommended to subclass ihm.model.Model and override add_sphere() and/or add_atom(), and provide that subclass here. See ihm.model.Model.get_spheres() for more information.
  • format (str) – The format of the file. This can be ‘mmCIF’ (the default) for the (text-based) mmCIF format or ‘BCIF’ for BinaryCIF.
  • handlers (list) – A list of Handler classes (not objects). These can be used to read extra categories from the file.
  • warn_unknown_category (bool) – if set, emit an UnknownCategoryWarning for each unknown category encountered in the file.
  • warn_unknown_keyword (bool) – if set, emit an UnknownKeywordWarning for each unknown keyword (within an otherwise-handled category) encountered in the file.
  • read_starting_model_coord (bool) – if set, read coordinates for starting models, if provided in the file.
  • starting_model_class – The class to use to store starting model information. If read_starting_model_coord is also set, it is recommended to subclass ihm.startmodel.StartingModel and override add_atom() and/or add_seq_dif().
  • reject_old_file (bool) – If True, raise an ihm.reader.OldFileError if the file conforms to an older version of the dictionary than this library supports (by default the library will read what it can from the file).
  • variant (Variant) – A class or object that selects the type of file to read. This primarily controls the set of tables that are read from the file. In most cases the default IHMVariant should be used.
Returns:

A list of ihm.System objects.

exception ihm.reader.UnknownCategoryWarning[source]

Warning for unknown categories encountered in the file by read()

exception ihm.reader.UnknownKeywordWarning[source]

Warning for unknown keywords encountered in the file by read()

exception ihm.reader.OldFileError[source]

Exception raised if a file conforms to too old a version of the IHM extension dictionary. See read().

class ihm.reader.Handler(sysr)[source]

Base class for all handlers of mmCIF data. Each class handles a single category in the mmCIF or BinaryCIF file. To add a new handler (for example to handle a custom category) make a subclass and set the class attribute category to the mmCIF category name (e.g. _struct). Provide a __call__ method. This will be called for each category (multiple times for loop constructs) with the parameters to __call__ filled in with the same-named mmCIF keywords. For example the class:

class CustomHandler(Handler):
    category = "_custom"
    def __call__(self, key1, key2):
        pass

will be called with arguments “x”, “y” when given the mmCIF input:

_custom.key1 x
_custom.key2 y

Note that the arguments will always be strings when reading an mmCIF file. To convert to integer, floating point, or boolean, use the utility methods get_int(), get_float() or get_bool() respectively.

copy_if_present(obj, data, keys=[], mapkeys={})[source]

Set obj.x from data[‘x’] for each x in keys if present in data. The dict mapkeys is handled similarly except that its keys are looked up in data and the corresponding value used to set obj.

end_save_frame()[source]

Called at the end of each save frame.

finalize()[source]

Called at the end of each data block.

get_bool(val)[source]

Convert val to bool and return, or leave as is if None or ihm.unknown

get_float(val)[source]

Return float(val) or leave as is if None or ihm.unknown

get_int(val)[source]

Return int(val) or leave as is if None or ihm.unknown

get_int_or_string(val)[source]

Return val as an int or str as appropriate, or leave as is if None or ihm.unknown

get_lower(val)[source]

Return lowercase string val or leave as is if None or ihm.unknown

ignored_keywords = []

Keywords which are explicitly ignored (read() will not warn about their presence in the file). These are usually things like ordinal fields which we don’t use.

not_in_file = None

Value passed to __call__ for keywords not in the file

omitted = None

Value passed to __call__ for data marked as omitted (‘.’) in the file

sysr = None

Utility class to map IDs to Python objects.

system

The ihm.System object to read into

unknown = ?

Value passed to __call__ for data marked as unknown (‘?’) in the file

class ihm.reader.SystemReader(model_class, starting_model_class)[source]

Utility class to track global information for a ihm.System being read from a file, such as the mapping from IDs to objects (as IDMapper objects). This can be used by Handler subclasses.

alignments = None

Mapping from ID to ihm.reference.Alignment objects

analyses = None

Mapping from ID to ihm.analysis.Analysis objects

analysis_steps = None

Mapping from ID to ihm.analysis.Step objects

assemblies = None

Mapping from ID to ihm.Assembly objects

asym_units = None

Mapping from ID to ihm.AsymUnit objects

centers = None

Mapping from ID to ihm.geometry.Center objects

chem_comps = None

Mapping from ID to ihm.ChemComp objects

chem_descriptors = None

Mapping from ID to ihm.ChemDescriptor objects

citations = None

Mapping from ID to ihm.Citation objects

Mapping from ID to ihm.restraint.CrossLinkPseudoSite

Mapping from ID to ihm.restraint.CrossLink

data_transformations = None

Mapping from ID to ihm.geometry.Transformation objects used by ihm.dataset.TransformedDataset objects (this is distinct from transformations since they are stored in separate tables, with different IDs, in the mmCIF file).

dataset_groups = None

Mapping from ID to ihm.dataset.DatasetGroup objects

datasets = None

Mapping from ID to ihm.dataset.Dataset objects

db_locations = None

Mapping from ID to ihm.location.DatabaseLocation objects

densities = None

Mapping from ID to ihm.model.LocalizationDensity objects

dist_restraint_groups = None

Mapping from ID to ihm.restraint.RestraintGroup of ihm.restraint.DerivedDistanceRestraint objects

dist_restraints = None

Mapping from ID to ihm.restraint.DerivedDistanceRestraint objects

em2d_restraints = None

Mapping from ID to ihm.restraint.EM2DRestraint objects

em3d_restraints = None

Mapping from ID to ihm.restraint.EM3DRestraint objects

ensembles = None

Mapping from ID to ihm.model.Ensemble objects

entities = None

Mapping from ID to ihm.Entity objects

experimental_xl_groups = None

Mapping from ID to groups of ihm.restraint.ExperimentalCrossLink objects

experimental_xls = None

Mapping from ID to ihm.restraint.ExperimentalCrossLink objects

external_files = None

Mapping from ID to ihm.location.FileLocation objects

features = None

Mapping from ID to ihm.restraint.Feature objects

flr_data = None

Mapping from ID to ihm.flr.FLRData objects

flr_entity_assemblies = None

Mapping from ID to ihm.flr.EntityAssembly objects

flr_exp_conditions = None

Mapping from ID to ihm.flr.ExpCondition objects

flr_experiments = None

Mapping from ID to ihm.flr.Experiment objects

flr_fps_av_modeling = None

Mapping from ID to ihm.flr.FPSAVModeling objects

flr_fps_av_parameters = None

Mapping from ID to ihm.flr.FPSAVParameter objects

flr_fps_global_parameters = None

Mapping from ID to ihm.flr.FPSGlobalParameters objects

flr_fps_mean_probe_positions = None

Mapping from ID to ihm.flr.FPSMeanProbePosition objects

flr_fps_modeling = None

Mapping from ID to ihm.flr.FPSModeling objects

flr_fps_mpp_atom_position_groups = None

Mapping from ID to ihm.flr.FPSMPPAtomPositionGroup objects

flr_fps_mpp_atom_positions = None

Mapping from ID to ihm.flr.FPSMPPAtomPosition objects

flr_fps_mpp_modeling = None

Mapping from ID to ihm.flr.FPSMPPModeling objects

flr_fret_analyses = None

Mapping from ID to ihm.flr.FRETAnalysis objects

flr_fret_calibration_parameters = None

Mapping from ID to ihm.flr.FRETCalibrationParameters objects

flr_fret_distance_restraint_groups = None

Mapping from ID to ihm.flr.FRETDistanceRestraintGroup objects

flr_fret_distance_restraints = None

Mapping from ID to ihm.flr.FRETDistanceRestraint objects

flr_fret_forster_radius = None

Mapping from ID to ihm.flr.FRETForsterRadius objects

flr_fret_model_distances = None

Mapping from ID to ihm.flr.FRETModelDistance objects

flr_fret_model_qualities = None

Mapping from ID to ihm.flr.FRETModelQuality objects

flr_inst_settings = None

Mapping from ID to ihm.flr.InstSetting objects

flr_instruments = None

Mapping from ID to ihm.flr.Instrument objects

flr_lifetime_fit_models = None

Mapping from ID to ihm.flr.LifetimeFitModel objects

flr_peak_assignments = None

Mapping from ID to ihm.flr.PeakAssignment objects

flr_poly_probe_conjugates = None

Mapping from ID to ihm.flr.PolyProbeConjugate objects

flr_poly_probe_positions = None

Mapping from ID to ihm.flr.PolyProbePosition objects

flr_probes = None

Mapping from ID to ihm.flr.Probe objects

flr_ref_measurement_groups = None

Mapping from ID to ihm.flr.RefMeasurementGroup objects

flr_ref_measurement_lifetimes = None

Mapping from ID to ihm.flr.RefMeasurementLifetime objects

flr_ref_measurements = None

Mapping from ID to ihm.flr.RefMeasurement objects

flr_sample_conditions = None

Mapping from ID to ihm.flr.SampleCondition objects

flr_sample_probe_details = None

Mapping from ID to ihm.flr.SampleProbeDetails objects

flr_samples = None

Mapping from ID to ihm.flr.Sample objects

geom_restraints = None

Mapping from ID to ihm.restraint.GeometricRestraint objects

geometries = None

Mapping from ID to ihm.geometry.GeometricObject objects

model_groups = None

Mapping from ID to ihm.model.ModelGroup objects

models = None

Mapping from ID to ihm.model.Model objects

ordered_procs = None

Mapping from ID to ihm.model.OrderedProcess objects

ordered_steps = None

Mapping from ID to ihm.model.ProcessStep objects

pred_cont_restraint_groups = None

Mapping from ID to ihm.restraint.RestraintGroup of ihm.restraint.PredictedContactRestraint objects

pred_cont_restraints = None

Mapping from ID to ihm.restraint.PredictedContactRestraint objects

protocols = None

Mapping from ID to ihm.protocol.Protocol objects

pseudo_sites = None

Mapping from ID to ihm.restraint.PseudoSite objects

ranges = None

Mapping from ID to ihm.AsymUnitRange or EntityRange objects

references = None

Mapping from ID to ihm.reference.Reference objects

repos = None

Mapping from ID to ihm.location.Repository objects

representations = None

Mapping from ID to ihm.representation.Representation objects

sas_restraints = None

Mapping from ID to ihm.restraint.SASRestraint objects

software = None

Mapping from ID to ihm.Software objects

src_gens = None

Mapping from ID to ihm.source.Manipulated objects

src_nats = None

Mapping from ID to ihm.source.Natural objects

src_syns = None

Mapping from ID to ihm.source.Synthetic objects

starting_models = None

Mapping from ID to ihm.startmodel.StartingModel objects

state_groups = None

Mapping from ID to ihm.model.StateGroup objects

states = None

Mapping from ID to ihm.model.State objects

system = None

The ihm.System object being read in

transformations = None

Mapping from ID to ihm.geometry.Transformation objects

xl_restraints = None

Mapping from ID to ihm.restraint.CrossLinkRestraint objects

class ihm.reader.IDMapper(system_list, cls, *cls_args, **cls_keys)[source]

Utility class to handle mapping from mmCIF IDs to Python objects.

Parameters:
  • system_list (list) – The list in ihm.System that keeps track of these objects.
  • cls (class) – The base class for the Python objects.
get_all()[source]

Yield all objects seen so far (unordered)

get_by_id(objid, newcls=None)[source]

Get the object with given ID, creating it if it doesn’t already exist. If newcls is specified, the object will be an instance of that class (this is commonly used when different subclasses are employed depending on a type specified in the mmCIF file, such as the various subclasses of ihm.dataset.Dataset).

get_by_id_or_none(objid, newcls=None)[source]

Get the object with given ID, creating it if it doesn’t already exist. If ID is None or ihm.unknown, return None instead.

class ihm.reader.RangeIDMapper[source]

Utility class to handle mapping from mmCIF IDs to ihm.AsymUnitRange or EntityRange objects.

get(asym_or_entity, range_id)[source]

Get a range from an ID.

Parameters:
  • asym_or_entity – An ihm.Entity or ihm.AsymUnit object representing the part of the system to which the range will be applied.
  • range_id (str) – mmCIF ID
Returns:

A range as a ihm.Entity, ihm.AsymUnit, ihm.EntityRange or ihm.AsymUnitRange object.

set(range_id, seq_id_begin, seq_id_end)[source]

Add a range.

Parameters:
  • range_id (str) – mmCIF ID
  • seq_id_begin (int) – Index of the start of the range
  • seq_id_end (int) – Index of the end of the range
class ihm.reader.Variant[source]

Utility class to select the type of file to read with read().

get_audit_conform_handler(sysr)[source]

Get a Handler to check the audit_conform table. If read() is called with reject_old_file=True, this handler is used to check the audit_conform table and reject the file if it is deemed to be too old.

Parameters:sysr (SystemReader) – class to track global file information.
Returns:a suitable handler.
Return type:Handler
get_handlers(sysr)[source]

Get the Handler objects to use to parse input.

Parameters:sysr (SystemReader) – class to track global file information.
Returns:a list of Handler objects.
system_reader = None

Class to track global file information, e.g. SystemReader

class ihm.reader.IHMVariant[source]

Used to select typical PDBx/IHM file input. See read().