The ihm.format Python module

Utility classes to handle CIF format.

This module provides classes to read in and write out mmCIF files. It is only concerned with handling syntactically correct CIF - it does not know the set of tables or the mapping to ihm objects. For that, see ihm.dumper for writing and ihm.reader for reading.

See also the stream parser example.

class ihm.format.CifWriter(fh)[source]

Write information to a CIF file. The constructor takes a single argument - a Python filelike object to write to - and provides methods to write Python objects to that file. Most simple Python types are supported (string, float, bool, int). The Python bool type is mapped to CIF strings ‘NO’ and ‘YES’. Floats are always represented with 3 decimal places (or in scientific notation with 3 digits of precision if smaller than 1e-3); if a different amount of precision is desired, convert the float to a string first.

category(category)[source]

Return a context manager to write a CIF category. A CIF category is a simple list of key:value pairs.

Parameters:

category (str) – the name of the category (e.g. “_struct_conf_type”).

Returns:

an object with a single method write which takes keyword arguments.

For example:

with writer.category("_struct_conf_type") as l:
    l.write(id='HELX_P', criteria=writer.unknown)
loop(category, keys)[source]

Return a context manager to write a CIF loop.

Parameters:
  • category (str) – the name of the category (e.g. “_struct_conf”)

  • keys (list) – the field keys in that category

Returns:

an object with a single method write which takes keyword arguments; this can be called any number of times to add entries to the loop. Any field keys in keys that are not provided as arguments to write, or values that are the Python value None, will get the CIF omitted value (‘.’), while arguments to write that are not present in keys will be ignored.

For example:

with writer.loop("_struct_conf", ["id", "conf_type_id"]) as l:
    for i in range(5):
        l.write(id='HELX_P1%d' % i, conf_type_id='HELX_P')
start_block(name)[source]

Start a new data block in the file with the given name.

write_comment(comment)[source]

Write a simple comment to the CIF file. The comment will be wrapped if necessary for readability. See _set_line_wrap().

class ihm.format.CifReader(fh, category_handler, unknown_category_handler=None, unknown_keyword_handler=None)[source]

Class to read an mmCIF file and extract some or all of its data.

Use read_file() to actually read the file.

Parameters:
  • fh (file) – Open handle to the mmCIF file

  • category_handler (dict) – A dict to handle data extracted from the file. Keys are category names (e.g. “_entry”) and values are objects that have a __call__ method and not_in_file, omitted, and unknown attributes. The names of the arguments to this __call__ method are mmCIF keywords that are extracted from the file (for the keywords tr_vector[N] and rot_matrix[N][M] simply omit the [ and ] characters, since these are not valid for Python identifiers). The object will be called with the data from the file as a set of strings, or not_in_file, omitted or unkonwn for any keyword that is not present in the file, the mmCIF omitted value (.), or mmCIF unknown value (?) respectively. (mmCIF keywords are case insensitive, so this class always treats them as lowercase regardless of the file contents.)

  • unknown_category_handler – A callable (or None) that is called for each category in the file that isn’t handled; it is given two arguments: the name of the category, and the line in the file at which the category was encountered (if known, otherwise None).

  • unknown_keyword_handler – A callable (or None) that is called for each keyword in the file that isn’t handled (within a category that is handled); it is given three arguments: the names of the category and keyword, and the line in the file at which the keyword was encountered (if known, otherwise None).

read_file()[source]

Read the file and extract data. Category handlers will be called as data becomes available - for loop_ constructs, this will be once for each row in the loop; for categories (e.g. _entry.id model), this will be once at the very end of the file.

If the C-accelerated _format module is available, then it is used instead of the (much slower) Python tokenizer.

CifParserError will be raised if the file cannot be parsed.

Returns:

True iff more data blocks are available to be read.

exception ihm.format.CifParserError[source]

Exception raised for invalid format mmCIF files