Data Models

This module defines the core data models that represent mmCIF file structures. SLOTH’s data hierarchy mirrors the mmCIF format:

  • MMCIFDataContainer β€” top-level container (one or more data blocks)

  • DataBlock β€” a named data block (e.g., data_1ABC)

  • Category β€” a category within a block (e.g., _atom_site)

  • Row β€” a single record in a category

  • Item β€” a column/field in a category

Enumerations

class sloth.mmcif.models.DataSourceFormat[source]

Bases: Enum

Enum to track the format source of mmCIF data.

MMCIF = 1
JSON = 2
DICT = 3
UNKNOWN = 4

Abstract Base Classes

class sloth.mmcif.models.DataNode[source]

Bases: ABC

Abstract base class for all data nodes in the hierarchy.

abstract property name: str

Get the name of the node.

class sloth.mmcif.models.DataContainer[source]

Bases: DataNode

Abstract base class for containers that hold other nodes.

register(name, plugin)[source]

Register a plugin for dot-notation access on this node.

Parameters:
  • name (str) – The attribute name (e.g. "validate").

  • plugin – A Plugin instance or a plain callable.

Return type:

None

Top-level Container

class sloth.mmcif.models.MMCIFDataContainer[source]

Bases: DataContainer

A class to represent an mmCIF data container.

__init__(data_blocks=None, source_format=DataSourceFormat.MMCIF)[source]
Parameters:
property name: str

Get the name of the node.

delete(block_name)[source]

Delete a data block by name (string-based API).

Parameters:

block_name (str) – The block name (with or without data_ prefix).

Raises:

KeyError – If the block does not exist.

Return type:

None

property blocks: LazyKeyList

Provides O(1) lazy list of data block names (prefixed names for consistency).

property data: DataBlockCollection

Provides access to data blocks with both list and dict interfaces.

Data Block

class sloth.mmcif.models.DataBlock[source]

Bases: DataContainer

A class to represent a data block in an mmCIF file.

__init__(name, categories=None)[source]
Parameters:
property name: str

Get the name of the node.

property categories: LazyKeyList

Get names of contained categories (prefixed names for external API) - O(1) lazy.

property data: CategoryCollection

Provides read-only access to the category objects.

delete(category_name)[source]

Delete a category by name (string-based API).

Parameters:

category_name (str) – The category name to remove (with or without _ prefix).

Raises:

KeyError – If the category does not exist.

Return type:

None

Category

class sloth.mmcif.models.Category[source]

Bases: DataContainer

A class to represent a category in a data block.

__init__(name)[source]
Parameters:

name (str)

property name: str

Get the name of the node.

property items: LazyKeyList

Get names of contained items - O(1) lazy list.

delete(item_name)[source]

Delete an mmCIF item by name (string-based API).

Parameters:

item_name (str) – The item name to remove.

Raises:

KeyError – If the item does not exist.

Return type:

None

property data: LazyItemDict

Provides O(1) lazy read-only access to the data (loads items on-demand).

property row_count: int

Returns the number of rows in this category.

property rows: LazyRowList

Returns all rows in this category as a lazy list (O(1) creation, cached for performance).

get_item(item_name)[source]

Get the raw item (Item object or list), without forcing lazy loading.

Return type:

Union[Item, List[str]]

Parameters:

item_name (str)

is_lazy_loaded(item_name)[source]

Check if an item is lazy-loaded.

Return type:

bool

Parameters:

item_name (str)

Row

class sloth.mmcif.models.Row[source]

Bases: DataNode

Represents a single row of data in a Category.

__init__(category, row_index)[source]
Parameters:
property name: str

Return name from the first item value in the row if available, otherwise the row index.

property data: Dict[str, str]

Return all item values for this row as a dictionary.

Item

class sloth.mmcif.models.Item[source]

Bases: DataNode

Represents a column/item in a category. Always uses eager loading.

__init__(name, values=None)[source]

Initialize an Item with pre-loaded values.

Parameters:
  • name (str) – The name of the item

  • values (Optional[List[str]]) – Pre-loaded values

property name: str

Read-only access to the item name.

property values: List[str]

Values with automatic caching via @cached_property.

add_value(value)[source]

Add a value directly (for small datasets or immediate loading).

Return type:

None

Parameters:

value (str)

Lazy Loading Internals

These classes provide efficient lazy-loading wrappers over gemmi data structures. They are used internally and typically not instantiated directly.

class sloth.mmcif.models.LazyGemmiColumn[source]

Bases: list

Lazy wrapper for gemmi loop columns - data extracted only when accessed. Behaves like a list but loads data from gemmi on first access.

__init__(gemmi_loop, column_index)[source]

Initialize lazy column wrapper.

Parameters:
  • gemmi_loop – The gemmi loop object containing the data

  • column_index (int) – The column index in the loop

class sloth.mmcif.models.LazyRowList[source]

Bases: object

A list-like object that creates Row objects only when accessed.

__init__(category, row_count)[source]
Parameters:
class sloth.mmcif.models.LazyItemDict[source]

Bases: object

A dict-like object that only loads Item values when accessed, providing O(1) creation.

__init__(items)[source]
Parameters:

items (Dict[str, List[str] | Item])

keys()[source]
values()[source]
items()[source]
get(key, default=None)[source]
Parameters:

key (str)

class sloth.mmcif.models.LazyKeyList[source]

Bases: object

A list that dynamically generates prefixed keys without storing them, providing O(1) creation.

__init__(collection, prefix='')[source]
Parameters:
index(item)[source]
Return type:

int

Parameters:

item (str)

count(item)[source]
Return type:

int

Parameters:

item (str)