Usage Guide

This guide covers the main access patterns and data manipulation features of SLOTH.

Access Patterns

Dot Notation

The most natural way to access mmCIF data:

from sloth import MMCIFHandler

handler = MMCIFHandler()
mmcif = handler.read("1abc.cif")

block = mmcif.data_1ABC
atom_site = block._atom_site
print(atom_site.Cartn_x[0])

Dictionary Notation

Useful when category or field names are dynamic:

category_name = "_atom_site"
field_name = "Cartn_x"
x = mmcif.data[0][category_name][field_name]

Row-wise Access

Iterate over rows in a category:

first_atom = atom_site[0]
print(first_atom.type_symbol, first_atom.Cartn_x)

# Iterate all rows
for atom in atom_site:
    print(atom.label_atom_id, atom.Cartn_x)

Column-wise Access

Access entire columns at once:

x_coords = atom_site.Cartn_x      # All X coordinates
atom_ids = atom_site.label_atom_id  # All atom labels

Filtering and Slicing

Use list comprehensions for powerful filtering:

# CA atoms from chain A
ca_atoms = [
    a for a in atom_site
    if a.label_atom_id == "CA" and a.label_asym_id == "A"
]

# Mean X coordinate
avg_x = sum(float(x) for x in atom_site.Cartn_x) / atom_site.row_count

Iterating Over Structure

Enumerate all categories and items in a block:

for cat_name in block.categories:
    category = block[cat_name]
    for item_name in category.items:
        print(f"{cat_name}.{item_name}: {len(category[item_name])} values")

Data Creation

Instead of writing raw CIF text by hand:

sample = """data_1ABC
_entry.id 1ABC_STRUCTURE
loop_
_atom_site.group_PDB
_atom_site.id
_atom_site.type_symbol
_atom_site.Cartn_x
_atom_site.Cartn_y
_atom_site.Cartn_z
ATOM 1 N 10.123 20.456 30.789
ATOM 2 C 11.234 21.567 31.890
"""
with open("sample.cif", "w") as f:
    f.write(sample)

…use SLOTH’s API to build structures programmatically.

Programmatic Creation

Build structures using the object model:

from sloth.mmcif import MMCIFDataContainer, DataBlock, Category

mmcif = MMCIFDataContainer()
block = DataBlock("1ABC")

cat = Category("_entry")
cat["id"] = ["1ABC_STRUCTURE"]
block["_entry"] = cat

mmcif["1ABC"] = block

Dot-based Auto-creation

The most concise approach β€” objects are created on the fly:

mmcif = MMCIFDataContainer()
mmcif.data_1ABC._entry.id = ["1ABC_STRUCTURE"]
mmcif.data_1ABC._atom_site.Cartn_x = ["10.1", "11.2"]

Safety Features

SLOTH provides several mechanisms to catch mistakes early without restricting the API.

Pending Proxies

Accessing a category or data block that does not yet exist returns a lightweight pending proxy instead of silently creating an empty object:

handler = MMCIFHandler()
mmcif = handler.read("1abc.cif")
block = mmcif.data_1ABC

pending = block._brand_new       # returns a _PendingCategory (not committed yet)
bool(pending)                     # False β€” nothing has been written

# Writing commits the proxy automatically
pending.id = ["NEW1"]            # category is now part of the block
bool(block._brand_new)           # True

Reading from a pending proxy raises AttributeError with fuzzy suggestions:

block._atm_site.Cartn_x          # AttributeError: ... Did you mean '_atom_site'?

Schema Warnings

When you assign data to a category or item that is not in the bundled mmCIF dictionary, SLOTH emits a SchemaWarning:

import warnings
from sloth.mmcif import SchemaWarning

with warnings.catch_warnings(record=True) as w:
    warnings.simplefilter("always")
    block._atom_site.my_custom_field = ["x"]
    # w[0].category == SchemaWarning
    # "Item 'my_custom_field' is not in the mmCIF dictionary. Did you mean ...?"

To suppress these warnings:

warnings.filterwarnings("ignore", category=SchemaWarning)

Tab Completion

All three model classes implement __dir__() to expose item names, category names, block names, and registered plugin names. This enables tab completion in IPython, Jupyter, and any IDE that introspects __dir__.

Fuzzy Matching

AttributeError messages on Category, DataBlock, and MMCIFDataContainer include β€œDid you mean …?” suggestions powered by difflib.get_close_matches:

block._atom_site.Cartn_X  # AttributeError: ... Did you mean 'Cartn_x'?

Deleting Data

Remove categories from a block or items from a category using del or the .delete() method.

Using del

# Delete a category
del block._obsolete_category

# Delete an item from a category
del block._atom_site.auth_asym_id

Using .delete()

The string-based API is useful when the name is dynamic:

block.delete("_obsolete_category")
block._atom_site.delete("auth_asym_id")

Both raise an error if the target does not exist (AttributeError for del, KeyError for .delete()).