ModelArrayIO

Latest Version PyPI - Python Version License Documentation Status GitHub Actions: Tox Codecov Code style: ruff

ModelArrayIO is a Python package that converts between neuroimaging formats (fixel .mif, voxel NIfTI, CIFTI-2 dscalar/pscalar/pconn) and the HDF5 (.h5) layout used by the R package ModelArray. It can also write ModelArray statistical results back to imaging formats.

Relationship to ConFixel: The earlier project ConFixel is superseded by ModelArrayIO. The ConFixel repository is retained for history (including links from publications) and will be archived; new work should use this repository.

Documentation for installation and usage: ModelArrayIO on GitHub (this README). For conda, HDF5 libraries, and installing the ModelArray R package, see the ModelArray vignette Installation.

Overview

ModelArrayIO provides three converter areas, each with import and export commands:

Once ModelArrayIO is installed, these commands are available in your terminal:

  • Neuroimaging data (CIFTI, NIfTI, or MRtrix .mif):

    • Neuroimaging → .h5: modelarrayio to-modelarray

    • .h5 → Neuroimaging: modelarrayio export-results

Storage backends: HDF5 and TileDB

ModelArrayIO supports two on-disk backends for the subject-by-element matrix:

  • HDF5 (default), implemented in modelarrayio/h5_storage.py

  • TileDB, implemented in modelarrayio/tiledb_storage.py

Both backends expose a similar API:

  • create a dense 2D array (subjects, items) and write all values at once

  • create an empty array with the same shape and write by column stripes

  • write/read column names alongside the data

Notes and minor differences:

  • Chunking vs tiling: HDF5 uses chunks; TileDB uses tiles. We compute tile sizes analogous to chunk sizes to keep write/read patterns similar.

  • Compression: HDF5 uses gzip by default; TileDB defaults to zstd with shuffle for better speed/ratio. You can switch to gzip for parity.

  • Metadata: HDF5 stores column_names as a dataset attribute; TileDB stores names as JSON metadata on the array/group.

  • Layout: Both backends keep dimensions in the same order and use zero-based indices.