modelarrayio.cli.mif_to_h5.mif_to_h5

modelarrayio.cli.mif_to_h5.mif_to_h5(index_file, directions_file, cohort_long, backend='hdf5', output=PosixPath('fixelarray.h5'), storage_dtype='float32', compression='gzip', compression_level=4, shuffle=True, chunk_voxels=0, target_chunk_mb=2.0, workers=1, s3_workers=1, split_outputs=False)[source]

Load all fixeldb data and write to an HDF5 or TileDB file.

Parameters:
  • index_file (pathlib.Path) – Path to a Nifti2 index file

  • directions_file (pathlib.Path) – Path to a Nifti2 directions file

  • cohort_long (pandas.DataFrame) – Normalised long-format cohort dataframe (from load_and_normalize_cohort()).

  • backend (str) – Backend to use for storage ('hdf5' or 'tiledb')

  • output (pathlib.Path) – Output path. For the hdf5 backend, path to an .h5 file; for the tiledb backend, path to a .tdb directory.

  • storage_dtype (str) – Floating type to store values

  • compression (str) – Compression filter. gzip works for both backends; lzf is HDF5-only; zstd is TileDB-only.

  • compression_level (int) – Compression level (codec-dependent)

  • shuffle (bool) – Enable shuffle filter

  • chunk_voxels (int) – Chunk/tile size along the fixel axis (0 = auto)

  • target_chunk_mb (float) – Target chunk/tile size in MiB when auto-computing the spatial axis length

  • workers (int) – Maximum number of parallel TileDB write workers. Default 1. Has no effect when backend='hdf5'.

  • s3_workers (int) – Number of parallel workers for S3 downloads. Default 1.

  • split_outputs (bool) – If True, write one output file per scalar. Default False.

Returns:

status – 0 if successful, 1 if failed.

Return type:

int