modelarrayio.cli.nifti_to_h5.nifti_to_h5
- modelarrayio.cli.nifti_to_h5.nifti_to_h5(group_mask_file, cohort_long, backend='hdf5', output=PosixPath('voxelarray.h5'), storage_dtype='float32', compression='gzip', compression_level=4, shuffle=True, chunk_voxels=0, target_chunk_mb=2.0, workers=1, s3_workers=1, split_outputs=False)[source]
Load all volume data and write to an HDF5 or TileDB file.
- Parameters:
group_mask_file (
str) – Path to a NIfTI-1 binary group mask file.cohort_long (
pandas.DataFrame) – Normalised long-format cohort dataframe (fromload_and_normalize_cohort()).backend (
str) – Storage backend ('hdf5'or'tiledb').output (
pathlib.Path) – Output path. For the hdf5 backend, path to an .h5 file; for the tiledb backend, path to a .tdb directory.storage_dtype (
str) – Floating type to store values. Options:'float32'(default),'float64'.compression (
str) – Compression filter.gzipworks for both backends;lzfis HDF5-only;zstdis TileDB-only.compression_level (
int) – Compression level (codec-dependent). Default 4.shuffle (
bool) – Enable shuffle filter. Default True.chunk_voxels (
int) – Chunk/tile size along the voxel axis. If 0, auto-compute. Default 0.target_chunk_mb (
float) – Target chunk/tile size in MiB when auto-computing. Default 2.0.workers (
int) – Maximum number of parallel TileDB write workers. Default 1. Has no effect whenbackend='hdf5'.s3_workers (
int) – Number of parallel workers for S3 downloads. Default 1.split_outputs (
bool) – If True, write one output file per scalar. Default False.