modelarrayio.utils.nifti.load_cohort_voxels
- modelarrayio.utils.nifti.load_cohort_voxels(cohort_long, group_mask_matrix, s3_workers)[source]
Load all voxel rows from the cohort, optionally in parallel.
When s3_workers > 1, a ThreadPoolExecutor is used. Threads share memory so group_mask_matrix is accessed directly with no copying overhead. Results arrive via as_completed and are indexed by (scalar_name, subj_idx) so the final ordered lists are reconstructed correctly regardless of completion order.
- Parameters:
cohort_long (
pandas.DataFrame) – Long-format cohort dataframe with columns ‘scalar_name’, ‘source_file’, and ‘source_mask_file’.group_mask_matrix (
numpy.ndarray) – Boolean group mask array.s3_workers (
int) – Number of parallel workers for loading.
- Returns:
scalars (dict[str, list[np.ndarray]]) – Per-scalar ordered list of 1-D subject arrays, ready for stripe-write.
sources_lists (dict[str, list[str]]) – Per-scalar ordered list of source file paths (for HDF5 metadata).