modelarrayio.utils.misc.load_and_normalize_cohort
- modelarrayio.utils.misc.load_and_normalize_cohort(cohort_file, scalar_columns=None) tuple[DataFrame, str][source]
Load a cohort CSV, normalise it, and detect the neuroimaging modality.
This is the single entry-point for cohort ingestion shared by all
*_to_h5converters. It performs, in order:pd.read_csvthe file.cohort_to_long_dataframeto normalise wide/long format.Empty-cohort validation.
Modality detection from every unique
source_fileextension.Mixed-modality validation (all rows must be the same modality).
- Parameters:
cohort_file (path-like) – Path to the cohort CSV file.
scalar_columns (list of str, optional) – Column names for wide-format cohort files. If omitted the CSV must already contain
scalar_nameandsource_filecolumns.
- Returns:
cohort_long (pandas.DataFrame) – Normalised long-format cohort dataframe.
modality (str) – Detected modality:
'nifti','mif', or'cifti'.
- Raises:
ValueError – If the cohort is empty after normalisation, if a source file has an unrecognised extension, or if the cohort contains mixed modalities.