NestedLogit.preprocess_model_data#
- NestedLogit.preprocess_model_data(choice_df, utility_equations)[source]#
Pre-process the model initiation inputs into a format that can be used by the PyMC model.
This method prepares the 3D design matrix
X, fixed covariate matrixF(if applicable), and the encoded response vectory, while also extracting and storing relevant metadata such as alternatives, fixed covariate names, product index mappings, and nesting structures.- Parameters:
- choice_df
pd.DataFrame A pandas DataFrame containing the observed choices and covariates for each alternative. Each row represents an individual choice observation.
- utility_equations
list[str] A list of model formulas, one per alternative. Each formula should be of the form:
"alt_name ~ alt_covariates | fixed_covariates". The left-hand side identifies the alternative name; the right-hand side specifies the covariates used to explain utility for that alternative.
- choice_df
- Returns:
- X
np.ndarray A 3D numpy array of shape (n_observations, n_alternatives, n_covariates), representing the covariate tensor for alternative-specific attributes.
- F
np.ndarray|None A 2D numpy array (n_observations, n_fixed_covariates) for covariates shared across alternatives, or None if no such covariates are used.
- y
np.ndarray A 1D numpy array of encoded target labels (integers), where each entry represents the chosen alternative for an observation.
- X
Notes
Updates internal state: assigns
X_data,F,alternatives,fixed_covar,y,
prod_indices,nest_indices,all_nests,lambda_lkup, andcoords. - Handles multi-level nesting structures if provided inself.nesting_structure. - Assumes the existence of instance attributesdepvar,covariates, andnesting_structure.