A high-dimensional multinomial logit model

Research output: Contribution to journalArticleResearchpeer-review


The number of parameters in a standard multinomial logit model increases linearly with the number of choice alternatives and number of explanatory variables. Because many modern applications involve large choice sets with categorical explanatory variables, which enter the model as large sets of binary dummies, the number of parameters in a multinomial logit model is often large. This paper proposes a new method for data-driven two-way parameter clustering over outcome categories and explanatory dummy categories in a multinomial logit model. A Bayesian Dirichlet process mixture model encourages parameters to cluster over the categories, which reduces the number of unique model parameters and provides interpretable clusters of categories. In an empirical application, we estimate the holiday preferences of 11 household types over 49 holiday destinations and identify a small number of household segments with different preferences across clusters of holiday destinations.

Original languageEnglish
Number of pages17
JournalJournal of Applied Econometrics
Publication statusAccepted/In press - 2024


  • Dirichlet process prior
  • high-dimensional models
  • large choice sets
  • multinomial logit model

Cite this