146
Views
0
CrossRef citations to date
0
Altmetric
Research Article

Identification and Estimation of Discrete Choice Models with Unobserved Choice Sets

& ORCID Icon
Published online: 22 May 2024
 

Abstract

We propose a framework for nonparametric identification and estimation of discrete choice models with unobserved choice sets. We recover the joint distribution of choice sets and preferences from a cross-section of repeated choices. We assume that either the latent choice sets are sparse or that the number of repeated choices is sufficiently large. Sparsity requires the number of possible choice sets to be relatively small. It is satisfied, for instance, when the choice sets are nested or when they form a partition. Our estimation procedure is computationally fast and uses mixed-integer programming to recover the sparse support of choice sets. Analyzing the ready-to-eat cereal industry using a household scanner dataset, we find that ignoring the unobservability of choice sets can lead to incorrect estimates of preferences.

Acknowledgments

We would like to thank Roy Allen, Daniel Chaves, Mingshi Kang, Yuichi Kitamura, Mathieu Marcoux, Salvador Navarro, Joris Pinkse, David Rivers, Bruno Salcedo, Susanne Schennach, Tomasz Strzalecki, and David Wei for their helpful comments and suggestions. Researcher(s)’ own analyses calculated (or derived) based in part on data from Nielsen Consumer LLC and marketing databases provided through the NielsenIQ Datasets at the Kilts Center for Marketing Data Center at The University of Chicago Booth School of Business. The conclusions drawn from the NielsenIQ data are those of the researcher(s) and do not reflect the views of NielsenIQ. NielsenIQ is not responsible for, had no role in, and was not involved in analyzing and preparing the results reported herein.

Disclosure Statement

The authors report there are no competing interests to declare.

Notes

1 See, for instance, Hall and Zhou (Citation2003), Hu (Citation2008), Kasahara and Shimotsu (Citation2009), Bonhomme, Jochmans, and Robin (2016), Kitamura and Laage (Citation2018), and references therein.

2 For applications in the context of auctions and discrete games see Hu, McAdams, and Shum (Citation2013), Xiao (Citation2018), and Luo (Citation2020).

3 The strict monotonicity condition requires the existence of some known function of the data to be monotone in the latent types. See Hu (2017) for examples.

4 See also Goeree (Citation2008) and Barseghyan, Molinari, and Thirkettle (Citation2021b) for applications of consideration sets driven by item-dependent attention.

5 In fact, since they allow prices to diverge to infinity they effectively allow for menu variation, because when the price of an alternative goes to infinity then the alternative becomes unavailable to the DMs.

6 Lu (Citation2022) also provides a set of conditions that ensure that a system of moment inequalities he builds uniquely identifies the parameter of interest.

7 We use boldface font (e.g., D) to denote random objects and regular font (e.g., D) for deterministic ones.

8 |S| denotes the cardinality of S.

9 Similar to Fox, Yang, and Hsu (Citation2018), we implicitly assume that the assignment of label s to consumers is either known (e.g., one uses observable information) or immaterial because of exchangeability of consumers within a market.

10 Mbakop (Citation2017) uses Markovianity of order statistics to identify the distribution of private valuations in auction settings.

11 A condition that requires independence of observed data across time periods is also standard in the analysis of differentiated products demand systems using market-level data (e.g., Berry, Levinsohn, and Pakes Citation1995; Nevo Citation2000, 2001). For instance, in this literature, the independent markets are often defined using a time interval (e.g., week, quarter, or year) and location (e.g., town or zip-code). As a result, it is often assumed that the market shares of a product in the same location, but different time periods conditional on observables are independent draws from the same distribution.

12 Indeed, if m is identified, then we can identify the joint distribution of D, x, and y1. Hence, if we also identify FRUM, we identify the joint distribution of (ys)s=12, D, and x. Using Markovianity, we can then identify the joint distribution of (ys)s=1S, D, and x.

13 Under Assumptions 2’ and 3, the choices are conditionally iid and Si can be any collection of K distinct elements of S.

14 In the context of auctions, a similar assumption for K = 1 has been made in An (Citation2017); Mbakop (Citation2017), and Luo (Citation2020).

15 Under Assumption 2’, one does not need to condition on the previous choices. As a result, the identification and estimation of the model simplifies. For example, we need to observe at least 3 choices instead of 5.

16 The columns of this matrix correspond to different elements of Dx,y,y. The rows correspond to different values yK can take.

17 Abaluck and Adams-Prassl (Citation2021) consider two models the alternative specific model in Manzini and Mariotti (Citation2014) and a variant of the sleeping agent.

18 In general, a sparsity condition is not needed for identification of finite mixtures if the dependent variable is continuously distributed and the latent heterogeneity is discrete (e.g., Hu, McAdams, and Shum Citation2013). In our setting, the dependent variable has finite support, thus, we have to reduce the dimensionality of the problem by bounding the cardinality of the support of the latent choice sets.

19 If one takes α(D)=aDexp(u(a))/|D|, where expu(i)=i, i{1,2,3,4,5}, as the normalized McFadden’s surplus of a consumer that follows a logit additive random utility choice probability; τ = 5; and K(t)=t2/2, then the only active sets are {5} and {4, 5}.

20 Zhong (Citation2022) formally describes under what conditions dynamic information acquisition is smoothed across time.

21 Recent work on the Nielsen Homescan (Shapiro, Hitsch, and Tuchman Citation2021) have estimated that the elasticity of demand to TV-ads is very low in general and is statistically non-different from zero in two-thirds of the goods.

22 The elasticities estimates of Shapiro, Hitsch, and Tuchman (Citation2021) take into account dynamics by considering the stock of advertisement. They estimate long-run elasticities that are low and, in many cases, close to zero.

23 For a recent application of a sieve estimator with continuous covariates see, for instance, Kashaev (Citation2023).

24 Since there are finitely many collections of subsets (i.e., the parameter space is discrete), this estimator of the support of choice sets will converge arbitrary fast.

25 Our simulations indicate that the estimation time of our procedure grows exponentially with Y. However, it is substantially faster than the procedure that checks all sets. For instance, our method takes about 6 hr to estimate a model with Y = 10. The alternative would require solving 1023!/(102310)!>1018 optimization problems.

26 Instead, one can also use the estimator based on diagonalization argument as in Hu, McAdams, and Shum (Citation2013), or any other asymptotically consistent estimator. Unfortunately, it suffers from the same issues in finite samples and performs worse in our simulations.

27 If covariates are discrete, then instead of minimizing the Euclidean distance, one can also minimize the Kullback-Leibler divergence and obtain maximum-likelihood estimates.

28 In our application and simulations, we set ε=0.01.

29 In applications, one can always conduct sensitivity analysis and make ε smaller until the results do not change. One can also use εn that converges to 0 sufficiently slowly (e.g., εn=log(log(n))/n if αn=n). In this case, Assumption 7 is not needed.

30 If some of the estimated sets appear more than ones (i.e., two columns of F˜s1 has the same zero components), then we can just drop one of them.

31 Similar to the Step-1 estimator, instead of minimizing the Euclidean distance, we can conduct maximum likelihood estimation here when covariates are discrete.

32 We consider a short time window, which we believe allows us to disregard habit formation. Also, we consider frequent buyers who usually buy a few units of cereal each shopping trip and then repeat their shopping trip weekly.

33 Here, we use an additive random utility framework where the mean utility is assumed to be stable in the time window, but taste shocks are idiosyncratic. Arguably, in a short time window, it is less likely that the DM will adapt her mean utility due to structural environmental changes.

34 Our DMs are frequent RTE cereal buyers. Hence, we do not allow for the option of not buying anything.

35 The market shares computed directly from the data are not the true market shares but rather a mixture of the market shares from different choice sets.

36 The model can be extended to the Generalized Extreme Value model (McFadden 1978), which includes the Nested Logit model, and to the case when coefficients are random (e.g., Nevo 2001).

37 The details of construction of instruments can be found in our replication files.

38 See Goeree (Citation2008) for similar exclusion restrictions.

39 In Online Appendix C, we report the median across markets own-price elasticities with similar results.

Additional information

Funding

We gratefully acknowledge financial support from the Western Social Science Faculty Grant (FRDF R5533A02) and Social Sciences and Humanities Research Council Insight Development Grant.

Log in via your institution

Log in to Taylor & Francis Online

PDF download + Online access

  • 48 hours access to article PDF & online version
  • Article PDF can be downloaded
  • Article PDF can be printed
USD 61.00 Add to cart

Issue Purchase

  • 30 days online access to complete issue
  • Article PDFs can be downloaded
  • Article PDFs can be printed
USD 123.00 Add to cart

* Local tax will be added as applicable

Related Research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.