* See Keane.des for description of the data. Example taken from * Econometric Analysis of Cross Section and Panel Data * by Jeffrey M. Wooldridge, 2002, pp. 498 - 450. * Original panel data from Keane and Wolpin, 1997 * A cross-section of men for the year 1987. The three possible * outcomes are 1) enrolled in school (status = 0), not in school * (status = 1), and working (status = 2). The explanatory variables * are education, a quadratic in past work experience, and a black * binary indicator. The base category is enrolled in school. Out of * 1,717 observatins, 99 are enrolled in school, 332 are at home, * and 1,286 are working. * Multinomial Logit Estimates of School and Labor Market Decisions * Table 15.2 in Wolldrdge, p. 499. use c:\data\keane.dta keep if (year == 87) * Tabulate the dependent variable (status) tabulate status * Multinomial logit with base outcome alternative 1 (status=0) mlogit status educ exper expersq black, baseoutcome(1) nolog * Odds Ratio estimates - Multinomial logit with base outcome alternative 1 (status=0) mlogit status educ exper expersq black, rr baseoutcome(1) nolog * Predict probabilities of each status and compare to actual freqs quietly mlogit status educ exper expersq black, baseoutcome(1) predict pmlogit1 pmlogit2 pmlogit3, pr summarize pmlogit* st, separator(3) list pmlogit* in 1/10 * Create Classification Table and get accuracy rate egen pred_max = rowmax(pmlogit*) generate pred_choice = . forv i=1/3 { replace pred_choice = `i' if (pred_max == pmlogit`i') } local status_label: value label status label values pred_choice `status_label' tabulate pred_choice status * Accuracy rate = (12 + 130 + 1224)/1717 = 0.796 * In comparison, the accuracy rate that one would expect from naively classifying * using the majority class (status = 2 - work) would be 74.9% accuracy on average. * See the previous tabulation result for the dependent variable - status. * Thus, the current mlogit classifier is providing a LIFT of 79.6/74.9 = 1.06.