*** Table 6.1 in W&B book: ORDERED MULTINOMIAL LOGIT EXAMPLE: CHOICE OF SECONDARY SCHOOL * This is a ordered multinomial problem. This program reproduces Table 6.1 in the * W&B textbook. The regressors in this problem are case-specific only. * Read in dataset and describe dependent variable and regressors. use c:\data\school.dta, clear describe * Summarize dependent variable and regressors summarize, separator(0) * Tabulate the dependent variable tabulate school * Table of log of income by school table school, contents(N linc mean linc sd linc) * Table of mother's education by school table school, contents(N motheduc mean motheduc sd motheduc) ********** Table 6.1 in W&B ORDERED MULTINOMIAL LOGIT/PROBIT MODEL OF SCHOOL CHOICE * Creation of year dummies generate d95 = (year == 1995) generate d96 = (year == 1996) generate d97 = (year == 1997) generate d98 = (year == 1998) generate d99 = (year == 1999) generate d00 = (year == 2000) generate d01 = (year == 2001) generate d02 = (year == 2002) * Create full time dummy for mother's employment. 1 = full time employment * 0 = otherwise. * Creat Work dummy if mother works at all or is not employed. 1 = employed * (either full-time or part-time), 0 = otherwise. generate mothftime = (mothemp == 1) generate mothwork = (mothemp < 3) * The Ordered Probit results reported in Table 6.1 in W and B * Remember the or option is not applicable to Ordered Probit oprobit school motheduc mothwork linc lsize parity d95 d96 d97 d98 d99 d00 d01 d02 * The Ordered Logit results reported in Table 6.1 in W and B ologit school motheduc mothwork linc lsize parity d95 d96 d97 d98 d99 d00 d01 d02 * The or version of the report of coefficients (not in Table 6.1) with the * accompanying Brant command to test the Single Index (Parallel Regressions) * assumption. We see that the assumption of a Single Index is supported by the * Brant test of the data. See the Likelihood Ratio test of the Single Index * assumption at the end of this program. ologit school motheduc mothwork linc lsize parity d95 d96 d97 d98 d99 d00 d01 d02, or brant, detail * Predict probabilities of choice of each mode and compare to actual freqs predict pologit1 pologit2 pologit3, pr summarize pologit*, separator(3) * List predicted values of alternatives for first 10 observations list pologit* in 1/10 * Create Classification Table and get accuracy rate egen pred_max = rowmax(pologit*) generate pred_choice = . forv i=1/3 { replace pred_choice = `i' if (pred_max == pomlogit`i') } local school_label: value label school label values pred_choice `school_label' tabulate pred_choice school * Accuracy rate = (107 + 46 + 215)/675 = 0.545 * In comparison, the accuracy rate that one would expect from naively classifying * using the majority class (Gymnasium) would be 41.04% accuracy on average. * See the previous tabulation result for the dependent variable - school. * Thus, the current ologit classifier is providing a LIFT of 54.5/41.04 = 1.328. * Recall the below calculation of the lift of the unordered MNL model reported * in the program Table5_1_WandB.do. * The Lift Ratios are about the same whether unordered or ordered. * Of course we have not generated a classification table using the generalized * ordered logit (gologit2). But the Brant test indicates that this is not * necessary. * Accuracy rate = (113 + 49 + 208)/675 = 0.548 * In comparison, the accuracy rate that one would expect from naively classifying * using the majority class (Gymnasium) would be 41.04% accuracy on average. * See the previous tabulation result for the dependent variable - school. * Thus, the current mlogit classifier is providing a LIFT of 54.8/41.04 = 1.335. * As an alternative to the Brant test we could do a likelihood ratio test * using the generalized ordered logit model. If the restriction that the * single index model is appropriate we should get a high probability value * for the Likelihood ratio test. gologit2 school motheduc mothwork linc lsize parity d95 d96 d97 d98 d99 d00 d01 d02 * With OR report gologit2 school motheduc mothwork linc lsize parity d95 d96 d97 d98 d99 d00 d01 d02, or * Now for the computation of the Likelihood Ratio test of the Single Index hypothesis. * From the Ordered Logit model above we have a log likelihood value of -630.1549. * This is the restricted model that imposes the Single Index assumption. * From the above Generalized Ordered Logit model we obtain the log likelihood * value of -622.32471. This represents the fit of the unrestricted model because * we are not imposing the Single Index assumption. Then the likelihood ratio * statistic is -2log(lambda) = -2(logl(restricted model) - logl(unrestricted model)) * = -2(-630.1549 -(-622.32471)) = -2(-7.83019) = 15.66038. * The number of degrees of freedom of the chi-square test is 28 - 15 = 13 * where the number of parameters in the unrestricted model (gologit2) is 28 * while the number of parameters in the restricted model (ologit) is 15. * It follows that the p-value associated with the observed statistic of * 15.66038 is 0.267957 > 0.05. Therefore we accept the null hypothesis of a * Single Index. You can use the EXCEL function chisq.dist.rt to obtain * this probability value. Therefore, we see that the Brant and * Likelihood Ratio tests give the same result. The Single Index model * (ologit) is to be preferred. That is, the Ordered Logit model, in this * case, is to be preferred over the Generalized Ordered Logit model.