*** Table 6.1 in W&B book: ORDERED MULTINOMIAL LOGIT EXAMPLE: CHOICE OF SECONDARY SCHOOL

* This is a ordered multinomial problem.  This program reproduces Table 6.1 in the
* W&B textbook.  The regressors in this problem are case-specific only.

* Read in dataset and describe dependent variable and regressors.  
       
use c:\data\school.dta, clear
describe

* Summarize dependent variable and regressors
summarize, separator(0) 

* Tabulate the dependent variable
tabulate school

* Table of log of income by school 
table school, contents(N linc mean linc sd linc)

* Table of mother's education by school 
table school, contents(N motheduc mean motheduc sd motheduc)

********** Table 6.1 in W&B ORDERED MULTINOMIAL LOGIT/PROBIT MODEL OF SCHOOL CHOICE

* Creation of year dummies

generate d95 = (year == 1995)
generate d96 = (year == 1996)
generate d97 = (year == 1997)
generate d98 = (year == 1998)
generate d99 = (year == 1999)
generate d00 = (year == 2000)
generate d01 = (year == 2001)
generate d02 = (year == 2002)

* Create full time dummy for mother's employment.  1 = full time employment
* 0 = otherwise.

* Creat Work dummy if mother works at all or is not employed.  1 = employed
* (either full-time or part-time), 0 = otherwise.

generate mothftime = (mothemp == 1)
generate mothwork = (mothemp < 3)

* The Ordered Probit results reported in Table 6.1 in W and B
* Remember the or option is not applicable to Ordered Probit 
oprobit school motheduc mothwork linc lsize parity d95 d96 d97 d98 d99 d00 d01 d02

* The Ordered Logit results reported in Table 6.1 in W and B
ologit school motheduc mothwork linc lsize parity d95 d96 d97 d98 d99 d00 d01 d02

* The or version of the report of coefficients (not in Table 6.1) with the 
* accompanying Brant command to test the Single Index (Parallel Regressions)
* assumption.  We see that the assumption of a Single Index is supported by the
* Brant test of the data.  See the Likelihood Ratio test of the Single Index
* assumption at the end of this program.   
ologit school motheduc mothwork linc lsize parity d95 d96 d97 d98 d99 d00 d01 d02, or
brant, detail

* Predict probabilities of choice of each mode and compare to actual freqs
predict pologit1 pologit2 pologit3, pr
summarize pologit*, separator(3)

* List predicted values of alternatives for first 10 observations

list pologit* in 1/10

* Create Classification Table and get accuracy rate

egen pred_max = rowmax(pologit*)
generate pred_choice = .
forv i=1/3 {
replace pred_choice = `i' if (pred_max == pomlogit`i')
}
local school_label: value label school
label values pred_choice `school_label'
tabulate pred_choice school

* Accuracy rate = (107 + 46 + 215)/675 = 0.545
* In comparison, the accuracy rate that one would expect from naively classifying
* using the majority class (Gymnasium) would be 41.04% accuracy on average.
* See the previous tabulation result for the dependent variable - school.
* Thus, the current ologit classifier is providing a LIFT of 54.5/41.04 = 1.328. 

* Recall the below calculation of the lift of the unordered MNL model reported
* in the program Table5_1_WandB.do.
* The Lift Ratios are about the same whether unordered or ordered.
* Of course we have not generated a classification table using the generalized
* ordered logit (gologit2).  But the Brant test indicates that this is not
* necessary.
  
* Accuracy rate = (113 + 49 + 208)/675 = 0.548
* In comparison, the accuracy rate that one would expect from naively classifying
* using the majority class (Gymnasium) would be 41.04% accuracy on average.
* See the previous tabulation result for the dependent variable - school.
* Thus, the current mlogit classifier is providing a LIFT of 54.8/41.04 = 1.335.

* As an alternative to the Brant test we could do a likelihood ratio test
* using the generalized ordered logit model.  If the restriction that the
* single index model is appropriate we should get a high probability value
* for the Likelihood ratio test.

gologit2 school motheduc mothwork linc lsize parity d95 d96 d97 d98 d99 d00 d01 d02

* With OR report 
gologit2 school motheduc mothwork linc lsize parity d95 d96 d97 d98 d99 d00 d01 d02, or

* Now for the computation of the Likelihood Ratio test of the Single Index hypothesis.
* From the Ordered Logit model above we have a log likelihood value of -630.1549.
* This is the restricted model that imposes the Single Index assumption.
* From the above Generalized Ordered Logit model we obtain the log likelihood
* value of -622.32471.  This represents the fit of the unrestricted model because
* we are not imposing the Single Index assumption.  Then the likelihood ratio
* statistic is -2log(lambda) = -2(logl(restricted model) - logl(unrestricted model))     
* = -2(-630.1549 -(-622.32471)) = -2(-7.83019) = 15.66038.
* The number of degrees of freedom of the chi-square test is 28 - 15 = 13
* where the number of parameters in the unrestricted model (gologit2) is 28
* while the number of parameters in the restricted model (ologit) is 15.
* It follows that the p-value associated with the observed statistic of
* 15.66038 is 0.267957 > 0.05.  Therefore we accept the null hypothesis of a
* Single Index.  You can use the EXCEL function chisq.dist.rt to obtain
* this probability value.  Therefore, we see that the Brant and
* Likelihood Ratio tests give the same result.  The Single Index model
* (ologit) is to be preferred.  That is, the Ordered Logit model, in this
* case, is to be preferred over the Generalized Ordered Logit model.