/* Note that the MDC procedure does not include the intercept term automatically like other regression procedures. The dependent variable decision takes the value 1 when a specific alternative is chosen; otherwise it takes the value 0. Each individual is allowed to choose one and only one of the possible alternatives. In other words, the variable decision takes the value 1 one time only for each individual. If each individual has three elements (1, 2, and 3) in the choice set, the NCHOICE=3 option can be specified instead of CHOICE=(mode 1 2 3). Consider the following trinomial data from Daganzo (1979). The original data (origdata) contains travel time (ttime1-ttime3) and choice (choice) variables. ttime1-ttime3 are the travel times for three different modes of transportation, and choice indicates which one of the three modes is chosen. The choice variable must have integer values. This program is taken directly from the Proc MDC help file. */ data origdata; input ttime1 ttime2 ttime3 choice @@; datalines; 16.481 16.196 23.89 2 15.123 11.373 14.182 2 19.469 8.822 20.819 2 18.847 15.649 21.28 2 12.578 10.671 18.335 2 11.513 20.582 27.838 1 10.651 15.537 17.418 1 8.359 15.675 21.05 1 11.679 12.668 23.104 1 23.237 10.356 21.346 2 13.236 16.019 10.087 3 20.052 16.861 14.168 3 18.917 14.764 21.564 2 18.2 6.868 19.095 2 10.777 16.554 15.938 1 20.003 6.377 9.314 2 19.768 8.523 18.96 2 8.151 13.845 17.643 2 22.173 18.045 15.535 1 13.134 11.067 19.108 2 14.051 14.247 15.764 1 14.685 10.811 12.361 3 11.666 10.758 16.445 1 17.211 15.201 17.059 3 13.93 16.227 22.024 1 15.237 14.345 19.984 2 10.84 11.071 10.188 1 16.841 11.224 13.417 2 13.913 16.991 26.618 3 13.089 9.822 19.162 2 16.626 10.725 15.285 3 13.477 15.509 24.421 2 20.851 14.557 19.8 2 11.365 12.673 22.212 2 13.296 10.076 17.81 2 15.417 14.103 21.05 1 15.938 11.18 19.851 2 19.034 14.125 19.764 2 10.466 12.841 18.54 1 15.799 16.979 13.074 3 12.713 15.105 13.629 2 16.908 10.958 19.713 2 17.098 6.853 14.502 2 18.608 14.286 18.301 2 11.059 10.812 20.121 1 15.641 10.754 24.669 2 7.822 18.949 16.904 1 12.824 5.697 19.183 2 11.852 12.147 15.672 2 15.557 8.307 22.286 2 ; /* Here we convert the data to satisify the data format of Proc MDC. */ data newdata(keep=pid decision mode ttime); set origdata; array tvec{3} ttime1 - ttime3; retain pid 0; pid + 1; do i = 1 to 3; mode = i; ttime = tvec{i}; decision = ( choice = i ); output; end; run; proc print data = newdata; run; /* Now we run the multinomial logit on the Daganzo data. */ proc mdc data=newdata; model decision = ttime / type=clogit nchoice=3 optmethod=qn covest=hess; id pid; run; /* Here we set up the out-of-sample data for predicting the choice of the 51st individual. */ data extra; input pid mode decision ttime; datalines; 51 1 . 5.0 51 2 . 15.0 51 3 . 14.0 ; data extdata; set newdata extra; run; /* Now we use the estimated multinomial logit model to predict the out-of-sample data. */ proc mdc data=extdata; model decision = ttime / type=clogit covest=hess nchoice=3; id pid; output out=probdata pred=p; run; /* Now we print out the prediction for the 50th and 51st individuals, the 51st individual being the out-of-sample individual. */ proc print data=probdata( where=( pid >= 49 ) ); var mode decision p ttime; id pid; run;