use "C:\data\03728-0001-data.dta", clear keep if sex==2 * women keep if age > = 40 * completed fertility keep if year==2002|year==1998|year==1994|year==1990|year==1986|year==1982|year==1978|year==1974 ren childs kids drop if kids ==. drop if age ==. drop if sibs ==. drop if educ ==. gen afb = agekdbrn gen city16=(res16>=4)&(res16<=6) gen lowinc16 = (incom16==1)|(incom16==2) gen immig = (born ==2)|(parborn==8) replace race = racecen1 if year == 2002 gen white = race == 1 label var afb "woman's age when 1st child born" label var white "=1 if r's race is white" label var immig "=1 if respondent or both r's parents born abroad" label var lowinc16 "=1 if income is below average income at age 16" label var city16 "=1 if respondent lived in a city (pop>50000) at age 16" keep year sibs kids age afb educ white city16 lowinc16 immig order kids age educ year sibs afb white city16 lowinc16 immig gen trend = year - 1974 * Now to replicated the results of Table 8.7 in the W&B textbook * Poisson Regression Model poisson kids educ trend white immig lowinc16 city16 * As discussed in the W&B textbook, one can get robust standard * errors for the Poisson Regression coeficient estimates by * using the vce(rbust) option. This is what is called * Quasi-Maximum Likelihood Estimation (QMLE) poisson kids educ trend white immig lowinc16 city16, vce(robust) * Negative Binomial Regression I nbreg kids educ trend white immig lowinc16 city16 * Negative Binomial Regression II gnbreg kids educ trend white immig lowinc16 city16 * Tabulate the kids variable * From this information we might see if there are * an excess number of zeroes. This can be done by * by calculating Prob(y=0) using a Poisson model (with no * explanatory variables) and the lambda value set equal * to the sample mean of the counts. If there appears to * be an excess number of zeroes when comparing the proportion * of zeroes in the sample with the Poisson Prob(y = 0) * then we can model the counts using a Hurdle * model or an excess zeroes model. See the W&B book. tabulate kids summarize kids * From the tabulation and summarize we see that the proportion * of zeroes in the sample is 744/5150 = 0.1445. The mean of the * counts is 2.59. Then the Prob(y = 0) = exp(-2.59) = 0.075 is * what we would expect of a Poisson Distribution with a mean * count of 2.59 (i.e. lamda = 2.59). We are then interested * in whether or not there is a significanct difference in * the two proportions, 0.1445 and 0.075. A 95% confidence interval * of the true proportion, say, p, is given by 0.1445 +- z(.025)* * sqrt[(0.1445)*(1 - 0.1445)/5150] = 0.1445 +- 0.004899 which * obviously does not encompass 0.075. Given this result we * probably should continue our analysis assuming an excess of * zeroes by using Hurdle or zero-inflated negative binomial * models.