/* The following data was obtained from the website for the Stock and Watson textbook Introductory Econometrics (Addison-Wesley, 2007, Second edition). It is used for the chapter 1 presentation on the empirical question of "Does Reducing Class Size Improve Elementary School Education?" This is data gathered on 420 California school districts in 1998. This is, of course, a cross-section data set. We are first going to analyze the data using a one-way analysis of variance (i.e. a test of difference of means assuming the two populations have equal variances and are normally distributed.) Then later we will use multiple regression analysis, first using OLS and then later adjusting for heteroskedasticity by using White's heteroskedasticity-consistent standard errros for the OLS coefficient estimates. obs = observation number, score = district average test score (Fifth grade), st_ratio = average student to teacher ratio in the fifth grade classes in the district, expend_pupil = expenditure per 5th grade pupil in the district, english = average percentage of fifth grade students learning English in the district. */ Data Combined; input obs score st_ratio expend_pupil english; datalines; 1 690.8 17.89 6385 0.0 2 661.2 21.52 5099 4.6 3 643.6 18.70 5502 30.0 4 647.7 17.36 7102 0.0 5 640.8 18.67 5236 13.9 6 605.6 21.41 5580 12.4 7 606.8 19.50 5253 68.7 8 609.0 20.89 4566 47.0 9 612.5 19.95 5356 30.1 10 612.7 20.81 5036 40.3 11 615.8 21.24 4548 52.9 12 616.3 21.00 5447 54.6 13 616.3 20.60 6567 42.7 14 616.3 20.01 4819 20.5 15 616.5 18.03 5621 80.1 16 617.3 20.25 6026 49.4 17 618.1 16.98 6723 85.5 18 618.3 16.51 5590 58.9 19 619.8 22.70 5065 77.0 20 620.3 19.91 5434 49.8 21 620.5 18.33 5726 40.7 22 621.4 22.62 4542 16.2 23 621.8 19.45 5107 45.1 24 622.1 25.05 4660 39.1 25 622.6 20.68 4555 76.7 26 623.1 18.68 5415 40.5 27 623.2 22.85 4998 73.7 28 623.5 19.27 5224 70.0 29 623.6 19.25 5139 56.0 30 624.2 20.55 4614 11.1 31 624.6 20.61 5342 80.4 32 625.0 21.07 5347 63.1 33 625.3 21.54 5036 65.1 34 625.8 19.90 5117 53.4 35 626.1 21.19 5117 49.8 36 626.8 21.87 5272 35.5 37 626.9 18.33 5226 56.1 38 627.1 16.23 6517 32.4 39 627.3 19.18 4559 65.5 40 627.3 20.28 5119 53.1 41 628.3 22.99 5338 49.6 42 628.4 20.44 5090 45.1 43 628.6 19.82 5485 30.3 44 628.7 23.21 4793 52.2 45 628.8 19.27 5093 36.8 46 629.8 23.30 4360 30.3 47 630.3 21.19 5645 49.9 48 630.4 20.87 4518 13.8 49 630.5 19.02 5864 28.9 50 630.5 21.92 5258 52.8 51 631.1 20.10 5017 44.1 52 631.4 21.48 4720 35.3 53 631.8 20.07 5471 37.5 54 631.9 20.38 5615 50.4 55 632.0 22.45 5245 31.1 56 632.0 22.90 4838 18.3 57 632.2 20.50 5368 34.7 58 632.3 20.00 5526 33.3 59 632.4 22.26 4353 33.5 60 632.8 21.56 5034 38.2 61 633.0 19.48 4692 36.9 62 633.0 17.67 5607 33.0 63 633.2 21.95 4969 58.2 64 633.7 21.78 4676 17.0 65 633.9 19.14 5306 17.7 66 634.0 18.11 5694 7.3 67 634.1 20.68 5182 31.2 68 634.1 22.62 5229 16.6 69 634.1 21.79 5339 58.1 70 634.2 18.58 6056 55.9 71 634.2 21.55 4846 5.5 72 634.4 21.15 4827 14.4 73 634.5 16.63 5367 22.8 74 634.7 21.14 5743 38.8 75 634.9 19.78 4136 64.2 76 634.9 18.98 5268 25.2 77 635.0 17.67 5238 5.4 78 635.2 17.75 5463 6.1 79 635.5 15.27 6313 0.0 80 635.6 14.00 6653 0.0 81 635.6 20.60 5533 34.9 82 635.8 16.31 6119 13.1 83 636.0 21.13 5099 36.0 84 636.1 17.49 5653 34.7 85 636.5 17.89 5329 28.7 86 636.6 19.31 5930 1.8 87 636.7 20.89 4897 30.6 88 636.9 21.29 5100 59.0 89 637.0 20.20 4826 13.6 90 637.0 24.95 4079 5.0 91 637.1 18.13 5349 1.0 92 637.3 20.00 5869 0.7 93 637.7 18.73 6462 38.5 94 637.9 18.25 6232 0.0 95 638.0 18.99 4994 17.2 96 638.0 19.89 4664 9.9 97 638.2 19.38 6107 19.4 98 638.3 20.46 5324 36.5 99 638.3 22.29 5221 39.4 100 638.3 20.70 5158 28.7 101 638.5 19.06 5131 13.7 102 638.7 20.23 5279 11.3 103 639.3 19.69 4704 3.4 104 639.3 20.36 6090 15.4 105 639.3 19.75 5357 18.0 106 639.5 19.38 5145 22.7 107 639.8 22.92 4906 18.4 108 639.8 19.37 5490 16.2 109 639.8 19.16 5719 2.0 110 639.9 21.30 4436 9.6 111 640.1 18.30 4895 41.5 112 640.2 21.08 5159 9.9 113 640.5 18.79 5491 16.1 114 640.8 19.63 5172 43.5 115 640.9 19.59 4442 8.8 116 641.1 20.87 5220 39.0 117 641.4 21.12 5253 53.9 118 641.4 20.08 5519 41.1 119 641.5 19.91 5609 1.4 120 641.8 17.81 4945 35.4 121 642.2 18.13 5223 8.6 122 642.2 19.22 4757 15.3 123 642.4 18.66 5522 19.9 124 642.8 19.60 6088 3.1 125 643.0 19.28 4961 9.9 126 643.2 22.82 4880 16.1 127 643.3 18.81 6538 43.5 128 643.4 21.37 4926 45.0 129 643.4 20.02 5205 18.2 130 643.5 21.50 4907 15.5 131 643.5 15.43 5923 0.9 132 643.7 22.40 4742 7.6 133 643.7 20.13 4954 29.1 134 644.2 19.04 5500 0.1 135 644.2 17.34 5361 50.9 136 644.4 17.02 5686 11.5 137 644.5 20.80 5594 13.5 138 644.5 21.15 4693 4.7 139 644.5 18.46 5085 21.4 140 644.5 19.14 5456 12.4 141 644.7 19.41 5105 30.1 142 645.0 19.57 5520 15.9 143 645.1 21.50 5066 43.8 144 645.3 17.53 5182 0.0 145 645.5 16.43 5960 0.0 146 645.6 19.80 5125 16.7 147 645.6 17.19 5655 3.2 148 645.8 17.62 5812 0.0 149 645.8 20.13 5468 48.5 150 646.0 22.17 5213 0.8 151 646.2 19.96 4613 1.9 152 646.3 19.04 4887 13.5 153 646.4 15.22 6455 0.0 154 646.5 21.14 5125 16.7 155 646.5 19.64 5194 5.9 156 646.7 21.05 5003 36.2 157 646.9 20.18 5338 45.0 158 646.9 21.39 5161 16.7 159 647.0 20.01 5159 15.2 160 647.3 20.29 5123 34.3 161 647.3 17.67 5783 0.0 162 647.6 18.22 4843 13.1 163 647.6 20.27 4219 4.3 164 648.0 20.20 5081 39.6 165 648.2 21.38 5145 21.7 166 648.3 20.97 4674 3.3 167 648.3 20.00 4830 7.9 168 648.7 17.15 5622 39.6 169 648.9 22.35 4949 10.1 170 649.2 22.17 5101 27.5 171 649.3 18.18 5133 14.0 172 649.5 18.96 5359 8.8 173 649.7 19.75 5149 6.4 174 649.8 16.43 5373 2.4 175 650.4 16.63 6485 6.0 176 650.5 16.38 5504 8.2 177 650.6 20.07 5106 15.5 178 650.7 18.00 5635 0.0 179 650.9 19.39 4980 0.0 180 650.9 16.43 6114 0.0 181 651.2 16.73 5850 31.8 182 651.2 24.41 4548 12.8 183 651.3 18.26 5012 0.0 184 651.4 18.96 5261 3.8 185 651.5 21.04 4276 2.5 186 651.8 20.74 4566 10.7 187 651.8 18.10 6049 0.0 188 651.9 19.85 4974 1.9 189 652.0 21.60 4432 4.6 190 652.1 22.44 4925 5.3 191 652.1 23.01 4604 27.5 192 652.3 17.75 5974 15.0 193 652.3 18.29 5216 47.9 194 652.3 19.27 4882 0.0 195 652.4 22.67 4146 1.8 196 652.4 19.29 7542 0.0 197 652.5 17.36 5247 0.0 198 652.8 19.82 5170 7.3 199 653.1 20.43 5951 17.0 200 653.4 21.04 4747 10.4 201 653.5 19.92 4944 2.3 202 653.5 19.01 6306 28.2 203 653.6 23.82 4260 8.8 204 653.7 19.37 4718 7.5 205 653.8 19.83 4751 9.8 206 653.8 15.26 5653 12.5 207 653.9 17.16 5920 0.0 208 654.1 21.81 4826 31.2 209 654.2 19.07 5533 1.4 210 654.2 25.79 3926 9.6 211 654.3 18.21 5806 0.0 212 654.6 18.17 4899 24.3 213 654.8 16.97 4885 9.6 214 654.8 21.50 5140 5.9 215 654.9 20.60 5249 1.0 216 655.0 16.99 5399 18.3 217 655.1 20.78 4965 5.0 218 655.1 15.51 6210 4.3 219 655.2 19.89 4798 3.4 220 655.3 21.40 5397 32.7 221 655.3 20.50 5079 9.5 222 655.3 19.36 4734 6.3 223 655.4 17.66 4963 13.3 224 655.5 21.02 5431 3.2 225 655.7 19.06 5382 17.0 226 655.8 22.54 5724 0.9 227 655.8 21.11 4843 6.6 228 656.4 20.05 5730 0.1 229 656.5 14.20 5636 0.0 230 656.6 18.48 5672 6.9 231 656.7 18.64 7071 9.6 232 656.7 20.95 5097 0.9 233 656.8 21.09 5483 1.2 234 656.8 18.69 5643 40.1 235 657.0 20.87 5280 16.7 236 657.0 19.83 5433 22.0 237 657.2 19.75 4529 8.9 238 657.4 19.50 6040 0.9 239 657.5 18.39 6168 0.0 240 657.6 18.79 5775 3.9 241 657.7 19.77 4807 11.9 242 657.8 19.33 4841 4.0 243 657.8 21.46 4954 22.9 244 657.9 23.08 4024 2.5 245 658.0 21.06 5179 9.9 246 658.3 18.69 4385 0.5 247 658.6 20.77 4778 9.0 248 658.8 19.31 4863 6.5 249 659.1 20.13 5486 20.7 250 659.2 20.67 5486 0.2 251 659.3 22.28 4631 14.0 252 659.4 20.60 5190 8.7 253 659.4 20.83 4601 20.0 254 659.8 19.22 6736 1.4 255 659.9 17.65 5475 2.1 256 660.1 17.00 5991 28.6 257 660.1 16.50 7203 0.2 258 660.2 19.78 4778 13.7 259 660.3 22.30 4303 0.0 260 660.8 17.73 5630 0.2 261 660.9 20.45 4709 3.1 262 661.3 20.37 5630 47.4 263 661.5 20.16 4891 22.7 264 661.6 21.62 4929 29.1 265 661.6 20.56 4905 4.4 266 661.8 19.96 5157 4.8 267 661.8 21.18 4942 5.2 268 661.8 18.81 5409 1.9 269 661.9 20.58 4981 8.7 270 661.9 18.32 4877 10.5 271 662.0 18.82 5112 1.2 272 662.4 20.82 5202 0.0 273 662.4 20.00 4138 9.4 274 662.5 19.68 5744 1.4 275 662.5 19.39 5703 35.3 276 662.6 20.93 4802 9.6 277 662.6 19.94 5442 2.2 278 662.7 20.79 4522 0.4 279 662.7 19.20 4966 0.6 280 662.8 19.02 5358 0.0 281 662.9 17.62 5785 0.9 282 663.3 20.24 5073 5.5 283 663.4 19.29 5556 0.0 284 663.5 18.83 5190 0.0 285 663.8 20.34 5211 2.4 286 663.8 19.23 5284 20.8 287 663.9 17.89 5332 2.2 288 664.0 19.52 5600 13.6 289 664.0 19.08 4882 13.3 290 664.2 19.94 4700 1.9 291 664.2 18.87 5650 23.1 292 664.3 20.14 5318 0.6 293 664.4 23.56 4709 3.3 294 664.4 21.46 5143 0.0 295 664.7 19.19 5133 2.2 296 664.8 20.13 5081 18.6 297 664.9 25.80 4016 6.2 298 665.0 18.78 5374 18.5 299 665.1 19.11 5428 32.1 300 665.2 19.70 5180 4.0 301 665.3 18.62 5400 15.3 302 665.7 21.00 5726 27.7 303 665.9 20.00 5004 12.5 304 666.0 20.98 4970 12.7 305 666.0 21.64 4747 6.9 306 666.1 20.03 4543 15.0 307 666.1 19.81 5230 11.3 308 666.2 18.00 4507 9.7 309 666.2 19.36 5303 3.3 310 666.4 20.18 5546 0.7 311 666.6 21.12 4616 5.1 312 666.6 23.39 5398 5.0 313 666.7 22.18 5118 7.0 314 666.7 19.94 5008 8.0 315 666.7 17.79 4906 0.5 316 666.8 14.71 6870 2.5 317 666.8 19.04 5302 3.8 318 667.2 20.89 5182 1.3 319 667.2 19.84 4999 4.9 320 667.5 19.52 6732 0.0 321 667.5 20.69 4613 0.2 322 667.6 18.18 5479 0.0 323 668.0 18.89 5667 17.9 324 668.1 24.89 4743 4.5 325 668.4 18.58 4907 9.5 326 668.6 18.04 5700 4.0 327 668.7 17.73 5192 20.7 328 668.8 21.45 4451 7.6 329 668.9 19.92 5628 27.5 330 669.0 20.34 4923 10.9 331 669.1 22.55 4969 5.0 332 669.3 21.10 4830 14.2 333 669.3 18.20 6717 0.2 334 669.3 20.11 5206 2.1 335 669.3 19.16 4358 8.6 336 669.8 19.55 5196 0.0 337 669.8 20.89 6002 4.8 338 670.0 18.39 5091 4.0 339 670.0 19.18 5112 1.6 340 670.7 19.40 5287 3.6 341 671.3 21.68 4889 8.7 342 671.3 19.29 5118 0.0 343 671.6 20.35 5119 0.3 344 671.6 20.96 4792 1.4 345 671.7 19.46 5450 0.1 346 671.7 19.29 5211 0.0 347 671.8 20.92 4715 3.7 348 671.9 20.90 4632 0.3 349 671.9 20.60 5029 0.0 350 671.9 19.38 5695 4.5 351 672.0 19.95 4504 2.4 352 672.1 18.85 5156 0.0 353 672.3 18.12 5434 0.0 354 672.3 19.18 5255 5.4 355 672.5 22.00 4593 0.0 356 672.6 21.58 4321 6.4 357 672.7 20.39 4921 6.5 358 673.0 16.29 6906 14.3 359 673.3 18.28 5730 3.7 360 673.3 19.37 4825 0.5 361 673.5 18.91 5388 0.0 362 673.5 16.41 5134 22.7 363 673.9 15.59 5346 0.0 364 674.3 18.71 4789 0.1 365 675.4 18.33 5330 2.7 366 675.7 17.90 5371 1.7 367 676.2 18.91 5771 0.0 368 676.5 20.32 4994 0.1 369 676.6 20.02 4854 0.0 370 676.8 24.00 4393 11.0 371 676.9 17.61 5884 0.0 372 677.3 19.35 4274 0.0 373 678.0 19.68 5147 0.1 374 678.1 18.73 5352 0.0 375 678.4 15.88 7668 0.0 376 678.8 20.05 5282 15.1 377 679.4 17.99 5369 1.5 378 679.5 16.97 6429 2.6 379 679.7 19.24 5387 1.5 380 679.8 19.20 5236 4.5 381 679.8 19.60 4830 0.3 382 680.1 20.54 5089 2.5 383 680.5 18.59 5390 0.5 384 681.3 15.60 6588 11.6 385 681.3 15.29 6197 11.4 386 681.6 17.66 4645 0.0 387 681.9 17.58 6490 1.4 388 682.2 22.33 5095 2.2 389 682.5 18.75 4842 5.3 390 682.5 18.10 6315 2.8 391 682.7 20.26 5399 0.0 392 683.3 18.80 5429 17.7 393 683.4 18.77 5644 0.0 394 684.3 20.41 6060 0.0 395 684.3 18.65 5320 0.4 396 684.8 20.71 4820 6.1 397 685.0 22.00 5208 10.1 398 686.1 17.70 5860 3.4 399 686.7 21.48 4963 8.6 400 687.5 16.70 7614 12.3 401 689.1 19.58 5566 1.6 402 691.0 17.26 7040 1.5 403 691.3 17.38 6604 10.4 404 691.9 17.35 6180 6.1 405 694.0 16.26 6461 2.3 406 694.3 17.70 6415 2.5 407 694.8 20.13 5231 0.9 408 695.2 18.27 5838 2.4 409 695.3 14.54 7712 3.8 410 696.6 19.15 5593 2.0 411 698.2 17.37 5933 0.6 412 698.3 15.14 7593 2.8 413 698.4 17.84 6500 1.4 414 699.1 15.41 7217 1.2 415 700.3 18.87 5393 2.1 416 704.3 16.47 7290 6.0 417 706.8 17.86 5741 4.7 418 645.0 21.89 4403 24.3 419 672.2 20.20 4776 3.0 420 655.8 19.04 5993 5.0 ; /* Here we create a "dummy" variable that indicates whether a given observation is from a "small class" school district (dummy = 1) or a "large class" school district (dummy = 0). */ Data combined; set combined; dummy = (st_ratio < 20); /* Here we run a "regression equivalent version" of the Exact t-test-of-difference in means. Here we are assuming the variance in the populations (small class vs. large class) is the same. */ proc reg data = combined; model score = dummy; run; /* Now instead of breaking the score data into two groups and comparing the means of the two groups, we run a simple regression of score on the student-to-teacher ratio. Here we assume that the errors in the regression are homoskedastic. You can check this out by using SAS INSIGHT to produce a "residual" plot. */ proc reg data = combined; model score = st_ratio; output out = result r = resid; run; /* Use SAS INSIGHT and the data set "get" to plot the Ordinary Least Squares (OLS) residuals labeled "resid" on the y-axis and the student-to-teacher ratio labeled "st_ratio" on the x-axis. Do the residuals appear to be homoskedastic? Look at the scatter in the residuals when st_ratio < 20 versus the scatter when st_ratio. */ data get; merge combined result; run; /* Here we generate the ingredients to conduct an F-test for equal variance on the OLS residuals from our above regression. The null hypothesis is that the variance of the residuals of the small-classroom school districts is equal to the variance of the residuals of the large-classroom school districts. The alternative hypothesis is that variances of the the residuals of the two groups are not equal and thus that there is heteroskedasticity in the errors of our regression model. If there is heteroskedasticity we cannot base our statistical tests of significance of variables on the OLS equation results. We must either adjust our standard errors for the heteroskedasticity either non-parametrically or by means of Weighted Least Squares. */ data small; set get; if st_ratio < 20; /* Here we get the sample standard deviations for the residuals associated with the small-classroom school districts. */ proc means data = small; var resid; run; data large; set get; if st_ratio >= 20; /* Here we get the sample standard deviations for the residuals associated with the large-classroom school districts. */ proc means data = large; var resid; run; /* The F-statistic is formed by calculating the ratio of the sample variance of the residuals for the small-classroom school districts to the sample variance of the residuals for the large-classroom school districts. In other words, F = [std(small group residuals)]^2/[std(large group residuals)]^2 . Under the assumed truth that the two groups having equal variances, the above F-statistic has [N(small) - 1] numerators degrees of freedom and [N(large) - 1] denominator degrees of freedom. In this two-sided test the rejection region is two-sided. The right-hand-side critical region is (F > F(alpha/2,N(small)-1,N(large)-1). The left-hand-side critical region is (F< F(1-alpha/2,N(small)-1,N(large)-1). In case your F-table only has right-hand-end critical values the following is a useful identity to help you calculate the left-hand-side critical region: F(1-alpha/2,N(small)-1,N(large)-1) = 1/[F(alpha/2,N(large-1),N(small-1)]. Notice the reversal of the numerator and denominator degrees of freedom in the above identity. Of course, if you have access to Microsoft EXCEL you can use the statistical function FINV(probability,degrees_freedom1,degrees_freedom2) with probability = 0.025 to get the right-hand-tail critical value and the FINV function with probability = 0.975 to get the left-hand-tail critical value. In terms of calculating the EXACT two-tailed p-value for the observed F-statistic you can use the EXCEL function FDIST(x,degrees_freedom1,degrees_freedom2) to calculate the right-tail p-value of Pr(F>x). Then to get the left-tail p-value associated with the x F-value you should calculate [1 - FDIST(1/x,degrees_freedom1,degrees_freedom2)]. Notice the argument in FDIST is 1/x and we use the same order of the degrees of freedom. The subtraction from one is because the FDIST function gives the right-hand-tail probability and we want the left-tail probability. Finally, the two-tailed p-value is then the sum of the left-tail and right-tail p-values. */