* ------------------------------------------------ * * This is a text log of a live regression analysis * in Stata from class on Sep 14, 2017. * ------------------------------------------------ * . use RECS_subset1.dta (RECS 2009) . describe Contains data from RECS_subset1.dta obs: 12,083 REC S 200 9 vars: 7 14 Se > p 2017 08:02 size: 193,328 --------------------------------------------------- storage display value variable name type format label varia > ble label --------------------------------------------------- doeid int %8.0g DOE id regionc byte %8.0g Censu > s Region nweight float %9.0g NWEIG > HT yearmade int %8.0g Year Constructed totsqft int %8.0g Total Square Feet kwh long %12.0g Energ > y Consumption (kwh) urban byte %9.0g urban_rural Urban / Rural --------------------------------------------------- Sorted by: . by regionc: summarize yearmade not sorted r(5); . sort regionc . describe Contains data from RECS_subset1.dta obs: 12,083 REC S 200 9 vars: 7 14 Se > p 2017 08:02 size: 193,328 --------------------------------------------------- storage display value variable name type format label varia > ble label --------------------------------------------------- doeid int %8.0g DOE id regionc byte %8.0g Censu > s Region nweight float %9.0g NWEIG > HT yearmade int %8.0g Year Constructed totsqft int %8.0g Total Square Feet kwh long %12.0g Energ > y Consumption (kwh) urban byte %9.0g urban_rural Urban / Rural --------------------------------------------------- Sorted by: regionc . by regionc: summarize yearmade --------------------------------------------------- -> regionc = 1 Variable | Obs Mean Std. Dev. > Min Max -------------+------------------------------------- > -------------------- yearmade | 2,266 1957.739 26.68195 > 1920 2009 --------------------------------------------------- -> regionc = 2 Variable | Obs Mean Std. Dev. > Min Max -------------+------------------------------------- > -------------------- yearmade | 2,843 1967.567 25.84303 > 1920 2009 --------------------------------------------------- -> regionc = 3 Variable | Obs Mean Std. Dev. > Min Max -------------+------------------------------------- > -------------------- yearmade | 4,090 1978.948 21.49131 > 1920 2009 --Break-- r(1); . bysort regionc: summarize yearmade --------------------------------------------------- -> regionc = 1 Variable | Obs Mean Std. Dev. > Min Max -------------+------------------------------------- > -------------------- yearmade | 2,266 1957.739 26.68195 > 1920 2009 --------------------------------------------------- -> regionc = 2 Variable | Obs Mean Std. Dev. > Min Max -------------+------------------------------------- > -------------------- yearmade | 2,843 1967.567 25.84303 > 1920 2009 --------------------------------------------------- -> regionc = 3 Variable | Obs Mean Std. Dev. > Min Max -------------+------------------------------------- > -------------------- yearmade | 4,090 1978.948 21.49131 > 1920 2009 --------------------------------------------------- -> regionc = 4 Variable | Obs Mean Std. Dev. > Min Max -------------+------------------------------------- > -------------------- yearmade | 2,884 1973.793 21.4755 > 1920 2009 . . . . by regionc, sort: summarize yearmade --------------------------------------------------- -> regionc = 1 Variable | Obs Mean Std. Dev. > Min Max -------------+------------------------------------- > -------------------- yearmade | 2,266 1957.739 26.68195 > 1920 2009 --------------------------------------------------- -> regionc = 2 Variable | Obs Mean Std. Dev. > Min Max -------------+------------------------------------- > -------------------- yearmade | 2,843 1967.567 25.84303 > 1920 2009 --------------------------------------------------- -> regionc = 3 Variable | Obs Mean Std. Dev. > Min Max -------------+------------------------------------- > -------------------- yearmade | 4,090 1978.948 21.49131 > 1920 2009 --------------------------------------------------- -> regionc = 4 Variable | Obs Mean Std. Dev. > Min Max -------------+------------------------------------- > -------------------- yearmade | 2,884 1973.793 21.4755 > 1920 2009 . display r(mean) 1973.793 . summarize yearmade Variable | Obs Mean Std. Dev. > Min Max -------------+------------------------------------- > -------------------- yearmade | 12,083 1971.062 24.81791 > 1920 2009 . display r(mean) 1971.0624 . regress kwh yearmade Source | SS df MS > Number of obs = 12,083 -------------+---------------------------------- > F(1, 12081) = 572.66 Model | 3.1926e+10 1 3.1926e+10 > Prob > F = 0.0000 Residual | 6.7352e+11 12,081 55749983.6 > R-squared = 0.0453 -------------+---------------------------------- > Adj R-squared = 0.0452 Total | 7.0544e+11 12,082 58387797.5 > Root MSE = 7466.6 --------------------------------------------------- > --------------------------- kwh | Coef. Std. Err. t P>| > t| > [95% Con > f. Interval] -------------+------------------------------------- > --------------------------- yearmade | 65.49926 2.737081 23.93 0.0 > 00 > 60.13414 > 70.86437 _cons | -117815 5395.386 -21.84 0.0 > 00 > -128390.8 > -107239.1 --------------------------------------------------- > --------------------------- . display e(r2_a) .04517749 . generate year_since1920 = yearmade - 1920 . regress kwh year_since1920 totsqf Source | SS df MS > Number of obs = 12,083 -------------+---------------------------------- > F(2, 12080) = 1288.92 Model | 1.2406e+11 2 6.2032e+10 > Prob > F = 0.0000 Residual | 5.8138e+11 12,080 48127243.5 > R-squared = 0.1759 -------------+---------------------------------- > Adj R-squared = 0.1757 Total | 7.0544e+11 12,082 58387797.5 > Root MSE = 6937.4 --------------------------------------------------- > --------------------------- kwh | Coef. Std. Err. t P>| > t| > [95% Con > f. Interval] -------------+------------------------------------- > --------------------------- year_si~1920 | 50.95976 2.564703 19.87 0.0 > 00 > 45.93253 > 55.98699 totsqft | 1.915754 .0437839 43.75 0.0 > 00 > 1.829931 > 2.001578 _cons | 4524.337 164.1721 27.56 0.0 > 00 > 4202.534 > 4846.141 --------------------------------------------------- > --------------------------- . display %3.1f 100*e(r2_a) 17.6 . tabluate regionc command tabluate is unrecognized r(199); . tabulate regionc Census | Region | Freq. Percent Cum. ------------+----------------------------------- 1 | 2,266 18.75 18.75 2 | 2,843 23.53 42.28 3 | 4,090 33.85 76.13 4 | 2,884 23.87 100.00 ------------+----------------------------------- Total | 12,083 100.00 . regress kwh year_since1920 regionc Source | SS df MS > Number of obs = 12,083 -------------+---------------------------------- > F(2, 12080) = 295.37 Model | 3.2889e+10 2 1.6445e+10 > Prob > F = 0.0000 Residual | 6.7255e+11 12,080 55674824.6 > R-squared = 0.0466 -------------+---------------------------------- > Adj R-squared = 0.0465 Total | 7.0544e+11 12,082 58387797.5 > Root MSE = 7461.6 --------------------------------------------------- > --------------------------- kwh | Coef. Std. Err. t P>| > t| > [95% Con > f. Interval] -------------+------------------------------------- > --------------------------- year_si~1920 | 62.55812 2.825115 22.14 0.0 > 00 > 57.02045 > 68.0958 regionc | 279.904 67.27818 4.16 0.0 > 00 > 148.0279 > 411.78 _cons | 7358.114 209.5708 35.11 0.0 > 00 > 6947.322 > 7768.906 --------------------------------------------------- > --------------------------- . regress kwy year_since1920 i.regionc variable kwy not found r(111); . regress kwh year_since1920 i.regionc Source | SS df MS > Number of obs = 12,083 -------------+---------------------------------- > F(4, 12078) = 514.99 Model | 1.0279e+11 4 2.5696e+10 > Prob > F = 0.0000 Residual | 6.0266e+11 12,078 49897003.6 > R-squared = 0.1457 -------------+---------------------------------- > Adj R-squared = 0.1454 Total | 7.0544e+11 12,082 58387797.5 > Root MSE = 7063.8 --------------------------------------------------- > --------------------------- kwh | Coef. Std. Err. t P>| > t| > [95% Con > f. Interval] -------------+------------------------------------- > --------------------------- year_si~1920 | 43.42686 2.723126 15.95 0.0 > 00 > 38.0891 > 48.76463 | regionc | 2 | 2932.142 200.7163 14.61 0.0 > 00 > 2538.706 > 3325.578 3 | 5908.107 193.7921 30.49 0.0 > 00 > 5528.243 > 6287.97 4 | 360.2732 203.0577 1.77 0.0 > 76 > -37.75247 > 758.2988 | _cons | 6294.94 180.5021 34.87 0.0 > 00 > 5941.127 > 6648.753 --------------------------------------------------- > --------------------------- . summarize kwh Variable | Obs Mean Std. Dev. > Min Max -------------+------------------------------------- > -------------------- kwh | 12,083 11288.16 7641.191 > 17 150254 . generate lkwh = log(kwh) . regress year_since1920 totsqf i.regionc Source | SS df MS > Number of obs = 12,083 -------------+---------------------------------- > F(4, 12078) = 417.75 Model | 904420.999 4 226105.25 > Prob > F = 0.0000 Residual | 6537229.95 12,078 541.251031 > R-squared = 0.1215 -------------+---------------------------------- > Adj R-squared = 0.1212 Total | 7441650.95 12,082 615.928733 > Root MSE = 23.265 --------------------------------------------------- > --------------------------- year_si~1920 | Coef. Std. Err. t P>| > t| > [95% Con > f. Interval] -------------+------------------------------------- > --------------------------- totsqft | .0027759 .0001475 18.81 0.0 > 00 > .0024867 > .0030651 | regionc | 2 | 9.032937 .6565259 13.76 0.0 > 00 > 7.746041 > 10.31983 3 | 21.81215 .6100975 35.75 0.0 > 00 > 20.61626 > 23.00803 4 | 17.00249 .6550354 25.96 0.0 > 00 > 15.71852 > 18.28647 | _cons | 31.46533 .5916405 53.18 0.0 > 00 > 30.30562 > 32.62504 --Break-- r(1); . regress lkwh year_since1920 totsqf i.regionc, Source | SS df MS Number of obs = 12,083 -------------+---------------------------------- F(5, 12077) = 1117.69 Model | 1858.06765 5 371.61353 Prob > F = 0.0000 Residual | 4015.39966 12,077 .332483205 R-squared = 0.3163 -------------+---------------------------------- Adj R-squared = 0.3161 Total | 5873.46732 12,082 .486133696 Root MSE = .57661 -------------------------------------------------------------------------------- lkwh | Coef. Std. Err. t P>|t| [95% Conf. Interval] ---------------+---------------------------------------------------------------- year_since1920 | .0029855 .0002255 13.24 0.000 .0025434 .0034275 totsqft | .0001798 3.71e-06 48.46 0.000 .0001725 .000187 | regionc | 2 | .2813622 .0163989 17.16 0.000 .2492177 .3135066 3 | .6617856 .0159012 41.62 0.000 .6306168 .6929545 4 | .1126703 .0166816 6.75 0.000 .0799717 .1453689 | _cons | 8.252236 .0162904 506.57 0.000 8.220304 8.284168 -------------------------------------------------------------------------------- . regress lkwh year_since1920 totsqf i.regionc, eform("%Change") cformat("%5.3f") Source | SS df MS Number of obs = 12,083 -------------+---------------------------------- F(5, 12077) = 1117.69 Model | 1858.06765 5 371.61353 Prob > F = 0.0000 Residual | 4015.39966 12,077 .332483205 R-squared = 0.3163 -------------+---------------------------------- Adj R-squared = 0.3161 Total | 5873.46732 12,082 .486133696 Root MSE = .57661 -------------------------------------------------------------------------------- lkwh | %Change Std. Err. t P>|t| [95% Conf. Interval] ---------------+---------------------------------------------------------------- year_since1920 | 1.003 0.000 13.24 0.000 1.003 1.003 totsqft | 1.000 0.000 48.46 0.000 1.000 1.000 | regionc | 2 | 1.325 0.022 17.16 0.000 1.283 1.368 3 | 1.938 0.031 41.62 0.000 1.879 2.000 4 | 1.119 0.019 6.75 0.000 1.083 1.156 | _cons | 3836.195 62.493 506.57 0.000 3715.633 3960.668 -------------------------------------------------------------------------------- . replace totsqft = totsqft / 100 variable totsqft was int now float (12,083 real changes made) . regress lkwh year_since1920 totsqf i.regionc, eform("%Change") cformat("%5.3f") Source | SS df MS Number of obs = 12,083 -------------+---------------------------------- F(5, 12077) = 1117.69 Model | 1858.06765 5 371.61353 Prob > F = 0.0000 Residual | 4015.39967 12,077 .332483205 R-squared = 0.3163 -------------+---------------------------------- Adj R-squared = 0.3161 Total | 5873.46732 12,082 .486133696 Root MSE = .57661 -------------------------------------------------------------------------------- lkwh | %Change Std. Err. t P>|t| [95% Conf. Interval] ---------------+---------------------------------------------------------------- year_since1920 | 1.003 0.000 13.24 0.000 1.003 1.003 totsqft | 1.018 0.000 48.46 0.000 1.017 1.019 | regionc | 2 | 1.325 0.022 17.16 0.000 1.283 1.368 3 | 1.938 0.031 41.62 0.000 1.879 2.000 4 | 1.119 0.019 6.75 0.000 1.083 1.156 | _cons | 3836.195 62.493 506.57 0.000 3715.633 3960.668 -------------------------------------------------------------------------------- . regress lkwh c.totsqf c.year_since1920 i.regionc /// split across lines / invalid name r(198); . regress lkwh e1920#i.regionc, eform("%Change") cformat("%5.3f") variable e1920 not found r(111); . regress lkwh c.totsqf c.year_since1920 i.regionc c.year_since1920#i.regionc, eform("%Change") cformat( > "%5.3f") Source | SS df MS Number of obs = 12,083 -------------+---------------------------------- F(8, 12074) = 705.73 Model | 1871.38315 8 233.922894 Prob > F = 0.0000 Residual | 4002.08417 12,074 .331462992 R-squared = 0.3186 -------------+---------------------------------- Adj R-squared = 0.3182 Total | 5873.46732 12,082 .486133696 Root MSE = .57573 ------------------------------------------------------------------------------------------ lkwh | %Change Std. Err. t P>|t| [95% Conf. Interval] -------------------------+---------------------------------------------------------------- totsqft | 1.018 0.000 48.39 0.000 1.017 1.019 year_since1920 | 1.004 0.000 9.08 0.000 1.003 1.005 | regionc | 2 | 1.491 0.046 12.95 0.000 1.404 1.584 3 | 2.119 0.071 22.29 0.000 1.984 2.264 4 | 1.049 0.038 1.34 0.181 0.978 1.125 | regionc#c.year_since1920 | 2 | 0.997 0.001 -4.41 0.000 0.996 0.998 3 | 0.998 0.001 -3.12 0.002 0.997 0.999 4 | 1.001 0.001 1.27 0.202 1.000 1.002 | _cons | 3677.473 81.493 370.48 0.000 3521.153 3840.734 ------------------------------------------------------------------------------------------ . regress lkwh c.totsqf c.year_since1920##i.regionc, eform("%Change") cformat("%5.3f") Source | SS df MS Number of obs = 12,083 -------------+---------------------------------- F(8, 12074) = 705.73 Model | 1871.38315 8 233.922894 Prob > F = 0.0000 Residual | 4002.08417 12,074 .331462992 R-squared = 0.3186 -------------+---------------------------------- Adj R-squared = 0.3182 Total | 5873.46732 12,082 .486133696 Root MSE = .57573 ------------------------------------------------------------------------------------------ lkwh | %Change Std. Err. t P>|t| [95% Conf. Interval] -------------------------+---------------------------------------------------------------- totsqft | 1.018 0.000 48.39 0.000 1.017 1.019 year_since1920 | 1.004 0.000 9.08 0.000 1.003 1.005 | regionc | 2 | 1.491 0.046 12.95 0.000 1.404 1.584 3 | 2.119 0.071 22.29 0.000 1.984 2.264 4 | 1.049 0.038 1.34 0.181 0.978 1.125 | regionc#c.year_since1920 | 2 | 0.997 0.001 -4.41 0.000 0.996 0.998 3 | 0.998 0.001 -3.12 0.002 0.997 0.999 4 | 1.001 0.001 1.27 0.202 1.000 1.002 | _cons | 3677.473 81.493 370.48 0.000 3521.153 3840.734 ------------------------------------------------------------------------------------------ . generate year_sqft = yearmade*totsqf . correlate yearmade totsqf year_sqft (obs=12,083) | yearmade totsqft year_s~t -------------+--------------------------- yearmade | 1.0000 totsqft | 0.1296 1.0000 year_sqft | 0.1491 0.9997 1.0000 . generate year1920_sqft = year_since1920*totsqft . correlate year_since1920 totsqf year1920_sqft (obs=12,083) | yea~1920 totsqft year19~t -------------+--------------------------- year_si~1920 | 1.0000 totsqft | 0.1296 1.0000 year1920_s~t | 0.6021 0.8006 1.0000 . quietly summarize yearmade . generate c_year = (yearmade - r(mean))/10 // Make unit decades 10/ invalid name r(198); . label variable c_year "10 Years (centered)" variable c_year not found r(111); . generate c_year = (yarmade - r(mean)) / 10 yarmade not found r(111); . generate c_year = (yearmade - r(mean))/10 . quietly summarize totsqf . generate c_totsqf = (totsqf - r(mean)) . generate cyear_csqft = c_totsqf*c_year . correlate c_year c_totsqf cyear_csqft (obs=12,083) | c_year c_totsqf cyear_~t -------------+--------------------------- c_year | 1.0000 c_totsqf | 0.1296 1.0000 cyear_csqft | 0.1232 0.2486 1.0000 . regress lkwh c.c_totsqf##c.c_year i.regionc, eform("%Change") Source | SS df MS Number of obs = 12,083 -------------+---------------------------------- F(6, 12076) = 953.65 Model | 1888.2726 6 314.7121 Prob > F = 0.0000 Residual | 3985.19471 12,076 .330009499 R-squared = 0.3215 -------------+---------------------------------- Adj R-squared = 0.3212 Total | 5873.46732 12,082 .486133696 Root MSE = .57446 ------------------------------------------------------------------------------------- lkwh | %Change Std. Err. t P>|t| [95% Conf. Interval] --------------------+---------------------------------------------------------------- c_totsqf | 1.019024 .0003878 49.52 0.000 1.018264 1.019785 c_year | 1.032284 .0023286 14.09 0.000 1.02773 1.036859 | c.c_totsqf#c.c_year | .9986472 .0001413 -9.57 0.000 .9983703 .9989243 | regionc | 2 | 1.317106 .021534 16.85 0.000 1.275565 1.359999 3 | 1.934322 .0306461 41.64 0.000 1.875174 1.995335 4 | 1.120045 .0186147 6.82 0.000 1.084146 1.157134 | _cons | 6657.254 83.08795 705.36 0.000 6496.365 6822.129 ------------------------------------------------------------------------------------- . log close name: log: /afs/umich.edu/user/j/b/jbhender/Stats506/Stata/RECS/macros.txt log type: text closed on: 14 Sep 2017, 11:15:52 --------------------------------------------------------------------------------------------------------