UNITED STATES DEPARTMENT OF ENERGY MEETING OF THE AMERICAN STATISTICAL ASSOCIATION (ASA) COMMITTEE ON ENERGY STATISTICS WITH THE ENERGY INFORMATION ADMINISTRATION (EIA) Washington, D.C. Friday, October 29, 2004 2 1 PARTICIPANTS: 2 F. JAY BREIDT 3 NICOLAS HENGARTNER 4 JOHNNY BLAIR 5 MARK BURTON 6 MOSHE FEDER 7 BARBARA FORSYTH 8 NEHA KHANA 9 NAGARAJ K. NEERCHAL 10 SUSAN M. SEREIKA 11 RANDY R. SITTER 12 HOWARD BRADSHER-FREDRICK 13 ROBERT RUTCHIK 14 NANCY KIRKENDALL 15 PRESTON McDOWNEY 16 GUY CARUSO 17 TOM BROENE 18 HENRY S. BROOKS 19 BRENDA COX 20 GRACE SUTHERLAND 21 SHAWNA WAUGH 22 BETA REPORTING & VIDEOGRAPHY SERVICES www.betareporting.com (202) 638-2400 1-800-522-2382 (703) 684-2382 3 1 PARTICIPANTS (CONT'D): 2 TOM LORENZ 3 PHILLIP TSENG 4 JOHN WOOD 5 HOWARD GRUENSPECHT 6 WILLIAM WEINIG 7 INDUJIT KUNDRA 8 JOE SEDRANSK 9 KAREN NORMAN 10 11 12 * * * * * 13 14 15 16 17 18 19 20 21 22 BETA REPORTING & VIDEOGRAPHY SERVICES www.betareporting.com (202) 638-2400 1-800-522-2382 (703) 684-2382 4 1 C O N T E N T S 2 AGENDA: PAGE 3 Data Analysis on the EIA-826/906 5 4 Post-Stratification Methodology for 50 the 2002 Manufacturing Energy 5 Consumption Survey (MECS) 6 Time Series Edits for the Electric 118 Power EIA-906 7 If you were King 133 8 Committee Suggestions 179 9 10 11 * * * * * 12 13 14 15 16 17 18 19 20 21 22 BETA REPORTING & VIDEOGRAPHY SERVICES www.betareporting.com (202) 638-2400 1-800-522-2382 (703) 684-2382 5 1 P R O C E E D I N G S 2 (8:37 a.m.) 3 CHAIRMAN BREIDT: Okay, I guess 4 we'll go ahead and start the meeting. First 5 I'd like to ask any committee member, guest, 6 EIA staff, or member of the public who was 7 not present yesterday to introduce yourself 8 from one of the microphones? 9 MR. WEINIG: If you are -- 10 CHAIRMAN BREIDT: In that category? 11 MR. WEINIG: Directly under a 12 speaker in the ceiling, it will cause the 13 thing to wind and break, so we'll use the 14 other microphone. 15 SPEAKER: Okay. 16 MR. COLE: My name is Stacey Cole. 17 I'm from the Bureau of Census, I'll be 18 speaking here today about the post 19 stratification we did for the MEC survey. 20 MR. WEINIG: Anybody else? 21 MR. SLANDA: John Slanda, U.S. 22 Bureau of the Census. I'm with Stacey Cole. BETA REPORTING & VIDEOGRAPHY SERVICES www.betareporting.com (202) 638-2400 1-800-522-2382 (703) 684-2382 6 1 MS. BUCCI: Susan Bucci, U.S. 2 Census Bureau and I'm with Rick. 3 MR. WEINIG: Anybody else? 4 SPEAKER: Okay. 5 SPEAKER: Okay. 6 SPEAKER: I think that's it, thank 7 you. 8 CHAIRMAN BREIDT: Okay, thank you. 9 And anyone who did not sign in yesterday is 10 asked to do so, out front at the break or 11 before you leave. Lunch for the committee 12 will be held at the conclusion of this 13 session on suggestions for the spring agenda, 14 lunch is downstairs as usual, and we'll 15 probably be discussing a variety of things 16 including future committee appointments and 17 so on. So with that I will turn it over to 18 Joe Sedransk, who will be speaking on data 19 analysis on the EIA 826 906. 20 MR. SEDRANSK: You're doing the 21 pushing, or I might as well do it. This is 22 of course joint work, so as one typically BETA REPORTING & VIDEOGRAPHY SERVICES www.betareporting.com (202) 638-2400 1-800-522-2382 (703) 684-2382 7 1 says when giving the talk, the other -- the 2 co-author will answer all the questions. In 3 any rate, our work is an evaluation of 4 methodology in current use, but we're not 5 talking about things we've invented, and we 6 need to make recommendations to CNEAF about 7 several things including about methodology 8 and possibly there's been talk about changing 9 the sample. This is just rumor. 10 But anyway this is the -- 11 background is to estimate -- we're trying to 12 estimate total monthly sales and revenues for 13 all non IOU's by end user, first of all IOU 14 is investor owned utility, and the end users 15 are residential, commercial, industrial and 16 other; in all these discussions just forget 17 about other, we'll just talk about the first 18 three. 19 So there are two variables; sales 20 and revenues. Other entities such as IOU's 21 and wholesalers are censused. So we're 22 really just talking about the non IOU's BETA REPORTING & VIDEOGRAPHY SERVICES www.betareporting.com (202) 638-2400 1-800-522-2382 (703) 684-2382 8 1 essentially. 861 is an annual census, it 2 covers everyone and it provides a frame for 3 the 826 which is a monthly survey used to 4 provide estimates of total sales and revenue. 5 And to give you an idea, the sample 6 size were about 150, 160, 170. There are 7 over 2000 in the population of non IOU's. 8 Sample design is cut off sample which was 9 developed in the early 90s, and at that time 10 it included the IOU's which are now censused, 11 and of course there have been additions and 12 deletions since then. 13 Sample coverage rates, I cite 14 below, they depend 76 percent to 88 percent, 15 depending on the variable and end user, 68 16 percent to 95 percent by geographic region. 17 The estimation procedure: very standard, 18 linear regression through the origin with a 19 slope different for different geographic 20 regions. 21 So Y is the current 826 value for 22 company I and region R, X is the past census BETA REPORTING & VIDEOGRAPHY SERVICES www.betareporting.com (202) 638-2400 1-800-522-2382 (703) 684-2382 9 1 value for that company in the region. The 2 only thing to bring to your attention is the 3 variance, sigma squared2 proportional to X to 4 the two gamma, and the current usage has 5 gamma as 0.8. That was developed in the '90s 6 long before I had any connection with this 7 thought to be desirable and I could -- we 8 could discuss this afterwards. 9 The current method is simply, add 10 up the units, the sampled units, to get this 11 sum for a State. For example, add up the 12 sample units and make predictions for the non 13 sampled ones. It's possible that there are 14 some added -- there are some outliers and 15 that might not be included in estimating beta 16 hat for the non sampled ones but they would 17 always be added in for the -- to get the 18 totals, assuming that there has been a check 19 for outliers and that the value is correct. 20 Valuation; what we've been doing is 21 looking at Standard Exploratory Analysis 22 using scatter plots, standardized residual BETA REPORTING & VIDEOGRAPHY SERVICES www.betareporting.com (202) 638-2400 1-800-522-2382 (703) 684-2382 10 1 plots and so forth. Then I want to point out 2 the ten regions that are currently being used 3 for estimation, that is related to beta R's, 4 they depend on region and the R's here bring 5 your attention to the fact that NEA is very 6 large. 7 These were developed, I think again 8 in a year -- early to the mid '90s, based on 9 temperature and climate considerations. NEA 10 is very large; it ranges from and -- possibly 11 diverse from Maine all the way down to 12 Maryland. The other thing to bring to your 13 attention is West, which is somewhere in the 14 middle of California divide; there are only 15 three sample companies in that region. 16 So with this large number of 17 regions, the sample sizes as you see, as we 18 go through here and not too -- pretty small. 19 Four regions are aggregations of the 10 that 20 we've looked at at various times. 21 Occasionally, we'll talk about all the data, 22 and sometimes about all the data except for BETA REPORTING & VIDEOGRAPHY SERVICES www.betareporting.com (202) 638-2400 1-800-522-2382 (703) 684-2382 11 1 those that have been deemed outliers or 2 influential and those are the criteria we've 3 actually -- we've been using. 4 That's we've been using for 5 investigation as opposed to what is being 6 used in the current estimation. Here are a 7 couple of -- about the five scatter plots on 8 top, standardized residual plots down below. 9 Essentially if you've looked through them on 10 the website, there's very little that we can 11 find and suggest that model doesn't fit. If 12 you look carefully enough, you'll find 13 examples where there are things that you 14 might be a little unhappy about. 15 But generally the residual plots 16 look okay. Here's one, here's another one. 17 I picked ones where there are decent number 18 of points, like 14 in this case where you 19 could see something. Some of the others, you 20 can't see much. If you have only four or 21 five points, you just can't tell anything. 22 Here's another one with nine. Then I put two BETA REPORTING & VIDEOGRAPHY SERVICES www.betareporting.com (202) 638-2400 1-800-522-2382 (703) 684-2382 12 1 and here's one which looks like an outlying 2 observation in the top for industrial sales 3 and it's minus 3 point -- larger than minus 4 3.5 here and, here's another one, which 5 probably -- it's probably an influential 6 point. 7 Just a quick comment; I apologize 8 that, there was a gremlin that got in, there 9 should be a second set of plots and that 10 should've been on the website. But a gremlin 11 got in and the ones, the plots, which were 12 "the after," these are "the before," this is 13 the -- you know in another words, this is all 14 the data, the ones where the things were 15 removed, never got into the data set, never 16 got into the -- okay. 17 Issues and questions; choice of 18 criteria for deleting, observe values, and 19 classifying them as outliers, we need to make 20 a recommendation about that. Second one: 21 Composition of estimation groups. I've just 22 given you some examples of 11 regions, why BETA REPORTING & VIDEOGRAPHY SERVICES www.betareporting.com (202) 638-2400 1-800-522-2382 (703) 684-2382 13 1 10, why 11? It's Hawaii, Hawaii is censused, 2 and sometimes we call them 10 and sometimes 3 we call them 11, and that's my fault, it's 4 really 11. 5 Smaller number of regions, the four 6 I cited before, these were examples. Other 7 regions could be -- you could use other 8 criteria for defining the regions based on 9 temperature and climate. Another completely 10 different category is ownership, which we've 11 looked at. We don't use the IOUs at all to 12 make any predictions for the non IOUs. It 13 turns out that these seemed to be different, 14 different in terms of the regression co- 15 efficients. 16 But there's a possibility of using 17 them within the non IOUs, there are also some 18 sub-sets, including things like the federal 19 entity, the federal company -- federal 20 installations. Unfortunately they may be 21 different but we can't do much about it 22 because there aren't very many of them. I BETA REPORTING & VIDEOGRAPHY SERVICES www.betareporting.com (202) 638-2400 1-800-522-2382 (703) 684-2382 14 1 think something like six to eight.(b) was 2 something I was looking for advice about; 3 it's fairly clear, in some cases how you work 4 from the ground up? 5 How would one be using ownership or 6 geography is sort of clear, but suppose you 7 just started with all these companies and so 8 you say, how am I going to form the strata 9 just from basic micro data? Not so obvious 10 to me. You can get the estimates of slopes, 11 but how trustworthy they are, not so clear. 12 So this is just methodology working from the 13 ground up to form a strata. 14 Okay, number three, deals with 15 macro-level. We have a lot of extra data that 16 we're not making any use of, and perhaps that 17 can be used to retain precision and reduce 18 the sample size. And marked here, remember 19 there are two variables and three end users, 20 and here are a couple of plots. Why don't 21 you look at the bottom one because the colors 22 are sharp and the before and after is before BETA REPORTING & VIDEOGRAPHY SERVICES www.betareporting.com (202) 638-2400 1-800-522-2382 (703) 684-2382 15 1 -- all the data in "the after" is outliers. 2 Clearly there is a time series 3 pattern. What's plotted here are the 4 estimated regression co-efficients over a 24- 5 month period starting from January '02 to 6 December '03. And so each point is the 7 estimated regression co-efficient for a 8 region. So first thing to notice there's 9 clearly a time-series pattern for this. As a 10 Bayesian, what I would do -- or contemplate 11 doing, is, suppose I want to make an estimate 12 for September '03, I might use the data prior 13 to that to do a forecast, and if the forecast 14 mean and the forecast variance be the 15 empirical mean and variance of a prior 16 distribution to put together. 17 I mean this is just a suggestion; 18 we're looking for other suggestions as well. 19 I want to -- before going on, I want to 20 point, this is residential, this is 21 important, I want to point out one thing so I 22 don't have to go back to this slide. Notice BETA REPORTING & VIDEOGRAPHY SERVICES www.betareporting.com (202) 638-2400 1-800-522-2382 (703) 684-2382 16 1 towards September -- let's take any cross- 2 sectional view, sometimes looks like these 3 beta hats are very -- they look like they're 4 estimating the same thing. In other periods 5 it looks like there are other groupings. 6 That there are groupings but it is 7 not everything together, and some 8 methodology, which I've been working on for 9 Years, might be useful for here, in trying to 10 improve the precision, letting the data tell 11 which sorts of regression should be combined. 12 So I'll come back to that. 13 Here's another one, commercial, 14 these are again the beta hats, same plot, not 15 as much seasonality as there was on the 16 residential. And then let's look at 17 industrial and the reason -- one reason is, 18 this is much less here as well, but note in 19 the bottom the spike for North-West. And I 20 think these plots -- the reason I'm pointing 21 this out is this is sort of pointed out, 22 something that maybe very well known to BETA REPORTING & VIDEOGRAPHY SERVICES www.betareporting.com (202) 638-2400 1-800-522-2382 (703) 684-2382 17 1 everybody in CNEAF, but this is sort of an 2 interesting thing and I'll come back to it in 3 a second. 4 These are comparable plots for the 5 estimates of sigma square, the residual 6 variance. It was my assumption before we 7 started that in fact sigma square might be 8 quite stable, and you could use some of this 9 to improve the estimation of sigma2. If this 10 is residential, here's commercial, there's 11 another sort of wild, but most of them behave 12 very -- they don't change much over time. 13 This is the one I wanted to point 14 to; look at the industrial, look at the 15 bottom; that's that same region we had 16 before. So, in the summer -- spring/summer 17 of '03, there was both a change in beta hat 18 and this large increase in sigma2, so 19 something went on. And this maybe fairly 20 useful, we have not tracked this down yet. 21 I've asked one person who a hypothesis about 22 what happened, but this maybe a useful thing BETA REPORTING & VIDEOGRAPHY SERVICES www.betareporting.com (202) 638-2400 1-800-522-2382 (703) 684-2382 18 1 to look at. 2 Point number four is about use of 3 company level longitudinal data, and there 4 are two examples here. What is plotted here 5 is again same 24 months, each point is for a 6 region, and what's plotted is the monthly 7 value divided by the annual, divided by 12. 8 And the annual divided by 12 is to put 9 everything on the same scales, so they should 10 operate around one. And the dark thicker 11 blue line is the estimated regression co- 12 efficient for that region over the 24 months. 13 What are the reasons for this -- 14 you've possibly used utility of these plots. 15 One is to see whether the large and small 16 companies, all of them seemed out of the same 17 patterns; that would be a validation -- 18 informal validation of the model. The second 19 is if you see a lot of divergence, is maybe 20 the stratification -- post stratification 21 isn't any good. So here's the second one, 22 this looks even better from my perspective. BETA REPORTING & VIDEOGRAPHY SERVICES www.betareporting.com (202) 638-2400 1-800-522-2382 (703) 684-2382 19 1 This is a second one, the difference between 2 these two, these are both residential sales, 3 one is NWE and one was another. 4 We've looked at some other -- a 5 couple of others, which don't look quite as 6 good as this. But suggestions looked for 7 here about what to do with these data of this 8 sort. And the last one is what I referred to 9 before but I want to rephrase now; you have a 10 choice of a large number of estimation groups 11 and of course the advantage of that or post 12 strata, large advantage of that is greater 13 homogeneity within the groups. Disadvantage 14 is small sample sizes within each of the 15 groups. 16 Conversely you can have very few 17 estimation groups and you have a larger and 18 then you get more heterogeneity but larger 19 sample sizes. Methodology that is referred 20 to, if you've all done your homework and 21 looked at least at the paper I wrote, alluded 22 to this research which the data -- a database BETA REPORTING & VIDEOGRAPHY SERVICES www.betareporting.com (202) 638-2400 1-800-522-2382 (703) 684-2382 20 1 pooling of like regression co-efficients. 2 What it is, is an extension of the Standard 3 Shrinkage Methodology. 4 Shrinkage Methodology is kind of 5 dumb; it just says which things to put 6 together. This method is dynamic and it 7 basically takes the data and tells you which 8 things to put together; it works quite well. 9 So the question here with regard to this is 10 any suggestions about that or other ways of 11 choosing between large number of estimation 12 groups or -- and with a consequent gains and 13 benefits and other things. 14 Here are the questions for the 15 committee; they're just actually a repetition 16 of the five points I just went through now, 17 and good point to stop. The first two are 18 things for which we need to make 19 recommendations, the last three, any 20 suggestions about things other than what 21 we've talked about or comments about 22 anything. BETA REPORTING & VIDEOGRAPHY SERVICES www.betareporting.com (202) 638-2400 1-800-522-2382 (703) 684-2382 21 1 CHAIRMAN BREIDT: Okay, thanks. 2 Let me start with quick questions and you 3 talked about deleting an outlier, so that 4 really means just setting it into -- setting 5 it outside the regression, using it as an 6 observed value? 7 MR. SEDRANSK: Yeah, we're assuming 8 that when such an outlier occurs that there 9 is contact with the company to make sure the 10 value is okay. 11 CHAIRMAN BREIDT: It's not an 12 omission? 13 MR. SEDRANSK: Yeah. 14 CHAIRMAN BREIDT: Okay. 15 MR. SEDRANSK: And that's an 16 important -- I mean, that's a -- I see 17 Scott's up there -- you know, I mean, that's 18 a standard part of practice is going back. 19 Right now there are programs that do these, 20 what we call, scatter plot edits and they 21 have been used heavily in the past, now not 22 being used very much. We think they should BETA REPORTING & VIDEOGRAPHY SERVICES www.betareporting.com (202) 638-2400 1-800-522-2382 (703) 684-2382 22 1 be used a lot more. 2 CHAIRMAN BREIDT: And is this cut 3 off sample maintained or is it pretty much 4 the same thing? 5 MR. SEDRANSK: No in fact that it 6 was a cut off sample and it's now -- no, it's 7 not maintained. 8 MS. KIRKENDALL: It needs to be 9 evaluated again? 10 MR. SEDRANSK: It needs to be 11 evaluated. In fact that was what we were 12 saying before, as I was saying at the 13 beginning, there was a rumor about changing 14 this sample composition. What happens, I 15 think it would be fair to say is that, 16 somebody perceives a problem, maybe in one of 17 the sectors and in one area and just adds 18 some sample size to it and it's not done in 19 an overall principled way, so in fact, it's 20 not maintained, so this is no longer, you 21 know, it's no longer a cut off sample. 22 DR. NEERCHAL: Why does the beta BETA REPORTING & VIDEOGRAPHY SERVICES www.betareporting.com (202) 638-2400 1-800-522-2382 (703) 684-2382 23 1 estimate hover around one? 2 MR. SEDRANSK: It does -- 3 DR. NEERCHAL: You explained it but 4 I didn't get it. 5 MR. SEDRANSK: Oh, they do, oh, you 6 mean in those plots in the end? 7 DR. NEERCHAL: Yes. 8 MR. SEDRANSK: Oh, yeah, what I 9 said in the company level plots, what we try 10 to do, this was the first attempt, was we 11 plotted the observed value for the month 12 divided by annual, which is an estimate of 13 the slope. But then the values went all over 14 the place, so the simple idea, divided by 12, 15 put the annual over 12 on the same scale as 16 the month. So it made the plots -- the plots 17 look better. 18 MS. KIRKENDALL: Is the regression 19 actually based on the annual over 12 also, so 20 when you put the regression you'd expect that 21 to be around one -- 22 DR. NEERCHAL: Oh. BETA REPORTING & VIDEOGRAPHY SERVICES www.betareporting.com (202) 638-2400 1-800-522-2382 (703) 684-2382 24 1 MS. KIRKENDALL: To show the 2 seasonality or whatever, or you could show 3 market trends, or you could show whatever is 4 going on. But in this case it sort of looks 5 like it's primarily the seasonality. 6 MR. SEDRANSK: Something I should 7 say, I know Nick is -- I'll just comment -- 8 just remark; part of this work or much of it 9 is really a warm up, it's like an exhibition 10 game for dealing with the more complex 906 11 and 920 surveys that appeared in the original 12 title. Those, the data -- the relationships 13 are not as nice or nowhere nearly as nice is 14 what they've showed here. We were trying to 15 go through the exercise for something that 16 was, looked good to get sort of what we were 17 going to do, sort of down pad. 18 And then the next thing is to turn 19 to that, which you presumably will hear about 20 in succeeding talks. This is the data, it 21 really is well behaved and when I talked 22 about the fact that there are these problems, BETA REPORTING & VIDEOGRAPHY SERVICES www.betareporting.com (202) 638-2400 1-800-522-2382 (703) 684-2382 25 1 it's not problems but number three, what you 2 may like to do is get some more precision, so 3 that you can reduce the sample size; that's 4 even more important that the other surveys. 5 Nick? 6 DR. HENGARTNER: You talked about 7 outliers and things like that, and I looked 8 at the plots and my first inclination would 9 be to do everything on a log-log scale. 10 Because if you look at some of the plots 11 there, I always have one guy way out there 12 that drives the regression line and just like 13 rescaling often things look -- it's just that 14 the type of plots that's what suggest that -- 15 the other thing, I was wondering you said 16 shrinkage, were you thinking of mixed effects 17 models or even hierarchal models? 18 MR. SEDRANSK: Yeah, you know that 19 was what I -- yeah, when I used shrinkage, I 20 mean -- what I was talking about very 21 specifically, let's take the example of the 22 beta hats, if you used -- if you looked, if I BETA REPORTING & VIDEOGRAPHY SERVICES www.betareporting.com (202) 638-2400 1-800-522-2382 (703) 684-2382 26 1 knew where this was, maybe Lorenz can go -- 2 how do I go back? Go back on -- go back down 3 there. I'll give you an example. Now, I can 4 go forward, I know where to go as long as you 5 got me into a -- no, not pushing out, I'm not 6 going anywhere. Akay, at any rate, take a 7 cross sectional view of those beta hats, keep 8 going, Lorenz, yeah stop there, great. 9 SPEAKER: Can you speak to the 10 microphone? 11 DR. FEDER: Can you speak to the 12 microphone? 13 MR. SEDRANSK: No, I'm finished. 14 DR. HENGARTNER: But here are those 15 regions, what you have is you have one that's 16 on an annual cycle and the other one is on a 17 bi-annual cycle, and those are the regions on 18 the bi-annual cycles are the ones in which 19 you need both heating and air-conditioning. 20 MR. SEDRANSK: Yeah, okay no, no -- 21 I say no. 22 DR. HENGARTNER: Why? BETA REPORTING & VIDEOGRAPHY SERVICES www.betareporting.com (202) 638-2400 1-800-522-2382 (703) 684-2382 27 1 MR. SEDRANSK: Anyway, the point I 2 was going to try to make a technical point -- 3 DR. HENGARTNER: Okay, go ahead. 4 MR. SEDRANSK: Shrinkage 5 Methodology, if you -- the standard shrinkage 6 -- I'll stay over here. Standard Shrinkage 7 Methodology, look at September '03 where 8 there is a kind of a large thing, here we 9 just say, your estimate for one region, take 10 NWE for example, will be that regression 11 estimate, weighted average and that of the 12 others, it's dumb, it doesn't realize the 13 fact that some of the data, not that all data 14 doesn't come from a single source. The 15 methodology I have developed let's the data 16 decide which things should go together. 17 DR. HENGARTNER: But you could 18 actually, since you have the time series, you 19 can think of modeling this d(t) as a function 20 of (t). And you mentioned that. 21 MR. SEDRANSK: Yes, that's right. 22 That was, okay I've told two different BETA REPORTING & VIDEOGRAPHY SERVICES www.betareporting.com (202) 638-2400 1-800-522-2382 (703) 684-2382 28 1 things, right? I would do -- that's the 2 first thing is the time series effects are 3 probably -- in fact you might do it both ways 4 actually. You might have both the time 5 series and all the -- 6 DR. HENGARTNER: Yes, the next link 7 are the time series coefficient. 8 MR. SEDRANSK: Yes, in fact you 9 really want to do this both ways, you really 10 want to do the cross section and the time 11 series parts of this. But sort of not the 12 dumb shrinkage methodology -- well, shrinkage 13 methodology is same as like mixed effects; 14 you don't want to assume that the regression 15 coefficients in some of these time periods 16 come from the same source. They don't. In 17 other periods they do. So the real solution 18 to this which is really interesting 19 statistical problem is the time series and 20 the cross section. 21 DR. NEERCHAL: You mentioned there 22 are only three sample points in the --- BETA REPORTING & VIDEOGRAPHY SERVICES www.betareporting.com (202) 638-2400 1-800-522-2382 (703) 684-2382 29 1 MR. SEDRANSK: Lest. 2 DR. NEERCHAL: In the Lest? 3 MR. SEDRANSK: Yes. 4 DR. NEERCHAL: In the frame, how 5 many are there? 6 MR. SEDRANSK: Do we know? 7 DR. NEERCHAL: What percentage? 8 MS. KIRKENDALL: We don't know. 9 MR. SEDRANSK: Do we know off hand. 10 DR. NEERCHAL: Are there lots? 11 MS. KIRKENDALL: More than three. 12 I haven't looked at that. 13 MR. SEDRANSK: Yeah. 14 MS. KIRKENDALL: But that's 15 probably small. There's been other 16 discussion that may be California should be 17 broken into two regions. North and south are 18 very different. 19 DR. NEERCHAL: But there aren't 20 that many IOUs in California? 21 MR. SEDRANSK: Non IOUs. 22 DR. NEERCHAL: Non IOUs? BETA REPORTING & VIDEOGRAPHY SERVICES www.betareporting.com (202) 638-2400 1-800-522-2382 (703) 684-2382 30 1 MR. SEDRANSK: Non IOUs. There may 2 not be, there may not be too many. 3 MS. KIRKENDALL: We have to look at 4 the data. 5 MR. SEDRANSK: Yes. There are 6 more, I'm sure there is more, I'm sure there 7 are more. 8 CHAIRMAN BREIDT: Can you say more 9 about the background behind this pooling 10 methodology? Is there kind of a -- is it 11 like a model averaging sort of procedure? 12 MR. SEDRANSK: Yes. 13 CHAIRMAN BREIDT: With prior 14 subsequent model. 15 MR. SEDRANSK: Oh what you do -- oh 16 it's very simple. You consider all the 17 partitions of all those 10 points, put a 18 prior probability on it, usually equal, and 19 then within a grouping, within the subset in 20 the partition, you assume everything is 21 exchangeable. It works very well, I develop 22 -- first person to ever -- this is -- there BETA REPORTING & VIDEOGRAPHY SERVICES www.betareporting.com (202) 638-2400 1-800-522-2382 (703) 684-2382 31 1 is some literature back to the early 90s, and 2 it works very, very well. I now have a paper 3 to show how to do this for small area 4 estimation, which is really easy to do. 5 If you start with the estimates -- 6 this is actually really easy to do because 7 you -- I'm planning to do it, student who's 8 worked with me on it is very busy. And as 9 soon as we finish -- the reason this is easy 10 is if you start with the summary statistics 11 like the regression coefficient is easy. If 12 you want to read the papers, biometrical 13 paper in '92 and one about 2001 is this is a 14 sinica paper in '99. But it really works, it 15 works surprisingly well this -- 16 DR. HENGARTNER: It sounds what 17 Hartigan was also doing. 18 MR. SEDRANSK: He assumes -- Okay, 19 he assumes is there's not a mistake in what 20 he does, ours is much more general -- 21 DR. HENGARTNER: I'm glad you say 22 that. BETA REPORTING & VIDEOGRAPHY SERVICES www.betareporting.com (202) 638-2400 1-800-522-2382 (703) 684-2382 32 1 MR. SEDRANSK: No, no, it is, it 2 is, it is. It's the same, it's the same 3 thing, we were actually working on it at the 4 same time, started working on it at the same 5 time. He assumed something, we assumed that 6 the regression coefficients in the subset are 7 not exactly the same. He assumes they are. 8 DR. HENGARTNER: Yes. 9 MR. SEDRANSK: But once you get 10 into a subset that they have exactly the same 11 values, we allow variability in that. And 12 that's what the difference is. There's 13 unfortunately one of his students wrote a 14 thesis based on it, he's got a mistake in 15 that. 16 DR. HENGARTNER: Yes, I know. 17 MR. SEDRANSK: Oh, you know. 18 DR. HENGARTNER: But I mean in some 19 sense the Bayesian form of clustering. 20 MR. SEDRANSK: That's right. 21 DR. HENGARTNER: I mean, that's 22 really what we're doing here. BETA REPORTING & VIDEOGRAPHY SERVICES www.betareporting.com (202) 638-2400 1-800-522-2382 (703) 684-2382 33 1 MR. SEDRANSK: That is exactly 2 right. And in fact if you could do this 3 computationally, this would be Bayesian 4 clustering. 5 DR. HENGARTNER: Yes. 6 MR. SEDRANSK: If you want a 7 another reference which is funny, David 8 Binder who's doctoral dissertation on 9 Bayesian clustering, and has some early 10 papers and it -- was something I looked at 11 when we had started out. Now this would be 12 good. The hurdle is really that with all 13 these partitions, unless you have a way of 14 structuring the population, you get too many, 15 you just, there are just too many of these. 16 DR. HENGARTNER: Couldn't we try to 17 think in terms of the tree, and somehow -- 18 you know split your things in two and then 19 four like a CART. 20 MR. SEDRANSK: You want to try and 21 do CART. I haven't actually thought of that. 22 But that's not -- BETA REPORTING & VIDEOGRAPHY SERVICES www.betareporting.com (202) 638-2400 1-800-522-2382 (703) 684-2382 34 1 DR. HENGARTNER: Because if you 2 consider all the all the subsets. 3 MR. SEDRANSK: Oh, it's too many. 4 DR. HENGARTNER: Forget it, but if 5 you do it like they do in CART and things 6 like that, you actually have a prior. 7 MR. SEDRANSK: There is a paper by 8 -- people realized after I wrote my first -- 9 consoniun(?) weren't easy in biometrical. 10 After I published the first paper, did it for 11 the binomial and they -- similar work but 12 similar structure and whatever, but they 13 recognized that there are some structures 14 where you can put in the factorial designs, 15 you only look at sort of some main effects 16 and some interactions. Problems such as 17 this, not so clear how to do it. 18 DR. NEERCHAL: One thing you might 19 want to do is to start by plotting the 20 Bayesian values against those variables that 21 you're not using right now. 22 MR. SEDRANSK: Yes. BETA REPORTING & VIDEOGRAPHY SERVICES www.betareporting.com (202) 638-2400 1-800-522-2382 (703) 684-2382 35 1 DR. NEERCHAL: To see if there is 2 some pattern. 3 MR. SEDRANSK: Yes. See what's 4 going on. 5 DR. NEERCHAL: Is there is some, 6 some information there. 7 MS. KIRKENDALL: What value of data 8 that we're not using? 9 DR. NEERCHAL: Like some other 10 characteristics of the business, sales 11 revenue, and things like that. 12 MR. SEDRANSK: Yes. 13 DR. NEERCHAL: I don't know what 14 else do you have there, so. 15 MS. KIRKENDALL: Well, one thing, 16 there should be a relationship between sales 17 and revenues. I mean those are reported by 18 companies, right now we're doing it 19 independent estimates. 20 DR. NEERCHAL: No, no, plot the 21 beta values against some other 22 characteristics of the -- whatever you have BETA REPORTING & VIDEOGRAPHY SERVICES www.betareporting.com (202) 638-2400 1-800-522-2382 (703) 684-2382 36 1 -- let's try to see if there is something you 2 can pull up and that might help you at least 3 in terms of cut down the number of subsets 4 you need to look at. 5 MR. SEDRANSK: Anyway let me switch 6 the discussion because Nancy -- Nancy is not 7 squirming but probably afterwards we do need 8 to make recommendations probably about these. 9 Does anybody know anything about the out -- 10 go back to question number one. Sort of 11 anything about looking at other than 12 influential observations and -- 13 MS. KIRKENDALL: I do want to 14 address implementation as one of the things. 15 MR. SEDRANSK: Yes. 16 MS. KIRKENDALL: That's happened in 17 the past is that the people who process, who 18 collect the data and process it, they get it 19 ready, they do whatever edits they think they 20 need to, they think it's fairly clean and 21 then it goes to an estimation program. And 22 there's a lot of information that comes out BETA REPORTING & VIDEOGRAPHY SERVICES www.betareporting.com (202) 638-2400 1-800-522-2382 (703) 684-2382 37 1 of the estimation as a result of these 2 scattered plots and outliers and what not. 3 And there has not been a feedback to the 4 staff. So one way you could do something -- 5 and then they do more detailed analysis, but 6 they typically find it when somebody looks at 7 a total number and it's off it. 8 So you get an error and it goes 9 through and really messes up your total 10 estimate. And may be caught at the end, or 11 it may be caught later, and it may not be. 12 So the thought on the outlier detection was 13 just take it out and make that outlier so it 14 doesn't affect your estimate for everybody 15 else, feedback that information about the 16 outlier so people can check up on it. So 17 it's to try to minimize the impact of one 18 funny observation on the estimate of the 19 total that you're producing. 20 MR. SEDRANSK: But basically this 21 is an automatic, sort of an automated 22 procedure. Particularly this survey which is BETA REPORTING & VIDEOGRAPHY SERVICES www.betareporting.com (202) 638-2400 1-800-522-2382 (703) 684-2382 38 1 operating well, it doesn't get a lot of -- it 2 doesn't get as much attention in some ways as 3 other's do. So if you don't -- you've got to 4 have something that's automated. But you 5 know -- it's sort of, it's something strange, 6 really strange happens that you don't get 7 blown out of the water. 8 MS. KIRKENDALL: I mean sometimes 9 these are real data to -- there's a huge 10 changes in the company. 11 DR. NEERCHAL: If you classify 12 something as an outlier, then it will not be 13 used for the estimate prior? 14 MS. KIRKENDALL: Yes, it would just 15 not be used on data. It would be added back 16 in at the end. 17 MR. SEDRANSK: Yes. We always, we 18 always use it for -- I mean assuming it's 19 correct, you know it's deemed to be correct. 20 MS. KIRKENDALL: But then you do 21 want to check on it. 22 DR. NEERCHAL: I mean if something BETA REPORTING & VIDEOGRAPHY SERVICES www.betareporting.com (202) 638-2400 1-800-522-2382 (703) 684-2382 39 1 -- some -- they have some special thing going 2 on that particular data value then it will 3 marked that as an outlier -- 4 DR. BURTON: It will still be added 5 back at the end of it as far as -- if it's 6 sure to be valid. 7 MS. KIRKENDALL: Right actually 8 that's one thing, there was some discussion 9 yesterday about over editing. EIA, we do 10 editing but we typically do not impute for 11 failed data. We try to phone on it, 12 certainly all the big companies. So while we 13 do the editing as an agency, which is across 14 most of our offices, we do not rely on 15 imputation for failed data. So we are a 16 little different from many statistical 17 agencies. 18 MR. SEDRANSK: Let me switch to a 19 different-- 20 DR. BURTON: Would your data be 21 cleaner? 22 MR. SEDRANSK: I am going to ask BETA REPORTING & VIDEOGRAPHY SERVICES www.betareporting.com (202) 638-2400 1-800-522-2382 (703) 684-2382 40 1 another question which I tried to the get the 2 -- draw out the answer to. If in a single, 3 in an establishment survey and you try to do 4 stratification -- here we're doing post 5 stratification, not doing stratification -- 6 but you sort of do probably something like 7 sales or revenue are probably highly 8 correlated, profits but pretty not -- not may 9 be profits, sales and revenues, it's 10 stratify, if you had the opportunity -- you 11 would just do something like the cumulative 12 square root of F method, which is been around 13 for forty, or forty or more years. 14 This case you don't have that, 15 you're looking for common features of the 16 regression coefficients. So I just have 17 question about how you might build up strata, 18 just forget about climate and temperature. I 19 mean this is not smart, just totally forget 20 about it. How do you sort of build -- how do 21 you build up strata from the ground up, is 22 the only way I can describe it, any ideas? BETA REPORTING & VIDEOGRAPHY SERVICES www.betareporting.com (202) 638-2400 1-800-522-2382 (703) 684-2382 41 1 MS. KIRKENDALL: By strata you mean 2 the estimation? 3 MR. SEDRANSK: Estimation groups in 4 post strata, yes, yes, right. Or turn it 5 around and say we're getting a new -- they 6 were in cut off samples, as far as you're 7 doing a probability design, performance 8 strata in the first place. I'm not 9 suggesting we're doing this but -- 10 MS. KIRKENDALL: Actually one thing 11 for all of these observations, we do have the 12 state that they operate in. So we can look 13 at any different groupings that we think we 14 need to. 15 MR. SEDRANSK: Yes. It's just I 16 was thinking of a just primitive method, or 17 where you start from. I mean there are 18 things you can do, you can just get estimates 19 of the regression coefficients that are just 20 monthly over annual, you have a whole batch 21 of them for each company. I just - I mean 22 I've suggested that, but I --just curious if BETA REPORTING & VIDEOGRAPHY SERVICES www.betareporting.com (202) 638-2400 1-800-522-2382 (703) 684-2382 42 1 anyone has any other ideas. 2 CHAIRMAN BREIDT: But the 3 regression coefficients you get at the 4 company level are, these are based on -- 5 MR. SEDRANSK: One observation. 6 CHAIRMAN BREIDT: Yes. 7 MR. SEDRANSK: Per month. But 8 you'd actually get a bunch of months of them, 9 so you -- 10 CHAIRMAN BREIDT: Then the same 11 month across the years and that regression 12 coefficient. 13 SPEAKER: Yeah. 14 CHAIRMAN BREIDT: We're talking 15 about. 16 MR. SEDRANSK: Well, we got both 17 those, we've got the same month over the 18 years, we've got all this, we've got the time 19 series data and whatever, you could take all 20 of this and see what -- put it in the cluster 21 analysis package and see what do you get. 22 CHAIRMAN BREIDT: Right, it doesn't BETA REPORTING & VIDEOGRAPHY SERVICES www.betareporting.com (202) 638-2400 1-800-522-2382 (703) 684-2382 43 1 see that you want to through all the months 2 together on some of these plots. 3 MR. SEDRANSK: I'm not sure if this 4 is important. I mean I think the temperature 5 based thing makes sense, although I think the 6 ones we've got are -- I think ones we've got 7 -- I think there are some other units. I did 8 that and I'd be surprised -- this one just 9 looks too large, that north east one. 10 DR. SITTER: Well, I mean you could 11 also use it as a validation, if we came up 12 with a method to see if it actually is a 13 decent grouping. But you can just quantify 14 the problem, you've got a variance estimator, 15 obviously? 16 MR. SEDRANSK: Yeah. 17 DR. SITTER: Okay, it's going to 18 depend on the group, throw it into your 19 favorite search algorithm for finite grouping 20 searches, clustering, you can just tick away 21 random groupings. I mean computers are fast, 22 you can quickly find out things are BETA REPORTING & VIDEOGRAPHY SERVICES www.betareporting.com (202) 638-2400 1-800-522-2382 (703) 684-2382 44 1 reasonable. And I've seen them do this with 2 genetic algorithms or simulated annealing, 3 anything that doesn't require any derivatives 4 or anything. They're not really fast, but 5 you know speed isn't of the essence. 6 DR. HENGARTNER: Randy we're trying 7 to group thousands of companies, correct? 8 DR. SITTER: No, just states -- 9 MS. KIRKENDALL: There are only - 10 there are less than 200 that wer'ee actually 11 using in this regression, we have about 400 12 observations if you count the IOUs that are 13 currently not used to any regression. 14 DR. HENGARTNER: So 200 and you are 15 trying to group them in? 16 DR. SITTER: It's not too bad. I 17 mean you can even use some of the scheduling 18 methods where they want to schedule jobs to 19 computers. There are some very simple 20 algorithms that are very fast, they are not 21 going to get you the optimum, but they will 22 get you something very good. And they are BETA REPORTING & VIDEOGRAPHY SERVICES www.betareporting.com (202) 638-2400 1-800-522-2382 (703) 684-2382 45 1 very simple, they just order it on some 2 reasonable proxy and then they take the 3 biggest ones, put one in each group, and then 4 they just tick the next guy and say, well, 5 which group can I put it in that will 6 increase the variance the least, and they 7 just do that bang, bang, bang, bang; it's 8 very fast. And you come up with a sub 9 optimal solution but I'm -- 10 DR. HENGARTNER: Which is pretty 11 good though -- 12 DR. SITTER: But it's pretty good 13 on average. 14 DR. HENGARTNER: I don't have 15 experience with -- you know that's the 16 Bayesian method. But I have the feeling that 17 if you tried to prove that, you are going to 18 have a lot of local minima's. 19 MR. SEDRANSK: Oh, no, I wasn't 20 even thinking of Bayesian, this was not 21 Bayesian -- 22 DR. HENGARTNER: Oh, that wasn't BETA REPORTING & VIDEOGRAPHY SERVICES www.betareporting.com (202) 638-2400 1-800-522-2382 (703) 684-2382 46 1 you -- I was thinking of. 2 MR. SEDRANSK: I do all kinds of 3 things, it depends on what day it is. I 4 think it like a Bayesian, but I -- 5 MS. KIRKENDALL: One of the things 6 that I should say is this the original system 7 and the estimation methods were done by Jim 8 Knob who was hiding up there or who is trying 9 hiding up there, in the active performance in 10 alternative fuel. And he presented a lot of 11 the information to the ASA committee back in 12 the early the 90s, I think some of it was 13 when Joe was the chair of the committee. So 14 this is an example of a problem that keeps 15 coming back. 16 MR. SEDRANSK: I mean essentially 17 things are working well here, and this is 18 sort of a prototype. But there are these 19 issues, there is certainly desire to cut back 20 sample size, may be not so much in this one 21 as in 906 and 920. So can one make use of 22 this time series, information is also issues, BETA REPORTING & VIDEOGRAPHY SERVICES www.betareporting.com (202) 638-2400 1-800-522-2382 (703) 684-2382 47 1 you know academically it's of interest 2 whether you can make some use of it. One of 3 the problems which we did not allude to is 4 there was a major in the time series when 5 you're going to have trouble making a longer 6 time series because in 2004 there was a major 7 shift that other was sent away and replaced 8 by transportation. So that was a -- so 9 you've got a problem of different 10 distribution among those end users after 11 that. That's one of the reasons we don't go 12 after 03, then 02 and 03 are -- you know are, 13 are clear cut, so we won't have a long -- 14 MS. KIRKENDALL: Residential may 15 still be okay. 16 MR. SEDRANSK: What? 17 MS. KIRKENDALL: Residential may 18 still be okay. 19 MR. SEDRANSK: Residential probably 20 be okay. 21 CHAIRMAN BREIDT: Well, it looks 22 likes you have spatial(?) structure too. BETA REPORTING & VIDEOGRAPHY SERVICES www.betareporting.com (202) 638-2400 1-800-522-2382 (703) 684-2382 48 1 MR. SEDRANSK: Yes, do something 2 here. 3 CHAIRMAN BREIDT: Do something 4 here, especially in the finding the time 5 series and the spatial structure. 6 MR. SEDRANSK: Yes. 7 CHAIRMAN BREIDT: Allowing for the 8 fact that over time that their ability 9 changes a lot in states so that you have a -- 10 MR. SEDRANSK: Yes. This looks 11 like a, this is a good problem. 12 CHAIRMAN BREIDT: Yes. 13 MR. SEDRANSK: I found a similar 14 one at the census bureau also recently which 15 I now can't remember, carrying those small, 16 small area overtime, there's a - they've been 17 doing it this way, but I detected that 18 they've got some of the same structure 19 problem. 20 DR. HENGARTNER: Joe, have you 21 thought of looking at this as a geo temporal 22 process? And I'm now thinking, I know the BETA REPORTING & VIDEOGRAPHY SERVICES www.betareporting.com (202) 638-2400 1-800-522-2382 (703) 684-2382 49 1 location of each of the generators, right? I 2 always thought that there is some spatial 3 correlation instead of just grouping them 4 into regions, let's look at some kind of a 5 diffusion those must be correlated in some 6 ways. And we have awesome barriers like the 7 mountain ridges, whatever, I mean we actually 8 know about the geography. 9 MR. SEDRANSK: No, we haven't 10 because we've looked at really, we're 11 evaluating their methodology, it doesn't -- 12 by the way it's a good idea -- 13 DR. HENGARTNER: I mean you're 14 asking for what goes through my mind -- 15 MR. SEDRANSK: No, no, that's a 16 good -- no this is. I'm down on, --I'm down 17 geo spatial stuff, I have a student who's 18 been working on a problem and I don't want to 19 tell you what troubles he has had. These are 20 binary responses though in a geo spatial 21 setting and I can tell you things that don't 22 work, lots of things that don't work. BETA REPORTING & VIDEOGRAPHY SERVICES www.betareporting.com (202) 638-2400 1-800-522-2382 (703) 684-2382 50 1 CHAIRMAN BREIDT: Okay, any other 2 comments? 3 MS. KIRKENDALL: Okay, you have any 4 brilliant ideas? 5 MR. SEDRANSK: Email. 6 MS. KIRKENDALL: Email works. 7 SPEAKER: I haven't had one of 8 those in a while. 9 CHAIRMAN BREIDT: So it's now time 10 to move to our break out sessions. The next 11 session will be downstairs on the post 12 stratification methodology. 13 (Recess) 14 MR. HOUGH: Desperate to get 15 started. I was just informed that there -- I 16 guess we'll be chairing this ourselves. It's 17 a scary thought I know but -- My name's Rick 18 Hough, I am here from the Census Bureau, I am 19 the survey manager responsible for the 20 Manufacturing Energy Consumption Survey. I 21 am here to talk to you a little today with my 22 colleague, Stacey Cole, who's the brands BETA REPORTING & VIDEOGRAPHY SERVICES www.betareporting.com (202) 638-2400 1-800-522-2382 (703) 684-2382 51 1 chief responsible for the R&M brands that has 2 MECS as one of its many responsibilities. We 3 want to talk to you today a little bit about 4 the post stratification methodology that we 5 implemented during the 2002 Manufacturing 6 Energy Consumption Survey. 7 Before I begin, I'd like to take a 8 quick minute and acknowledge everyone who 9 worked hard on the 2002 MECS, Susan Bucci is 10 with us today, she's my boss, she's the 11 brands chief, in charge of the brands where 12 MECS is conducted, members of my staff who 13 were responsible for data analysis, Vicky 14 Haitot, Lacy Loffin, and Eva Snap. I would 15 also like to recognize the methodologists on 16 Stacey's staff, who were responsible for 17 coming up with a lot of the details that 18 you'll see in today's presentation, Jeff 19 Dalzell, Cathy Gregor, John Slanda, who's 20 here with us today in case any technical 21 questions would arise, and Bob Struble. 22 I'd also like to take a quick BETA REPORTING & VIDEOGRAPHY SERVICES www.betareporting.com (202) 638-2400 1-800-522-2382 (703) 684-2382 52 1 moment to acknowledge our members of EIA who 2 worked on the Manufacturing Energy 3 Consumption Survey. Dwight French is with us 4 today. Dwight gave the approval to go 5 forward with this process, came out to census 6 few times, listened, gave comments, and then 7 told us to go ahead, Bob Adler, who was the 8 other survey manager I saw, my co-survey 9 manager from EIA who also worked very hard, 10 and Tom Lorenz who some of you heard from 11 yesterday, survey analyst from EIA who came 12 out for census quite a bit to work with our 13 team on data analysis and preparing reports 14 and he was a very valuable member of our 15 team. 16 Just an overview of what we're 17 going to talk a little bit about today. I'm 18 going to talk to you a little bit about the 19 background and the goals of the Manufacturing 20 Energy Consumption Survey. I'm going to give 21 a brief description of the economic census, 22 talk a little bit about some of the aspects BETA REPORTING & VIDEOGRAPHY SERVICES www.betareporting.com (202) 638-2400 1-800-522-2382 (703) 684-2382 53 1 of the census that affect the reasons why we 2 Post Stratified, and also some of the files 3 that are involved with Post Stratification. 4 Stacey is going to come up and talk 5 a little bit about the sample design, and 6 then he's going to get in to the actual Post 7 Stratification methodology and then we have 8 some results of what we implemented to talk 9 about. And then, finally we have some 10 questions for the panel that we'd like to 11 consider, some questions about the 12 methodology that we implemented for this 13 survey, and if time permits we'd also like to 14 talk a little bit about what we might 15 consider in the future given the time frame 16 of MECS and how it is ultimately related to 17 how the census is conducted. 18 Okay, the Manufacturing Energy 19 Consumption Survey. Basically our goal is to 20 provide detailed data on energy consumption 21 for the manufacturing sector. We provide a 22 variety of information from individual energy BETA REPORTING & VIDEOGRAPHY SERVICES www.betareporting.com (202) 638-2400 1-800-522-2382 (703) 684-2382 54 1 sources, such as electricity, natural gas, 2 selected fuels. We also provide data on 3 industry as well as geographic. We report 4 statistics for some of the energy saving 5 technologies that are implemented in 6 manufacturing establishments as well as their 7 ability to switch to alternative fuels in 8 certain situations. 9 The survey was initiated in 1985, 10 it was conducted every three years until the 11 survey year 1994 and it's been conducted 12 every four years since that point. The 13 economic census, just some general 14 information, provides a detailed portrait of 15 the nation's economy. We provide data on 16 industry as well as geographic, we provide a 17 variety of maniputable formats for data users 18 such as CD-ROM's, we have an American Fact 19 Finder on our website where the data user can 20 go in and get data and create their own 21 tables. 22 The census is conducted every five BETA REPORTING & VIDEOGRAPHY SERVICES www.betareporting.com (202) 638-2400 1-800-522-2382 (703) 684-2382 55 1 years; it is conducted in years that end in 2 Two and Seven. It is conducted using the 3 North American Industry Classification 4 system. This is the system that was brought 5 on to replace the old SIC. The 2002 census 6 was the second census conducted using these 7 classification definitions. 8 Okay, and the census provides a 9 comprehensive update of the classifications 10 that are contained in the Census Bureau's 11 business register. A lot of the companies 12 within the business register do not get 13 Census Bureau questionnaires every year. 14 Therefore, the update path that we have from 15 them to update their classifications is done 16 through the census. 17 I want to talk a little bit about 18 the 2002 economic census for the 19 manufacturing sector. The census bureau does 20 not mail every establishment in the country a 21 questionnaire. We have what's called a 22 non-mailed file, it is defined as single unit BETA REPORTING & VIDEOGRAPHY SERVICES www.betareporting.com (202) 638-2400 1-800-522-2382 (703) 684-2382 56 1 manufacturing establishments with less than 2 five employees and these establishments were 3 not mailed questionnaires. For the year 4 2002, this file contained approximately 5 160,000 establishments. 6 The Census Bureau uses 7 administrative data from other government 8 agencies to estimate for these establishments 9 and they account for approximately 3 percent 10 of the published totals. Now, if you have a 11 non-mailed file, then you have a mailed file, 12 and the definition of the mailed file is the 13 ones you don't -- the ones you sent 14 questionnaires to. They are the single unit 15 establishments with more than five employees 16 and all multi-unit manufacturing 17 establishments were mailed questionnaires. 18 DR. HENGARTNER: Now how do you 19 have those addresses? 20 MR. HOUGH: Other government 21 agencies. 22 DR. SITTER: Homeland Security. BETA REPORTING & VIDEOGRAPHY SERVICES www.betareporting.com (202) 638-2400 1-800-522-2382 (703) 684-2382 57 1 DR. HENGARTNER: Good answer. 2 MS. DODDS: We do some surveys in 3 the years between the census and many of the 4 addresses come from those, and others would 5 come from other government agencies, taxing 6 agencies. 7 DR. HENGARTNER: IRS. 8 MR. HOUGH: Okay, the mailed file 9 contained approximately 200,000 manufacturing 10 establishments for the survey year 2002. 11 Okay, the data that's collected during the 12 census from the manufacturing establishments 13 -- we collect operational data on employment, 14 receipts, inventories, costs, assets, 15 purchase services, et cetera. Among the cost 16 data we collect from the establishments are 17 cost of fuels and cost of electricity. 18 And when Stacey comes up, he's 19 going to read for quite a bit to this cost of 20 energy, that is what we use to define the 21 cost of energy. We take the cost of the 22 fuels reported and the cost of electricity BETA REPORTING & VIDEOGRAPHY SERVICES www.betareporting.com (202) 638-2400 1-800-522-2382 (703) 684-2382 58 1 reported by the establishments and that is 2 how the cost of energy is defined. I'm going 3 to defer now to Stacey and he's going to 4 start off by talking a little bit about the 5 sample design. 6 SPEAKER: Which one's up? 7 MR. COLE: Good Morning. Basically 8 my area was responsible for going to in 9 developing a sample design for the MEC 10 survey. The key goal of MECS is to produce 11 detailed aspects of energy consumption at the 12 national level for industries as well as 13 regional level for industries. The target 14 population is basically the mail panel of the 15 -- every portion of the census. Single 16 location establishments with more than five 17 employees and all multi unit establishments. 18 The mail file was used as just 19 said, every establishment in the mailed file 20 was classified to a specific industry code, 21 and everyone has to sign cost of energy data 22 which came from historical sources. For many BETA REPORTING & VIDEOGRAPHY SERVICES www.betareporting.com (202) 638-2400 1-800-522-2382 (703) 684-2382 59 1 cases the cost of energy data came from the 2 2001 annual survey manufacturers, for other 3 cases it came from the 1997 government 4 census. So in some cases, it's very, very 5 old. 6 We used a salvation approach to go 7 through the sample, there were 473 specific 8 manufacturing industries, way too many to 9 cover in detail for MECS but fortunately 10 energy use is concentrated to a great deal. 11 There are 37 industries that are of high 12 interest to EIA, for the most part these 13 account for almost half of the energy 14 consumed in manufacturing. The balance of 15 the -- the remaining 440 -- 400 and whatever 16 is -- 436 whatever it is -- industries were 17 collapsed to higher levels, so between both 18 groups that cover the can clear the entire 19 sector of manufacturing. 20 And also there was a desire for 21 some states and some sub- national level 22 data, so we collapsed the stage to Census BETA REPORTING & VIDEOGRAPHY SERVICES www.betareporting.com (202) 638-2400 1-800-522-2382 (703) 684-2382 60 1 Bureau regions. There are 267-industry group 2 by recent strata, there was one that doesn't 3 exhibit basically an empty set. There are no 4 establishments. The allocation was also 5 challenged, we have a sample of only 15,000 6 establishments. 7 The establishments both the 8 collection unit with the mail unit as well as 9 the collection unit, the collection unit as 10 well as the sample unit, and also the 11 collection unit. We have a sliding scale of 12 reliability constraints, basically we think 13 the cells that have the most amount of 14 activity are the cells of the highest 15 interests, so we tried to do a really good 16 job in the cells with a lot of activity. 17 The smaller cells have fairly high 18 C V constraints to inflation. I believe the 19 C V constraints range from 2 percent to about 20 10 percent. Again the big cells had higher 21 constraints. The section approach basically 22 is, an independent sample was drawn from each BETA REPORTING & VIDEOGRAPHY SERVICES www.betareporting.com (202) 638-2400 1-800-522-2382 (703) 684-2382 61 1 one of the 267 stratum, within each stratum, 2 we've a PPS approach. The establishment size 3 does matter, establishments that have a large 4 consumption of energy had a -- a very high 5 problem with selection. Again, the energy 6 data we had for such level came from the year 7 19 -- 2001 ASM or historical data. 8 We also imposed a minimum 9 probability of.02, this helps to prevent in 10 the cases where we end up with large weights 11 and surprisingly large data, which causes 12 problems for estimation as well as the 13 variances. Estimation is really a four-stage 14 process, I only have three on here, but the 15 first stage is obviously to collect the data 16 and like most surveys, we do not get all the 17 mailed cases in. There's a small amount of 18 all that -- not a small amount, there's about 19 a 25 percent non-responses rate to a survey. 20 We mailed out 15,000 forms and we 21 got about 12,000 back in. Historically we 22 have done a non-response weight adjustment to BETA REPORTING & VIDEOGRAPHY SERVICES www.betareporting.com (202) 638-2400 1-800-522-2382 (703) 684-2382 62 1 MECS and this time we did the same thing. 2 The first and third board there, I think, 3 we've done historically in MECS. The middle 4 board is something basically brand new. In 5 non- response weight adjustment process, we, 6 no, go back a second, we did it by stratum. 7 With each stratum we said, we broke down in 8 to sub-stratum, we said within each stratum 9 there are certainties and non-certainties, 10 response rate varies between the two 11 subgroups, and we wanted to go through and 12 adjust their weights separately. That's also 13 an enhancement for 2002 MECS. 14 It is basically a standard form of 15 the non-response weight adjustment, where the 16 numerator is basically equal to the weighted 17 data from the entire sample on a sample frame 18 and the denominator was the observed data 19 from respondents, I'm sorry, it's the data 20 trivial to respondents. So every response in 21 the cell that did the adjustment factor is 22 equal to 1. In most cells that one didn't BETA REPORTING & VIDEOGRAPHY SERVICES www.betareporting.com (202) 638-2400 1-800-522-2382 (703) 684-2382 63 1 happen. So at that point we adjust the 2 sample weights to reflect non-response. The 3 post-stratification, why we do this? We are 4 concerned about sample frame. 5 Basically we are using a 2002 mail 6 file, we are worried about the coverage of 7 the target population, we are worried of the 8 classification of the el, and we are worried 9 about the cost of energy data being accurate. 10 The coverage, we know that the mailed file 11 includes records that will ultimately not be 12 in your factory when your mail questionnaire 13 is censused, they'll come back in 14 re-classified into wholesale, retail or 15 construction or other areas. 16 Initially there were cases data 17 being mailed as retail/ wholesale for census, 18 they can be classified in manufacturing. You 19 know, that's a problem. In addition, we have 20 cases that -- the classification within 21 manufacturing will have changed between 1997 22 and 2002 and this affects the thin strata of BETA REPORTING & VIDEOGRAPHY SERVICES www.betareporting.com (202) 638-2400 1-800-522-2382 (703) 684-2382 64 1 variability, and the cost of energy data is 2 also a problem because it's quite volatile 3 year to year, in addition for most of the 4 records, the last testimony of the 5 consumption is five years old. So, we had 6 all those concerns. And that prompted us to 7 go through and examine the idea of Post- 8 Stratification. 9 Fortunately, we had the Econ census 10 as a source to go through and do the post 11 stratification. It provides a comprehensive 12 collection of all factors; it allows us to 13 update the classifications of everybody in 14 the Econ directorate, and also we do collect 15 the cost of energy for all manufacturing 16 establishments. So we were able to go 17 through and basically re-assemble the frame 18 based upon the more current information, 19 which is what we did. 20 For each stratum, we identified the 21 eligible units in the stratum, we summed 22 their cost of energy to the stratum level, we BETA REPORTING & VIDEOGRAPHY SERVICES www.betareporting.com (202) 638-2400 1-800-522-2382 (703) 684-2382 65 1 also developed an estimate of that from the 2 sample cases in that stratum, so we were able 3 to do a comparison between the target 4 population energy data and the sample 5 estimate of that, and here's a graph that 6 shows the comparison between the actual and 7 the estimated total. 8 For the most part, the estimates 9 are pretty close to the actual data, but 10 there are a handful of cells where the 11 estimate is quite a bit higher than the 12 actual total from the census. Look at those 13 cells and determined that -- that cause was 14 caused primarily by the big weight and big 15 data problem, had some cases we thought they 16 were quite small, but in reality they end up 17 being quite large. 18 So we had a handful of cells where 19 the estimates was quite a bit greater than 20 the actual population total. We are 21 looking for a function that will go back and 22 adjust the sample weights of the sample BETA REPORTING & VIDEOGRAPHY SERVICES www.betareporting.com (202) 638-2400 1-800-522-2382 (703) 684-2382 66 1 cases, so that the weight adjusted estimate 2 will equal the controlled total. We wanted a 3 function where we didn't adjust the weights 4 of a certainty cases. They are a 5 self-representing cases during the selection 6 process and we want them to remain that way 7 during the estimation process. 8 We also wanted to avoid a problem 9 of lowering weights on non-certainties to the 10 point where they became less than one, which 11 is kind of a hard concept to understand, 12 weight less than one. So we develop a 13 process where the K is our factor, K is 14 linear, it goes to the point 1.1 and 1.1 is 15 the axis, it swings round that at a point, 16 and it's the purple area, if we have the 17 estimate as less than the population, we have 18 to adjust the weights up. 19 In the blue area, we are saying 20 basically the estimate is greater than the 21 population and we have to adjust the weights 22 downward. That's the graphic display. And BETA REPORTING & VIDEOGRAPHY SERVICES www.betareporting.com (202) 638-2400 1-800-522-2382 (703) 684-2382 67 1 here's the actual formula that we used, 2 basically numerator represents the amount of 3 energy data associated with establishments 4 are not part of the sample. Basically it's 5 the cap end of the total, the first time is 6 the published total, and the second time is 7 what we observe in the MECS sample. So 8 basically the numerator is what was not in 9 the MECS sample. And the denominator is 10 basically an estimate of the numerator. 11 So hopefully, our ratio is close to 12 one, and if it's one, we are really happy. 13 The sample was -- after we get the case for 14 each stratum we go back and we adjust the 15 sample weights further by the K. Just look 16 at the top of the slide there, you can see 17 that the certainty cases, the cases where 18 WI=1, are never going to be adjusted. 19 They're going to remain -- retain their 20 sample weight of one, and never will it get 21 to miss an adjustment, but it's not a linear, 22 but not proportionate. BETA REPORTING & VIDEOGRAPHY SERVICES www.betareporting.com (202) 638-2400 1-800-522-2382 (703) 684-2382 68 1 In my example I have got myself -- 2 got cable.8. The first by the weight of two 3 goes done by only 10 percent to 1.8 but the 4 weight of 20 goes down by almost 20 percent 5 down to 16.2. So basically the 6 establishments with higher sample weights are 7 more impacted by the adjustment than the 8 cases with small sample weights. And I just 9 said that, okay actually in truth when we 10 classify the population and develop a control 11 total, we had almost a 1,000 establishments 12 that switched strata within manufacturing. 13 It's a little high than we expected but they 14 have often verified. 15 The overall impact on the estimate 16 of the energy at the national level was to 17 provide the down load by about 1.8 percent. 18 About half this stratum was just up and half 19 went down. So we are pretty pleased it 20 wasn't anything systematically suggesting 21 that we had a upper bias, redundant bias and 22 for the cells that were adjustable, that were BETA REPORTING & VIDEOGRAPHY SERVICES www.betareporting.com (202) 638-2400 1-800-522-2382 (703) 684-2382 69 1 adjusted, the average absolute adjustments a 2 little under 4 percent well, 3.7 percent. 3 So we weren't making major 4 adjustments on most of the cells. There were 5 a handful of cells that went down by a 30 or 6 40 percent. The color adjustment, this has 7 been done in the past, this is not a new 8 activity for MECS. The sample frame is a 9 mail file, it does not include those were 10 identified after we mailed out the census and 11 not they are not part of the mail file, they 12 are not put in the estimates and in addition 13 there are cases that were mailed as retail or 14 wholesale who were reclassified in the 15 manufacturing. Those cases again not in the 16 sample frame, so they are not included in the 17 estimate right now. That's a problem that we 18 are trying to address. 19 Here is a frame, nice big graph and 20 a pretty picture, and those are the sample we 21 picked originally, again the sample of frame 22 was a mail sample. We have cases that became BETA REPORTING & VIDEOGRAPHY SERVICES www.betareporting.com (202) 638-2400 1-800-522-2382 (703) 684-2382 70 1 ineligible, these are cases that died, went 2 on ---- or duplicates, cases where not 3 manufacturing. We didn't know them, that 4 they were advanced, so we drew a sample of 5 them. So our sample represents the number of 6 deaths in the actual frame, so these cases 7 can be removed from MECS, and they were. 8 But the births and the -- in the 9 transport and manufacturing were not in a 10 sample frame and yet they are still part of 11 the total population. We like to have them 12 included in the estimates but we had no 13 sample of those cases. So we basically went 14 through identified them all for about 15,000 15 births and incoming transfers who were a part 16 of the total population, which is one of the 17 cases we did have cost of energy data and we 18 developed a revised stratum cost of energy 19 control total by simply adding the energy 20 cost associated with these cases to the 21 originally control total and develop a final 22 adjustment where we had a revised control BETA REPORTING & VIDEOGRAPHY SERVICES www.betareporting.com (202) 638-2400 1-800-522-2382 (703) 684-2382 71 1 total cost of energy going up by the original 2 control total. 3 And then we get a final way and we 4 applied that to establishments in each cell. 5 This is done at each one of the cells levels. 6 About 3/4 of the strata were adjusted upward. 7 The remaining quarter, they were no births or 8 there no incoming transfers, cases like 9 chemical plants and refineries, pulp mills 10 are not likely to have incoming transfers and 11 their adjustment was about 3 percent. So 12 that point we finished the process we had 13 gone through, we are adjusting our response 14 that does it for the economic census and then 15 adjusted for the coverage of the original 16 sample frame and that's where its finished as 17 we are right now. These are the questions we 18 got and I see these were in right ballpark, 19 are we committing heresy here. 20 Isn't that all that you recently 21 found and we forgot something; we miss a big 22 issue, and are your alternative that maybe BETA REPORTING & VIDEOGRAPHY SERVICES www.betareporting.com (202) 638-2400 1-800-522-2382 (703) 684-2382 72 1 more appropriate. And the real one that I'm 2 looking for is, what do we tell the EIA and 3 their users about what we did, so they can 4 figure out how to use data. That's it. 5 Questions, comments? 6 MR. HOUGH: You are the chair. 7 MR. COLE: I'm the chair. Direct 8 all the questions to Rick. 9 MR. HSEN: Stacey, you astounded 10 them. 11 MR. COLE: I'm sure, I astounded, 12 oh yeah, the great minds here I'm sure I 13 astounded the great minds. 14 DR. SITTER: You chose post 15 stratification, you really calibrating, I 16 mean you using post stratification in 17 calibrating. 18 MR. COLE: That right. 19 DR. SITTER: You look at 20 calibration then? 21 MR. COLE: No, we do not look at 22 calibration methods. BETA REPORTING & VIDEOGRAPHY SERVICES www.betareporting.com (202) 638-2400 1-800-522-2382 (703) 684-2382 73 1 DR. SITTER: Some of the 2 calibration methods will allow you to put 3 constraints on them, which is essentially 4 what you have, that is you have a constrained 5 calibration problem. I think I can probably 6 come up with two or three references. You 7 can take a look at them. I'm not saying that 8 we better -- 9 MR. COLE: Is that done by a 10 variable, if it was a one calibration for the 11 entire stratum, I didn't -- 12 DR. SITTER: I think anything is 13 possible, as just viewing it as a calibration 14 problem may give you some insight as to where 15 you're sitting, you may in fact be in 16 calibration, if you can, I don't know. 17 MR. COLE: Okay. 18 DR. SITTER: I can give you a 19 couple of references that I know of. 20 MR. COLE: Okay, appreciate that. 21 DR. HENGARTNER: Randy, I don't 22 know that literature, how sensitive are these BETA REPORTING & VIDEOGRAPHY SERVICES www.betareporting.com (202) 638-2400 1-800-522-2382 (703) 684-2382 74 1 methods to randomness, I mean the way to get 2 their attention random variable. 3 MR. COLE: Yes. 4 DR. HENGARTNER: Because the 5 frames, well, we estimated them, the frames 6 change into both. Does that have an impact? 7 DR. SITTER: Of course, what are 8 you really meaning, are they really sensitive 9 to that or something like that? 10 DR. HENGARTNER: I mean the analogy 11 is it like errors and variables problem or 12 not. 13 MR. COLE: I wouldn't classify 14 myself as an expert. John, he is my wizard. 15 MR. SLANDA: Well, we looked at the 16 weights and the impact on variances -- 17 DR. FEDER: Okay. 18 MR. SLANDA: And how they did that, 19 we treated, we looked at -- we used the total 20 cost of energy and then we used electricity 21 as the variable from the actual survey data 22 because we needed something to like a BETA REPORTING & VIDEOGRAPHY SERVICES www.betareporting.com (202) 638-2400 1-800-522-2382 (703) 684-2382 75 1 by-variant along with that and we did 2 variances on the -- before and after the non 3 response adjustment and after the K 4 adjustment and about 80 percent of the 5 variances went down. But our big question 6 became, how would me measures bias and that 7 was something we hadn't come up with yet and 8 we are so interested in knowing how to 9 approach and attack that problem. 10 DR. FEDER: Actually you also 11 talked about -- felt that the stage for my 12 question is, could you do stimulations to 13 examine the bias issue, because by looking at 14 your formula it's not apparent to me that 15 there is no bias here because of the 16 selection issue because you are changing the 17 weights in a disproportional way, so 18 obviously the Horowitz-Thomson is out of the 19 window, so I would recommend doing some 20 stimulation and see again, if you get good 21 results I think I certainly would say that 22 the good paper here. BETA REPORTING & VIDEOGRAPHY SERVICES www.betareporting.com (202) 638-2400 1-800-522-2382 (703) 684-2382 76 1 Now I -- then you revise you 2 approach, because I haven't seen something 3 quite like it but it might make sense. And 4 actually not just bias or variance, I would 5 look at them into -- together, because any 6 calibration method might introduce some bias 7 but reduce there miniscule error and this 8 after all would be after, I mean, a little 9 bit of bias is okay, I learnt from Randy. I 10 have questions for Randy although quite a bit 11 older I mean his contribution is just obvious 12 so -- 13 DR. SITTER: Yeah, a bit 14 intimidating he already had a PhD in 15 mathematics. 16 MR. SLANDA: Well, I attempted to 17 actually look into some of that simulation 18 and I was wondering, have an idea I could 19 just bounce off, see it works but I was 20 thinking of using the actual survey data to 21 get a regression but then also modeling some 22 type of error term around the regression so BETA REPORTING & VIDEOGRAPHY SERVICES www.betareporting.com (202) 638-2400 1-800-522-2382 (703) 684-2382 77 1 that when we impute back, so then everybody 2 not in the sample then would have much more 3 variability as opposed to just having measure 4 of size there because if we have the measure 5 of size and we are doing simulation on that, 6 then we get funny results too. 7 DR. FEDER: One of the things that 8 you could also do -- stimulation is pretend 9 that 20 percent you assembled that responded 10 and 20 did not respond, take them out and try 11 to figure a method that would give you 12 anything close to what you actually have. So 13 -- 14 MR. SLANDA: So label about 20 15 percent of the population? 16 DR. FEDER: Yeah, I mean if you 17 stimulate no response by making some you 18 know -- you play God, you make them no 19 respond, and then you see you will have to 20 predict them the right way, you know and try 21 to make it a bit informative, make that no 22 response -- some known attributes of those -- BETA REPORTING & VIDEOGRAPHY SERVICES www.betareporting.com (202) 638-2400 1-800-522-2382 (703) 684-2382 78 1 DR. HENGARTNER: For example more 2 non responses of the smaller companies will 3 probably what we tend to generalize though, 4 the step, you know, the assumption that 5 moderately violated. The same thing is that, 6 we had the discussion about your estimate 7 with those weights, are you using the highest 8 estimate or the Horowitz-Thomson? 9 MR. COLE: Horowitz-Thomson. 10 MR. SLANDA: Horowitz-Thomson 11 estimate. 12 DR. HENGARTNER: Okay, there was 13 discussion and Randy suggested how it might 14 be better for some cases. 15 DR. FEDER: Randy also suggested to 16 you some calibration methods and once you 17 calibrate if you include in you calibration 18 constraints the population side, then you are 19 in fact doing high. 20 DR. HENGARTNER: Yeah but then 21 because you weights are random then you get 22 really the errors and variables problem and BETA REPORTING & VIDEOGRAPHY SERVICES www.betareporting.com (202) 638-2400 1-800-522-2382 (703) 684-2382 79 1 then you have to worry about attenuation 2 which is a few random weights. 3 DR. FEDER: It's a very complex 4 thing. That's why I think if you do 5 something in relation it could be some 6 cancellation of some of these errors. I 7 think it's and interesting approach affecting 8 the issue but you know, that the people that 9 represent themselves on the weight response 10 would be not change their weight but go until 11 -- 12 DR. SITTER: I'll give you why the 13 simulation is probably your best route I mean 14 I stayed so bluntly that this confront into 15 the class of constraint calibration problem, 16 if you look at the papers that I would give 17 to you none of them actually proved any 18 reason, any consistency results under any 19 reasonable framework as to what a consistency 20 result should be. Certainly they left the 21 simple size go to infinity but what you 22 really need, they discussed the consistency BETA REPORTING & VIDEOGRAPHY SERVICES www.betareporting.com (202) 638-2400 1-800-522-2382 (703) 684-2382 80 1 probably more realistic framework but it 2 hasn't really been proved yet. 3 And it really comes down to that 4 constraint that is you -- because of those 5 constraints, you're disproportionably 6 adjusting the weights and that's really going 7 to play a role and that you know, if you just 8 sort of set up an obvious fixed framework as 9 N goes to infinity, you are fine but you have 10 to imagine a situation where those 11 constraints are also not staying fixed with N 12 and that I don't think anybody has actually 13 come up with a ---- it's not that they can 14 prove it, it's that you could come up with 15 such a thing but it would be sort of made up 16 with you know, what does it mean 17 realistically to say as your sample size 18 increases or your number of strata increases, 19 what happens to those constraints. 20 That's an issue, I like, in your 21 situation your constraints are the certainty 22 versus non certainty. So you wanted to set BETA REPORTING & VIDEOGRAPHY SERVICES www.betareporting.com (202) 638-2400 1-800-522-2382 (703) 684-2382 81 1 up a asymptotic framework, you have to have 2 some idea of what happens with those 3 certainty versus non certainty weights as 4 your sample size increases. And that really 5 impacts some products, so you are really 6 stuck with simulations. 7 DR. FEDER: In this case your point 8 is even more important because we are dealing 9 with a very ---- population which is also the 10 measure of size, the energy consumption, so I 11 think when you do this disproportionate wages 12 adjustment, it might have an attenuated 13 effect and I think it is really worthwhile to 14 examine. 15 MR. HOUGH: I have one other 16 question to you, I certainly have time, we 17 were able to do this this time because this 18 -- MECS is currently conducted every four 19 years. The census is conducted every five. 20 This is the first time that the two surveys 21 fell in the same survey year. So we were 22 able to pull a sample from a mail file and BETA REPORTING & VIDEOGRAPHY SERVICES www.betareporting.com (202) 638-2400 1-800-522-2382 (703) 684-2382 82 1 stratify to a final file. 2 In the future for example, in '06, 3 we will pull the sample from the completed 4 '02, which is what we stratified to here. 5 But things like Stacy talked about will 6 happen. We do an annual survey on 7 manufactures, which is a sample of the 8 census, that kind of fills the gaps between 9 census years for manufacturing data. So the 10 question would be, if we consider doing this 11 in the future and the two surveys don't 12 coincide, is there anything we should 13 consider if we think about stratifying to 14 another sample of the census. 15 So we would want to use, say the 16 annual survey manufactures are the control 17 total? Is there anything we should consider 18 in doing that, I mean those actual control 19 totals will now contain sampling errors is 20 one thing, the estimates will themselves 21 contain sampling error and the ASM itself is 22 not really set up to give point estimates BETA REPORTING & VIDEOGRAPHY SERVICES www.betareporting.com (202) 638-2400 1-800-522-2382 (703) 684-2382 83 1 although they are fairly good. 2 DR. SITTER: Well, I think you are 3 going to run into, I mean conceptually run 4 into same issues that we saw yesterday with 5 the Natural Gas essentially. I mean there is 6 going to be a change in the relationship over 7 time because you sort of have certain 8 determined your stratification on the basis 9 of old information and the relationship is 10 going to change over time. 11 So looking at it back over time and 12 what's happened you may see a pattern as to 13 how those things might change. I am speaking 14 incredibly, vaguely your problem is much more 15 complicated then the one that was presented, 16 well, ultimately that's going to be the 17 problem, it's a lag problem. 18 MR. HOUGH: Right. 19 DR. FEDER: And in '06 it will be 20 -- the census data would be as old as 21 possible because in '07 you're going to have 22 another census -- BETA REPORTING & VIDEOGRAPHY SERVICES www.betareporting.com (202) 638-2400 1-800-522-2382 (703) 684-2382 84 1 MR. HOUGH: Correct. 2 DR. FEDER: So, but could you 3 analysis the '06 data once you get the '07 4 census in at least it will be more up-to-date 5 than the '02 census. Or reanalysis it and at 6 least assess the impact of the uncertainty by 7 the way, but I would at least try to analysis 8 the '06 MECS with the '07 census. We have 9 been -- our surveys of individuals are a 10 problem because we use census data which is 11 only 10 years in the -- analyzing unlike in 12 Canada and so we sometimes use a 10-year-old 13 census to calibrate our population estimates. 14 And as you know, there was quite a surprise 15 because the last census -- some of the 16 demographics, clearly we face the same 17 problem. But they do projections for the 18 inter census years, I don't know if anything 19 is prevalent here. 20 DR. HENGARTNER: The problem is of 21 course because the countries addressing, 22 right, so it's -- we make our justification BETA REPORTING & VIDEOGRAPHY SERVICES www.betareporting.com (202) 638-2400 1-800-522-2382 (703) 684-2382 85 1 based on signs or something like that? What 2 are you using for the first justification? 3 MR. HOUGH: For the control tables? 4 DR. HENGARTNER: Yes. 5 MR. HOUGH: Well, we would use the 6 -- rederive the cost of energy from the 7 annual survey manufacture, the most 8 reasonably the survey manufactures. So we 9 will mail the 2006 MECS at the beginning of 10 '07 and as we process that, the annual 11 survey, I mean the 2006 annual survey 12 manufactures will be processed. So by the 13 time we get to where we are ready to publish 14 our estimates, we will have an '06 ASM data 15 base to reestablish the control totals. But 16 my question was that since we were able to do 17 it this time to the census which is the 18 universe which doesn't contain any sampling 19 error and doesn't you know, it contains all 20 those smaller establishments. 21 In the ASM, the smaller 22 establishments are sampled with weights, so BETA REPORTING & VIDEOGRAPHY SERVICES www.betareporting.com (202) 638-2400 1-800-522-2382 (703) 684-2382 86 1 they're representative of other smaller 2 establishments, so the control total would 3 contain some level of error. Whether or not 4 we should consider that when we think about 5 doing this again, you know, we could analysis 6 the data certainly and look at. 7 DR. FEDER: Can I go back to the 8 original question of the weight adjustment. 9 Looking at the formula, I see one thing, I 10 think someone, Nick said that it's likely or 11 that as many small companies won't respond 12 and the larger ones will but let's assume for 13 the moment that the opposite happened. So 14 many companies that had to weight of close to 15 one do not respond, the other company that 16 had the weight close to one which are more 17 like them, their weight will me adjusted much 18 if you look at the formula, which ones will 19 get adjusted, the ones that had the weight 20 far away from one. So I'm worried about bias 21 here. 22 And that's why a simulation of BETA REPORTING & VIDEOGRAPHY SERVICES www.betareporting.com (202) 638-2400 1-800-522-2382 (703) 684-2382 87 1 their first, can I propose, may not be the 2 best because may be the way the population is 3 it's not so bad and my concern is not so 4 real. If you take the other kind of 5 validation by throwing away 20 percent of 6 your sample, just saying 20, some part of the 7 sampling, see the impact, you will able to do 8 it and I really like what Rick suggested 9 doing based on the weight. 10 Make it's own respondents, two 11 exercise, one with those that have weights 12 are close to one and study the impact and one 13 which would tend to be large companies. Self 14 representing, close to being self represented 15 and then another exercise when you make the 16 non-respondent fictitious with churn the 17 respondents to ones that have weights far 18 away from one and see the impact. Because in 19 theory, this estimate in my opinion would be 20 general bias just because of the phenomena -- 21 knows nothing about because it's tends to not 22 alter the weight of units that have weights BETA REPORTING & VIDEOGRAPHY SERVICES www.betareporting.com (202) 638-2400 1-800-522-2382 (703) 684-2382 88 1 close to one and those might be the ones that 2 do not respond. So we are not treating them 3 in the same way, that's my concern. It might 4 not be a problem with your population, that's 5 why I -- 6 MR. HOUGH: The reality of the 7 survey is that the coverage rate is about 89 8 percent, so we are getting those large 9 companies and the majority of the 10 non-response is absolutely, definitely in the 11 higher weight and smaller establishments -- 12 so I think if we do this simulation, we 13 assume that some of the certainty cases are 14 non response, you are going to see a lot 15 different look to these ways. 16 MR. COLE: We know who the non 17