Home > About Us > EIA Guidelines for Statistical Graphs
EIA Guidelines for Statistical Graphs

DOE/EIA-0465(98)

EIA Guidelines for Statistical Graphs

Line Graphs

Line graphs are the simplest and most effective format to present data, particularly time series data. It is the format used most frequently in Energy Information Administration (EIA) products to show long-term trends or cyclic variation covering shorter periods of time (i.e., months and quarters). This chapter discusses the design and construction of simple line graphs. It focuses on three areas:

  • Keeping line overlap to a minimum
  • Graphing variables with different scales on the same graph
  • Illustrating seasonal variation
Basic Line Graphs

The usual method to construct a basic line graph is to plot data points representing a total, or average, over a period of time and then to join the successively plotted data points together with straight lines representing the change occurring between data points. This is appropriate when changes occur at a precise point in time and then remain level until the next abrupt change. A "step" chart is a better format for such data. Horizontal lines are drawn through each point, and the end of each line is connected by a vertical line to the next data point. A "step" chart is good for data such as allocations of funds or personnel ceilings. They are also appropriate for prices if the change occurs abruptly.

Figure 4 shows that, when the new supply of motor gasoline for each month falls short of demand, the deficit is made up by withdrawal from stocks. When the new supply exceeds demand, the surplus is added to the stock buildup. The differences between the new supply and demand of each month are represented by the shaded areas.

Guidelines Used for Figure 4:
  1. The title complies with EIA publications standards; it has the necessary information. (The dates of data in the title are optional.) Also, the title is left-justified, with the proper font size (10 point bold for paper-printed publications), and has the word "Figure," followed by the figure number, with a period after the number.
  2. The Y-axis is on the right side of the graph and a zero and break marks have been placed on it. Also, the Y-axis label is parallel and faces inwards to the Y-axis. Also note that EIA does not customarily use "0.0" even though other items on the Y-axis may have decimal numbers.
  3. The tick marks on both the horizontal and vertical axes are outside the graph frame.
  4. Line patterns differentiate the data lines.
  5. Line labels identify the data lines. The opposite approach, removing the line labels and using a legend, is also acceptable. A darker blue shade is used to highlight "Stock Buildup" and a light gray to highlight "Stock Withdrawal."
  6. The X-axis labels are horizontal and are placed in-between the tick marks to designate time intervals, and the data points are plotted at the midpoint of each interval.
  7. A grid line (comparatively light against the data lines) is used to mark the end of 1991 and, thus, highlights the seasonal pattern of the data.
  8. The data source citation is indented and in the lower-left corner below the graph. The checklist, "Text, Tables, and Graphs," in the EIA Standards Manual and the section on "Figures" in the EIA Publishing Style Guide outline EIA standards for citing data sources in statistical graphs in EIA products.
Overlapping Lines and Alternative Presentations

Sometimes more data than can be clearly presented are displayed in a line graph and the presentation becomes cluttered. Changes need then to be considered in either the level of data presented (i.e., aggregating them) or in the graph format. This section illustrates methods to eliminate overlap in line graphs.

Figure 5. Revenue per Kilowatthour for Utilities, Highest State and
Lowest State Averages With U.S. Average for the Residential
Sector, January 1991 Through July 1992

The above figure is an example of a convenient method to summarize a large amount of data. It presents a summary of the complete data series (all 50 States). In the Lowest Average Revenue Rate line, each data point represents the State whose utilities had the lowest average revenue per kilowatthour in that month while, conversely, in the Highest Average Revenue Rate line each data point represents the State whose utilities had the highest average revenue per kilowatthour in that month. Figure 5 presents a clear "picture" that would be difficult to visualize looking at a table with many rows and columns of data.

An other alternative to presenting these data would be in the form of a statistical map. This is illustrated in Figure 23 (Statistical Maps chapter). In this figure, each State is classified into one of four categories: under 5 cents, 5.1 to 6.0 cents, 6.1 to 7.0 vcents, and over 7.0 cents. On the map, a light to dark color coding scheme is used to represent the four categories. The lightest shading represents the under-5-cents category and the darkest shading represents the over-7-cents category.

Line Graphs With Different Scales

In the previous examples, the line graphs were measured in the same units. Sometimes, the purpose is to compare graphs measured in different units. This can only be done by using separate graphs, or by transforming the data to a common scale.

Graphs with multiple scales are difficult to interpret and may mislead unwary readers. In a graph, the line of interest is judged against a background grid indicated by the axes scale marks. Portraying two backgrounds is analogous to a double-exposure photograph and very confusing.

The analyst has several alternative strategies to present data in a graph with multiple scales. The following are examples:

If the timing of changes in consumption and price is the primary interest, but relative magnitudes are not relevant, then two separate graphs, one above the other or side-by-side, could be used. Indicators such as vertical lines or message labels also can be used to emphasize or accentuate features of particular interest.

If the pattern of annual change is of primary interest, then annual percentage change from the previous year in price and consumption can be plotted, as in Figure 6 below. The variables are in comparable units. The reader can observe that from 1979 to 1981, the price increased substantially from the previous year, while consumption decreases were less than 10 percent. The reader can also see that from 1981 to 1985, there were small annual decreases in consumption and a varying pattern of decreases or increases in price. From 1985 to 1986, consumption still barely increased, but there was a sharp (nearly 23 percent) drop in prices.

The same pattern is also observed from 1986 to 1992. Consumption changed little, while prices varied. Prices increased slightly from 1986 to 1988, but rose 10 and 15 percent from 1986 to 1988. From 1990 to 1992, prices declined 1.7 and 0.5 percent.

Figure 6. Change as a Percent of Previous Year for U.S. Motor Gasoline
Retail Price and Consumption, 1979 Through 1992

Figure 7. U.S. Motor Gasoline Retail Price and Consumption as a
Percent of 1987 Levels, 1978 Through 1992

If long-term change rather than year-to-year changes are of interest, the magnitude of change in both price and consumption relative to the same baseline is the appropriate graph. In Figure 7, gasoline retain price and consumption are expressed as a percentage of 1987 values. The figure shows clearly that, between 1978 and 1981, prices doubled while consumption dropped about 9 percent. Between 1981 and 1992, prices varied widely while consumption never ranged more than 9 percent below the 1987 level and 1.5 percent above it.

Graphs should bring out the features of the data that are of the greatest interest. Multiple scales may hide or distort these features, causing readers to make incorrect deductions about the relative magnitude of changes in the data series and the interrelationship of these changes.

Time Comparisons

Frequently, when a long time series is being compared, the focus of interest is the relationship of current data to past data. Figures 8 and 9 (below) show two methods for graphically comparing current weekly, monthly, or quarterly data with past data. In Figure 8, total monthly stocks of crude oil and petroleum products are presented for 1990 through 1992. The data for the same month are plotted on the same vertical line. The monthly observations for the same year are connected by a line. (Note that the data in Figure 8 are point data; i.e., stocks as of the end of each month and, therefore, are plotted on the X-axis tick marks, not between them. Thus, the connecting lines start at the end of January, the first period plotted for the fist year, but shift to December for subsequent years.)

Figure 8. Stocks of Crude Oil and Petroleum Products (Excluding SPR)
U.S. Total, January Through December, 1990 Through 1992

Figure 8 is intended to illustrate whether present data (1992) follow the same patterns as the past data. The graph shows that there are some similar and dissimilar patterns among the 3 years that are presented. There is a rise in stocks during the summer, followed by a decline during the fall and winter for all years. Except for October through December 1990, stock levels are higher than the 1991 levels which, in turn, (except for January through April) are higher than 1992 levels. The 1991 levels are the most variable. As noted, the January through April stock levels for that year are lower than stock levels for the other two years, but the October through December levels are the highest. Thus, it is not known to what extent each of the 3 years differs from each other, what is a trend, or what is a seasonal effect.

Figure 9, below, illustrates a method (which is used predominantly in EIA's petroleum supply publications) to compare present data with past data.
Figure 9. U.S. Stocks of Motor Gasoline, January 1991 Through
December 1992


In the above graph, the most recent data, January 1991 through December 1992, are plotted against the "average range" for the past 3 years. (The "average range" is a combination of a seasonal adjustment technique developed at hte U.S. Bureau of the Census and the average and variation, i.e., standard deviation, from the most recent 3 years' data.) Two recent years are plotted so that comparison can be readily made between the current year and the year before.

Historically, stocks of crude oil and petroleum products reach their highest levels in the winter months (i.e., November through March) and their lowest levels in the summer and fall (i.e., June through October). Both 1991 and most of 1992 data shown in Figure 9 conform to this pattern. The 1992 variation from the pattern is in June through August. In June, motor gasoline stocks were approximately 225 million barrels, then declined to 215 million in July, and to near 200 million (the observed minimum) in August.

In summary, the presentation in Figure 9 permits easy identification of "unusual" data, which industry analysts can usually explain. The graph helps support the analysts' discussions and helps publication users focus on the data.

Summary of When and How To Use Line Graphs
  1. The data to be portrayed represent a dependent variable changing over an independent continuous variable such as time (as shown in Figure 4, U.S. Daily Average Motor Gasoline Supply and Demand, January 1991 Through July 1992).
  2. For a line graph to be used effectively, preferably five or more data points are needed for the display. With fewer than five data points, a bar or pie chart is more effective.
  3. If possible, use the same scales when plotting more than one graph on a page. When different scales of the same units are used, i.e., $40 to the inch for one graph and $20 to the inch for another, comparison is difficult. Thus, a footnote should be added to the graphs stating, "Because vertical scales differ, graphs should not be compared."
  4. In general, four data series may be effectively displayed in a line graph, except when overlapping lines could cause confusion. When overlap is minimal, more than four lines often can be effective.
  5. With multiple data series, plotting the range, with the average superimposed, is more effective than overlapping lines.
  6. Only one vertical scale is permissible within a simple plot. This means that variables measured in different units must be transformed to a common scale before plotting.
  7. Comparisons over time may be more effectively plotted against a past range than against the 2 most recent years.

Click here to return to front of report.