Economist's View: How Data Revisions Complicate Forecasting and Policy

リンク: Economist's View: How Data Revisions Complicate Forecasting and Policy.

How Data Revisions Complicate Forecasting and Policy

Evan Koenig of the Dallas Fed looks at how "analysis that neglects data revisions can lead policymakers and forecasters astray, and makes suggestions for how to cope with data that are subject to revision":

Through a Glass, Darkly: How Data Revisions Complicate Monetary Policy, by Evan F. Koenig, In Depth, Federal Reserve Bank of Dallas:
“Now we see through a glass, darkly; but then face to face: now I know in part; but then shall I know even as also I am known.” — 1 Corinthians 13:12
Through a glass, darkly. In his first letter to the Corinthians, Paul of Tarsus writes of our present limited self-knowledge: we see only a dim and distorted image of ourselves. Eventually, though, our true characters will be revealed. Government statistical releases, similarly, initially provide only a dim and distorted view of the economy. As more complete and more accurate data are assembled, our knowledge improves. But policymakers don’t have the luxury of waiting until all is revealed. Meanwhile, there is danger that they will misinterpret what they see.
An example: the elusive “comfort zone.” As an example of the potential importance of data revisions for monetary policy, consider the behavior of inflation in 2003.

The red line in Figure 1 shows the history of personal consumption expenditure (PCE) inflation, excluding food and energy, as it appeared in November of that year. Federal Reserve policymakers had several years earlier selected core PCE inflation as their preferred measure of price change, citing its broad coverage and superior tracking of shifts in household spending patterns. Core PCE inflation had been held to a fairly narrow 1 to 2 percent “comfort zone” for seven years running. Looking ahead, though, there was concern that inflation might experience an “unwelcome fall.” Partly because of this concern, the Federal Open Market Committee (FOMC) voted to cut the target federal funds rate at its June meeting.
Figure 1
Unfortunately, the broad coverage and shifting spending shares which make PCE inflation so attractive on theoretical grounds have a big, practical disadvantage: they make it vulnerable to substantial revision. In December 2003, the path of inflation suddenly looked like the blue line in Figure 1, not the red line. It was now apparent that inflation had exceeded 2 percent back in 2001 and—of more pressing concern—had been running at 1 percent or below for four months straight.
Eighteen months later, in June 2005, policy seemed to have stabilized inflation right in the middle of the 1 to 2 percent “comfort zone.” See the orange line in Figure 2. But another month of data brought yet another major revision (the green line in Figure 2). Concerns about deflation in 2003 suddenly seemed overblown, and inflation in 2004 and 2005 was revealed to be not in the middle of the comfort zone after all, but above its upper limit.
Figure 2
Overview. This presentation reviews the different types of data revisions, provides evidence on the reliability of several important economic data series, illustrates how analysis that neglects data revisions can lead policymakers and forecasters astray, and makes suggestions for how to cope with data that are subject to revision.
The Three Main Sources of Revisions
Source #1: New estimates of seasonal patterns. The first main source of data revisions is new estimates of seasonal patterns. Most economic data have a discernable seasonal pattern due to predictable weather and holiday effects. Statistical agencies try to strip out this pattern to make it easier to identify the business-cycle movements that are of concern to policymakers. But seasonal patterns shift over time and have to be re-estimated, which leads to data revisions.
Because it takes at least three years of data to estimate a seasonal pattern, revisions to seasonal factors can extend over several years. On the other hand, seasonal patterns shift slowly enough that resulting revisions are usually small. This is true, especially, of revisions to 12-month and 4-quarter growth rates.
Source #2: More complete survey responses. The second source of revisions is the arrival of more complete survey responses. As new responses are processed and old responses are updated, government statisticians are able to improve the accuracy of their estimates of what transpired in any particular month.
Series derived from surveys with once-and-for-all monthly deadlines are unaffected by this sort of revision. Unaffected series include the unemployment rate, the Conference Board’s Consumer Confidence Index, the Institute for Supply Management’s manufacturing and non-manufacturing indexes, and the business-conditions indexes compiled by various Federal Reserve Banks. Another example is the Consumer Price Index, which is based on retail prices observed and recorded directly by Labor Department employees. Commodity and financial-asset prices, of course, are also not subject to this type of revision.
For series that are updated to capture late arriving, more complete data, the government typically issues one or two revisions in the months immediately after the initial release. Other revisions follow later, at regular intervals, as data from annual surveys, censuses, or other sources become available. Revisions due to more complete data are responsible for most of the month-to-month and year-to-year changes in government data.
As an example, consider the sequence of official estimates of the number of nonfarm jobs added in Texas during March 2005. The initial estimate, a 10,600-job gain, was released in April 2005 (Figure 3). It was based on survey results for a sample of firms that collectively account for about 40 percent of nonfarm jobs. A first revision to March jobs growth was released a month later, along with the first estimate of April employment. It reflected corrections to previously received survey responses, as well as late-arriving responses, and showed a slightly smaller job gain. Finally, an annual revision was released in March 2006. It showed an increase twice as large as that previously estimated. Data for each of the other 11 months from October 2004 through September 2005 were revised at the same time. Annual revisions draw on tax reports submitted by employers who are covered under Texas unemployment insurance laws. These covered employers account for about 98 percent of nonfarm jobs, and the new estimates are definitive, apart from updates to seasonal factors.
Figure 3
Source #3: New methods and definitions, applied retroactively. Finally, revisions occur when new calculation methods or new definitions are applied to old data. Significant revisions of this type are relatively infrequent, and their timing is unpredictable.
A good example is a recent change to the construction of the Conference Board’s Composite Leading Index. The red line in Figure 4 displays the history of the leading index as it appeared in June 2005. Note that the index fell nearly every month between April 2000 and the start of the 2001 recession 11 months later. The cumulative decline was 2 percent. But the index fell by an almost identical amount between May 2004 and May 2005, without a recession. The Conference Board concluded that its index was misinterpreting changes in the slope of the yield curve—changes in the difference between long-term and short-term interest rates. So, the index was reformulated
Figure 4
The blue line in Figure 4 shows the result. With one stroke, the 2005 recession warning was eliminated. The point is that the seemingly strong record of the leading index is in part the result of changes to the construction of the index that have erased its past failures.
Assessing and Enhancing Government Data
Assessing reliability. Taking into account all of these different types of revisions, just how reliable are early government statistical releases? How close do early releases come to capturing the movements that we see in the data available to us today? Let’s start with manufacturing capacity utilization, which is compiled by Federal Reserve Board staff in Washington, D.C. As shown in the table in Figure 5, 87 percent of the variation in today’s capacity utilization data was captured in the Board’s initial releases. Revisions over the next three months raise the fraction of variation explained only slightly, to 88 percent. Even after two years of revisions, 6 percent of the movements we observe today are left unexplained by the Board’s estimates.
For the unemployment rate, the story is very different. The unemployment data are unrevised except when seasonal factors are updated. Because these updates are small, the initial unemployment-rate estimates capture essentially all of the information that’s in today’s data.
Results for real growth as measured by gross domestic product (GDP), industrial production, and nonfarm jobs, and for inflation as measured by the GDP and PCE price indexes and the CPI, are similar to those for capacity utilization: revisions add little to reliability until one or two years after the initial statistical release. However, revisions to 12-month CPI inflation—driven entirely by changes in seasonal factors—are small.
The message from Figure 5 is that the most important revisions are those undertaken to incorporate new data from surveys and censuses conducted at a frequency of once per year or less. The revisions in the month or two immediately after the government’s initial releases and revisions due to reestimation of seasonal factors contribute relatively little new information.
Figure 5
Useful supplements to government statistics. Are there useful alternatives or supplements to government data that are not subject to large revisions? Yes, to begin with, there are formal business and consumer surveys like those published by the Institute for Supply Management, various Federal Reserve Banks, the University of Michigan, and the Conference Board. If the results of these surveys are revised at all, it is only from re-estimation of seasonal factors. There are also less-structured surveys like the “go-arounds” that are held at the regional Federal Reserve Bank directors’ meetings, and the calls that Reserve Bank presidents and their staffs make to business contacts in advance of each FOMC meeting. Studies have shown that some of these surveys have information beyond that available from real-time government statistical releases.
A big advantage of surveys of this type is their timeliness. The Institute for Supply Management’s manufacturing index, for example, is published the first business day of each month—about two weeks before the Federal Reserve Board’s index of manufacturing output. A downside is that participants often are not selected scientifically and may not be representative of the general population. Moreover, anecdotal accounts, like those contained in the Beige Book, can be difficult for inexperienced readers to interpret. That’s one reason our in-house regional economists are so important.
Commodity and financial-asset prices provide other useful supplements to government statistical releases. The former have historically provided early signals of emerging inflation pressures and the strength of the manufacturing sector, while quality and maturity spreads based on financial-asset prices are some of our most reliable indicators of overall real growth prospects.
Commodity and financial-asset prices have the advantage that they are available on a daily basis or even minute-by-minute. A problem is that although the indicators themselves are not subject to revision, their interpretation is. For example, as more manufacturing activity has shifted overseas, the correlation between commodity prices and the strength of the U.S. manufacturing sector has declined. Oil-price movements were once mostly driven by changes in world oil supply. Now, shifts in world demand are also important. ...
A Recipe for Trouble: Confusing Revised with First-Release Data
Seriously misleading conclusions and subpar forecasting results are likely when analysts and policymakers treat heavily revised and first-release data as if they are interchangeable. Let’s look at an example from the realm of inflation forecasting.
Lies, damned lies, and the markup. According to Benjamin Disraeli, “There are three kinds of lies: lies, damned lies, and statistics.” One statistic with great potential to mislead is a measure of profitability called “the markup.” It equals the dollar value of the goods and services firms produce, less the cost of materials and supplies, all divided by labor compensation. When the markup exceeds 1, firms’ revenues more than cover their variable costs.
The markup is potentially of interest for several reasons. First, it is the reciprocal of labor’s share of the value added to production by U.S. firms. When you hear someone say that labor’s share of aggregate output or aggregate income is at a near-record low, that’s equivalent to the statement that the markup is at a near-record high. In the same vein, when you hear that real wage growth has been lagging behind labor productivity growth, that’s equivalent to the statement that the markup has been rising. Finally—and of greatest importance for monetary policy—whenever the markup is unusually high, theory predicts that competition between firms should gradually drive it back down. That means that a high markup should act as a restraining influence on future inflation. Alan Greenspan gave prominent attention to the theoretical link between profit margins and future inflation in his July 2004 testimony that accompanied release of the Federal Reserve’s Monetary Policy Report to the Congress.
The markup and inflation. The strong correlation between the markup and inflation forecast errors made by professional forecasters certainly suggests that the markup deserves policymakers’ attention. In Figure 7, the forecast errors made by the professional forecasters who participate in the Blue Chip survey are measured on the vertical axis. The markup is measured on the horizontal axis. A point is plotted for each year from 1984 through 2002, showing the relationship between the markup at the end of the prior year and the Blue Chip forecast error. The positive slope of the scatter of points means that professional forecasters have systematically overpredicted inflation when the markup is high, and underpredicted inflation when the markup is low. Either professional forecasters are ignoring important information, or there’s something not quite right with this chart.
Figure 7
What’s “not quite right,” of course, is that the markup estimates available to us today are not the markup estimates that were available to these forecasters. Sure enough, when we replace today’s markup estimates with the first-release estimates available to forecasters in real time, the correlation between the markup and inflation completely disappears (Figure 8). The markup may be useful for understanding inflation after the fact, but it’s useless for predicting inflation.
Figure 8
Poor forecasts from confusing current with real-time data. Indeed, the markup is worse than useless for forecasting if you naively assume that the relationship between today’s markup estimates and inflation also describes the relationship between first-release markup estimates and inflation. On its own, the Blue Chip survey successfully anticipates 67 percent of the variation in next year’s inflation. If you conduct an after-the-fact exercise in which you supplement Blue Chip inflation forecasts with today’s markup data, it appears that you can increase predictive power to 79 percent. However, as we’ve discussed, this exercise is artificial, because today’s markup data would not actually have been available in real time.
Unfortunately, the fact that only first-release data are available for actual forecasting all too often doesn’t stop analysts from using revised data to estimate their forecasting equations. In the case of inflation, if you estimate using revised markup data and then forecast by plugging in first-release data as they become available, predictive performance is substantially worse than if you had ignored the markup entirely: only 56 percent of the variation in next year’s inflation is successfully anticipated.
The message is that if you’re going to be forecasting with first-release data, the correct thing to do is to estimate using first-release data.
Early estimates of the markup nearly worthless. The consequences from confusing revised with first-release data are especially severe in the case of the markup because real-time markup data are of such poor quality. You can see the poor quality of real-time markup estimates in Figure 9, which shows the latest data in blue and the first-release estimates in red. Over the entire period shown, the first-release data account for only 5 percent of the variation that we see in today’s markup data. Since 1990, they account for only 2 percent.
Figure 9
Even with the benefit of a year’s worth of revisions—the green line in Figure 10—markup estimates account for no more than 20 percent of today’s markup variation. So, be a little skeptical about claims that labor’s share of output is at a near-record-low level, or that high profit margins are going to restrain inflation, until the data have been through several annual revisions.
Figure 10
Summary
The main conclusions from this review of data revisions are as follows:

Not all revisions are created equal. The contributions of seasonal and month-to-month revisions to the accuracy of government statistical estimates are generally minor. It’s the less-frequent annual, comprehensive, and benchmark revisions that really matter.
By supplementing the government’s formal statistical releases with information from other sources, it’s sometimes possible to obtain a more accurate picture of the economy. At the Dallas Fed, we’ve had particular success using unemployment insurance tax records to make early updates to Texas jobs data.
That revised data show one variable leading another says next to nothing about whether the first variable is of any practical use in forecasting the second.
Finally, forecasting relationships estimated with heavily revised data often perform poorly when applied to the first-release data that are available in real time.

Posted by Mark Thoma on October 5, 2006 at 12:58 AM in Economics, Methodology, Policy | Permalink