Economist's View: FRB Dallas: The Perils of Using Preliminary Data

リンク: Economist's View: FRB Dallas: The Perils of Using Preliminary Data.

Evan Koenig of the Dallas Fed has a nice discussion of how data revisions complicate monetary policy:

Through a Glass, Darkly: How Data Revisions Complicate Monetary Policy, by Evan F. Koenig, Economic Letter, Federal Reserve Bank of Dallas: Over the course of any year, we receive a veritable tidal wave of numbers on the U.S. economy's performance—readings on output, inflation, employment, productivity and so much more. Policymakers, business operators, investors and the general public look to these data to make economic decisions. Unfortunately, some early statistical releases only imperfectly reflect what's happening. As more complete and accurate data come out, the view they provide improves.

Looking at preliminary data, policymakers and others may misinterpret what they see, leading to mistakes that could harm the economy. A better understanding of the nature of the revisions that regularly alter the data should lessen the chances of acting on information that doesn't accurately reflect economic realities.

As an example of the potential importance of data revisions for monetary policy, consider the behavior of personal consumption expenditure (PCE) inflation, excluding food and energy. Core PCE inflation is policymakers' preferred measure of trend price change because of the PCE price index's relatively broad coverage and superior tracking of shifts in household spending patterns.

As of November 2003, government data showed that core PCE inflation had been held to a fairly narrow 1 to 2 percent range for several years running—a range that several Fed policymakers subsequently identified as their inflation "comfort zone" (Chart 1A). Citing worries about a possible unwelcome fall in inflation, the Federal Open Market Committee had voted to stimulate the economy by cutting the target federal funds rate at its June 2003 meeting.

Unfortunately, the broad coverage and shifting spending shares that make PCE inflation so attractive have a big, practical disadvantage: They make the measure vulnerable to substantial revision. By December 2003, new data had significantly altered the path of core PCE inflation. It was now apparent that inflation had exceeded 2 percent back in 2001 and—of more pressing concern—had been running at 1 percent or below for four months.

Cut to June 2005 (Chart 1B). Core PCE inflation appeared to have stabilized at about 1.5 percent. Another month of data, however, brought yet another major revision. Concerns about excessively low inflation in 2003 now seemed possibly exaggerated. Just as important, inflation in much of 2004 and 2005 wasn't in the middle of the comfort zone after all, but above its 2 percent upper limit.

Chart 1: The Shifting Inflation Picture

Main Sources of Revisions

Most data revisions fall into one of three categories.

New estimates of seasonal patterns. Most economic data have a discernible seasonal pattern due to predictable weather and holiday effects. Statistical agencies try to strip out this pattern to make it easier to identify the business-cycle movements of concern to policymakers.

But seasonal patterns shift over time and have to be reestimated, which leads to data revisions. Because it generally takes three years of data to estimate seasonal patterns, revisions due to seasonal factors can extend over several years. On the other hand, seasonal patterns shift slowly enough that resulting revisions are usually small.

More complete survey responses. Many government data series are based on survey responses. As new responses are processed and old responses are corrected, statisticians are able to improve the accuracy of earlier estimates of what transpired in any particular month.

For series updated to capture late-arriving, more complete data, the government typically issues one or two revisions in the months immediately after the initial release. Other revisions follow later, at regular intervals, as data from annual surveys, censuses or other sources become available. Revisions due to more complete data are responsible for most of the month-to-month and year-to-year changes in economic data. ...

Series derived from surveys with once-and-for-all monthly deadlines aren't subject to revisions based on new information. They include the unemployment rate, the Conference Board's Consumer Confidence Index, the Institute for Supply Management's manufacturing and nonmanufacturing indexes, and the business-conditions indexes compiled by various Federal Reserve Banks.

Another example is the Consumer Price Index, which is based on retail prices observed and recorded directly by Labor Department employees. Commodity and financial asset prices, of course, are also not subject to this type of revision.

New methods and definitions, applied retroactively. Finally, revisions occur when new calculation methods or new definitions are applied to old data. Significant revisions of this type are relatively infrequent, and their timing can be irregular.

Chart 3: The Case of the Missing RecessionA recent change to the construction of the Conference Board's Composite Leading Index provides a good example. Looking at the index as it appeared in June 2005, we see that it fell nearly every month between April 2000 and the start of the 2001 recession 11 months later (Chart 3). The cumulative decline was 2 percent. But the index fell by an almost identical amount between May 2004 and May 2005 without a recession.

The Conference Board concluded that its leading index was misinterpreting changes in the slope of the yield curve—changes in the difference between long-term and short-term interest rates. In July, the index was reformulated. With one stroke, the 2005 recession warning was eliminated. The seemingly strong record of the leading index is at least in part an illusion due to changes to its construction that have erased its past failures.

Assessing and Enhancing Data

Taking into account these different types of revisions, just how reliable are early government statistical releases? How close do early releases come to capturing the movements we see in the data available to us today?

Let's start with manufacturing capacity utilization, which is compiled by Federal Reserve Board staff in Washington, D.C. Initial releases capture 87 percent of the variation in today's capacity utilization data (Table 1). Revisions over the next three months raise the fraction of variation explained only slightly—to 88 percent. After two years of revisions, 6 percent of the movements we observe today remain unexplained.

Table 1: How Reliable are Goverment Data?

The effects of revisions on real growth as measured by gross domestic product (GDP), industrial production and nonfarm jobs are similar. Revisions add little to reliability until a year or more after the initial statistical release. The same holds for inflation as measured by the GDP and PCE price indexes.

For the unemployment rate and inflation as measured by the Consumer Price Index, the story is very different. These series are unrevised, except when seasonal factors are updated. Because these updates are small, the initial estimates capture essentially all the information in today's data.

This brief survey suggests that the most important revisions are those undertaken to incorporate new data from surveys and censuses conducted once a year or even less frequently. The revisions in the month or two immediately after the government's initial releases and revisions due to reestimation of seasonal factors contribute relatively little new information.

In addition to government data, useful alternatives or supplements exist that aren't subject to large revisions. To begin with, formal business and consumer surveys are published by the Institute for Supply Management, various regional Federal Reserve Banks and the Conference Board. If the results of these surveys are revised at all, it's only from reestimation of seasonal factors.

There are also less-structured surveys, like the roundtables held at the Federal Reserve Bank directors' meetings and the calls Reserve Bank presidents and their staffs make to business contacts in advance of Federal Open Market Committee meetings. Studies have shown that some of these surveys contain information beyond what's available from real-time government statistical releases.[1]

A big advantage of many nongovernmental surveys is their timeliness. The Institute for Supply Management's manufacturing index, for example, is published the first business day of each month—about two weeks before the Federal Reserve's index of manufacturing output. A drawback is that participants often aren't selected scientifically and may not be representative of the general population. Moreover, anecdotal accounts, like those in the Fed's Beige Book, can be difficult for inexperienced readers to interpret.

Key market prices can also add to our understanding of the economy. Commodity prices have historically provided early signals of emerging inflation pressures and the strength of the manufacturing sector. Quality and maturity spreads based on financial asset prices provide some of our most reliable indicators of overall real growth prospects.

Commodity and financial asset prices have the advantage of being available on a daily basis or even minute by minute. A problem is that although the indicators themselves aren't subject to revision, their interpretation is. For example, as more manufacturing activity has shifted overseas, the correlation between commodity prices and the strength of the U.S. manufacturing sector has declined.[2] Oil price movements were once mostly driven by changes in world supplies. Now, shifts in world demand are increasingly important.[3] ...

A Recipe for Trouble

Seriously misleading conclusions and subpar forecasting results are likely when analysts and policymakers treat heavily revised and first-release data as if they are interchangeable. Let's look at an example from the realm of inflation forecasting.

Lies, damned lies and the markup. British politician Benjamin Disraeli famously remarked that "there are three kinds of lies: lies, damned lies and statistics." One statistic with great potential to mislead is a measure of profitability called the markup. It equals the dollar value of the goods and services firms produce, less the cost of materials and supplies, all divided by labor compensation. When the markup exceeds 1, firms' revenues more than cover variable costs.

The markup is interesting for several reasons. First, it's the reciprocal of labor's share of the value added to production by U.S. firms. When you hear someone say that labor's share of aggregate output or aggregate income is at a near-record low, that's equivalent to the statement that the markup is at a near-record high. In the same vein, when you hear that real wage growth has been lagging behind labor productivity growth, that's equivalent to the statement that the markup has been rising. Finally—and of greatest importance for monetary policy—whenever the markup is unusually high, theory predicts that competition between firms should gradually drive it back down. That means a high markup should act as a restraining influence on future inflation. Former Fed Chairman Alan Greenspan gave prominent attention to this link in his July 2004 testimony before Congress.[5]

The markup and inflation. Let's compare inflation forecast errors in the Blue Chip survey of professional forecasters with the markup at the end of the prior year. We find that from 1984 through 2002, forecasters systematically overpredicted inflation when the markup was high and underpredicted inflation when the markup was low (Chart 5A). Either the Blue Chip forecasters have been ignoring important information, or there's something not quite right in this relationship.

What's not quite right, of course, is that the markup estimates available to us today aren't the markup estimates that were available to these forecasters. Sure enough, when we replace today's markup estimates with the first-release estimates available in real time, the correlation between the markup and inflation disappears (Chart 5B). The markup is useful for understanding inflation after the fact, but no help in predicting it.[6]

Chart 5: THe Markup and Inflation

Poor forecasts from confusing current with real-time data. Indeed, the markup is worse than useless for forecasting if you naively assume the relationship between markup estimates and inflation is the same for first-release data and subsequent revisions. On its own, the Blue Chip survey successfully anticipates 68 percent of the variation in the next year's inflation. If you conduct an after-the-fact exercise in which you supplement Blue Chip inflation forecasts with today's markup data, it appears that you can increase predictive power to 77 percent. However, this exercise is artificial because today's markup data wouldn't have been available in real time.

Unfortunately, the fact that only first-release data are available for actual forecasting all too often doesn't stop analysts from using revised data to estimate their forecasting equations. In the case of inflation, if you estimate using revised markup data and then forecast by substituting first-release data as they become available, predictive performance is substantially worse than if you had ignored the markup entirely. Only 57 percent of the variation in next year's inflation is successfully anticipated.

The message is clear: If you're going to forecast with first-release data, the correct thing to do is to estimate using first-release data.[7]

Early estimates of the markup nearly worthless. The consequences of confusing revised with first-release data are especially severe in this case because real-time markup data are poor quality. From 1983 to 2002, first-release markup estimates accounted for only 5 percent of the variation that we see in today's markup data. Since 1990, they've accounted for only 2 percent. Even with the benefit of a year's worth of revisions, markup estimates account for just 21 percent of today's markup variation.

So don't take too seriously claims that labor's share of output is at a record low or arguments that high profit margins are going to restrain inflation—at least not until the data have been through several annual revisions.

Living with Revisions

Caution is essential in interpreting early government reports because many data series are subject to large after-the-fact revisions. When reading government statistical releases, it's best to keep the following in mind:

  • Seasonal and month-to-month revisions generally have little impact on the accuracy of government statistical estimates. It's the less frequent annual, comprehensive and benchmark revisions that really matter.
  • By supplementing the government's formal statistical releases with information from other sources, it's sometimes possible to obtain a more accurate picture of the economy. At the Dallas Fed, we've had success using unemployment insurance tax records to make early updates to Texas jobs data.
  • Revised data showing one variable leading another say next to nothing about whether the first variable is of any practical use in forecasting the second.
  • Forecasting relationships estimated with heavily revised data are unlikely to perform well when applied to first-release data available in real time.

As an example of possible implications for monetary policy, consider inflation targeting. Data revisions potentially affect both how tightly one can realistically expect to control any particular inflation measure and how strongly policy ought to react to early inflation releases. Attempts to target forecasted inflation will benefit if forecasts are as accurate as possible, which requires that heavily revised and early-release data