Taking the Earth’s Temperature

How the Methods Measure Up

How do we know anything about the Earth’s past climate? Discussions about climate change — its extent, its causes, and what to do about it — often hinge on what we know about our planet’s temperature history. Climate scientists and policymakers routinely talk about the Earth’s “global mean temperature” and compare today’s temperature to a record dating back hundreds of thousands of years. But where does that record come from? And what does it even mean for a single figure to represent the temperature of our entire planet, with its regional diversity and dynamic atmosphere? Scientists have devised ingenious techniques to peer into our planet’s past temperature record, but the picture they give us is a blurry one.

If today you decided that you wanted to know how the climate changes at a certain location, say the base of the Statue of Liberty, you could put a thermometer there and record a measurement at noon every day — or at the beginning of every hour or second, if you want finer resolution. This would ensure that you have a thorough record of fluctuations in temperature at that particular site, from this day forward.

Thanks to scientists (and scientifically-minded amateurs), we have such temperature records dating back more than two centuries for some particular sites. But in discussions of global climate change, the figure of interest is not just the temperature at the feet of Lady Liberty — no single site is wholly representative of the Earth’s complicated climate system — but rather a number representing the Earth’s temperature as a whole. This figure is often cited but rarely explained.

To obtain a figure that represents any large area — say, the temperature of Minnesota — requires measurements from multiple locations. From these separate measurements researchers construct an artificial figure that isn’t meaningful to someone living in a particular place — say, Minneapolis — because it smoothes over particularities. But that average figure can still be a useful indicator of broad regional trends.

As we increase the number of temperature sensors strewn across the globe, we would intuitively expect improved accuracy for computing a globally-averaged temperature. But how many sensors distributed around the globe does it actually require to obtain a global temperature with an acceptably small uncertainty associated with it? Statistical studies suggest that a system of 50 to 100 reasonably distributed temperature stations around the globe would be sufficient to reproduce an accurate representation of most localized temperature anomalies and to account for isolated effects such as urban heat islands. Various statistical techniques can then be applied to produce a single figure representing a fairly reliable globally-averaged temperature. Much as opinion polls rely on sampling and margins of error, climate scientists make statistical claims about the validity of their global temperature figure, saying that they know with 95 percent certainty that we have enough sensors and that they are widely-enough distributed around the planet so as not to allow for thermal anomalies that would result in a global average differing by more than 0.04 degrees Celsius.

This is the level of accuracy claimed with today’s distribution of sensors and analysis techniques. Such a system of sensors with similar capabilities has been reliably and consistently in place for roughly 150 years, allowing scientists to obtain a value for global temperature based on direct measurements back to the mid-nineteenth century, though with a slightly larger uncertainty of about 0.1 degrees Celsius.

Discussions about global climate change, however, involve claims about the Earth’s temperature going back much further than 150 years. To know anything about global temperature prior to the spread of thermometers, we have to rely on proxy indicators — sources of data that are not direct measurements of temperature but that correlate with temperature changes. There are several proxy techniques used by climate researchers, varying widely in usefulness.

Perhaps the dominant proxy used to understand past climate is tree rings — a practice called dendroclimatology. By measuring the widths and densities of a tree’s rings, scientists can tell roughly how favorable or unfavorable to growth were the conditions of that tree’s environment in past growing seasons. Temperature is one of the important factors determining how well a tree can grow, so in many cases there is a correlation between the width or density of a specific ring and the local temperature during the growing season corresponding to that ring.

Depending on the species, trees of interest to scientists can grow anywhere from hundreds to more than a thousand rings before death. The measurement record can be extended beyond the life of a single tree by correlating overlapping patterns in present trees with wood from dead trees preserved on forest floors or even in old buildings. Using this technique, researchers have been able to develop tree ring records extending beyond 10,000 years in some regions.

But just because we know how well trees have grown in a certain region for the last several millennia does not mean that we necessarily know what the temperature in that region has been — chiefly because many other factors influence a tree’s growth, including sunlight, wind, and precipitation, as well as factors such as local competition for nutrients and the presence or absence of pests. There is no way of knowing with any certainty which of these factors caused observed variations in the appearance of tree rings, so measuring the width or density of those rings without further knowledge of the context does not allow for any real conclusions to be made about past temperature.

Sorting out these confounding variables is one of the chief challenges in the field. Researchers attempt to mitigate the effects of variables other than temperature by looking at sets of trees where, for various reasons, these other factors are expected to have been stable. Sets of trees, for which fluctuations in a single variable are thought to dominate the response of ring growth are called limiting stands. In the case of temperature, the limiting stands are generally thought to be near the tree line — the point where a tree is at the limits of its ability to survive, whether due to latitude or elevation.

Dendroclimatologists are further limited to studying trees in certain regions. Trees in the tropics, for instance, generally don’t have a consistently discernable ring structure; the tree rings we take for granted in temperate zones are a result of the cycle of seasonal growth and dormancy. The majority of tree data, then, come from mid- or high-latitude zones of the northern hemisphere.

And even in those zones of interest, tree-ring data can be unreliable. While trees normally grow one ring per year, under severe circumstances, such as a sudden drought or volcanic eruption, tree growth can shut down for a time and then start anew — resulting in multiple rings in a single growing season. The data are further degraded by the fact that it can take a tree years to recover from stresses such as pest infestations or floods, resulting in decreased growth long after the tree’s immediate environmental problems have ceased. As a result of these uncertainties, dendroclimatology has a limited resolution — its data are useful at the level of five or ten years, not at the annual level. The National Academy of Sciences, in its 2006 report on temperature reconstructions of the past two millennia, states that dendroclimatological claims are limited to decadal rather than annual resolution.

One glaring difficulty with the study of tree rings has emerged fairly recently: a discrepancy between what we know about very recent temperature history (thanks to thermometer measurements) and what researchers think tree rings are showing. The growth patterns of many trees appear to be less sensitive to temperature increases today than was thought to be the case prior to several decades ago. This phenomenon, which seems to be more pronounced with certain species of trees, is not well understood, although some scientists have suggested that it could be related to the thinning of the ozone layer or some sort of global dimming. At any rate, because the tree-ring data of the last few decades do not comport with the thermometer measurements, they are intentionally left out of most calibrations (though some recent studies have accounted for the divergence with varying success). If the more recent tree-ring data were included in standard calibrations and reconstructions, it would result in past temperatures looking warmer than scientists believe they were. The temperature spike witnessed in the late twentieth century — the source of much worry about global warming — would have a significantly smaller magnitude if temperature were computed solely in this manner.

Of course, there are several sources of proxy data other than tree rings used to reconstruct the Earth’s past temperature — like samples of ice taken from glaciers, which give scientists data reaching much further back in time than the tree rings. Glaciers accumulate when previous snowfalls are crushed into ice from the weight of more recent snowfalls above. Seasonal cycles of temperature and precipitation lead to discernable annual striations. Researchers can drill down from the surface of the glacier to obtain a core sample — a long record of these bands.

The precipitation that makes up each year’s ice layer is composed of water molecules that hold an important clue to our planetary temperature history. Different isotopes of oxygen are contained within the common compounds found on earth, such as in water, and the ratio of those isotopes — particularly the abundant oxygen-16 and the rare oxygen-18 — changes with temperature in a predictable fashion. By studying the water molecules in each ice layer, scientists can determine the isotope ratio and thus the temperature at the time that layer was formed.

Ice cores don’t just reveal past temperatures; they can also help scientists reconstruct the past concentration of greenhouse gases in the atmosphere. As the snow layers near the top of a glacier are being compressed into ice, they trap tiny bubbles of air. Researchers can measure the concentrations of gases to get a sense of what the Earth’s atmosphere was like at the time those bubbles were trapped. This technique has been especially important in tracking the past correlation between temperature and the atmospheric concentration of carbon dioxide.

A small number of ice samples have come from remote mountain glaciers around the globe, including Mount Kilimanjaro and locations throughout South America and Africa. But for the most part, ice-core researchers have focused their attention on Greenland and Antarctica, both because they are easier to access than mountain glaciers and because of their significant thickness (and hence their significant age). Two of the most impressive cores, the Vostok and EPICA cores, both from Antarctica and more than 3,000 meters long, reveal data going back more than 400,000 and 700,000 years, respectively.

Even these impressive ice cores, however, suffer from a problem inherent to the medium — namely, the measurements become much more uncertain the deeper you look. Near the top of a glacier it is easy to pick out each year’s discrete layer of ice. Deeper down, however, the striations become muddled because of high pressure and glacial flow. This makes dating the layers more difficult, causing the resolution to drop from approximately one year near the top to about 5,000 years beneath a depth of a few hundred meters.

Additional errors arise when studying those trapped air bubbles to track changes in past atmospheric concentrations. It takes a very long time for the layers of snow at the top to be crushed into an impermeable substance; in the meantime, gases freely move between these loosely bound layers and into and out of the atmosphere itself. This results in a smearing effect by which the gases ultimately trapped within an ice layer may differ by several thousand years from when the ice layer began to form. And diffusion rates differ for different gases — lighter molecules move more quickly than heavier molecules at a given temperature — further confounding efforts to reconstruct past atmospheres.

These uncertainties, though significant, do not in any way invalidate the importance of data from ice cores; they merely limit the extent of the claims that can be made. Ice core samples have played, and will doubtlessly continue to play, an invaluable role in temperature reconstructions of the last several hundred thousand years as well as in our understanding of past correlations between temperature and the atmospheric concentration of carbon dioxide. But unless our techniques and our understanding of the sources of error dramatically improve, this approach will continue to be limited by an uncertainty of several thousand years.

Other temperature proxies are, like tree rings and ice cores, useful but impaired by significant sources of error. If these proxies are used in conjunction, however, some of these shortcomings can be overcome. For example, studies of temperature fluctuations based on coral serve as an important complement to the calculations made from tree rings. While the tree ring records are mostly limited to high latitudes, the corals are mostly limited to equatorial regions. The two techniques can be used together to obtain a more detailed and accurate picture of global climate changes.

Temperature information from coral data is inferred in a way similar to the technique used with ice cores: relying on the ratio of oxygen-18 to oxygen-16. This can provide sea-surface temperatures to within about 0.3 degrees Celsius over its data range, although that range is limited to a few centuries. Also, the ratio of oxygen isotopes is affected not just by temperature but also by salinity, causing additional uncertainty. Given these uncertainties, coral data are mostly useful as a confirmatory tool.

Researchers have also found it surprisingly useful to employ historical and cultural events to place bounds on past temperatures in various regions. These can consist of detailed records of produce, newspaper articles about local events (such as the famous account of the annual frost festival held atop the frozen Thames in London), and even landscape paintings showing the extent of glacial advancement. While all of these techniques are helpful in placing bounds on possible temperature values, obviously they are all vastly imprecise and only available for the last few centuries.

Perhaps the oddest technique used by scientists to determine the Earth’s past temperature is that of “thermal boreholes.” Essentially, a thermometer is placed into a narrow hole in the ground to measure temperature as a function of depth. The resulting signature can be used to reconstruct estimates of the surface temperatures of the past at a resolution of multiple decades. In its report on temperature reconstructions, the National Academy of Sciences explained this technique by comparing it to a metal spoon placed in a cup of hot tea. A spoon has high thermal conductivity, so heat from the tea would quickly travel its length, heating it from end to end. But one can imagine an object that conducts heat much more slowly — an object for which it could take an hour for the heat to move from the end submerged in the tea to the tip of the handle. Continuing with this analogy, the temperature at the surface of the Earth is like the temperature of the tea and the slowly conducting spoon is like the Earth. But the temperature at the surface of the Earth is not constant. This is akin to changing the temperature of the tea over time. When our specially-crafted spoon is initially placed in this hot tea, the submerged end will heat up first and the heat will begin to slowly travel the length of the spoon. But if we then cool the tea, the submerged end will take on this new temperature, which will then follow the earlier heat signal down the length of the spoon. The act of measuring the temperature at various points along the spoon is like that of measuring the temperature at various depths in the borehole. However, these adjacent hot and cold regions will mix, giving scientists only an extremely low-resolution glimpse of past surface temperatures at a few select sites for, at best, the past few hundred years. Boreholes can place very broad bounds on recent local temperatures, but other techniques are much more useful.

In the end, it is clear that each of these proxies provides useful data — data that suggest that the warming observed in the twentieth century is unusual. But there are important caveats to keep in mind. First, inherent in each proxy technique are sources of uncertainty that limit its usefulness — and this uncertainty becomes more pronounced the further back in time we attempt to peer. Keeping that uncertainty in mind, responsible scientists must be guarded in making claims about the Earth’s past temperature, especially knowing that claims about our planet’s temperature history are connected to policy proposals under discussion.

Moreover, even if proxy techniques provided temperature information with no uncertainty, we would still have an insufficient number of geographically dispersed sources to make claims about past globally-averaged temperatures with anything approaching the confidence we have in today’s sensors. It takes dozens of carefully-monitored thermometers distributed throughout the world to have an accurate figure for our globally-averaged temperature; proxy sources can only provide a comparable resolution for about the past four centuries. Before around a.d. 1600, however, the errors compound so that any calculated measurement becomes suspect. Claims that 1998 was the hottest year in “at least a millennium,” as made in a paper in Geophysical Research Letters by climate researcher Michael E. Mann, or that “the world is now warmer than it’s been for 2,000 years,” as Philip Jones of the University of East Anglia claimed in an interview with BBC News Online, exceed the resolution of the data and are, at best, imprudent.

As both the National Academy of Sciences and the U.N.’s Intergovernmental Panel on Climate Change have stated, the proxy techniques discussed here are sufficient to show with high confidence that there has been warming in the last century that is anomalous relative to what would have been expected based upon the natural variations of the geologically recent past — and human greenhouse-gas emissions are at least partly to blame. That said, the uncertainties of these techniques make them grossly insufficient to provide the basis for some of the more extreme claims that have been made. We have reason to be skeptical of both those who design elaborate hypotheses to explain away global warming and those who would have us panic.