Uncertainty in paleorecords built from foraminifera

PSU Solver applied to a paired record of planktic foraminiferal measurements in the Pigmy Basin.

PSU Solver applied to a paired record of planktic foraminiferal measurements in the Pigmy Basin.

We have a new paper out in Paleoceanography which describes a new computational toolkit that I built during my Ph.D. to characterize uncertainty in foraminiferal reconstructions of temperature and salinity. The toolkit, called Paleo Seawater Uncertainty Solver (or PSU Solver), constrains analytical, sampling, calibration, dating, and preservation-based uncertainty in reconstructions that use paired measurements of stable isotopes of oxygen (δ18O; apologies that I cannot make letters superscript in this interface) and magnesium-to-calcium (Mg/Ca) ratios in foraminifera (or forams: a type of plankton that deposit calcium carbonate shells). In the paper, we show how signficant this characterization of uncertainty can be while inferring climate processes in the past using forams.

δ18O has been measured in foram-calcite for over half-a-century to understand past climatic processes. In a closed system, changes in foram-δ18O ought to be governed by thermodynamics. However, in the open ocean, the foram-δ18O composition reflects not just the temperature of the seawater, but also the δ18O content of the seawater in which the foram grew its calcite shell (δ18Osw; where sw is in subscript and stands for 'seawater'). This parameter, δ18Osw, which may vary independent of temperature, can be used as a proxy for salinity changes, and on longer timescales, a proxy for ice-volume (more info here and here). The problem is, if you only measure the δ18O forams, how do you know which changes are due to temperature and which changes are due to δ18Osw of the ocean? In other words, the foram-δ18O signal is convoluted by temperature and salinity/ice-volume signals.

This is where the Mg/Ca paleothermometer comes into the picture: over the last two decades, researchers have shown that Mg/Ca ratios in forams can give quantitative insights into past seawater temperature changes i.e. foram-based Mg/Ca varies as a function of temperature! This appears to solve the deconvolution problem – if one measured both Mg/Ca and δ18O on co-deposited forams, then you could solve for δ18Osw and temperature of the seawater where the forams lived. Mathematically,

  • δ18O = g(T, δ18O­sw)
  • Mg/Ca = f(T)

So, with both of these measurements on the foram shells, one could tease out the δ18Osw and temperature of the seawater to infer past climate change. This technique has been used successfully to understand a variety of significant climate processes (we list many of them in our paper).

However, as more nuanced calibrations (i.e. as functions f and g above keep getting updated) and inferences are made from these paired measurements, and as climate model-data comparisons increasingly demand for more quantitative paleorecords, there is a strong need to quantify the uncertainty in the above deconvolution of temperature and δ18O­sw. Transferring analytical (/measurement/instrumental) errors, sampling uncertainty due to the number of foraminifera used in the measurement, calibration uncertainties, preservation effects, and age-model uncertainties into temperature and δ18O­sw space is not straightforward due to a variety of reasons, one of which is that the Mg/Ca-T relationship is non-linear in forams. Theoretical error propagation exercises predict uncertainties way too large to interpret even glacial-interglacial signals where such practices cannot handle complex issues such as the influence of salinity on Mg/Ca variability and the non-stationarity of the relationship between salinity and δ18O­sw.

In any case, all these things call for a computational approach to address quantitative uncertainty in foram paleoclimate. This is what PSU Solver does. The code is written in MATLAB and is available here, on Mathworks, or on GitHub. It can produce uncertainty profiles for temperature, δ18O­sw, and salinity paleorecords from paired foramininferal δ18O­ and Mg/Ca datasets. It can be as customized as the user wants it to be (or as simple as a user wants it to be) where the user dictates all options that they require. The user input can be as easy as time, δ18O,­ and Mg/Ca, if so desired.

Go ahead, download it, and try to play around with the code (here are the help files)! It can easily be coupled to more sophisticated algorithms or different sampling schemes as well. Furthermore, there’s an option to couple output from BACON to incorporate radiocarbon-based uncertainty in PSU Solver as well. I will have a post up shortly that explains how PSU Solver works in detail soon. Until then, here’s the type of plot that you could be producing right now (paleorecord from the Pigmy basin, Gulf of Mexico). Go try it out! We show that particular excursions in your reconstructions may (or may not) be sensitive to easily-changable options in the PSU Solver algorithm, and hence be an artifact of transposition issues. Do let me know if you have any questions/doubts/criticisms!

P.S. If you are not a super expert in coding (which I am not either), PSU Solver has been written keeping us in mind: it is not meant to intimidate! Try it out...! Thanks to Julie Richey for being patient with the algorithm and also to Chris Maupin for testing PSU Solver.

Sticky Statistics: Getting Started with Stats in the Lab

Courtesy:   xkcd

Courtesy: xkcd

A strong grasp of statistics is an important tool that any analytical laboratory worker should possess. I think it is immensely important to understand the limitations of the process by which any data is measured, and the associated precision and accuracy of the instruments used to measure said data. Apart from analytical constraints, the samples from which data are measured aren't perfect indicators of the true population (true values) and hence, sampling uncertainty must be carefully dealt with as well (e.g. sampling bias).

In most cases, both analytical (or measurement) uncertainty and sampling uncertainty are equally important in influencing the outcome of a hypothesis test. In certain cases, analytical uncertainty may be more pivotal than sampling uncertainty, whereas in others, sampling uncertainty may prove to be more influential to the outcome while testing a hypothesis. Regardless, in all these cases, both analytical and sampling uncertainty must be accounted for when testing (and conceiving) a hypothesis.

Consider a paleoclimate example where we measure stable oxygen isotopes in planktic foraminiferal shells with a mass spectrometer whose precision is 0.08‰ (that's 0.08 parts per 1000), based on known standards. With foraminifera, we take a certain number of shells (say, n) from a discrete depth in a marine sediment core and obtain a single δ18O number for that particular depth interval. This depth interval represents Y years, where Y can represent decades to millennia depending on the sedimentation rate at the site where the core was collected. The lifespan of foraminifera is about a month (Spero, 1998). Therefore the measurement represents the mean of n months in Y years. It does not give you the mean of the continuous δ18O during that time interval (true value). Naturally, as n increases and/or Y decreases, the sampling uncertainty decreases. There may be several additional sampling complications such as the productivity and habitat of the analyzed species' shells that may bias the data to say, summer months (as opposed to a mean annual measurement), or deeper water δ18O (as opposed to sea-surface water) etc. Hence, both foraminiferal sampling uncertainty (first introduced by Schiffelbein and Hills, 1984) along with the analytical uncertainty must be considered while testing a hypothesis (e.g. "mean annual δ18O signal remains constant from age A to age D" - the signal-to-noise ratio invoked by your hypothesis will determine which uncertainty plays a bigger role).

Here are two recent papers that are great starting points for working with experimental statistics in the laboratory (shoot me an email if you want pdf copies):

  1. Know when your numbers are significant - David Vaux
  2. Importance of being uncertain - Martin Krzywinski and Naomi Altman

Both first-authors have backgrounds in biology, a field which I am led to believe that heinous statistical crimes are committed on a weekly (journal) basis. Nonetheless, statistical crimes tend to occur in paleoclimatology and the geosciences too (and a myriad of other fields too I'm sure). The first paper urges experimentalists to use error bars on independent data only:

Simply put, statistics and error bars should be used only for independent data, and not for identical replicates within a single experiment.

What does this mean? Arvind Singh, a friend and co-author at GEOMAR (whom I have to thank for bringing these papers to my attention), and I had an interesting discussion that I think highlights what Vaux is talking about:

Arvind: On the basis of Vaux's article, errors bars should be the standard deviation of 'independent' replicates. However, it is difficult (and almost impossible) to do this for my work, e.g., I take 3 replicates from the same Niskin bottle for measuring chlorophyll but then they would be dependent replicates so I cannot have error bars based on those samples. And as per Vaux's statistics, it appears to me that I should've taken replicates from different depths or from different locations, but then those error bars would be based on the variation in chlorophyll due to light, nutrient etc, which is not what I want. So tell me how would I take true replicates of independent samples in such a situation. I've discussed this with a few colleagues of mine who do similar experiments and they also have no clue on this.

Me: I think when Vaux says "Simply put, statistics and error bars should be used only for independent data, and not for identical replicates within a single experiment." - he is largely talking about the experimental, hypothesis-driven, laboratory-based bio. community, where errors such as analytical error may or may not be significant in altering the outcome of the result. In the geo/geobio community at least, we have to quantify how well we think we can measure parameters especially field-based measurements, which easily has the potential to alter the outcome of an experiment. In your case, first, what is the hypothesis you are trying to put forth with the chlorophyll and water samples? Are you simply trying to see how well you can measure it at a certain depth/location such that an error bar may be obtained, which will subsequently be used to test certain hypotheses? If so, I think you are OK in measuring the replicates and obtaining a std. dev. However, even here, what Vaux says applies to your case, because a 'truly independent' measurement would be a chlorophyll measurement on a water sample from another Niskin bottle from the same depth and location. This way, you are removing codependent measurement error/bias which could potentially arise due to sampling from the same bottle. So, in my opinion, putting an error bar to constrain the chlorophyll mean from a particular depth/location can be done using x measurements of water samples from n niskin bottles; where x can be = 1.

While Vaux's article focuses on analytical uncertainty, the second paper details the importance of sampling uncertainty and the central limit theorem. The Krzywinski and Altman article introduced me to the Monty Hall game show problem, which highlights that statistics can be deceptive on first glance!

Always keep in mind that your measurements are estimates, which you should not endow with “an aura of exactitude and finality”. The omnipresence of variability will ensure that each sample will be different.

In closing, another paper that I would highly recommend for beginners is David Streiner's 1996 paper, Maintaining Standards: Differences between the Standard Deviation and Standard Error, and When to Use Each, which has certainly proven handy many times for me!