The solar wind is constantly crashing into the earth, alternating between times of turbulent cacophony and moments of serenity.
On a finer scale, we see variations in its density, bulk speed, and intrinsic magnetic field---sometimes in sync, and sometimes not so much. It is clear that certain features of the solar wind are correlated with various activities in Earth's ionosphere and magnetosphere, and as a scientist studying this stuff, my day-to-day is figuring out how to transform such correlative knowledge into a story of causation...
Yesterday I forayed into the realm of hydrology---a field where this question has also been taken quite seriously. Although a stranger to hydrological parlance, I found the mathematical landscape (stats, time series, modeling) and philosophical quandaries ("This high correlation cannot be for naught, can it?!") were familiar.
I came across many fascinating papers (see "Further Reading" below)---and while I skimmed them all, I only have time today to ponder about and quote from one of them. The paper I spent the most time with was James W. Kirchner's 2006 paper, "Getting the right answers for the right reasons: Linking measurements, analyses, and models to advance the science of hydrology."
The accuracy of a broken clock illustrates at least part of the article's sentiment: though broken and frozen at 4:40, it accurately tells the time twice a day --- and approximately does so for a couple minutes just before and after ("good enough for most applications!").
Lessons Learned: If the modeler is careful not to use the clock at times departing too far from 4:40, then this broken clock is a great model for the time of day!
The point is, oftentimes misplaced faith is put into mathematical models describing some physical phenomenon. It might be that the model is only good for one type of condition ("late afternoon"), yet it is used to say things about other types of conditions ("early morning").
Sorry, I'm being purposely vague here... A concrete example might be a model that assumes stationarity for a geophysical time series: by its construction, it cannot---without modification---describe or predict the nonstationary features of the actual recorded time series.
Over-parameterization of a model was a hot topic in the paper: the fundamental pitfall of over-parameterizing is that too many extraneous degrees of freedom allow one to fit their model to almost any data set, even if that model is wrong or the data is bad. This allows one to describe a phenomenon ("get the right answers") using a model that does not represent the reality ("for the wrong reasons").
"Whatever," says the token engineer, "If it works, it's good enough for me." This view has some merit given the model is consistently able to explain independent data sets or make accurate predictions, etc. However, the token scientist however is not so satisfied: "But why these parameters?" the token scientist asks. "And why is it that parameter set of this model is not unique? Certainly there is one true storyline going on in reality!"
Kirchner amusingly remarks that being able to tune parameters in this way "violates a basic principle of mathematical modeling, namely that the constants should stay constant while the variables vary."
He argues that over-parameterization is not only a cheap way to the right answers (or worse, the "right" answers), but fundamentally hides from us the true structural and geometric information that is inherent to the phenomenon under study. "Whereas the problem of parameter identification has been emphasized in the hydrologic literature, the more fundamental (and difficult) problem of structural identification has received less attention than it deserves. Likewise, whereas many hydrologists recognize that over-parameterization makes parameter identification problematic, it is less clearly understood that over-parameterization also makes structural identification difficult. Parameter tuning makes models more flexible, and thus makes their behavior less dependent on their structure. This in turn makes validation exercises less effective for diagnosing models' structural problems. By making it easier for models to get the right answer, over-parameterization makes it harder to tell whether they are getting the right answer for the right reason."
Some people might say, "Oh, but what's the harm in a little parameter tuning, Kirch?" to which the Kirch Man says, "Very little parameter tuning is still too much!" He discusses a type of hydrological study in which the data sets contain only enough information to constrain simple models with up to four free parameters, and another study that despite using detailed data sets could not constrain a six-parameter model. "Ok," says my reader, "So just use four or five free parameters. Gah, what's the big deal?!" The big deal, Kirchner says, is that many types of hydrological models come chock full of free parameters---dozens of them!
From what I gather, hydrologists largely use two classes of models: lumped-parameter and spatially-distributed.
A lumped parameter model is like a circuit diagram: in reality you have a voltage applied across a copper wire made up of jazillions of atoms and used to power a light bulb (another jazillion atoms, heat, photons, and the works!), but in the lumped-parameter model you have a 1-dimensional loop that has a few symbols representing macroscopic, emergent features like resistance and a continuous current.
|How do you usefully trivialize the existence of a jazillion atoms? |
Lumped parameters, baby!
Spatially-distributed models seem to be just the opposite, going for tons of geographic detail and often suffering from a proliferation of free parameters due to spatial disaggregation. ("Dis-aggra-wha?" you ask? Well, when you aggregate, you lump things together, so when you disaggregate, you take them apart --- you want the details, not the averages. Spatial disaggregation is also called downscaling: it is the process of mapping information from a coarse spatial scale to a finer one while maintaining consistency with the original dataset." For the time series analysts out there, in the context of a time series, temporal disaggregation would basically be a form of upsampling---that is, going down in scale is going up in sampling rate.)
Lumped-parameter models can obscure the physics on smaller scales, while spatially-distributed models can obscure the broad-stroke physics at best, and tell a completely wrong story at worst. Kirchner suggests a compromise between lumped-parameter models and spatially-distributed models: quasi-lumped, quasi-distributed. Not so simple as to trivialize away all the detail, but not so detailed as to overparameterize and hide away the structure. More importantly, his mid-scale-type model must be falsifiable: it must be able to succumb to available data sets. In other words, if our data sets can only constrain four parameters, then this model should only have four parameters.
Kirchner says, "All hydrological knowledge ultimately comes from observations, experiments, and measurements... Mathematical tools can at best only clarify (and at worst, obscure or distort) the information those data contain." He therefore argues, "In order to know whether we are getting the right answers for the right reasons, we will need to develop reduced-form models with very few free parameters... The collision of theory and data ... will be more scientifically productive if we can develop models that are parametrically efficient." Furthermore, we need to "develop model testing regimens that compare models against data more incisively."
What types of model testing is he talking about? Well, when I have the time, I'll tell you: for now know, the hydrologic-y technical argument might go like: "Split-sample tests bad. Differential split-sample tests good!"
Further Philosophically-Heightened Hydrological Reading
- 2014: Hydrology: A Science for Engineers
- 2003: Seibert: Reliability of Model Predictions Outside Calibration Conditions (pdf)
- 2002: Beven: Towards a Coherent Philosophy for Modeling the Environment
- 1996: Kirchner et al: Testing and Validating Environmental Models
- 1993: Beven: Prophecy, Reality, and Uncertainty in Distributed Hydrological Modeling (pdf)
- 1992: Beven and Binley: The Future of Distributed Models: Calibration and Uncertainty Prediction (pdf)
- 1986: Klemes: Operational Testing of Hydrological Simulation Models
- 1986: Klemes: Dilettantism in Hydrology: Transition or Destiny?