Thursday, September 3, 2015

Wave Polarization

[Originally published 2015-09-03]
Waves of the Orthogonal and Collinear Varieties
In a spatially 3D universe, one encounters two major types of waves: transverse and longitudinal. The waves are categorized into these two groups in response to the question: Does the wave vary parallel or perpendicular to its direction of propagation?

Transverse wave.
"Transverse" means to crossover something---and in this case, a "transverse wave" means that the wave fluctuates across ("transverse to") the wave's direction of propagation: the fluctuations take place in the 2D plane perpendicular the the wave's propagation, and so it could also be called an orthogonally-fluctuating wave.

Longitudinal wave.
"Longitudinal" means to extend or move collinearly with something instead of across it. So a longitudinal wave is a wave that varies parallel and anti-parallel (i.e., collinearly!) to the wave's direction of propagation---just compressions and rarefactions along one dimension. If we call transverse waves orthogonally-fluctuating waves, then longitudinal waves might be called collinearly-fluctuating waves.

Explosions, Fight Scenes, and Crazy Car Chases

"I ain't never goin' back to school!"

That was in 2001. I had just graduated high school... It's now 2015. I'll be getting my Ph.D. in a couple months. Clearly Young Kevin didn't know squat about his future.

"Squat?" my timid reader asks, "Are you allowed to say 'squat' on this blog?"

The answer is, "No, never. Replacement profanity is not allowed." However, since I'll likely never read this again after I hit publish, Future Kevin will never know, and what Future Kevin doesn't know, can't hurt Future Kevin! (That's a lie. I'll probably read it twenty five hundred times or so.)

Wednesday, September 2, 2015

Hydrologists be Gettin’ all Epistemological


The solar wind is constantly crashing into the earth, alternating between times of turbulent cacophony and moments of serenity.

On a finer scale, we see variations in its density, bulk speed, and intrinsic magnetic field---sometimes in sync, and sometimes not so much. It is clear that certain features of the solar wind are correlated with various activities in Earth's ionosphere and magnetosphere, and as a scientist studying this stuff, my day-to-day is figuring out how to transform such correlative knowledge into a story of causation...
 
But how?

Yesterday I forayed into the realm of hydrology---a field where this question has also been taken quite seriously. Although a stranger to hydrological parlance, I found the mathematical landscape (stats, time series, modeling) and philosophical quandaries ("This high correlation cannot be for naught, can it?!") were familiar.

I came across many fascinating papers (see "Further Reading" below)---and while I skimmed them all, I only have time today to ponder about and quote from one of them. The paper I spent the most time with was James W. Kirchner's 2006 paper, "Getting the right answers for the right reasons: Linking measurements, analyses, and models to advance the science of hydrology."

The accuracy of a broken clock illustrates at least part of the article's sentiment: though broken and frozen at 4:40, it accurately tells the time twice a day --- and approximately does so for a couple minutes just before and after ("good enough for most applications!").

Lessons Learned:  If the modeler is careful not to use the clock at times departing too far from 4:40, then this broken clock is a great model for the time of day!


The point is, oftentimes misplaced faith is put into mathematical models describing some physical phenomenon. It might be that the model is only good for one type of condition ("late afternoon"), yet it is used to say things about other types of conditions ("early morning").

Sorry, I'm being purposely vague here... A concrete example might be a model that assumes stationarity for a geophysical time series: by its construction, it cannot---without modification---describe or predict the nonstationary features of the actual recorded time series.

Over-parameterization of a model was a hot topic in the paper: the fundamental pitfall of over-parameterizing is that too many extraneous degrees of freedom allow one to fit their model to almost any data set, even if that model is wrong or the data is bad. This allows one to describe a phenomenon ("get the right answers") using a model that does not represent the reality ("for the wrong reasons").

"Whatever," says the token engineer, "If it works, it's good enough for me." This view has some merit given the model is consistently able to explain independent data sets or make accurate predictions, etc. However, the token scientist however is not so satisfied: "But why these parameters?" the token scientist asks. "And why is it that parameter set of this model is not unique? Certainly there is one true storyline going on in reality!"

Kirchner amusingly remarks that being able to tune parameters in this way "violates a basic principle of mathematical modeling, namely that the constants should stay constant while the variables vary."

He argues that over-parameterization is not only a cheap way to the right answers (or worse, the "right" answers), but fundamentally hides from us the true structural and geometric information that is inherent to the phenomenon under study. "Whereas the problem of parameter identification has been emphasized in the hydrologic literature, the more fundamental (and difficult) problem of structural identification has received less attention than it deserves. Likewise, whereas many hydrologists recognize that over-parameterization makes parameter identification problematic, it is less clearly understood that over-parameterization also makes structural identification difficult. Parameter tuning makes models more flexible, and thus makes their behavior less dependent on their structure. This in turn makes validation exercises less effective for diagnosing models' structural problems. By making it easier for models to get the right answer, over-parameterization makes it harder to tell whether they are getting the right answer for the right reason."

Some people might say, "Oh, but what's the harm in a little parameter tuning, Kirch?" to which the Kirch Man says, "Very little parameter tuning is still too much!" He discusses a type of hydrological study in which the data sets contain only enough information to constrain simple models with up to four free parameters, and another study that despite using detailed data sets could not constrain a six-parameter model. "Ok," says my reader, "So just use four or five free parameters. Gah, what's the big deal?!" The big deal, Kirchner says, is that many types of hydrological models come chock full of free parameters---dozens of them!

Hydrologic Models
From what I gather, hydrologists largely use two classes of models: lumped-parameter and spatially-distributed.

A lumped parameter model is like a circuit diagram: in reality you have a voltage applied across a copper wire made up of jazillions of atoms and used to power a light bulb (another jazillion atoms, heat, photons, and the works!), but in the lumped-parameter model you have a 1-dimensional loop that has a few symbols representing macroscopic, emergent features like resistance and a continuous current.
How do you usefully trivialize the existence of a jazillion atoms?
Lumped parameters, baby!
Lumped-parameter models like circuit diagrams are obviously useful, but they operate on a low-resolution scale.
Spatially-distributed models seem to be just the opposite, going for tons of geographic detail and often suffering from a proliferation of free parameters due to spatial disaggregation. ("Dis-aggra-wha?" you ask? Well, when you aggregate, you lump things together, so when you disaggregate, you take them apart --- you want the details, not the averages. Spatial disaggregation is also called downscaling: it is the process of mapping information from a coarse spatial scale to a finer one while maintaining consistency with the original dataset." For the time series analysts out there, in the context of a time series, temporal disaggregation would basically be a form of upsampling---that is, going down in scale is going up in sampling rate.)

Anyway!

Lumped-parameter models can obscure the physics on smaller scales, while spatially-distributed models can obscure the broad-stroke physics at best, and tell a completely wrong story at worst. Kirchner suggests a compromise between lumped-parameter models and spatially-distributed models: quasi-lumped, quasi-distributed. Not so simple as to trivialize away all the detail, but not so detailed as to overparameterize and hide away the structure. More importantly, his mid-scale-type model must be falsifiable: it must be able to succumb to available data sets. In other words, if our data sets can only constrain four parameters, then this model should only have four parameters.

Kirchner says, "All hydrological knowledge ultimately comes from observations, experiments, and measurements... Mathematical tools can at best only clarify (and at worst, obscure or distort) the information those data contain." He therefore argues, "In order to know whether we are getting the right answers for the right reasons, we will need to develop reduced-form models with very few free parameters... The collision of theory and data ... will be more scientifically productive if we can develop models that are parametrically efficient." Furthermore, we need to "develop model testing regimens that compare models against data more incisively."

What types of model testing is he talking about? Well, when I have the time, I'll tell you: for now know, the hydrologic-y technical argument might go like: "Split-sample tests bad. Differential split-sample tests good!"

Further Philosophically-Heightened Hydrological Reading
Further Reading on Temporal Disaggregation




Tuesday, February 3, 2015

IDL 8.2: persistent com.exelis.lmgrd issue observed in Console

Quite a few months back, I opened up my MacBook Pro's Console app to investigate a sleep/hibernation issue I was having --- the first thing I noticed, however, was completely unrelated: the IDL license manager daemon (com.exelis.lmgrd) was desperately trying to run every 10 seconds.

If you're running IDL 8.2 on a Mac, you might be having a similar problem. To check, press command+<Space> to bring up the Spotlight Search, type in "Console", and press <Enter>. If this issue is occurring, you will see an annoyingly persistent string of messages in the Console that look like this:
[Date/Time Info] com.apple.xpc.launchd[1] (com.exelis.lmgrd): Service only ran for 0 seconds. Pushing respawn out by 10 seconds.
This should be happening independent of running IDL, which is a major reason I suspected it was a bug in need of fixing. But I was busy and figured it wasn't a priority issue, so I left it for later...

If this was a dramatic post fueled with catastrophe, then at this point I'd tell you, "But I was wrong. It was extremely high priority and I should have fixed it while I still had the chance!!!"

But this is not a dramatic post. The issue is more like a zit than a cancer. It just annoyed me that it was happening and seemed like it needed to be addressed...so, without further ado -- The Solution!

The Exelis website addresses this issue very nicely. If you have an Exelis user account, then this page should help you just fine. I have not been able to get an Exelis user account for some reason (I've registered twice now to no avail), so that page was merely a gateway page for me --- a gateway to this page!

If you're like me --- free of an Exelis account --- or probably even if you have an account, just head to your Terminal and type:

sudo /bin/launchctl unload -w /Library/LaunchDaemons/com.exelis.lmgrd.plist

The page gives further advice, which I did not need to adhere to (but you might):
After issuing the "launchctl unload" command(s), reboot the system to halt any remnant "lmgrd" and "idl_lmgrd" processes.
And that should do it! I haven't seen the issue since, and it has been about a month.

Alternatively, you might just upgrade to IDL 8.4. But sometimes that's an issue (e.g., $$$).

==================================================
Extraneous information:
Launchctl is an interface to launchd, which manages processes and daemons running on your system. Launchctl allows you to load and unload jobs. See the man page launchctl to learn more about the unload option and -w flag.