Saturday, November 15, 2008

Do we really know what oil is?

Few months ago everybody was sure the price of oil was going up-up-up!
Now it's very low and sinking... how is that possible?

''the abiogenic oil theory posits that oil is not formed from plants and animals compressed for millions of years in sediment rock. Instead, oil is a primordial substance created before the formation of Earth, and found deep underground.
...
As early as 1995, a New York Times article quoted Dr. K. K. Bissada, a Texaco geochemist: "I think we pump oil out much faster than oil can come in. ... But from a long-term perspective, I believe that hydrocarbons are coming in from great depths and are filling the newer reservoirs at shallower depths.'' Stop and read that again.

Replenishing oil reservoirs makes perfect sense if we think of oil as akin to magma, which comes from deep in the Earth, rather than a substance created from dead ferns compressed for millions of years. Will oil turn out to be a near-infinite resource?
'' [source]

How many things we "know" are in fact based on one (old) theory?
How many alternative explanations have we (collectively) forgotten?

An example from Computer Science.
A couple of years ago I was following a student working on his thesis, and he wanted to implement a compiler in Java for a simple imperative langauge (a sub-set of Java itself actually), and he told me that his goal was to code the parser manually.

I suggested a recursive-descent parser (RDP), but I feared that Java's grammar was too complicated.
More precisely I was sure that it was impossible to build a RDP for that grammar, because I rememeber from my formal languages course that when you have to build a compiler for some compicated grammar with semantic actions you need in general more than a RDP, since you want to parse and evaluate inherited and synthesized attributes at the same time.

The student was very motivated and I decided to help him, so I looked deeper into the matter. It turns out he was right: it is possible to use a simple RDP even for complicated attribute grammars!
The trick (that I discovered came out recently in the XML-parsing community) is to divide the task in 2: first you make a simple, nice and readable RDP parser, and just build the abstract syntax tree (AST) for your program. Then you manipulate the AST to compute all attributes, with some hand-written code (possibly using a visitor design-pattern).
Great! But...

But I still was sure I learned that it was IMPOSSIBLE!
What happened to those theoretical results and theorems? Are they just wrong now, as they were right in the 70es when they were proven?
Actually that is the key element: the results I studied at university in Italy in the 90es are based on theorems from the 70es.
And at that time it was silly to think to parse an entire program into an AST, keep it all in the memory, then evaluate the attributes of the grammar.
So because of the assumption that memory optimization was central, all standard techniques derived from those theorems force you to build complex bottom-up parsers (that cannot be managed manually, and have very low readability).
So the theorems are of course still right, it is just that some of the assumptions they were based on are not relevant anymore.

To me it looks like this is happening more and more, more frequently and in many areas.

No comments:

Post a Comment