Monday, January 28, 2013

Emacs: revival of dynamic scope ?

In my previous posts I wrote on lexical scoping in Emacs Lisp and how it was good for high-level programming. I must admit now: this was somewhat premature excitement.. As it always happens, when more experience comes, people tend to change their minds.

On the picture above you can see that lexically-scoped code performs much worse when there's much "throwing" and "catching".

Poor man's closures

That's what closures in Emacs are. Look at this:

After some time of staring into disassembler's output and trying out miscellaneous examples, you'll see that closed-over environment is made part of the vector of constants of the byte-code function object. Thus, the vector of constants is now being produced on-the-fly by concatenating the closure environment and real constants together, at runtime. See above at commands 14 and 15. The vector is being made by calling to vconcat. Pretty straightforward.

Now, it is quite clear why this piece of code sucks at performance: that's because of making closures a runtime. Dynamic special forms, like catch, unwind-protect and condition-case are forced to turn their bodies into closures to provide appropriate semantics (to support lexical scoping).

Actually, you can see a comment in bytecode.c:

This snippet is pretty much self-explanatory.

Emacs: just a text editor

Honestly, it seems to me now that I've made the same mistake the second time.
Some time ago, I was digging into CL package and discovered that it contained really many flaws and inconsistencies (not to even mention being poorly commented). So I sent a letter to the emacs-dev mailing list, and Steffan Monnier responded to me and said this:

Elisp is not Common-Lisp. CL does try to provide some CL-style functionality, but indeed it has some rough edges in this regard...

This was perfectly true: CL is far from being equal to Common Lisp in semantics. It's just an Emacs library that provides some features from Common Lisp.

Putting aside the flaws of the CL package for now, the main conclusion that I should have drawn from this lesson was this: When you're reasoning about things, take into account more of what they really are than of what you'd wish them to be. :)

Now, the story with lexical scoping. Elisp has been dynamic for a very long time; during this time, lots of modes/packages/libraries/extensions/other stuff was written. Elisp was designed to have dynamic scope from the beginning. Now developers decided to implement lexical scoping in the language; seems they were pursuing the goal of catching up with modern languages and making Elisp more pleasant and powerful to hack in. Apparently, it was tremendously difficult to implement lexical scoping and what it entails in a proper way. So they did what they could, what was practical and reasonable for them to do.

In my opinion, "proper" lexical scoping support includes proper optimization:

  1. Unrolling local functions - the most trivial. 
  2. Avoid making closures for local functions. 
  3. Avoid placing closure environment on the heap for downward functions (those that are passed as arguments only down the stack). 
  4. Reasonably efficient closure implementation. The closure environment allocation cost should be approximately equal to the cost of allocating a necessary chunk of memory on the heap (or using another allocation scheme). 
This is sad, but the current Emacs byte compiler does neither of the above in a proper way (or even does nothing at all for some points).

So, the same rule applies here: Emacs lexical scoping is not mature. Don't compare it with implementation of closures in, say, PLT Scheme of CPython. Emacs is a baby in comparison with these.

What to do then ?

I think Elisp hackers should keep in mind that Emacs is just a text editor. Yes, it is highly customizable, but it is a text editor. It is difficult to accomplish really complicated programming task in Elisp, despite the fact that Elisp is a Lisp dialect. Python or Ruby or Scheme or Common Lisp would all do better.

So we should just use proper tool for the job. Elisp is okay for making scripts running into Emacs and performing some text-manipulation work. Elisp is okay even for relatively large and complicated scripts running into Emacs, but that's it.

There's no point for Emacs users to try to optimize Emacs byte-compiler. First, this is a job for Emacs developers; second, in most cases Emacs doesn't really need those optimizations, 'cause the main source of sluggishness in normal work is a text manipulation, and the care has already been taken to make it run fast enough - by writing in C crucial low-level primitives and algorithms.

So, dynamic scope or lexical scope ?

I think both ways of Elisp programming are viable; you may use new lexical features or may as well refrain from using them. Both decisions have their own strong and weak sides.

As for lexical scoping, it is more convenient for programming. You will most probably feel much less pain trying to do higher-order function stuff in Emacs Lisp if you stick to lexical scoping.

Demerits of lexical scoping are:
  1. Performance pitfalls as described above. Though I'm sure these are not common in typical Elisp programming, so this is not much of a problem; 
  2. Edebug doesn't display values of lexical variables properly (see this); 
  3. Emacs disassembler displays bodies of closures as string constants. This is not the case with dynamically-scoped nested functions. 
As for me, I've decided to return back to old good dynamic Elisp. :) This can be justified by the fact that certain lexical features can be simulated in dynamic Elisp.

Thanks for attention. Good luck !