Hey gbulmer! Great questions that we don't have great answers for...
First off, I can't remember if we updated the bootloader docs with our final-final Maple rev3 restart procedure: we now look for a magic ASCII string ("1EAF") coming in a single USB packet straight after the falling edge of DTR.
What a horrible ugly hack!
Our original plan seemed the most elegant to me: a CDC device for doing serial comms and a DFU device for flashing new firmware, both of which are the appropriate generic USB class device types, and thus should have been supported on any modern operating system. The DFU code would sit in the background; when upload activity started on that device the background code would stop CDC, clear the usercode stack, and enter the bootloader routines. However, turns out neither Mac OSX nor Windows seem to implement the USB spec completely, especially these class devices. Ironically (given the history of device drivers on open source operating systems) this scheme, and all the others we tried, worked just fine on Linux. Sigh. If I remember correctly the particular problem was that windows doesn't support multi-device USB configurations without some special driver hackery. The most recent magic string hackery was required because Mac OSX doesn't handle control over the RTS/DTR lines correctly from userspace.
To get back to your question, DFU was a much simpler protocol to implement, and thus the bootloader code size can in theory be much smaller. I think it can also upload code much faster, and in general is just "the right tool for the job". We also (still) would like to have the option not to compile in USB support in usercode because it takes a lot of memory and the frequent interrupts disrupt timing dependent code; when this is done users have to manually enter perpetual bootloader mode to upload new code.
To summarize, you can certainly could implement a "bootloader" like the one you describe above: it would have to be compiled in to every user program (like ours is now, unfortunately, but just the auto-reset part), and the out of band signals would be very hard to get right on all 3 platforms, and users would not have the option of implementing USB device types other than CDC (eg HID or mass storage). But in the end maybe those restrictions are better than the dfu/cdc headache.
There are other solutions too: I love the simplicity of the mbed's mass storage upload, and the HalfKay protocol implemented by Teensy (which uses generic HID) definitely makes me curious though it's implementation is somewhat controversial (see http://www.pjrc.com/teensy/halfkay_protocol.html and http://fourwalledcubicle.com/blog/archives/617 ). Emulating FTDI is also something I think about a lot: there are popular, stable, well supported drivers that can be installed (or are bundled) for every platform.
[sorry about the wandering structure of this post, i'm kind of sleep deprived ;) ]