I've found that for my program I can't throw away 3ms per character write so I was able to get the timings down enough to meet my "requirements".
Thought this might be useful to other folks / might be a good change to go into the next IDE verison.
This mod to LiquidCrystal.cpp should get your code loop moving faster
It's quite stable on my messy breadboard.
void LiquidCrystal::pulseEnable(void) {
digitalWrite(_enable_pin, LOW);
delayMicroseconds(12);
//delay(1); // Maple lib
digitalWrite(_enable_pin, HIGH);
delayMicroseconds(12); // enable pulse must be >450ns
//delay(1); // Maple lib
digitalWrite(_enable_pin, LOW);
delayMicroseconds(80); // commands need > 37us to settle
//delay(1); // Maple lib
}