Yes, as gbulmer said, we're trying to avoid the 300ms loop times. Ideally we're hoping for a 20ms loop time.
After some experimentation, it looks like the problem is now tied to SD card writes, not to our calls to micros(). I did several experiments, but here are the four that I thought were interesting.
1. Running the simple code I posted at the top of the thread without the fix to micros() that gbulmer posted results in occasional long loop times.
2. Running the simple code I posted at the top of the thread with the fix that to micros() that gbulmer posted results in no long loops.
3. Running our full code with the fix that gbulmer posted results in long loop times.
4. Running our full code with the fix that gbulmer posted, but with our SD card writes commented out results in no long loops.
Based on that, I'm inclined to believe that at least some of the long loops we were seeing originally were due to the bug in micros(), but now I'm convinced that I don't understand the delays that SD cards throw into the mix. i've started researching the options for DMA'ing data to the SD card, or using other card modes.
Anyway, thanks for all the help guys!
- Andy