I think that for something like this, an OSC implementation would probably be more usable.
(noob: OSC is a modern protocol to replace MIDI. You can still tunnel MIDI commands within the OSC protocol, so it doesn't exclude you if you still want to use MIDI. http://en.wikipedia.org/wiki/OpenSound_Control )
As a bonus, CrystalSpace supposedly already supports OSC within the Crystal Entity Layer.
Midi timecode (MTC) is pretty innacturate; frame accuracy
between numerous events (audio-visual) is very time
consuming to pull off.
A normalized SMPTE generator/reader included in Blender
would be an appropriate step in synchronizing (sub-frame
accuracy) & manipulating various picture-sound elements.
Almost mandatory in 'real world' motion picture pre-,
principal-, and postproduction. Systems being equal,
this is a rock-solid & stable method for audiovisual
control.