This is the problem of simultaneity between left and right images, and there are two parts. The first is ensuring that left and right images are captured simultaneously (or very close to one another) when the input is a pair of cameras. The second is ensuring that these images are subsequently displayed simultaneously. These problems are together solved by correlating frames in a movie.
The latter problem has the property that frames captured at slightly different times may still have a satisfactory appearance. As such, provisions should be considered which allow the output end to decide which frames are to be correlated.
Two main techniques for correlating frames are available -- identifying correlated left and right frames with a common tag, and correlating the initial frames and subsequently pairing the -th left and right frames together. Determining the exact advantages, disadvantages and suitability of these techniques will require experimentation with the system.
When tagging frames the most obvious tag to use is systime, the number of seconds since the epoch. Modern systems are able to record this with a suitably high precision (easily supporting 30 frames per second), but this would lower portability somewhat. Also, this would introduce the question of how tags would be embedded into movie files. However, it would allow either the input or output ends to decide how close frames must be in order to be correlated. It would further aid peripheral functions, such as having two cameras begin capturing at the same systime, and determining network latencies in the system.
Another possibility is to have small heartbeat style packets sent across the network between capturing cameras, with each packet having a unique (for example, monotonically increasing integers) identifier that is subsequently used to tag the frames. However, this could place an excess strain on the network in question, possibly degrading the performance of the rest of the system, and it is also overly susceptible to varying network speeds and latencies.
Another option is correlating the first frame, and then assuming that all subsequent frames are correlated (that is, using a ``clapper-board'' technique). This would be suitable, for example, where the frame rate could be set to a fixed value. Also, the initial frames could be correlated by specifying a systime at which they are taken, or by sending a small heartbeat packet across the network, borrowing ideas from earlier solutions.