B-frame
1.B-Frames
Introduction
This chapter is an introduction to B-frame handling. If you are familiar with the concepts you can safely skip it.
Video frames can be divided among 3 types :
- I-Frame: Intra frame, also called keyframe. They have no reference frame and can be decoded on their own. They can be thought of a jpeg image.
- P-Frame: Predicted frame. They are deduced from the previous frame (I or P) and cannot be built if the decoder has not decoded the previous frames.
- B-Frame: They are decoded from the previous and next I-P frames.
B frames are interesting for two facts. First they have a slightly better prediction. And second and more important, they do not impact the quality of following frames, so they can be coded with lower quality without degrading the whole sequence.
Since B-frames depend on both past and future picture, the decoder have to be fed with future I-P frames before being able to decode them.
There comes the PTS/DTS logic.
Presentation Time Stamp is the presentation time, it could be thought of as display frame number. It is the order you will see the decoded frames.
The DTS is the Decoder Time Stamp, i.e. the decoding frame number.
Assume if you have a short video like this : I-0 B-1 B-2 P-3
B-1 and B-2 depends on both I-0 and P-3. The corresponding DTS order would be : I-0 P-3 B-1 B-2.
To keep thing simple, the file is encoded with DTS order
So what ?
The problem is that to keep showing the video in the right order, the codec has to do things to pop out the frames in the correct order and sequentially (i.e. one frame in , one frame out)
The mpeg way (the right way)
The usual way to do this is that the codec delays decoding for 3 frames. Like that, he always have the two reference frames to decode frames.
In 0 3 1 2 . .
Out - - - 0 1 2 3 . .
This is perfectly legit for a player as the delay is known when creating the file and thus compensated (i.e. the audio stays in sync).
Divx (and xvid ways)
To be able to use the PTS/DTS with application not used to deal with such stream, Divx codec (and xvid when in compatibility mode) use a different trick.
They use a variant of PB frames and pack several frames in one. So the application thinks it is only one frame and the codec hides all this internally.
If we take the previous example, Divx would create a file like this, the () means one frame in the file.
In (0 3 1 2) - - - . . .
Out 0 1 2 3 ....
Null frames are inserted where frames were packed. The codec knows that if it receives null frame after a pack of frames, it should pop out frames from the pack
From a coder point of view, it is interesting as it does not introduce a delay between in and out, and avi files have not the PTS/DTS field to hint the decoder/player.
So what part 2 ?
This behaviour collides with avidemux aim : To provide frame accuracy.
In the mpeg way, there is a delay between what's fed to the codec and what's out. It is not acceptable as you would never know which actual frame your are looking at.
The divx/xvid way is tricky because frame 2 3 4 are seen a null frame and we cannot cut such a stream with frame accuracy.
How does avidemux do then ?
Simple, avidemux handles the PTS/DTS logic himself and force the codec to popout the frames immediately. The editor part of avidemux knows the DTS/PTS order of the frames and feeds the decoder correctly. You have frame accuracy and B-frames.
The problem is that divx/xvid hides the frames type by packing them, so the editor cannot deal with that for now.
From avidemux 2.0.24 and afterward, the packed bitstream is automatically unpacked upon loading. But only for avi/openDML. If the source is a OGM file, first save it as an avi and reload.
|