Professional Documents
Culture Documents
Abstract
In 1997, a project was started to capture, compress, store, and index up to 24 hours of digital TV broadcasts. The work in this report is to help implement this. In the first chapter of this report, the overall project is introduced and also the motivation behind particular focus of work. The second chapter deals with the theory behind digital video compression. In the third chapter a report is given on how the program to extract the motion vectors from the MPEG stream was developed. It also reports on the further development of the program, so the motion from frame to frame can be calculated. Chapter four explains why knowledge of the motion vectors is not sufficient information to calculate the motion from frame to frame. It gives the extra information that is needed and how all the information is used to calculate the motion from frame to frame.
ii
Table Of Contents
Abstract................................ ................................ ................................ ............................... ii Table Of Contents ................................ ................................ ................................ .............. iii Table of Figures................................ ................................ ................................ .................. iv Chapter 1................................ ................................ ................................ ............................. 1 1. Introduction................................ ................................ ................................ ..................... 1 Chapter 2................................ ................................ ................................ ............................. 3 2. Digital Video Compression................................ ................................ .............................. 3 2.1 The MPEG-1 bit stream ................................ ................................ ......................... 3 2.1.1 Description of a frame................................ ................................ ..................... 5 2.1.2 Bit stream order and display order of frames................................ .................... 6 2.1.3 Description of a macroblock................................ ................................ ............ 7 2.2 Types of macroblock present in a frame................................ ................................ .. 8 2.2.1 Types of macroblock in an I frame................................ ................................ ... 8 2.2.2 Types of macroblock in a P frame................................ ................................ .... 8 2.2.3 Types of macroblock in a B frame................................ ................................ ... 9 2.3 Motion estimation and compensation ................................ ................................ ...... 9 2.3.1 Encoding the motion vectors ................................ ................................ ......... 11 Summary................................ ................................ ................................ ................ 12 Chapter 3................................ ................................ ................................ ........................... 13 3 Extraction of the motion vectors ................................ ................................ ..................... 13 3.1 Choosing a decoder................................ ................................ .............................. 13 3.1.1 The Berkeley Decoder................................ ................................ ................... 13 3.1.2 The Java Decoder................................ ................................ .......................... 13 3.2 Description of the source code ................................ ................................ ............. 13 3.3 Storage of the Motion Vectors ................................ ................................ ............. 15 3.3.1 Reordering the bit stream order to the display order ................................ ...... 15 3.3.2 Storing the motion vectors ................................ ................................ ............ 16 3.3.3 Operation of program................................ ................................ .................... 17 3.3.4 Alterations made to the decoder ................................ ................................ .... 18 Summary ................................ ................................ ................................ ................... 20 Chapter 4................................ ................................ ................................ ........................... 21 4 Finding the motion from frame to frame................................ ................................ .. 21 4.1 Considerations that have to be taken into account - Frame level........................ 22 4.2Considerations that have to be taken into account - macroblock level................ 23 Summary ................................ ................................ ................................ ................... 24 Conclusion ................................ ................................ ................................ ........................ 25 References................................ ................................ ................................ ......................... 25 Appendix A................................ ................................ ................................ ....................... 26
iii
Table of Figures
Figure 2.1 The layered structure of the MPEG bit stream . . .4 Figure 2. 2 P frames use only forward prediction .. ..5 Figure 2.3 B frames use both forward and backward prediction . .6 Figure 2.4 A single frame divided up into slices .. . . ..6 Figure 2.5 Only one set of chrominance components is needed for every four luminance components . .7 Figure 2.6 Structure of a macroblock, and the blocks numbering convention .7 Figure 2.7 A forward predicted motion vector 10 Figure 3.1 Converting from bit stream order to Display order 16 Figure 3.2 Diagram of where the motion vectors for the different frames are stored .18 Figure3.3 Flow chart of the operational program 19 Figure 4.1 Motion vectors associated with a moving picture ..21 Figure 4.2 Realistic version of vectors associated with a moving picture 23
iv
Chapter 1 1. Introduction
With the arrival of Digital TV in America and Great Britain recently it is only a short time before its use will be standard. Recent years have also brought huge advances in: Networking - High bandwidth networks not only in the workplace, but reaching many homes also; Data storage - Today we talk only in Gigabytes; Video compression - Modern techniques allow compression rates of up to one in fifty (this topic is discussed in detail in Chapter 2) The combination of these developments will bring the wide spread usage of digital video over the next few years. Following the launch of this new technology will be the launch of many new services, we could see the introduction of the local video server instead of the local video store where connected residents can select a video from a huge multimedia server. A recording of all television broadcasts for the past week may be stored, allowing subscribers to catch up on any missed viewing. Searching through such large archives will see the need of a navigation tool. There is an on going project in DCU at the moment to develop such a tool of which this project is only a part[1]. When complete, the tool will allow the user to pick a category to search through (sport, drama, action, soap). Clicking on a category will display a list of key frames, each frame representing a program. Clicking on one of these frames will display another list of key frames and using this hierarchical approach, the user can narrow the search down to a single shot of video. One of the challenges of the project is to choose a frame to best represent a clip of film. It has been found that the frame after a sequence of frames with a lot of action is sometimes a good representation for that shot. This is one area where the motion vectors may come in useful. To allow navigation, the material has first to be broken up into elements. For video these elements are shots and scenes. A shot is defined as the continuous recording of a single camera, a scene is made up of multiple shots, while a television broadcast consists of a collection of scenes. For studio broadcasts (take for example the news) it is fairly easy to break the program up as the boundaries between shots are hard. However most television programs and films use special techniques to soften the boundaries, this makes them less detectable. There are four different types of boundaries between shots:
A cut. This is a hard boundary and occurs when there is a complete change of picture
over two consecutive frames. A fade. There are two types of fade, a fade out and a fade in. A fade out occurs when the picture gradually fades to a dot or black screen, while a fade in occurs when the
picture is gradually displayed from a black screen, both these effects occur over a few frames. A dissolve. This is the simultaneous occurrence of a fade out and a fade in, the two pictures are superimposed on each other. A wipe. This effect is like a virtual line going across the screen clearing one picture as it brings in another, again this occurs over a few frames. There are a lot of techniques (Pixel based difference method, Colour histogram method, Detection of macroblocks and Edge detection[5]) which can reliably detect a cut. However, only Edge detection is any way effective in detecting Fades, Dissolves and Wipes. There is another ongoing project in DCU at the moment that uses edge detection to find shot boundaries. The program takes two consecutive frames, uses special techniques to leave just a black & white outline of any objects in the frames, and then compares the two outlines. If there are a lot differences in them, it concludes a shot cut has occurred. One of the flaws of this method is that it only allows for relatively small movements of the objects from frame to frame. If something large suddenly moves across the screen, it may interpret this as a cut. To illustrate where this may happen, take the example where a journalist is giving a TV report from outside some building, and suddenly a bus goes by in the background. The inclusion of the bus in the frame could confuse the program into thinking a cut has occurred. This is another case where motion vectors could come in useful, as, associated with a lot of movement in a frame is a lot of motion vectors. These motion vectors can be used to compensate for the movement of the bus. Here is a history of the events that lead up to the creation of this project.
Develop a system to capture, compress, store and index up to 24 hours of TV
broadcasts in digital format. Eight hour recording of television broadcasts in MPEG1 format. This eight hours was broken into twenty minute segments for easier handling . A baseline was created by manually going through the entire recording and labelling where a cut, fade, dissolve and wipe occurred. A note of the frame number and the time the boundary occurred was taken. The results of any program written to find these boundaries can be easily compared to the baseline in order to determine its accuracy. A program was written using edge detection to find the shot boundaries but it was found that a lot of motion in a frame caused the program to falsely detect a cut. The use motion vectors to compensate for the motion should rectify the result. It is hoped that the motion vectors can also be used to enhance the programs performance in detecting fades, dissolves and wipes. Another area where the motion vectors may be used is in the choice of key frame for a shot, [choose a frame after a lot of action?]
Sequence layer
GOP
GOP
GOP
GOP
GOP
GOP
GOP
GOP
GOP
Frame
Frame
Frame
Frame
Frame
Frame
Frame
Frame
Frame
Slice
Slice
Slice
Slice
Slice
Slice
Slice
Slice
Slice
Macro Macro Macro Macro Macro Macro Macro Macro Macro Block Block Block Block Block Block Block Block Block
Y0 Block
Y1 Block
Y2 Block
Y3 Block
Cb Block
Cr Block
Figure 2.1 The layered structure of the MPEG bit stream Table 2.2 Function of each layer of the bit stream[2] Layer Sequence layer Group of pictures (GOP) Picture Slice Macroblock Block Function One or more groups of pictures Random access into the sequence Primary coding unit Resynchronisation unit Motion compensation unit DCT unit
Firstly, each layer is briefly described, and then a more thorough description of the units in the layers is given. 1. The sequence layer contains general information about the video: the vertical and horizontal size of the frames, height/width ratio, picture rate, VBV Buffer size, Intra and non-intra quantizer default tables. 2. Group of pictures (GOP) layer: Pictures are grouped together to support greater flexibility and efficiency in the encoder/decoder [2]. 3. The frame layer (picture layer) is the primary coding unit, it contains information regarding the picture s position in the display order (pictures do not come in the same order as they are displayed), what type of picture it is (Intra, Predicted or Bi-directionally predicted) and the precision and range of any motion vectors present in the frame. 4. The Slice layer is important in the handling of errors. If the decoder comes across a corrupted slice, it skips it and goes straight to the start of the next slice. 5. The Macroblock layer is the basic coding unit It is within this unit that the motion vectors are stored. Each macroblock may have one or associated with it. 6. The Block layer is the smallest coding unit and it contains information on the coefficients of the pixels.
I B
Bi-directionally predicted (B-type). These frames are encoded using a past (forward
predicted) and a future (backward predicted) I or P frame as a reference, as illustrated in figure 2.3 (a B-type frame is never used as a reference).
Figure 2.3 B frames use both forward and backward prediction Each frame is divided up into arbitrary sized slices. A slice may contain just one macroblock or all the macroblocks in the frame. As shown in Figure 2.4, a slice is not confined to a single row. Slice 1 Slice 3 Slice 4 Slice 5 Slice 6 Slice 7 Slice 9 Slice 2
Slice 8
Cb Cr Y Y Y Cb Cr Y Y Y
Figure 2.5 Only one set of chrominance components is needed for every four luminance components. A block is an 8 by 8 pixel area and is the smallest unit in the MPEG stream. It contains the Discrete Cosine Transform (DCT) coefficients of the luminance and chrominance components [3]. Six blocks are needed to make up a macroblock (16 pixels by 16 pixels), four for the luminance components but only one for each of the two chrominance components due to their compression. Figure 2.6 shows the blocks of a macroblock and their numbering convention.
0 2
1 3
Cb
Cr
1 1 1 1 1
Here is the meaning of the abbreviations used in the tables above: VLC - variable length code M F - motion forward M B - motion backward pred - predictive m - motion compensated c - at least one block in the macroblock is coded and transmitted d - default quantizer is used q - quantizer scale is changed i - interpolated. This is a combination of forward prediction and backward prediction. b - backward prediction f - forward prediction
compared between frames, and instead of encoding the whole macroblock again the difference between the two macroblocks is encoded and transmitted. Figure 2.7 demonstrates how forward motion compensation is achieved (backward compensation is done in the same way except a future frame in the display order is used as the reference frame.) I or P Reference Frame P or B Frame
Search area
x y
Figure 2.7 A forward predicted motion vector Macroblock x is the macroblock we wish to encode, macroblock y is its counterpart in the reference frame. A search is done around y to find the best match for x. This search is limited to a finite area, and even if there is a perfectly matching macroblock outside the search area, it will not be used. The displacement between the two macroblocks gives the motion vector associated with x. There are many search algorithms to find the best matching macroblock. A full search gives the best match but is computationally expensive. Alternatives to this are the Logarithmic search, One-at-a-time search, Three-step search and the Hierarchical search [3]. The choice of search is decided by the encoder, with the usual trade-off between time and accuracy.
10
The code needed to decode these VLC values is given in MPEG standard
Table 2.6 Range of motion vectors and their modulus[2] Forward-f-code or Backward-f-code 1 2 3 4 5 6 7
Half pel precision -8 to 7.5 -16 to 15.5 -32 to 31.5 -64 to 63.5 -128 to 127.5 -256 to 255.5 -512 to 511.5
Full pel precision -16 to 15 -32 to 31 -64 to 63 -128 to 127 -256 to 255 -512 to 511 -1024 to 1023
11
Table 2.7 VLC for the differential motion vectors (DMV) [2] VLC code 0000 0011 001 0000 0011 011 0000 0011 101 0000 0011 111 0000 0100 001 0000 0100 011 0000 0101 11 0000 0101 01 0000 0101 11 0000 0111 0000 1001 0000 1011 0000 11 0001 1 0011 011 1 DMV -16 -15 -14 -13 -12 -11 -10 -9 -8 -7 -6 -5 -4 -3 -2 -1 0 VLC code 010 0010 0001 0 0000 110 0000 1010 0000 1000 0000 0110 0000 0101 10 0000 0101 00 0000 0100 10 0000 0100 010 0000 0100 000 0000 0011 110 0000 0011 100 0000 0011 010 0000 0011 000 DMV 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Summary
In this chapter the MPEG standard is introduced and described, the layered structure of the bit stream was explained and the concept of a motion vector illustrated. The difference between the bit stream order and the display order of the frames was explained and illustrated. The different types of macroblock present in I, P and B frames was given. To find the motion from frame to frame all of these factors have to be considered.
12
13
public class MPEG_video implements Runnable { MPEG_video () { } public void run() { mpeg_stream.next_start_code(); do { Parse_sequence_header(); } do { Parse_group_of_pictures(); } } } private void Parse_sequence_header() { } private void Parse_group_of_pictures() { do { Parse_picture(); } } private void Parse_picture () { do { Parse_slice(); } } private void Parse_slice() { do { Parse_macroblock(); } } private void Parse_Block() { } }
It is clear how the program first takes in the highest level layer, and parses it. The program then extracts the information in a section of that layer, and moves down to the next level. This process is repeated for all the layers. The motion vector information is contained in the macroblock layer. Once this information is known, it is passed to a method in motion_data called compute_motion_vector. To decode the motion vectors, compute_motion_vector uses another method in motion_data called motion_displacement. The code in these methods is given in Appendix A.
14
The two components of the vector are right_x and down_x. The conventional direction used for the components is right and down, minus components represent left and up respectively. For this project it was decided to use half pixel precision for the vectors, (recon_right_x and recon_down_x). The vectors may not be pointing to a particular pixel, but it is the true vector for that macroblock. The fact that the vector is not pointing at a pixel should not be an issue. If the motion vectors are used for the selection of the key frame in a shot, there is no need for the vector to be pointing at a pixel. If the vectors are used to compensate for movement in a frame, Edge Detection (the process that will be using the vectors) blows up an area around around each pixel when comparing the two frames[5]. By simply halving the extracted (half pixel precision) vector, and using it for any motion compensation, the need for the extra calculations to get the vector pointing to a pixel will be eliminated. This will enhance the speed of the program. Any inaccuracies in the motion vector will be compensated for by the Edge detection s explosion. Besides Edge detection is not an exact science.
15
Display Order
future 1I
5P 5P 5P 5P 10P 10P 10P 10P 10P 11I 15P 15P 15P 15P
16
Array name
futureRight futureDown presentForwardRight presentForwardDown presentBackwardRight presentBackwardDown pastForwardRight pastForwardDown pastBackwardRight pastBackwardDown
Function of array . Store the motion vectors in a P frame until it is the P frames turn in the display o rder. Stores the motion vectors of the present frame in the display order Stores the motion vectors of the previous frame in the display order
17
Frame comes in
Is it an I, P or B frame?
Reset future
Figure 3.2 Diagram of where the motion vectors for the different frames are stored
18
Frame comes in
Reset present
Is frame I or P Type?
Yes
No
Take Vectors out of future and put in present Reset future Is frame I or P type ?
19
To bring in theses changes it was decided it would be best to creat a new class. This was done for a few reasons: 1. MPEG_video.java is a large file. It seemed unsuitable to make it any bigger. 2. Even though MPEG_video is very large, there is a logical flow to it. The bit stream is decoded from top to bottom, introducing new code would only disturb this natural flow and leave the program difficult to read. 3. At some time in the future the MPEG2 standard may be used instead of the MPEG1 standard that is being used at the moment. However most of the code developed for this project may still be relevent. Having the code developed in a single class will leave it easier to make the transition from MPEG1 to MPEG2.
Summary
There is a program developed which extracts the motion vectors from the bit stream. These vectors are stored in a fashion that allows the motion from frame to frame to be easily calculated. However additional information is needed to calculate this motion. The reasons we need this additional information are explained in the next chapter. Note the source code for the decoder has not been minimised. The code used to calculate the Inverse Discrete Cosine Transform, and also the code used to display the picture can be deleted.
20
The arrow represents a forward vector and represents a backward vector [Note a forward vector doesn t have to be pointing forward, and a backward vector pointing backward. It is just the nameing convention for whither the reference frame in the past (forward) or future (backward).] The values for the vectors are given below: In the first frame there are no motion vectors In frame 2: forwardRight = 2; forwardDown = -3; (2, -3) backwardRight = -7; backwardDown = 5; (-7, 4) Frame 3: forward = (4, -6) backward = (-5, 1) Frame 4: forward = (7, -7) backward = (-2, 0) Frame 5: forward = (9, -7) Transition 1 To find the motion in the transition from frame one to frame two, we can only use the forward vector. The backward vector has no reference in the I frame. The motion is just (2, -3) Transition 2 Here, the forward and backward vectors can be used as both forward vectors have the same reference point and both backward vectors have the same reference point.
21
presentForward - pastForward = forward motion (4, -6) - (2, -3) = (2, -3) presentBackward - pastBackward = backward motion (-5, -1) - (-7, 4) = (2, -3) To find the total motion avarage the two results motionRight = (2+2)/2 = 2 motionDown = (-3+-3)/2 = -3 Total motion = (2, -3) Note in this example the forward motion will always equal the backward motion but this is not usually the case in video. Transition 3 forward (7, -7) - (4, -6) = (3, -1) backward (-2, 0) - (-5, 1) = (3, -1) Total motion (3, -1) Transition 4 Both the forward and backward vectors can be used here. Both forward vectors are referenced to the same point and, as the B frames backward vector is referenced to the P frame. The P frame is said to have a zero backward vector. forward (9 ,-7) - (7, -7) = (2, 0) backward (0, 0) - (-2, 0) = (2, 0) Total motion (2, 0) The motion for the sequence is: (2,-3), (2, -3), (3, -1), (2, 0)
Table 4.1 Vector types that can be used in the transition from frame to frame past I I P P B B present B or P I B or P I B or P I Vector types that can be subtracted forward only None forward only None forward and backward backward only
I Frame to B or P Frame : When going from an I frame to a B or P frame only the forward motion vectors can be used. The P frame will only have forward vectors, the B frames backward vectors can t be used as they have no reference in the I frame. 22
I Frame to I frame: There are no vectors present in either frame. P Frame to P or B frame: None of the backward vectors in the B frame have a reference in the P frame. Therefore only forward vectors can be used. P Frame to I Frame: The forward vectors in the P frame do not have a reference in the I frame. No motion can be found. B Frame to B or P Frame: Both forward and backward vectors can be used as both have the same reference point from frame to frame. B Frame to I Frame: Only the backward vectors are referenced in the I frame.
In this example the transition from frame 1 to frame 2 can be calculated as before. If the second transition is calulated as before we get: forward motion: (4, -6) - (2, -3) = (2, -3) backward motion: (-5, 1) - (0, 0) = (-5, 1) Total motion = (-1.5, -1) This result is incorrect. To get the correct result, only the forward motion can be used. Similarly only the backward motion is used for the third transition. The motion for the final transition can not be found
23
because there is only a backward vector in frame 4 and only a forward vector in frame 5. Only similar types of vector can be subtracted from each other. Below are further rules to complement the rules that were established in table 4.1 Only if there is a similar type of vector (forward, backward or both) present in both frames can the motion be found. A reference frame is said to have all vectors equal to (0, 0) If there is a skipped macroblock in the present frame, there is zero motion for that transition. If there is a skipped macroblock in the previous frame, the motion for that transition can t be calulated. An exception to this is if there is also a skipped macroblock in the present frame in which case the motion will be zero. If there is an Intra macroblock in either the present or previous frame, the motion for that transition can t be calculated.
Summary
In this chapter the extra information needed to find the motion from frame to frame is described. A set of rules is established on how to find the motion. Note this set of rules is not rigid. By keeping track of other information, more vectors can be found. For example, if a record of the vector for a macroblock before a skipped macroblock is kept, the motion, in the transition between that skipped macroblock (or the final skipped macroblock in a series of skipped macroblocks) and a non intra macroblock can also be calculated. However, this will only further complicate the program. For a starting point, the rules created in this chapter should be sufficient. If the program does not perform satisfactorily this extra motion can be calculated.
24
Conclusion
This project set out to extract the motion vectors from an MPEG stream. This information was to be used to calculate the motion of all objects from one frame to another. The first step of the project was to choose an MPEG1 decoder to extract and decode the motion vectors. The choice came down to a Java decoder and a C decoder. Two issues had to be taken into account when choosing the decoder; how fast the decoder could run, and how easily it could be modified. The Java decoder was chosen because although the MPEG bit stream is quite complicated, it is very well structured. Java s superior ability to deal with the complexity of the bit stream in an easy to follow manner outweighed the C decoder s superior processing time. Using the decoder, the motion vectors were extracted and decoded. The decoder was modified to allow the subtraction of all the motion vectors in the present frame (display order) from all the motion vectors in the previous frame (display order). All the modifications were put in a separate class. This means minimal alterations to the decoder s well structured code. The creation of a separate class with all the new code is important because, at some time in the future the MPEG2 standard may be used instead of the MPEG1 standard (the standard we are using at the moment). All the relevant code developed for the MPEG1 standard can be easily taken and used for the MPEG2 standard. On completion of the program, it was realised that in order to find the motion from one frame to another, it is not a simple matter of subtracting all the vectors in the present frame from all the vectors in the previous frame. A set of rules have to be followed. The rules were developed in two stages. First a general set of rules that only take into account what type of frame (I, P or B) the vectors are in were written. Then at a lower level, the macroblock types present in the frames were taken into consideration and a comprehensive set of rules were written. These rules give the true motion from frame to frame. The next step in this project is to incorporated the rules into the program. Finally, to enhance the program s performance, some of the decoder s source code can be deleted. The code which deals with decoding the pixel coefficients is irrelevant. Also, the code used to display the video can be omitted. To conclude, on accomplishing the task presented in this project (to extract the motion vectors from the MPEG stream) it was discovered that more information is needed in order to achieve the ultimate goal of finding the motion of objects from one frame to another. This extra information is identified. Also, a description is given on how to use this information to find the motion from frame to frame.
References
[1] http://www.compapp.dcu.ie/~asmeaton/Video-Proj-summary.html [2] ISO/IEC 11172-2, Genve, 1993 [3] K.R. Rao and J.J.Hwang, Techniques & Standards For Image Video & Audio Coding, Prentice Hall PTR, New Jersey, 1996 [4] http://rnvs.informatik.tu-chemnitz.de/~ja/MPEG/MPEG_Play.html [5] Aidan Totterdell, An Algorithm for detecting and classifying scene breaks in an MPEG1 video bit stream, Dublin City University, 1998. 25
Appendix A
Code for the two methods, compute_motion_vector and motion_displacement [4] private int motion_displacement(int motion_code, int PMD, int motion_r) { int dMD, MD; if (x_ward_f == 1 || motion_code == 0) { dMD = motion_code; } else { dMD = 1 + x_ward_f * (Math.abs(motion_code) 1); dMD += motion_r; if (motion_code < 0) dMD = -dMD; } MD = PMD + dMD; if (MD > max) MD -= range; else if (MD < min) MD += range; return MD; } public void compute_motion_vector(int motion_horiz_x_code, int motion_verti_x_code,int motion_horiz_x_r, int motion_verti_x_r) { recon_right_x_prev = recon_right_x = motion_displacement(motion_horiz_x_code, recon_right_x_prev, motion_horiz_x_r); if (Full_pel_x_vector) recon_right_x <<= 1; recon_down_x_prev = recon_down_x = motion_displacement(motion_verti_x_code, recon_down_x_prev, motion_verti_x_r); if (Full_pel_x_vector) recon_down_x <<= 1; right_x = recon_right_x >> 1; down_x = recon_down_x >> 1; right_half_x = (recon_right_x & 0x1) != 0; down_half_x = (recon_down_x & 0x1) != 0; right_x_col = recon_right_x >> 2; down_x_col = recon_down_x >> 2; right_half_x_col = (recon_right_x & 0x2) != 0; down_half_x_col = (recon_down_x & 0x2) != 0; }
26
MPEG_video
/*This is a skeleton structure of MPEG_video, just to document some of /*the things that have been added in. Once the resolution of the video /*is known Arrays is called and the size of all the arrays can be set. /*If the frame is I or P type the future vectors will become the present /*vectors in display order, if the frame is P type any vectors present /*in the frame are stored in future until its turn in display order comes /*(when another I or P frame comes in) /* When compute_motion_vector is called some added information is /*passed to it, the macroblockes address (row and column), what type of /*frame it is if compute_motion_vector is to calculate forward motion /*vectors, if it is to calculate backward motion vectors the arbitrary /*value 4 (don't confuse this 4 with a D_type frame) is passed, just /*to indicate the vectors are backward. import java.io.InputStream; import java.applet.Applet; public class MPEG_video implements Runnable { private Array VideoArray = new Array(); MPEG_video () { } public void run() { mpeg_stream.next_start_code(); do { Parse_sequence_header(); } do { Parse_group_of_pictures(); } } } private void Parse_sequence_header() { Width = mpeg_stream.get_bits(12); Height = mpeg_stream.get_bits(12); mb_width = (Width + 15) / 16; mb_height = (Height + 15) / 16; VideoArray.setDimensions(mb_height, mb_width); } private void Parse_group_of_pictures() { do { VideoArray.pastEqualsPresent();//Store vectors of previos frame. VideoArray.resetPresent();();// All Vectors are reset for the new frame Parse_picture(); VideoArray.printArray(1);// Optional } } private void Parse_picture () { if (Pic_Type == P_TYPE || Pic_Type == I_TYPE) { VideoArray.futureEqualsPresent(); // Take what is in future and put in present VideoArray.resetFuture(); // Reset future for new values } do { Parse_slice(); */ */ */ */ */ */ */ */ */ */ */ */ */
27
} } private void Parse_slice() { do { Parse_macroblock(); } } private void Parse_macroblock() { if (macro_block_motion_forward) { Forward.compute_motion_vector(motion_horiz_forw_code, motion_verti_forw_code, motion_horiz_forw_r, motion_verti_forw_r, mb_row, mb_column, Pic_Type); } if (macro_block_motion_backward) { // motion vector for backward prediction exists b = 4; Backward.compute_motion_vector(motion_horiz_back_code, motion_verti_back_code, motion_horiz_back_r, motion_verti_back_r, mb_row, mb_column, b); } } }
motion_data
/*This is a skeleton of the class motion_data there is very little added to it. */ /*In the method compute_motion vector some extra information is passed, as was */ /*documented in MPEG_video. All this extra information is passed straight to */ /*Arrays along with the values of the motion vectors (in half pixels) */ public class motion_data { private Array MotionArray = new Array(); //Create instance of The class Array public void init () { } public void set_pic_data() { } public void reset_prev() { } /* The internal method "motion_displacement" computes the difference of the */ /* actual motion vector in respect to the last motion vector. Refer to */ /* ISO 11172-2 to understand tho coding of the motion displacement. */ private int motion_displacement(int motion_code, int PMD, int motion_r) { int dMD, MD; if (x_ward_f == 1 || motion_code == 0) { dMD = motion_code; } else {
28
dMD = 1 + x_ward_f * (Math.abs(motion_code) - 1); dMD += motion_r; if (motion_code < 0) dMD = -dMD; } MD = PMD + dMD; if (MD > max) MD -= range; else if (MD < min) MD += range; return MD; } /* The method "compute_motion_vector" computes the motion vector according to the */ /* values supplied by the "ScanThread". It uses the method "motion_displacement". */ /* The result are the motion vectors for the luminance and the chrominance blocks.*/ public void compute_motion_vector(int motion_horiz_x_code, int motion_verti_x_code, int motion_horiz_x_r, int motion_verti_x_r, int mr, int mc, int chooseArray) { recon_right_x_prev = recon_right_x = motion_displacement(motion_horiz_x_code, recon_right_x_prev, motion_horiz_x_r); if (Full_pel_x_vector) recon_right_x <<= 1; recon_down_x_prev = recon_down_x = motion_displacement(motion_verti_x_code, recon_down_x_prev, motion_verti_x_r); if (Full_pel_x_vector) recon_down_x <<= 1; /* The motion vectors(in half pixels) is sent to Arrays, along with information on*/ /* which array they are to go.*/ MotionArray.fillArray(mr, mc, recon_right_x, recon_down_x, chooseArray); } public void get_area() { } public void copy_area() { } public void copy_unchanged() { } public void put_area() { } public void put_area() { } }
Array
/*The class Array is used for the storage of the Motion Vectors*/ /*Two instances of the class Array will be created, one in the class*/ /*MPEG-video called VideoArray. This instance will be used to first*/ /*create the size of the arrays, depending on the reselusion of the video clip.*/ /*This instance will also pass information regarding which arrray*/ /*the Motion Vectors should be in (Past or Present).*/
29
/* The second instance, created in the class motion_data is called*/ /*MotionArray. This instance passes the values of the Motion Vectors*/ /*to the arrays and also information regarding which array they go into*/ /*(futureRight, futureDown, presentForwardRight, presentForwardDown, */ /*presentBackwardRight or presentBackwardDown) */ class Array { public Array() { } /*All arrays are created as static because we want both instances */ /*of the class Array to be able to see them*/ static static static static static static static static static static public public public public public public public public public public int[][] int[][] int[][] int[][] int[][] int[][] int[][] int[][] int[][] int[][] futureRight; futureDown; presentForwardRight; presentForwardDown; presentBackwardRight; presentBackwardDown; pastForwardRight; pastForwardDown; pastBackwardRight; pastBackwardDown;
/*Sets the dimensions of all the arrays*/ public void setDimensions(int mb_h, int mb_w){ futureRight = new int[mb_h][mb_w]; futureDown = new int[mb_h][mb_w]; presentForwardRight = new int[mb_h][mb_w]; presentForwardDown = new int[mb_h][mb_w]; presentBackwardRight = new int[mb_h][mb_w]; presentBackwardDown = new int[mb_h][mb_w]; pastForwardRight = new int[mb_h][mb_w]; pastForwardDown = new int[mb_h][mb_w]; pastBackwardRight = new int[mb_h][mb_w]; pastBackwardDown = new int[mb_h][mb_w]; } /*fill_array takes the values from the method compute_motion_vectors*/ /*in the class motion_data and puts them in the appropriate array.*/ /*Note all values are in half pixels*/ public void fillArray(int mr, int mc, int right, int down, int chooseArray) { if(chooseArray == 2){ futureRight[mr][mc] = right; futureDown[mr][mc] = down; } if(chooseArray == 3){ presentForwardRight[mr][mc] = right; presentForwardDown[mr][mc] = down; } if(chooseArray == 4){ presentBackwardRight[mr][mc] = right; presentBackwardDown[mr][mc] = down; } } /*This method is only used to print out the values!*/ public void printArray(int printWhich){ if (printWhich == 1){ for (int j = 0; j < futureDown.length; j++){ for (int i = 0; i < futureDown[j].length; i++){ System.out.print(""+pastBackwardRight[j][i]+"\t"); }
30
System.out.print("\n"); } System.out.print("\n"); } } /*As each new frame comes in all values have first to be set to zero*/ public void resetPresent(){ for (int j = 0; j < futureDown.length; j++){ for (int i = 0; i < futureDown[j].length; i++){ presentForwardRight[j][i] = 0; presentForwardDown[j][i] = 0; presentBackwardRight[j][i] = 0; presentBackwardDown[j][i] = 0; } } } /*When an I or P picture comes in we have to take all the motion vectors that*/ /*are in future and put them in present*/ public void futureEqualsPresent(){ for (int j = 0; j < futureDown.length; j++){ for (int i = 0; i < futureDown[j].length; i++){ presentForwardRight[j][i] = futureRight[j][i]; presentForwardDown[j][i] = futureDown[j][i]; } } } /*After all the values are taken out of future, future has to be reset*/ /*before any more values can be put in.*/ public void resetFuture(){ for (int j = 0; j < futureDown.length; j++){ for (int i = 0; i < futureDown[j].length; i++){ futureRight[j][i] = 0; futureDown[j][i] = 0; } } } /*We need to store all the Motion Vectors from the previous frame*/ /*so the net movement from frame to frame can be calculated*/ public void pastEqualsPresent(){ for (int j = 0; j < futureDown.length; j++){ for (int i = 0; i < futureDown[j].length; i++){ pastForwardRight[j][i] = presentForwardRight[j][i]; pastForwardDown[j][i] = presentForwardDown[j][i]; pastBackwardRight[j][i] = presentBackwardRight[j][i]; pastBackwardDown[j][i] = presentBackwardDown[j][i]; } } } }
31