ChipFind - документация

Электронный компонент: VP2611CGGH1R

Скачать:  PDF   ZIP
VP2611
1
VP2611
H.261 Encoder
Supersedes June 1996 edition, DS3487 - 4.0
DS3487 - 4.1 December 1998
DESCRIPTION
The VP2611 Video Compression Source Coder forms part
of a chip set used in video conferencing, video telephony and
multimedia applications. It produces data which conforms to
the H261 standard for video compression with rates between
64K and 2M bits per second. With a 27 MHz clock the device
will accept data produced to full CIF resolution at 30 Hz frame
rates. The pipeline latency through the device is only 3 macro
block periods.
The VP2611 contains all the elements necessary for the
compression algorithm. It incorporates a Motion Vector Esti-
mator which performs a +/- 7 pixel search. The decision to use
inter or intra frame compression is made by the device, and the
selected data blocks are read from the frame store. New or
difference data is then passed through a Discrete Cosine
Transformer and quantized. Data from the quantizer is also
inverse quantized and passed through an Inverse Discrete
Cosine Transformer. This re-constructed data is then written
to the frame store for use in the next frame period.This frame
store is managed by an internal DRAM controller, and no
external logic is needed.
The input data must be in YUV space, and must also
conform to the six sub blocks per macro block format defined
by H261. Any conversion from RGB format is performed by
the VP510 Colour Space Converter. Any reduction in spatial
resolution, down to CIF or QCIF requirements, is done by the
VP520 Three Channel Video Filter.
The quantized data is zig-zag scanned and run length
coded before being output, together with block information
and motion vectors.
FEATURES
s
Fully integrated H261 video encoder
s
Up to full CIF resolution and 30 Hz frame rates
s
Inputs YUV data in 8 x 8 sub block format
s
Outputs run length coded coefficients
s
On chip motion vector estimator with +/-7 pixel search
window
s
Addresses and control generated internally for DRAM
frame store
s
QFP package
ASSOCIATED PRODUCTS
s
VP510 Colour Space Converter
s
VP520S CIF/QCIF Converter
s
VP2612 Video Multiplexer
s
VP2614 Video Demultiplexer
s
VP2615 H.261 Decoder
Fig 1 : Typical Video Conferencing Transmission System
FRMIN
Cr/Cb
Y
FLAGS
RLC DATA
REQYUV
USER
INTERFACE
VP520
3 CHANNEL
VIDEO FILTER
VP2611
INTEGRATED
VIDEO ENCODER
R
G
B
VP510
COLOUR SPACE
CONVERTER
VP2612
VIDEO
MULTIPLEXER
H261
BIT
STREAM
64kb to 2Mb/s
NTSC
PAL
COMP VIDEO
DECODER
VIDEO
SYNC
SYSTEM
CONTROLLER
ADDR
CIF FRAME
STORE
16X128K
TX BUFFER
32K X 8
CCIR601 RESOLUTION
Y 720 X 288 Cr/Cb 360 x 288 PAL
Y 720 X 240 Cr/Cb 360 x 240
NTSC
CIF RESOLUTION
Y 352 X 288
Cr/Cb 176 x 144
DATA
CIF FRAME
STORE
16 X128K
MBLK'S
VP2611
2
PIN DESCRIPTIONS
YUV7:0
This input bus accepts YUV data one pixel at a
time from the preprocessor, clocked in on the
rising edge of PCLK.
PCLK
This signal is used to strobe in data at the YUV
port and must be derived by dividing SYSCLK
with an integer greater than one.
FRMIN
This input should be pulled high to prepare the
VP2611 to code a new frame. It must be held
high for at least one SYSCLK cycle and then
must be pulled low again before the next frame
begins. The VP2611 will respond to the rising
edge of FRMIN by asserting REQYUV
appproximately 184 SYSCLK cycles later.
REQYUV
This output is pulled high to request that YUV
data be input for a new MacroBlock. It is pulled
low again 1871 SYSCLK cycles later. It re-
mains low during Dummy MacroBlocks and
during the lay period between frames.
DBUS7:0
This output bus serves several functions as
defined by DMODE3:0. In addition to providing
the quantized coefficients and motion vectors,
it is used to output control information.
DMODE3:0
Output flag port for DBUS7:0 bus. The value at
this port identifies the data type appearing on
DBUS7:0 during the same period.
DCLK
This output pulses high for a minimum of 37ns
each time new data is output on DBUS or
DMODE. It can be used as an edge sensitive
strobe signal or a level sensitive "valid" signal.
SW15:0
This bidirectional port is connected to the
frame store.
RAS
Row Address Strobe output for the external
DRAMs.
CAS
Column Address Strobe output for the external
DRAMs.
Fig 2 : Simplified Block Diagram
R/
W
1
Read/Write control for external DRAM 1.
R/
W
2
Read/Write control for external DRAM 2.
N/C if 256k DRAMs.
OE1
Output Enable control for external DRAM 1
or ADR8.
OE2
Output Enable control for external DRAM 2.
N/C if 256k DRAMs.
ADR7:0
Address output for the external DRAMs.
CBUS7:0
Bi-directional data bus for use by a Microproce-
ssor. Data and insructions are clocked on and
off the chip on the rising edge of CSTR.
CSTR
Data strobe for the CBUS port.
CEN
An enabling signal for the CBUS port.
CADR
When high, this signal defines CBUS as a data
bus, and when low as an instruction input.
SYSCLK
System clock, run at 27MHz maximum. The
clock must be high for between 35% and 65%
of each clock cycle. This clock is used for all
internal operations.
RESET
Active low power on reset which must be held
low for at least 2064 cycles.
TCK
Test clock for JTAG.
TMS
Test Mode Select for JTAG.
TDI
Input JTAG test data.
TDO
Output JTAG test data.
TRST
Reset JTAG controller (active low).
NOTE:
"Barred" active low signals do not appear with a bar in the
main body of the text.
Q Step
Block Info
INTER/INTRA
DECISION
PROCESSOR
MOTION
VECTOR
ESTIMATOR
FRAME STORE INTERFACE
LOW
PASS
FILTER
DCT
Q
SUB
FORWARD PATH
IQ
IDCT
ADD
Force
Intra
Force
Filter
YUV
BLOCK
FORMAT
DATA
ADDRESS
CONTROL
DATA
BUS
Search
Window
Motion Vectors
CONTROL
LOGIC
Force
Filter
Force
Intra
Predicted
block
HOST DATA & CONTROL
Zig Zag
RLC
BUS
FLAGS
VP2611
3
40
80
CO
100
140 180
40
80
CO
140
180
5F
AB
x = 1.125y
MC On
MC Off
Zero Movement Absolute Error in Hex
140
180
100
CO
80
40
40
80
CO
100
140 180
20
Minimum Mean Absolute Err
o
r

i
n

H
e
x
Table 1 : External DRAM timing requirements
t RAC
Access time from RAS
-
105ns or under
t CAC
Access time from CAS
-
25ns or under
t RP
RAS precharge time
50ns or under
-
t CP
CAS precharge time
15ns or under
-
t RAS
RAS pulse width
90ns or under
-
t CAS
CAS pulse width
50ns or under
-
t REF
Time to refresh 256 rows
-
0.25ms or over
SYMBOL
PARAMETER
MINIMUM
MAXIMUM
N.B. All times are quoted assuming 27MHz operation. For lower clock
frequencies increase the above values proportionately.
Fig 3 : MC Decision Slope
OPERATION OF MAJOR BLOCKS
Motion Vector Estimator
The motion estimator calculates the mean absolute error
( MAE ) for each possible position of the combined luminance
block in a search window from the previous frame. The
combined luminance block consists of 16 x 16 pixels, and in
the search window this is displaced between -7 to +7 vertically,
and -8 to +7 horizontally. The two lsb's of each pixel are
discarded and the MAE value is contained within 14 bits.
The minimum MAE value, representing the best match
between the previous and current block, is passed to the
motion compensation decision block, together with the posi-
tion of this best fit in the search window. The zero displace-
ment MAE value is also passed to this block, which then
decides whether the best fit is sufficiently better than the zero
displacement fit. It uses the characteristic shown in Figure 3,
where the 14 bit MAE is a Hex value. In the area to the right
of the line all points defined by the two MAE values will cause
motion compensation to be applied. In this case the best fit
MAE value is used by the inter/intra decision processor,
otherwise the zero displacement value is used.
Inter/Intra Decision Processor
The MAE value passed by the motion compensation
decision block is compared to the simplified variance of the
current block. This simplified variance is calculated by sum-
ming the moduli of the differences between each luminance
pixel and the mean luminance value over the whole macrob-
lock. Eight bit pixels are used, and the variance value is
expressed in 14 bits by discarding the two lsb's from the actual
16 bit result. It can then be directly compared to the 14 bit MAE
value.
If the MAE value is below a user defined threshold inter
mode coding is always selected. The default threshold is 3, on
a scale from 0 to 255 using the 8 msb's from the 14 bit value.
Above this threshold inter mode is only selected if the variance
of the current block is greater than or equal to the MAE value
in use.
In order to avoid gradual picture degredation, every 61st
Macroblock input to the VP2611 is coded in intra mode
regardless of the above decision. As 61 is a prime number, this
will ensure that each macroblock will be transmitted in intra
mode at least once in every 61 transmissions. If FIX MAC-
ROBLOCK or SKIP PICTURE is invoked this `Force Intra'
counter will be disabled.
The user may overide the internal Inter/Intra decision at
any time using the CBUS control port. A user generated
forced inter mode will overide an internally generated `Force
Intra'.
Low Pass Filter
The macroblock selected from the previous frame in
motion compensated inter mode coding, will be filtered before
it is subtracted from the current block. This decision can be
overidden externally by the system controller. The Filter uses
a simple [ 1 2 1 ] characteristic in both vertical and horizontal
dimensions as specified in H.261 on the macroblock boundaries
[010] is used.
VP2611
4
Zig Zag Scan
This is essentially an address generator which reorders
the DCT coefficients according to the standard zig-zag scan
pattern. This has the effect of concentrating the significant
coefficients at the beginning of the sub-block, improving the
efficiency of the Run Length Coder.
Run Length Coder
Each coefficient output from the zig zag scan is examined.
If it is non-zero, then the Run Length Coding circuit will pass
the coefficient magnitude to the output port along with its zero
count i.e. the number of zero magnitude coefficients preced-
ing it within the same 8x8 sub-block.
Inverse Quantize
This circuit replicates the operation of the inverse quan-
tizer in the decoder. It reconstructs the 12 bit DCT coefficients
from the 8 bit quantized inputs, using the 5 bit quantization
value. This is achieved using the following formulae.
If QUANT is odd :
REC = QUANT*(2*LEVEL+1) : LEVEL > 0
REC = QUANT*(2*LEVEL-1) : LEVEL < 0
If QUANT is even :
REC = QUANT*(2*LEVEL+1)-1 : LEVEL > 0
REC = QUANT*(2*LEVEL-1)+1 : LEVEL < 0
For Intra Coded DC Coefficients :
REC = 8*LEVEL
except if LEVEL=255 when REC=1024
If LEVEL=0 then REC=0 in all cases.
The reconstructed values (REC) are passed through a
Clipping Circuit in case of arithmetic overflow.
Thus, the Inverse Quantizer restores the DCT coefficients
to their original value but with quantisation error.
Inverse DCT
This circuit replicates the operation of the Inverse Cosine
Transform in the Decoder, and outputs 9 bit signed pixel data
(intra mode) or pixel difference data (inter mode). The IDCT
fully meets the CCITT specification.
Reconstruction Adder
In Inter Mode, the IDCT data is added to the best fit block
from the previous frame store. In Intra mode, the IDCT data is
simply added to zero. After the adder, the sign bit is removed
from the result to give 8 bit pixels. Clipping circuits ensure that
any pixels with values exceeding 255 are clipped to 255, and
any with negative values are clipped to zero (such values are
possible due to quantization noise).
Frame Store Manager
The previous picture is stored in an external CIF DRAM
frame store, which is connected by a glueless interface. The
internal Frame Store Manager controls all read, write, and
refresh operations to these DRAMs. No provision is made to
allow the use of smaller DRAM's, if only QCIF operation is
required.
During the coding of each macroblock columns of the
search window are read from these DRAMs, and finally the
"best fit" macroBlock is obtained. At the completion of coding
the fully processed new macroblock is written to the DRAM's,
after it has been decoded again. In this way the frame store
maintains a bit-accurate duplicate of the image seen by the
Decoder (excepting transmission errors).
Several configurations are possible to make the required
128Kx16 store. Two 64K x 16 DRAMs could be employed; in
this case use the default 1M DRAM mode when setting up the
chip. Otherwise, a single 256K x 16 DRAM or four 256K x 4
DRAMs could be used. In these last two cases use OE1 as
ADR8, RW1 as R/W, and do not connect RW2 and OE2. Also,
use the Setup instruction at the CPORT to put the device into
4M DRAM mode.
Table 1 details the critical timing parameters which the
external DRAM must meet with SYSCLK running at 27MHz.
Note that, if used at slower speeds, the requirements on the
DRAM timing are relaxed with the exception of refresh. The
number of refresh cycles the VP2611 produces is directly
proportional to the SYSCLK frequency.
Discrete Cosine Transform
This circuit performs a Discrete Cosine Transform on each
8x8 sub block, whether in inter or intra mode. In intra mode,
eight bit pixel data is used, with a ninth implied sign bit ( all pixel
data is positive ). In inter mode the difference between the
current and best fit previous block is used. This will be a two's
complement number. Twelve bit coefficients are produced by
the DCT, and passed on to the quantizer.
Quantize
This section quantizes the results of the DCT by dividing
the 12 bit output from the DCT with a host supplied value. The
5 bit quantization value supplied corresponds to division of the
12 bit coefficients ( range
2048 ) by values from 2 to 62, but
in steps of 2. This variable quantization strategy allows the
volume of data generated by the encoder to be adjusted
dynamically, depending on the fullness of the transmission
buffer. For H.261 applications it uses the quantisation value
provided at the control port during the previous Macroblock
period (or at some earlier time). An option is provided which
allows two quantisation values to be used, one for use with
inter coded macroblocks, and the other for use with intra
coded macroblocks.
As specified in H.261, the DC coefficient of an Intra coded
Block is treated differently and the 12 bit value is always
divided by 8.
When the quantization value is small, and the DCT coef-
ficient is large, there is a danger of overflow in the eight bit
output. To avoid this a clipping circuit is included at the output
of the quantizer, which saturates at the maximum values.
VP2611
5
then theoretically the average rate need only be 384/1871
times the SYSCLK rate. Note that PCLK must always be
obtained by dividing SYSCLK by an integer greater than one.
When the VP520 CIF/QCIF Converter is supplying the VP2611
with data, it provides a peak PCLK rate equivalent to SYSCLK
divided by two, and an average rate of SYSCLK divided by
four.
The mimimum gap between REQYUV going active is
2064 SYSCLK periods. In full CIF mode "dummy" macrob-
locks are internally inserted between rows, in order to give the
chip sufficient time to load a new search window. No new YUV
data must be loaded during these dummy macroblocks, and
REQYUV will remain inactive. No dummy macroblocks are
required in QCIF mode. With a 27MHz SYSCLK all macrob-
locks will be coded in less than a 30Hz frame rate period, and
there will be a period of inactivity before FRMIN goes active
again. During this period the output bus will remain static at all
ones, and no output strobe ( DCLK ) will be produced.
Fig 5 : Ordering of Pixels
00 01 02 03 04 05 06 07
08 09 10 11 12 13 14 15
16 17 18 03 20 21 22 23
24 25 26 27 28 29 30 31
32 33 34 35 36 37 38 39
40 41 42 43 44 45 46 47
48 49 50 51 52 53 54 55
56 57 58 59 60 61 62 63
19
1
2
3
4
5
6
Y
U
V
SUBBLOCK ORDER WITHIN MACROBLOCK
PIXEL ORDER WITHIN SUBBLOCK
Fig 4: MacroBlock Pipelining
Fig 6 : Timing at YUV Port
OPERATION OF INTERFACES
Macroblock Delays
The VP2611 has a three macroblock pipeline delay be-
tween pixel inputs and run length coded outputs. This is
illustrated in Figure 4. Whilst the second macroblock is being
input, the best fit macroblock from the previous frame is being
identified and then read from the frame store. At this time any
Control Decisions which are to effect the first macroblock must
be supplied by the host controller. The run length coded
outputs for the first macroblock are not available until the
fourth macroblock is supplied at the input pins.
YUV Input Port
The YUV port accepts pixel data from the preprocessor in
block format as illustrated in Figure 5. Within a complete
system the VP2611 is always the master device, and must be
supplied with macroblock data when it makes a demand. The
order in which pixels are supplied is pre-determined, and must
be strictly maintained. There are 64 pixels per sub-block and
4 luminance and 2 chrominance sub-blocks per macroblock.
The macroblocks themselves are divided into groups of blocks
( GOB's ), and the sequence specified in H.261 must also be
maintained. Note that, since the chrominance resolution is half
the luminance resolution both vertically and horizontally, then
the two chrominance blocks cover the same picture area as
the four luminance blocks.
The pre-processor producing macroblock data must pro-
duce a frame start signal ( FRMIN ) when it has a complete
frame of data available. This resets the input controller within
the VP2611, which will then generate sequential GOB and
macroblock numbers for the coded outputs referenced to this
input.
FRMIN must go high for at least one system clock period,
and must go low before the next frame is available. The
VP2611 responds to FRMIN with a request for macroblock
data ( REQYUV ), which occurs approximately 184 SYSCLK
periods after FRMIN. It must then receive a complete macrob-
lock within 1871 SYSCLK periods, and at the end of this time
REQYUV will go inactive. The VP2611 must be provided with
a PCLK signal to strobe in the data. This must be derived from
SYSCLK, and must only be present when there is valid data
at the input. Data must meet the set up and hold times with
respect to PCLK as specified in Figure 6.
The maximum peak rate for PCLK is the SYSCLK rate
divided by two, but since there are 384 bytes per macroblock
YUV Input
Frame Store Read
Control Decisions
Frame Store Write
DBUS Output
MB1
MB2
MB3
MB4
DUMMY
MB1
MB2
MB3
DUMMY
MB1
MB2
MB3
DUMMY
DUMMY
DUMMY
MB1
DUMMY
DUMMY
DUMMY
MB1
2064 cycles
PCLK
YUV7:0
10ns
0ns
20ns
20ns
SCLK/2
N.B. All timings given are MINIMUM values.