## Электронный компонент: CS2412 | Скачать: PDF ZIP |

processing continuous data streams with high data throughput rate of up to 50 Msamples/Sec. This highly

integrated application specific silicon core is the pipelined version of CS2411 and is available in both ASIC and

FPGA versions that have been handcrafted by Amphion for maximum performance while minimizing power

consumption and silicon area.

1024)

32-bit

Input

Input

Pipelined architecture

16-bit complex input/output in two's

complement format (32-bit complex word)

16-bit twiddle factors generated inside the

core

18-bit internal accuracy

Programmable shift down control

Radix-4 architecture

Simultaneous loading/downloading

supported

Both input and output in normal order

No external memory required

Optimized for both ASIC and FPGA

technologies with the same functionality

Fully synchronous design

Image processing

Atmospheric imaging

Spectral representation

defined below

proportional to Nlog

of multiplication is required, however the more simultaneous

multiple data access is required which causes the circuits to be

more complicated. The radix-4 algorithm offers a balance

between the computational and circuit complexity and is often

used in construction of higher radix FFT computation units

when designing high performance FFT/IFFT hardware.

in Figure 2) of the CS2412 1024-point FFT/IFFT core. Unless

otherwise stated, all signals are active high and bit(0) is the

least significant bit.

0: FFT,

1: IFFT

when XBS is active and associated with the 1024-point block indicated by XBS.

block. The remaining N-1 data of the N-point data block are loaded into the core in the follow-

ing N-1 data clock cycles in the natural order.

when XBS is active and returns to LOW when the last data of the N-point block is loaded into

the core. XBS is ignored when it is HIGH.

transformed block is on the output port. The remaining N-1 data of the N-point transform

result come out of the core in the following N-1 clock cycles in the natural order.

forward or inverse Fast Fourier Transforms on complex data.

Data is loaded into its workspace in normal sequential

(natural) order. The transformed data is returned in normal

sequential order. It performs 1024-point FFT/IFFT using the

following equations:

signal, X(n) is the complex input data and Y(k) the complex

output data. Both the real and imaginary components of input

X(n) and output Y(k) are 16-bit numbers in two's complement

format.

Msamples/Sec by employing a pipelined architecture with

fixed-point arithmetic operations and pre-scaling strategy to

handle possible overflow in computation. The core has 4-bit

unconditional scaling down operations and 7-bit controlled

scaling down operations specified by input signal SDC, giving

the user the necessary gain control required in a specific

application. The CS2412 core uses radix-4 decimation in

frequency (DIF) algorithm to perform the transform. It

consists of five radix-4 pipelined stages with reshuffle buffers

between stages and is capable of processing continuous data

stream. Both the input and output are in the normal order (the

ordinary time order).

signal. However, the scaling down control is applied on a

block-by-block basis. The core detects possible overflow

during computation and saturates overflow data accordingly.

internally. For example, the input data is clocked in using the

data clock while the core operates on the 2 x clock. The output

data is also clocked out on the 2xclock although it changes

only on every 2 cycles of the 2 x clock. When implemented on

FPGA devices, The 2 x clock is generated by the on-chip PLL

of Apex 20KE device or DLL of Virtex devices.

is specified by Figure 3. The intermediate data stored in the

reshuffle buffers are 16-bit wide (32 bits for complex

numbers). The wordlength grows to 18 bits after the radix-4

butterfly. The twiddle multiplier takes the 18-bit butterfly

output and 16-bit twiddle factors, generating 34-bit product.

The product is then scaled and rounded to 16 bits for the next

stage radix-4 operation.

result

HIGH

form of the output data block.

Rounding

Loading the input data is performed under the control of

signal XBS. Signal XBS is asserted when the output signal

XBIP is de-asserted. It indicates the first data of the 1024-point

data block and the data is clocked in on the clock rising edge.

The rest of the 1023-points of data are loaded in the successive

1023 clock cycles in the natural order. When the last data is

loaded signal XBIP returns to LOW. Loading of the next data

next clock cycle after XBIP returns to LOW.

appears on the output port. The rest of the result data will be

continuously clocked out in the following 1023 clock cycles.

Signal YAV will be asserted during the period of the result

being output. Figure 4 illustrates the functional timing of the

I / O signals.

radix-4 butterfly followed by a twiddle multiplication.

Theoretically in the worst case the result value may grow by a

factor of up to 5.657 in the first stage. This occurs when the

four input data to the radix-4 computation have the maximal

absolute value and the twiddle angle is

represents a possible wordlength growth of 11 bits. As the

output is 16-bit value and fixed-point arithmetic is employed

in the core, it is necessary to be able to scale the result to avoid

overflow while still obtaining a good dynamic range.

zero bit growth can be allowed. Thus, the megafunction must

have the capability of up to 11-bit right shifting of the internal

result to enable overflow to be avoided. The total of 11 bit

scaling down operation is assigned to each stage according to

Table 2. When SDC is set to the maximal value, there will be

no overflow for any input data.

7-bits are applied at the discretion of the user under the

control of SDC.

computation accuracy possible for the given word lengths.

The core performs the round-to-the-nearest operation to keep

the loss in accuracy minimal. When the intermediate value, for

instance from the twiddle multiplication result, is required to

scale down, the most significant bit of the portion to be

rounded off is added to the word which remains. This is a

compromise between true rounding and truncation.

Compared with the technique that unconditionally sets the

bottom bit to '1', the partial rounding scheme achieves better

accuracy and guarantees to generate an all-zero output block

for an all-zero input block.

the following procedure to saturate output overflow samples:

with respect to SDC signal. Table 3 represents the output error

with respect to SDC signal.