Wisely employing digital capabilities can result in optimum filter designs with a minimum number of bits. As was seen last month, in some cases, even having an excess amount of processing power may not result in additional filter precision. The final installment of this four-part article series will examine the use of fixed-point filtering and provide a practical example of DSP-based FIR filtering to meet the spectral mask requirements of a Global System for Mobile Communications (GSM) cellular system.

Quantizing the coefficients correctly is not the only thing to keep in mind when implementing an FIR filter with fixed-point arithmetic. Suppose it is necessary to implement this filter using the direct-form structure. Figure 42 shows the structure as a reference for five coefficients. For the example at hand, we have 16-b coefficients, and suppose we need to filter 16-b data that is well scaled in the range. We can generate random data with that characteristic as follows:

q = quantizer(, 'RoundMode', 'round');

xq = randquant (1, 1000, 1);

The format is used for the coefficients for illustration purposes. Since the input is already quantized, an input quantizer or a multiplicand quantizer is not needed:

Hq = qfilt('fir', {b}, ...

'CoefficientFormat', );

set(Hq, 'InputFormat', 'none')

set(Hq, 'MultiplicandFormat', 'none');

For reference, the other parameters are set to default values:

OutputFormat =

ProductFormat =

SumFormat =

but will be temporarily set to 'none' in order to have a reference for comparison:

set(Hq, 'OutputFormat', 'none');

set(Hq, 'ProductFormat', 'none');

set(Hq, 'SumFormat', 'none');

yi = filter(Hq, xq);

Quantity yi represents an "ideal" output or the best output one can hope to compute. Aside from using the 16-b quantized coefficients, all computations are performed with double-precision arithmetic. Quantity yi provides a nice reference signal for comparison purposes.

If the parameters are set back to their default values, it becomes apparent that the product format is not accurate for this case. The multiplication of coefficients with a input sample results in a product. On a DSP processor, two 16-b registers are being multiplied and the result stored in a 32-b product register. The correct setting for the ProductFormat is :

set(Hq, 'OutputFormat', quantizer());

set(Hq, 'ProductFormat', quantizer());

set(Hq, 'SumFormat', quantizer());

yq = filter(Hq, xq);

The qreport(Hq) reporting function is an extremely useful tool to monitor what has happened here. In this case, it reports that no overflows have occurred (see table). To measure how good the output is, the energy of the error and the maximum error are compared:

norm(yi − yq, 2)

ans = 0.00054794884123692

norm(yi − yq, inf)

ans = 3.05137364193797e−005

Figure 42 shows that there is a source of error when moving the data from the set of adders (what would be the accumulator in a DSP processor) to the output. The word length is being reduced from 32 b to 16 b. The theoretical power spectrum of the quantization noise at the output of the filter corresponding to the model in Fig. 43 is

where H_{n}(e^{j}^{ω}) is the transfer function from the noise source to the output (in this case, simply 1) and ω_{x}^{2} is the power spectrum of the noise source (in this case, a constant and equal to the variance of the noise):

where :

b = the number of bits.

(This formula is approximate because the signal at the accumulator does not cover the entire range and because an analog signal is not being quantized. Rather, the number of bits in an already quantized signal is being reduced.) In this case, the theoretical power spectrum is constant and for 16 b is

Sy(_) = 10log10 22(−15)/12 = 101.100811159671 dB

An estimate of the noise power spectrum can be computed with the "nlm" function:

= nlm(Hq, 512, 100);

Figure 44 shows a plot of the "Pnn" function (in dB) compared to the theoretical power spectrum.

If the quantization noise in Fig. 43 is the only noise in the system, it should be possible to obtain an output that exactly matches yi by setting the output format to be the same as the sum format (one can think of it as the ability to "look inside the accumulator"):

set(Hq, 'OutputFormat', quantizer());

yq = filter(Hq, xq);

norm(yi − yq, 2)

ans = 2.02838467848398e−006

norm(yi − yq, inf)

ans = 7.98609107732773e−008

While the error has clearly been reduced, there is still some left, indicating some roundoff still present in the system. This is confirmed by looking at the power spectrum for the noise using the "nlm" function. Figure 45 shows the plot of the power spectrum. The noise is obviously less than before (about −168 dB), which is consistent with the smaller errors computed here. To find the source of the error, it is simply a matter of looking at the discrepancy between the product format and the sum format.

### Page Title

The sum format is set to so that the three least significant bits from the product register are basically being lost. It may be tempting to make the sum format the same as the product format, but overflows occur left and right:

set(Hq, 'SumFormat', quantizer());

yq = filter(Hq, xq);

Warning: 1944 overflows in QFILT/FILTER.

In general, for additions *k* bits are not enough to always store the result of adding two quantities with *k* bits each. Overflow *might* occur, and when adding so many numbers (220 in this example), chances are very high that it will occur. So it is preferable to live with some roundoff error, rather than to overflow (the two-norm of the error is a staggering 2.09011261755715, while the infinity norm is 0.285711827455089).

A trial-and-error procedure can be followed for reducing the sum format to , , etc., until no overflow occurs. However, a better way is to go back to the setting, filter a signal, and look at the report given by "qreport." For this example, qreport shows that the maximum and minimum sum values are 0.527 and −0.5357, respectively. Therefore, a format of will be the optimal setting to minimize quantization noise without overflow:

set(Hq, 'SumFormat', quantizer());

set(Hq,' 'OutputFormat', quantizer());

yq = filter(Hq, xq);

norm(yi − yq, 2)

ans = 7.53800283935414e−007

norm(yi − yq, inf)

ans = 2.93366611003876e−008

The improved results can be confirmed by the "nlm" function , which now shows a power spectrum for the noise of −174 dB (Fig. 46).

The results obtained previously are the best that could be obtained with a 32−b accumulator (typical in some early DSPs). Modern DSP processors provide an accumulator with extended precision, using so-called *guard bits*, with typically 40-b resolution for data word lengths of 16 b. With such an accumulator, better results are possible if the extra bits are used wisely. For instance, the following setting for the sum format will not do:

set(Hq,'SumFormat',quantizer());

set(Hq,'OutputFormat',quantizer());

because no overflow occurred with the setting. Having extra bits does no good (the errors are exactly the same as for the case). However, if the LSB weighting is set the same as for the product format, namely, if the following setting is used:

set(Hq,'SumFormat',quantizer());

set(Hq,'OutputFormat',quantizer());

the errors between "ideal" and actual become exactly zero. Of course, in this example it was not necessary to have a full 40-b accumulator to achieve an output exactly equal to what is considered ideal. Once again, from the report generated with "qreport," it was evident that a setting of for both sum and output would have been sufficient.

In an actual DSP, the output is not of the same width as the accumulator, so realistically it is necessary to set the output format back to either 16 b or 32 b in this example. Assuming 32 b for the output, qreport can once again be used to determine the best possible output setting. In this case, is the best setting because the minimum value reported at the output is −0.5357. The two-norm and infinity-norm of the errors are:

norm(yi − yq, 2)

ans = 6.82098421980174e−009

norm(yi − yq, inf)

ans = 3.49245965480804e−010

which compare favorably with the values 7.53800283935414e−007 and 2.93366611003876e−008, respectively (which were the best that could be done for a 32-b output with a 32-b accumulator).

To show the practical benefits of FIR filter design, two FIR filters will be used for a digital downconverter (DDC) for GSM, based on a quad multistandard DDC chip (model 4016) from Graychip.^{18} A DDC essentially has two parts. The first, which consists of a numerically controlled oscillator (NCO) and a mixer, translates an intermediate-frequency (IF) signal to baseband. The second part is a multistage decimator used to isolate the desired signal. This design example will focus on the second part.

For decimation, the 4016 provides for a multistage approach consisting of three FIR filters. Of the three filters, one is a cascaded integrator-comb (CIC) five-stage decimator (Fig. 47) and two are programmable decimate-by-two FIR filters. The five-stage CIC filter takes the high-rate input signal and decimates it by a programmable factor. The CIC filter is followed by a 21-tap compensation FIR (CFIR) filter that equalizes the "droop" due to the CIC filter and provides further lowpass filtering and decimation by two. The CFIR is followed by a 63-tap programmable FIR (PFIR) filter that provides a final decimation by two.

In a multistage decimator, the simplest (highest-rate) filter appears first, with the complexity of the filters increasing in the subsequent stages. In the 4016, the CIC filter operates without multipliers, providing (coarse) lowpass filtering using adders and delays. Its less-than-ideal magnitude response exhibits passband droop that progressively attenuates in-band signals. The relatively simple CFIR filter (only 21 taps) is designed primarily to compensate for the CIC's droop. The PFIR filter is the most complex, with 63 multiplications per sample, and thus operates at the lowest rate. This is an example of a design using fixed filter order. The CFIR and PFIR are linear-phase filters, with 16-b available word length for their coefficients.

### Page Title

The programmable 4016 is designed for use with different communications standards. For this reason, the decimation factor of the CIC filter can be selected as well as the coefficients for both the CFIR and PFIR filters. For GSM, the following requirements apply:

input sample rate of 69.333248 MHz

CIC decimation factor of 64

CFIR input sample rate of 1.083332 MHz

PFIR input sample rate of 541.666 kHz

PFIR ouptut sample rate of 270.833 kHz

passband width of 80 kHz

passband ripply of less than 0.1 dB peak to peak.

The CIC filter has five stages and a decimation factor of 64. Its magnitude can be shown (Fig. 48) by creating a CIC decimation object and using the "fvtool" function:

Hcic = mfilt.cicdecun(64, 1, 5);

fvtool(Hcic)

The filter exhibits a ^{5} shape. To compensate for the large DC gain (more than 180 dB), the 4016 provides a power-of-two scaling prior to data entering the filter, in order to avoid overflows. Passband details are shown in Fig. 49, with a droop in magnitude response of about 0.4 dB at 80 kHz, or much more than the GSM target specifications.

To improve on this performance, the "firceqrip" function was chosen for several reasons. It allows for compensation of responses having the form | sin(x)/x |^{N}. It also permits the filter order to be specified, and allows for a slope in the stopband, which can be used to attenuate spectral replicas of the PFIR filter that follows. The function also makes it possible to constrain peak passband and stopband ripples, and allows passband-edge frequency (rather than the cutoff frequency) to be specified. For this example, since the passband is the interval , it is desirable to compensate for the CIC droop in the passband only.

The filter order is determined by the hardware. For the passband-edge frequency, 80 kHz is selected since it is the final passband of interest. Very small passband ripple is chosen (0.01 dB) in order for the overall ripple to be within specification, keeping in mind that there is still the PFIR filter to follow which will add its own passband ripple. The stopband attenuation is selected as 40 dB with a 60-dB slope to provide adequate attenuation of the PFIR spectral replicas. Because this is a five-stage CIC, the droop is of the form |sin(x)/x|^{N} so 5 is selected as the sinc power for which to compensate. Finally, the sinc frequency factor is chosen as 0.5.

Figure 53 shows an overlay of the GSM spectral mask requirements^{18} with the combined response of the CIC filter and the CFIR filter. It is evident from the plot that the contribution of these two filters is not sufficient to meet the GSM requirements for either adjacent-band rejection or blocker requirements. The combined filter still has a transition band that is too large, due to the large transition band from the CFIR filter.

The PFIR filter is intended to be used to do the extra work required to meet the GSM specifications. It is a linear-phase FIR filter consisting of 63 taps, but not trivial. If a simple lowpass filter is designed with the "remez" or "gremez" functions:

N = 62;

Fs = 541666;

F = /(Fs/2);

A = ;

W = ;

pfir = gremez(N, F, A, W);

Hpfir = mfilt.firdecim(2, pfir);

the passband ripple requirement will not be met. The weighs can be altered to get better passband ripple, but the adjacent-band specifications can't be violated in the process. A setting of W = ; would do the trick, but with significantly less adjacent-band attenuation. A compromise can be achieved by setting up the design as a lowpass filter with two separate stopband regions, each with a different weight used in the optimization:

N = 62;

Fs = 541666;

F = /(Fs/2);

A = ;

W = ;

pfir = gremez(N, F, A, W);

Hpfir = mfilt.firdecim(2, pfir);

Figure 54 shows the quantized PFIR filter. The maximum coefficient is 0.3378, so the format is used again. The reference (nonquantized) filter is also shown, but it is practically indistinguishable from the quantized response. The different attenuation in the two stopbands is due to the different weighting. The passband ripple is kept small in order to not exceed the allowable peak-to-peak specification.

Figure 55 shows the magnitude responses of all three (CIC, CFIR, and PFIR) filters while Fig. 56 shows the overall response of the combination. As the combination response shows, the GSM spectral mask requirements are easily met. The peak-to-peak ripple requirement of 0.1 dB is easily met. The design could be further tuned to provide a smaller transition width at the expense of larger peak-to-peak ripple and/or less adjacent-band rejection.

REFERENCE

- Graychip, Inc., "GC4016 Multistandard Quad DDC Chip Data Sheet," Revision 1.0, August 27, 2001.