# Realization of Low Power FIR Digital Filter using Modified DA-based Architecture

#### Sheravi Shivaraj Ramu\* and S. Umadevi

School of Electronics Engineering, VIT University, Chennai Campus, Chennai-600127, Tamil Nadu, India; sheravi.shivaraj2013@vit.ac.in, umadevi.s@vit.ac.in

#### Abstract

**Objectives:** Finite Impulse Response (FIR) filters are widely used in image and signal processing applications and responsible for more total power dissipation. This research work presents modified Distributed Arithmetic (DA)-based approach for realization of low power FIR digital filter. Methods/Statistical Analysis: Power reduction in modified DA-based filter is achieved by turning off the MOS components whenever the input samples of circuit are zero. The proposed filter reduces power consumption by disabling adder operation when inputs are zero. In the conventional DSP applications, it is observed that the average of the zero input sample is very high. Therefore, substantial reduction in power consumption can be obtained by proposed modified DA-based architecture. The proposed DA-based FIR filter is designed using Xilinx<sup>®</sup> ISE design tool and Cadence<sup>®</sup> EDA tool. Findings: The proposed work proved that the power reduction is achieved in modified DA-based FIR filter realization when compared with existing LUT-less DA-based architecture of FIR filter. Simulation results shows that proposed architecture can achieve significant power reduction with fewer percentage increases in area and delay for different architectures. For four tap FIR filter 12.5% power reduction is achieved, at the cost of 0.63% extra chip area and 1% increased delay in comparison with conventional DA-based filter architecture. Therefore, the proposed filter can achieve even greater power saving for higher order FIR filters and for more number of zeros in the inputs sample. The proposed technique can also be used for reconfigurable architectures where filter coefficients change during runtime. **Conclusion:** A low power modified DA-based FIR digital Filter is proposed. Power reduction in proposed design is due to turning off MOS components and is linearly related to number of zeros in input and order of the filter.

**Keywords:** Distributed Arithmetic (DA), Finite Impulse Response (FIR), Look Up Table (LUT)-based and Look Up Table Less Computing, Low Power, Memory based Computing

## 1. Introduction

Finite Impulse Response (FIR) digital filters are widely used and are key components in many image processing and signal processing applications. The width of the transition-band is primarily determined by the order of an FIR filter. Sharper the transition between a pass-band and adjacent stop-band higher is the filter order. Many applications in speech processing (adaptive noise cancelation), seismic signal processing (noise elimination), digital communication (channel equalization, frequency channelization), and several other areas of signal processing require large order FIR filters<sup>1,2</sup>. Basic structure of FIR filter consists of multipliers, adders and delay elements. As the order of filters increases, requirement of these basic components also increases which causes large structure of filter for higher orders. Over the years many attempts are made to obtain the more efficient filter structure. DA-based, memory based, LUT-less structures are some of the popular filter structures<sup>3-6</sup>.

Distributed Arithmetic (DA)-based architecture is known for its efficient memory-based realization of Finite Impulse Response (FIR) filter where filter outputs are computed using inner-product of input-sample vector and filter-coefficient vector. The memory-based architectures

\*Author for correspondence

have many advantages over existing structures like reduced-latency implementation and greater potential for high-throughput and because of less switching activities these are expected to have less dynamic power consumption. With linear increase in order of FIR filter the number of Multiply-Accumulate (MAC) processes required per filter output increases, which makes real-time implementation of these filters of large orders a challenging task. Therefore, several attempts have been made and continued to design less complex and dedicated VLSI systems for high order filters7-9. In the DA-based approach, a Look-Up-Table (LUT) is used to store all the possible values of filter coefficients. In DA-based approach LUT size increases exponentially with the order of the filter. Efforts have been taken to reduce the memory-size in DA-based systems using group distributed technique and Offset Binary Coding (OBC). A decomposition of LUT scheme is presented in a paper<sup>6</sup> for reducing the memory-space required for DA-based implementation of FIR filter. But, reduction of memory-size obtained by such decompositions, results in increase in the number of adders and latches as well as latency.

In reconfigurable FIR filter coefficients change dynamically during runtime plays an important role in the digital up/down converters<sup>10</sup>, multichannel filters<sup>11</sup> and software defined radio systems<sup>12,13</sup>. A general multiplier-adder based design needs very large chip area and therefore enforces a limitation on the maximum possible order of the FIR filter realization for high-throughput applications. The main processes required for DA-based implementation are arrangement of LUT accesses followed by shift and accumulation operations of the LUT output. The conventional DA-based structure used for the realization of an FIR filter considers that impulse response coefficients are constant, and this assumption makes it possible to use ROM-based LUTs. Memory requirement for DA-based realization increases exponentially.

The objective of this research work is to design a novel architecture to achieve a low power FIR filter. The novel architecture reduces the power consumption by reducing the switching activities of adder module involved in the design. The proposed design is synthesized and results are obtained using Xilinx<sup>\*</sup> ISE design tool and Cadence<sup>\*</sup> EDA tool.

This paper is organized as follows. In section II, the existing FIR filter structures such as DA-based and decomposed DA-based FIR filter and LUT-less FIR filter are discussed. In section III, the proposed design of reconfigurable and modified FIR filter structure is discussed. Results and comparisons are discussed in section IV and conclusion is presented in section V.

### 2. Existing FIR Filter Structures

#### 2.1 DA-based FIR Filter Structure

Distributed arithmetic approach<sup>15</sup> is very important technology and used in computing SOP (Sum of Products) elements. The equations for the DA-based computing can be obtained as follows. Sum of product equation is given as,

$$y = \sum_{n=0}^{N-1} A_n * x_n$$
 (1)

For unsigned DA system  $x_n$  can be represented by,

$$x_{n} = \sum_{b=0}^{B-1} 2^{b} x_{n,b} \text{ with } \in [0,1]$$
(2)

For signed representation, DA-based system equation is written as,

$$y = -2^{B-1} \sum_{n=0}^{N-1} A_n x_{n,B-1} + \sum_{b=0}^{B-2} 2^b \sum_{n=0}^{N-1} A_n x_{n,b}$$
(3)

 $A_n x_{n,b}$  values are mapped into LUT. Therefore, 2<sup>N</sup> word LUT will accept the N-bit vector  $x_b$  and give corresponding output from LUT. Figure 1 shows the implementation of four tap DA-based FIR filter.

Serial inputs are converted into parallel bits and stored in registers. Last bits of each register are combined to constitute the bit pattern which addresses the data stored in LUT or ROM (Read Only Memory). At every clock pulse previous output will be shifted and added to the current output until last clock pulse. Registers at the output side are used to temporarily store the output. Shift registers



Figure 1. DA-based FIR Filter Design.

take the output from the output registers and shifted output is added to the current value from LUT and the new output is derived. Data stored in the LUT depend upon memory position, for example, for memory position 1010 data stored will be  $h_3 + h_1$ , and data stored at location 1101 will be  $h_3 + h_2 + h_0$  and so on. In this paper, 'h' represents the filter coefficients values.

Advantage of DA-based FIR filter structure is, it avoids multipliers, adders and delay elements which are basic components in general FIR digital filter structure. Due to elimination of multipliers significant power reduction is achieved which made it very popular structure and thereafter many researchers focused on DA-based FIR filter structures. There are some limitations of this structure. Even though multipliers were eliminated, memory blocks were included into the structure. From the structure it can be observed that as the order of the filter increases LUT size also increases. For N-bit FIR filter 2<sup>N</sup> sized memory block is needed. So, for large order FIR filters, bulkiness of the structure increases. Several attempts have been made to reduce the LUT/ROM size.

#### 2.2 DA-based FIR Filter With ROM Partitioning

The conventional DA-based architecture require  $2^{N}$  memory locations, but it can be reduced by partitioning of ROM<sup>15</sup>. If the order of the filter is too large then ROM can be partitioned and the results will be added. Inner product of length AB can be represented by,

$$y = \sum_{n=0}^{AB-1} A_n * x_n$$
 (4)

is to be implemented using DA-based architecture. This can partitioned into L separate parallel DA LUTs according to equation (5).

$$y = \sum_{n=0}^{A-1} \sum_{m=0}^{B-1} A_{An} * x_{An+m}$$
(5)

The concept of divided DA is explained in Figure 2 using four slices for eight tap FIR filter which require three adders for post addition process.

#### 2.3 LUT-less DA Architecture

In the Ref<sup>14</sup>, a new architecture for high speed large order FIR digital filter is proposed. In this architecture multiplexers and adders are used to replace the LUT. Because



Figure 2. Partitioned DA-based FIR Filter.

of the absence of LUT, the structure is called as LUT-less architecture. Architecture of the filter is shown in Figure 3. LUT-less design is the best available structure and used to compare with proposed architecture. Another important advantage of this structure is that it can be used for dynamic structures or reconfigurable filters where filters coefficients change during runtime. Even though LUT-less structures are better in performance when compared to LUT-based architectures they end up with added latency.

## 3. Proposed FIR Filter Structure

#### 3.1 Basic Concept

In LUT-less architecture, it is observed that irrespective of the select element is either zero or one, the addition operation is performed. These addition operations result



Figure 3. iUT-Less DA-based Architecture for N=4.

in added latency. So, a novel architecture is designed in which depending upon the input condition, the filter coefficients are added or they will be bypassed. This is achieved by including additional multiplexer and tri state buffers whereas multiplexers required before the addition operation are eliminated. Basically, modification is achieved through addition operation. The addition operation will happen only when there is a requirement otherwise the bypass operation will take place. The proposed concept is explained with the circuit diagrams shown in Figure 4 and Figure 5.

In Figure 4 it is observed that irrespective of inputs, addition operation is always executed as specified in previous section. But in the proposed circuit, bypassing of the addition operation depending upon the values of x [n-3] and x [n-2] is obtained. In Table 1, switching activities of adder module are presented. If both the inputs are high then only addition operation is performed otherwise adder circuit is turned off resulting in power saving. If any one of the input bit is 'high' then respective coefficient is bypassed. If both the input bits are 'low', then 'low' value i.e. '0' is transferred. So, by using bypassing technique the addition operation can be avoided in the proposed structure.



**Figure 4.** Conventional Addition Operation in LUT-less architectures.



Figure 5. Modified adder structure.

| Input Pattern | Adder Module | Output          |
|---------------|--------------|-----------------|
| 00            | OFF          | 0               |
| 01            | OFF          | h <sub>2</sub>  |
| 10            | OFF          | h <sub>3</sub>  |
| 11            | ON           | $h_{3} + h_{2}$ |

Table1.Adder outputs for different inputpatterns

Depending upon the input pattern values considered in Table 1, the adder module will be turned ON or OFF. For the considered example, it is observed that the adder module is OFF for three out of four times and thus giving almost 75% less power dissipation.

# 3.2 Proposed FIR Digital Filter Design using Modified DA Architecture

The existing multiplexer and adder modules are replaced with modified adder structure to get the proposed FIR digital filter architecture. Design of four tap FIR filter using proposed technique is shown in Figure 6.

For N-tap filter, N/2 multiplexers are needed. Input signal of N tap filter can be partitioned into multiple slices of two bit each. Outputs of the two multiplexers are added and then shift and accumulate operations are performed as in simple DA-based filters.

# 4. Simulation Results and Discussion

In this section, performance analysis and comparisons of different FIR digital filters are provided. All the FIR



**Figure 6.** Modified DA-based FIR Filter Structure for N=4.

filter structures represented in this paper are designed using Xilinx<sup>\*</sup> ISE design tool and Cadence<sup>\*</sup> EDA tool. The RTL design for the proposed DA-based FIR filter is verified and converted to layout using Cadence<sup>\*</sup> Encounter tool. The simulation flow for proposed design is shown in Figure 7.

The performance comparisons for the existing architectures and proposed structure are listed in Table 2.



Figure 7. Simulation Flow.

Table 2.Comparison of different Filter structures forfour tap FIR Filter

|                                  | Area (µm <sup>2</sup> ) | Power Dissipation<br>(µw) | Delay (ns) |
|----------------------------------|-------------------------|---------------------------|------------|
| Simple FIR Filter                | 9790                    | 751.255                   | 41.445     |
| LUT-less DA-<br>based FIR Filter | 2075                    | 150.548                   | 47.860     |
| Proposed FIR<br>Filter           | 2092                    | 131.721                   | 48.328     |





# Table 4.Power dissipation for different filterstructures



Table 5.Delay Comparison for different filterstructures



Adder circuits are responsible for more power dissipation in the basic FIR filter structures and DA-based structures. Since adders are bypassed in the proposed FIR filter design, significant power reduction is achieved. From Table 2, it is clearly visible that the proposed FIR filter design achieves 12.5% reduction in power consumption for four tap FIR filter at the cost of 0.062% increase in area and 1% increase in delay. The minor effects on the area and speed parameters can be neglected for larger goal of power reduction. The power consumption is linearly related to the number of zeros in the input sample values and order of the filter. The proposed DA-based FIR filter will be advantageous for current DSP applications where higher order filters are required.

# 5. Conclusion

A low power DA-based FIR digital is proposed. The proposed filter reduces power consumption by disabling adder operation when inputs are zero. In order to validate the proposed design, power consumption and other parameters are evaluated using Xilinx<sup>®</sup> ISE Isim simulator and then synthesized and simulated using Cadence<sup>®</sup> RTL complier. Simulation results shows that proposed architecture can achieve significant power reduction with fewer percentage increases in area and delay for different architectures. For four tap FIR filter 12.5% power reduction is achieved, at the cost of 0.63% extra chip area and 1% increased delay in comparison with those of conventional filter architectures. Therefore, the proposed DA-based filter can achieve even greater power saving for higher order FIR filters and for more number of zeros in the inputs sample.

# 6. References

- Xu D, Chiu J. 'Design of high-order FIR digital filtering and variable gain ranging seismic data acquisition system. Proceedings of IEEE Southeastcon; 1993 Apr 4–7; Charlotte, NC. p. 6.
- Mirchandani G, Zinser Jr RL, Evans JB. A new adaptive noise cancellation scheme in the presence of crosstalk [speech signals]. IEEE Trans Circuits Syst II. Analog Digit Signal Process. 1995 Oct; 39(10):681–94.
- 3. Chang T-S, Chen C,Jen C-W. New distributed arithmetic algorithm and its application to IDCT. IEE Proceedings: Circuits Devices Syst. 1999 Aug; 146(4):159–63.
- 4. Choi J-P, Shin S-C, Chung J-G. Efficient rom size reduction for distributed arithmetic. International Symposium on Circuits and Systems (ISCAS 2000); 2000 May 28-31; Geneva, Switzerland: IEEE. 2:61–4.

- 5. Hwang S, Han S, Kang S, Kim J. New distributed arithmetic algorithm for low-power FIR filter implementation. IEEE Signal processing letters. 2004 May; 11(5):463–6.
- Xie J, He J, Tan G. FPGA realization of FIR filters for high-speed and medium-speed by using modified distributed arithmetic architectures. Microelectronics Journal. 2010; 41(6):365-70.
- Kha HH, Tuan HD, Vo B-N, Nguyen TQ. Symmetric orthogonal complex-valued filter bank design by semidefinite programming. IEEE Trans Signal Process. 2007 Sep; 55(9):4405–14.
- Dam HH, Cantoni A, Teo KL, Nordholm S. FIR variable digital filter with signed power-of-two coefficients. IEEE Trans Circuits Syst IReg Papers. 2007 Jun; 54(6):1348–57.
- 9. Mahesh R, Vinod AP.A new common subexpression elimination algorithm for realizing low-complexity higher order digital filters. IEEE Trans Computer-Aided Ded Integr Circuits Syst. 2008 Feb; 27(2):217–29.
- Hatai I, Chakrabarti I, Banerjee S. Reconfigurable architecture of a RRC FIR interpolator for multi-standard digital up converter. Proceedings IEEE 27th IPDPSW. 2013 May; 247–51.
- Ming L, Chao Y. The multiplexed structure of multi-channel FIR filter and its resources evaluation. Proceedings Int Conf CDCIEM; 2012. p. 764–8.
- Hentschel T, Henker M, Fettweis G. The digital front-end of software radio terminals. IEEE Pers Commun Mag. 1999 Aug; 6(4):40–6.
- 13. Chen K-H, Chiueh T-D. A low-power digit-based reconfigurable FIR filter. IEEE Trans Circuits Syst II Exp Briefs. 2006 Aug; 53(8):617–21.
- 14. Eshtawie MAM, Othman M. On-line DA-LUT architecture for high- speed high-order digital FIR filters. Proceedings of the IEEE International Conference on Communication Systems (ICCS); 2006 Nov; Singapore. p. 5.
- 15. Parthi KK. VLSI Digital Signal Processing Systems. WILEY; 1999.