

# Design of High Performance and Efficient Architecture for Lifting Based 2D Discrete Wavelet Transform in VLSI

# Dr. D.Sasikala<sup>1</sup> and K.Dhivya<sup>2</sup>

<sup>1</sup>Professor, Department of ECE, Vivekanandha College of Engineering for Women (Autonomous), Tiruchegode, Tamilnadu, India. <sup>2</sup>PG Scholar, Department of ECE, Vivekanandha College of Engineering for Women (Autonomous), Tiruchengode, Tamilnadu, India.

| Article Received: 27 November 2017 | Article Accepted: 30 January 2018 | Article Published: 04 February 2018 |
|------------------------------------|-----------------------------------|-------------------------------------|
|                                    |                                   |                                     |

#### ABSTRACT

In this paper, very efficient and high performance VLSI architectures for lifting based 1D and 2D-Discrete wavelet transforms (DWTs) are proposed. The proposed method logic used for area efficient and delay efficient lifting based DWT is to perform the whole operation with one processing element and multiple processing element. In which both the cases, the element consists of floating point adder and proposed fused multiply add design. The proposed and existing lifting logic are implemented with 45 nm technology. The results show that the proposed designs attain significant development compared with existing architectures. For example, 9-point 2-parallel proposed single level 1D-DWT achieves 33.5% of reduction in total cycle delay compared with direct form. Similarly, 9-point single PE proposed single level 1D-DWT achieves 59.8% and 75.5% of reduction in total area and net power over direct form respectively

Keywords: DWT, ID-DWT and VLSI.

#### **1. INTRODUCTION**

This process Wavelets are correspond to the efficient time and frequency signal properties. DWT is used to decompose the input values into a set of low and high pass outputs. The low pass outputs are sending as input for next level of operation recursively. In multimedia applications, DWT is used to decrease the unnecessary information and hence the storage requirement of image pixels will be reduced. DWT is used in image processing for the analysis of multi resolution. The image reconstruction, image fusion, and image coding can be done easily with DWT. The general architectures of DWT are convolution based lifting based. In this paper, we considered only the lifting based DWT architectures.

The lifting based DWT has an advantage over convolution based DWT. Here, the low pass and high pass filters are replaced by upper and lower triangular matrices and hence computation complexity has been reduced as compared to convolution based DWT. In N-point 1D or N × N-point 2D DWT, the maximum number of possible levels is log -2N. The DWT with decomposition levels 2 or 3, the image quality will be good and the degradation is reasonable. Beyond the decomposition levels of 2 or 3, the image quality will be reduced. Therefore, number of decomposition levels in the implementation of our pro-posed Fig. 1 shows lifting based 3 levels DWT. Here, P and U represent predict and update modules. The input sample values are decomposed into odd and even sample values using split unit. The output from update unit is sent to the next stage recursively. The low pass and high pass analysis filters are taken as h (z) and g (z) respectively. The corresponding mixture filters are h(z) and g(z) respectively.

$$\widetilde{P(z)} = \begin{bmatrix} \widetilde{h_e(z)} & \widetilde{h_o(z)} \\ \widetilde{g_e(z)} & \widetilde{g_o(z)} \end{bmatrix} \text{ and } P(z) = \begin{bmatrix} h_e(z) & h_o(z) \\ g_e(z) & g_o(z) \end{bmatrix}$$

If (h, g) is a corresponding filter pair, then P (z) can be factored into the following lifting steps,



$$\widetilde{P(z)} = \begin{bmatrix} K & 0\\ 0 & \frac{1}{R} \end{bmatrix} \prod_{i=1}^{m} \begin{bmatrix} 1 & \widetilde{S_i(z)} \\ 0 & 1 \end{bmatrix} \begin{bmatrix} 1 & 0\\ t_i(\overline{z}) & 1 \end{bmatrix}$$
(1)  
$$P(z) = \begin{bmatrix} \frac{1}{R} & 0\\ 0 & K \end{bmatrix} \prod_{i=1}^{m} \begin{bmatrix} 1 & -\widetilde{S_i(z^{-1})} \\ 0 & 1 \end{bmatrix} \begin{bmatrix} 1 & 0\\ -t_i(\overline{z^{-1}}) & 1 \end{bmatrix}$$
(2)

The most efficient factorization of the poly phase matrix for (5, 3) and (9, 7) filters [24] are shown in Eqs. (3) and (7) respectively. Here, a, b, c, d, e, f, and K are constants.



Fig. 1. Lifting based DWT with 3 levels.

The 2D-lifting based discrete wavelet transform outputs can be produced in 2 steps, namely (1) row process and (2) column process. During the row process, each row of  $N \times N$ -input signal matrix is 1D transformed and the results are stored in  $N \times N$  2 buffer. After completing all the N rows of  $N \times N$ -input signal matrix in row process, transpose matrix of the  $N \times N$  2 -buffer is taken for column process. In column process, each row of transposed buffer matrix is 1D transformed and results are the final 2D-DWT values (N  $2 \times N 2$ ). Fig. 2 shows the example for lifting based 2D discrete wavelet transform with 2 levels of decomposition.

$$P_{1}(z) = \begin{bmatrix} 1 & a(1+z^{-1}) \\ 0 & 1 \end{bmatrix} \begin{bmatrix} 1 & 0 \\ b(1+z) & 1 \end{bmatrix}$$
(4)

$$P_{2}(z) = \begin{bmatrix} 1 & c(1+z^{-1}) \\ 0 & 1 \end{bmatrix} \begin{bmatrix} 1 & 0 \\ d(1+z) & 1 \end{bmatrix}$$
(5)

$$P_3(z) = \begin{bmatrix} K & 0\\ 0 & \frac{1}{K} \end{bmatrix}$$
(6)

$$\widetilde{P(z)}_{(9,7)} = P_1(z)P_2(z)P_3(z)$$
(7)

#### 1.1 Related Works

Most of the 2D-DWT lifting based architectures use any one of the following 1D-DWT architectures (1) direct form (2) recursive form and (3) flipping form. In all the above mentioned architectures, the set of processing elements (PEs) are used. Each PE consists of two adders and one multiplier. The (9, 7) direct form [11] and recursive form [12] architectures are the traditional designs for (9, 7) DWT, where, 4 and 2 PEs are used



respectively. In the folded recursive [12] lifting based DWT, the half of the direct form (9, 7) is used. So, the whole operation using [12] takes more cycles than direct form (9, 7) DWT [11] to complete the operation. In (5, 3) direct form [10] lifting based DWT, two PEs are connected in series. In the flipping based DWT [19], the scaling constants (multipliers) are inversed to reduce the critical path delay of the lifting structure. In all these lifting based DWTs, the drawback is the worst path delay that includes two adders and one multiplier. In the proposed designs, one floating point adder is combined with floating point multiplier and it is renamed as fused multiply add design. Therefore, the worst path delay of proposed techniques is less than the existing designs. Another drawback with the above mentioned 1D-DWTs is the requirement of more cycles. In this paper, 2-parallel DWT architectures are proposed, where two PEs are connected in parallel instead of series in the direct form of (5, 3) DWT and recursive form of (9, 7) DWT. Therefore, the number of cycles required for the proposed 2-parallel DWT architectures is much less than the existing designs. The SCLA (spatial combinative lifting algorithm) based DWT is proposed in . Here, the line buffers are added with four PEs of direct form, where the line buffers act as transpose buffer to perform column process. The four PEs are used for row and column processes by hardware reuse strategy.

The similar reuse strategy based 2D-DWT is proposed in, where the direct form (9, 7) DWT with four PEs is used for both row and column processes. In the flapping based 2D-DWT, two flipping based 1D-DWTs (row and column processors) are used. Each processor contains four PEs and each with two adders and one multiplier. The similar strategy is proposed in, where two DWTs with four PEs are used dedicatedly to row and column processes. In all the above mentioned 2D-DWTs, the drawback is the requirement of more cycles and this has been resolved in our proposed N-parallel architectures, In the recent trends, multiple input and multiple output (parallel) lifting based architectures are proposed, where the transpose buffer is not used. High speed and fast 2D-DWT architectures are shown in , where two and one 1D-DWTs are used in column process respectively. In both the cases, the row process contains two 1D-DWTs. Therefore, the area for fast architectures is less than high speed design and the number of cycles for fast architectures is greater than high speed design. Here, the drawback is the critical path that contains two add-shift based multipliers and four adders. Therefore, the critical path of is much greater than proposed designs. The P-block parallel 2D-DWT is proposed in , where P 2 PEs are used in both row and column processes and each with four PEs. Therefore, require 4P multipliers and 8P adders in parallel to complete the DWT operation. In , M recursive architecture based PEs are used in row process and M direct form PEs are used in column process. Therefore, Tian et al. require 6M multipliers and 12M adders in parallel. In all the afore mentioned transpose buffer free 2D-DWT parallel designs, the main drawback is the critical path delay that includes two adders and one multipliers, which has been reduced in the proposed designs.

#### 1.2. Contribution of This Paper

The major objective of this work is to optimize the delay, area and power requirement of lifting based 2D-DWT architecture. In general, any one of these parameters can be optimized at the cost of others. The proposed logic used for area efficient lifting based DWT is to perform the whole operation with one processing element. Similarly, the proposed logic used for delay efficient lifting based DWT is to perform the whole operation with multiple



processing elements in parallel. In both the cases, the processing element consists of one floating point adder and one proposed fused multiply add design. The worst path delay for proposed techniques is less than other existing techniques because the existing techniques require two adders and one multiplier. In the proposed designs, one floating point adder is combined with floating point multiplier and it is renamed as fused multiply add design. Section 3 discusses the design modelling, implementation, and results. The Section 4 states the conclusion.

## 2. THE PROPOSED LIFTING BASED DISCRETE WAVELET TRANSFORM ARCHITECTURE

The proposed lifting based single PE (area efficient) N × N- point 2D-DWT architecture. In this proposed lifting based area efficient DWT, only one processing element (PE) is used to find all the values in row process. In existing systems, more number of processing elements are used. So, the area of the proposed lifting based single PE DWT is less than existing systems with trade off in number of cycles. Fig. 4(a) shows the structure of conventional processing element. The processing element consists of two floating point adders and one floating point multiplier. Here, the inputs are x, y, Bi, and Ci. The difference between the floating point multiply accumulation (MAC) and fused multiply add (FMA) is clearly shown in the Eqs. (8) and (9) respectively. Here, Ai and Bi are the present inputs to be multiplied, Fi is the present MAC/FMA result, Fi–1 is the previous MAC result, and C<sub>i</sub> is the third input for FMA. The proposed processing element that contains one floating point adder and one proposed FMA, A new number is added with present multiplication in FMA, whereas in the MAC, the previous MAC result is added with present multiplication.



Fig. 2. 2D-lifting based discrete wavelet transform with decomposition of two levels.



Fig. 3. Proposed lifting based single PE N  $\times$  N-point 2D-DWT architecture.

In existing systems, more number of processing elements are used. So, the area of the proposed lifting based single PE DWT is less than existing systems with trade off in number of cycles. the structure of conventional processing



element. The processing element consists of two floating point adders and one floating point multiplier. Here, the inputs are x, y,  $B_i$ , and  $C_i$ . The difference between the floating point multiply accumulation (MAC) and fused multiply add (FMA) is clearly shown in the Eqs. (8) and (9) respectively.



Fig. 4. Processing element (PE) used in lifting based DWT (a) conventional (b) proposed.

Here, Ai and Bi are the present inputs to be multiplied, Fi is the present MAC/FMA result, Fi-1 is the previous MAC result, and Ci is the third input for FMA. The proposed processing element that contains one floating point adder and one proposed FMA,. A new number is added with present multiplication in FMA, whereas in the MAC, the previous MAC result is added with present multiplication.

$$F_i = (A_i \times B_i) + F_{i-1} \tag{8}$$

$$F_i = (A_i \times B_i) + C_i \qquad (9)$$



Fig. 5. Storage buffer architecture

Fig. 5 shows the  $8 \times 4$  storage buffer architecture, where the control line eni = 1 to store the new data and = 0 for storing the previous data. For example, if the data from 1D-DWT of row process has to be stored in the first row of transpose buffer, then en0 = 1 and en1 = en7 = 0. This storage buffer is made up of 2-to-1 multiplexers and



registers. The shaded boxes represent the hardware registers. This storage (transpose or temporal) buffer architecture is used in the proposed single PE (5, 3) and (9, 7) 2D-DWT. Temporal buffer is used to store the output from column process of the previous level, that is used as input to the next level.



Fig. 6. The signal flow graph for lifting based 9-point (5, 3) 1D-DWT with 3-levels.

#### 2.1. Lifting Based (5, 3) DWT

The signal flow graph [24] for lifting based (5, 3) DWT with 3-levels is shown in Fig. 6. Here, HP and LP are high pass and low pass outputs respectively. The initial set of inputs are x0, x1, ... x8. The high pass outputs of first level are y1, y3, y5, and y7. The lowpass outputs of first level are y0, y2, y4, y6, and y8. The high pass outputs of second level are yy1 and yy3. The corresponding low pass outputs are yy0, yy2 and yy4. Here, both the constants are considered as floating point values. The low pass outputs are sent to the input of next level recursively. The low-high pass outputs for each levels can be found by processing element. Throughout this document, X symbol is used to represent do not care condition in the tables.

According to the area efficient proposed technique, only one processing element will be used. In (5, 3) DWT, each level contains two stages (high and low pass). The number of cycles required to produce high pass outputs (NHP (5,3)) in proposed N-point single PE (5, 3) 1D-DWT is shown in Eq. (10)

$$N_{(5,3)}^{HP} = \frac{N}{2}$$
(10)

The number of cycles required to produce low pass outputs (NLP (5,3)) in proposed single PE N-point (5, 3) 1D-DWT is shown in Eq. (11),

$$N_{(5,3)}^{LP} = \frac{N}{2}$$
 (11)

Therefore, the number of cycles required to complete the single level (Npro (5,3),single) using proposed single PE N-point (5, 3) 1D-DWT is shown in Eq. (12). In other words, Npro (5,3),single represents the total number of terms produced in a single level N-point (5, 3) 1D-DWT.



$$N_{(5,3),single}^{pro} = N$$
 (12)

In case of direct form of N-point (5, 3) 1D-DWT [24], two PEs will be used. During first clock cycle, first PE alone is busy and from second cycle onwards, both the PEs will be busy. Therefore, from second cycle onwards, two terms will be produced during each cycle of single level N-point (5, 3) 1D-DWT. So, the maximum number of cycles required to complete the single level N-point (5, 3) 1D-DWT using direct form (N – (5, 3), single direct) is shown in Eq. (13)

$$N_{(5,3),single}^{direct} = 1 + \frac{N-1}{2} = \frac{N+1}{2}$$
(13)

#### 2.2. Lifting Based (9, 7) DWT

The signal flow graph for lifting based (9, 7) DWT with 3-levels. The main difference between the (9, 7) and (5, 3) DWT is the number of multiplication constants and scaling factors. Here, the number of multiplication constants is 4. They are a, b, c, and d.



Fig. 7. Proposed single PE recursive architecture of 3-levels lifting based 9-point (9, 7) 1D-DWT.

The storage (temporal) buffer architecture used in the proposed parallel 2D-DWTs, where the control line  $en_i = 1$  to store the new data and = 0 for storing the previous data. For example, if the data from column process has to be stored in first column of buffer, then  $en_0 = en_1 = ... en_4 = 1$ . This buffer is made up of 2-to-1 multiplexers and registers. The shaded boxes represent the hardware registers.



Fig. 8 . Overall architecture of proposed 1D lifting based DWT.



#### 3. DESIGN MODELING, IMPLEMENTATION, AND RESULTS

The existing 2D lifting based DWTs are modeled in Verilog HDL These Verilog HDL models are simulated and synthesized using Cadence 6.1 ASIC design tool with 45 nm technology that is used to find the worst path delay, area, power, and EOP details with operating voltage 0.88 V and the proposed 2D lifting based DWTs are modeled in Quartus II software. The energy per operation (EOP) or power delay product (PDP) stands for the average energy consumed per switching event, which can be found by multiplying worst path delay with sum of dynamic (switching) and leakage (static) power. The proper select lines of multiplexers are used to perform the particular 1D/2D-lifting based DWT using the proposed architecture. The overall proposed architecture of lifting based 1D-DWT. Here, the select lines of multiplexers are stored in look up table with corresponding address line. The Address will be incremented by one during each clock cycle (initially it is 0). The proper select lines Sel [Address] are obtained during each clock cycle to perform the required 1D lifting based DWT. Tables 6 and 7 show the theoretical comparison of lifting based N-point 1D-DWT and N × N-point 2D-DWT architectures.

Tables 8, 9, and 10 show the performance analysis of various 1D/2D-DWT architectures. The 13-point 1-level proposed single PE (5, 3) 1D-DWT requires 13 cycles, which is shown in Table 3. Therefore,  $13 \times 13$ -point proposed single PE (5, 3) 2-levels 2D-DWT requires  $13 \times 13 = 169$  cycles in row process of first level and  $13 \times 13 = 2 = 85$  cycles in column process of first level. In the second level, row and column processes of proposed single PE (5, 3) 2D-DWT requires  $7 \times 7 = 49$  and  $7 \times 7 = 25$  cycles respectively, because 7-point proposed single PE (5, 3) 1D-DWT requires 7 cycles, Therefore, proposed single PE (5, 3) 2-levels  $13 \times 13$ -point 2D-DWT requires (169 + 85) + (49 + 25) = 328 cycles. Here,  $7 \times 7$ -point low pass output will be produced after first level, which is sent as the input for second level operation. The number of cycles required for N- In Table 7, number of cycles is represented for single level 2D-DWT. Therefore, N = 13 for (5, 3), N = 9 for (9, 7) 2D-DWTs during first level and N = 7 for (5, 3), N = 5 for (9, 7) 2D-DWTs during second level. There is a small tolerance between the theoretical number of cycles and the actual number of cycles. The number of cycles required in the first level which calculated in the mean square error According to, the mean square error (MSE) is defined as,

$$MSE = \sigma^2 = \frac{1}{N} \sum_{n=1}^{W} ((x_n - y_n)^2)$$

where  $x_n$  is the input sequence,  $y_n$  is the output sequence, and W is the length of data sequence. Peak-Signal-To-Noise ratio (PSNR) measures the size of error relative to peak value  $x_{peak}$  (for 8 bit pixel x2 peak equals 255) of the signal and it is given by:

$$PSNR = 10log_{10} \frac{x_{peak}^2}{\sigma^2}$$

In floating point arithmetic, the following three bits are used beyond the least significant bit of the mantissa for rounding. They are (1) Guard bit (G); (2) Round bit (R); and (3) sticky bit (S). These are used to perform round up, down, and even.



# 4. ANALYSIS OF 2D-DWT ARCHITECTURE

The whole block of the 2d-dwt is implemented in QUARTUS II software the multiplier based DWT was implemented in different nm technologies power and time are analyzed in different FPGA devices and then tabulated.

| Platform    | Device        | Power<br>(mW) | Time<br>(ns) |
|-------------|---------------|---------------|--------------|
| Cyclone II  | EP2C15AF484C6 | 131.13        | 69.383       |
| Cyclone III | EP3C1U484C6   | 105.44        | 55.075       |
| Stratix II  | EP2S15F484C3  | 361.09        | 45.398       |
| Stratix III | EP3SL50F484C2 | 442.58        | 35.068       |

Tab.1 Analysis of 2D DWT architecture with different platform

# 4.1 SIMULATION WAVEFORM OF LIFTING BASED 2D-DWT

Simulation waveform of lifting based 2D DWT architecture it will be shown in Fig. 9

| Simu     | ulation Wa   | veforms       |          |                             |                                         |          |          |           |          |                   |          |          |          |
|----------|--------------|---------------|----------|-----------------------------|-----------------------------------------|----------|----------|-----------|----------|-------------------|----------|----------|----------|
| Sim      | ulation mode | e: Functional |          |                             |                                         |          |          |           |          |                   |          |          | 1        |
|          |              |               |          |                             |                                         |          |          |           |          |                   |          |          |          |
| Þ        | Master Tin   | ne Bar:       | 14.875 n | s • • F                     | ointer:                                 | 375.41 n | s Inte   | rval: 361 | 1.54 ns  | Start             |          | End      |          |
| А        |              |               |          | 0 ps 80                     | .0 ns                                   | 160,0 ns | 240,0 ns | 320,0 ns  | 400,0 ns | 480,0 ns          | 560,0 ns | 640,0 ns | 720,0 ns |
| Æ        |              |               | Name     | 14.875 ns                   |                                         |          |          |           |          |                   |          |          |          |
| ٤        | ₽0           | ck            |          | Hin ∵r                      | 1                                       |          | ייתי     | 1 [+-+-1] |          |                   |          |          |          |
| Þ        | <u>₽</u> 1   | rst           |          | Inm                         |                                         | υu       | υш       |           |          |                   |          |          |          |
| Å        | <b>₽</b> 2   | ± a           |          |                             | 001110000010101010101000111100          |          |          |           |          |                   |          |          |          |
|          | <b>₽</b> 35  | ⊞b            |          |                             | 00011111111111000000000011              |          |          |           |          |                   |          |          |          |
| ñ,       | <b>68</b>    | €c            |          | d d                         | 111111111000000000011111111             |          |          |           |          |                   |          |          |          |
| <b>→</b> | <b>1</b> 01  | 🗉 d           |          |                             | 111110000000000000000000000000000000000 |          |          |           |          |                   |          |          |          |
| 먒        | iiii 134     | 🗷 odd         |          |                             | 10101010000000000011111111111           |          |          |           |          |                   |          |          |          |
|          | <b>167</b>   | 🗄 even        |          | 011111111100000000000001111 |                                         |          |          |           |          |                   |          |          |          |
| 2+       | igiti 200    | 🗷 highpi      | 888      |                             | 25)([13)()7                             | IX       |          |           | 1        | 15]e[28][141]     |          |          |          |
|          | 233          | 🗄 lowpa       | ISS      |                             | V-)(w[2)(0]                             | 1)       |          |           | [162     | 2][164][194][149] |          |          |          |
|          |              |               |          |                             |                                         |          |          |           |          |                   |          |          |          |

Fig. 9 . Simulation waveform of a lifting based 2D -DWT.

#### **5. CONCLUSION**

In this paper, high performance VLSI architectures for lifting based 2D-DWTs are proposed. The proposed logic used for area efficient lifting based DWT is to perform the whole operation with one processing element. In both the cases, the processing element consists of one floating point adder and one proposed fused multiply add design. The proposed and existing 2D lifting based DWTs are implemented with 45 nm technology. The results show that the proposed designs achieve significant improvement compared with existing architectures. For example, 9-point 2-parallel proposed (9, 7) single level 2D DWT achieves 33.5% of reduction in total cycle delay compared with direct form. Similarly, 9-point single PE proposed (9, 7) single level 2D-DWT achieves 59.8% and 75.5% of reduction in total area and net power over direct form respectively.



#### REFERENCES

[1] Andra, K. Chaitali chakrabarti and tinku acharya, A VLSI architecture for lifting-based forward and inverse wavelet transform, IEEE Trans. Signal Process. 50 (4) (2002) 966–977.

[2] Antonini, M., M. Barlaud, P. Mathieu, I. Daubechies, Image coding using wavelet transform, IEEE Trans. Image Process. 1 (2) (1992) 205–220.

[3] Choi, Y. J.-Y. Koo, N.-Y. Lee, Image reconstruction using the wavelet transform for positron emission tomography, IEEE Trans. Med. Imaging 20 (11) (2001) 1188–1193.

[4] Daubechis .I, the wavelet transform, time-frequency localization and signal analysis, IEEE Trans. Inf. Theory 36 (5) (1990) 961–1005.

[5] DeVore. R.A, B. Jawerth, B.J. Lucier, Image compression through wavelet transform coding, IEEE Trans. Inf. Theory 38 (2) (1989) 719–746.

[6] Mallat. S.G. A theory for multiresolution signal decomposition: the wavelet representation, IEEE Trans. Pattern Anal. Mach. Intell. 11 (7) (1989) 674–693.

[7] Nunez, J. X. Otazu, O. Fors, A. Prades, V. Pala, R. Arbiol, Multiresolution-based image fusion with additive wavelet decomposition, IEEE Trans. Geosci. Remote Sens. 37 (3) (1994) 1204–1211.

[8] Shi, G., Liu, W., Zhang, L., Li, F.: 'An efficient folded architecture for lifting-based discrete wavelet transform', IEEE Trans. Circuits Syst. II, 2009, 56, (4), pp. 290–294.

[9] Vaidyanathan, P. P "Quadrature mirror filter banks, M-band extensions and perfect-reconstruction techniques," IEEE ASSP Mag., pp. 420, Jul. 1987.

[10] Vetterli M, "Wavelets and filter banks for discrete time signal processing," in Wavelets and their Applications (R. Coifman et al., Eds.). Place: Jones and Barlett, 1991.