Design of Low Power and High Speed Digital IIR Filter in 45nm with Optimized CSA for Digital Signal Processing Applications

In this paper, a design methodology to implement low-power and high-speed 2nd order recursive digital Infinite Impulse Response (IIR) filter has been proposed. Since IIR filters suffer from a large number of constant multiplications, the proposed method replaces the constant multiplications by using addition/subtraction and shift operations. The proposed new 6T adder cell is used as the Carry-Save Adder (CSA) to implement addition/subtraction operations in the design of recursive section IIR filter to reduce the propagation delay. Furthermore, high-level algorithms designed for the optimization of the number of CSA blocks are used to reduce the complexity of the IIR filter. The DSCH3 tool is used to generate the schematic of the proposed 6T CSA based shift-adds architecture design and it is analyzed by using Microwind CAD tool to synthesize low-complexity and high-speed IIR filters. The proposed design outperforms in terms of power, propagation delay, area and throughput when compared with MUX-12T, MCIT-7T based CSA adder filter design. It is observed from the experimental results that the proposed 6T based design method can find better IIR filter designs in terms of power and delay than those obtained by using efficient general multipliers.

Comparative Study of Ant Colony and Genetic Algorithms for VLSI Circuit Partitioning

This paper presents a comparative study of Ant Colony and Genetic Algorithms for VLSI circuit bi-partitioning. Ant colony optimization is an optimization method based on behaviour of social insects [27] whereas Genetic algorithm is an evolutionary optimization technique based on Darwinian Theory of natural evolution and its concept of survival of the fittest [19]. Both the methods are stochastic in nature and have been successfully applied to solve many Non Polynomial hard problems. Results obtained show that Genetic algorithms out perform Ant Colony optimization technique when tested on the VLSI circuit bi-partitioning problem.

Spacecraft Neural Network Control System Design using FPGA

Designing and implementing intelligent systems has become a crucial factor for the innovation and development of better products of space technologies. A neural network is a parallel system, capable of resolving paradigms that linear computing cannot. Field programmable gate array (FPGA) is a digital device that owns reprogrammable properties and robust flexibility. For the neural network based instrument prototype in real time application, conventional specific VLSI neural chip design suffers the limitation in time and cost. With low precision artificial neural network design, FPGAs have higher speed and smaller size for real time application than the VLSI and DSP chips. So, many researchers have made great efforts on the realization of neural network (NN) using FPGA technique. In this paper, an introduction of ANN and FPGA technique are briefly shown. Also, Hardware Description Language (VHDL) code has been proposed to implement ANNs as well as to present simulation results with floating point arithmetic. Synthesis results for ANN controller are developed using Precision RTL. Proposed VHDL implementation creates a flexible, fast method and high degree of parallelism for implementing ANN. The implementation of multi-layer NN using lookup table LUT reduces the resource utilization for implementation and time for execution.

Estimation of Attenuation and Phase Delay in Driving Voltage Waveform of a Digital-Noiseless, Ultra-High-Speed Image Sensor

Since 2004, we have been developing an in-situ storage image sensor (ISIS) that captures more than 100 consecutive images at a frame rate of 10 Mfps with ultra-high sensitivity as well as the video camera for use with this ISIS. Currently, basic research is continuing in an attempt to increase the frame rate up to 100 Mfps and above. In order to suppress electro-magnetic noise at such high frequency, a digital-noiseless imaging transfer scheme has been developed utilizing solely sinusoidal driving voltages. This paper presents highly efficient-yet-accurate expressions to estimate attenuation as well as phase delay of driving voltages through RC networks of an ultra-high-speed image sensor. Elmore metric for a fundamental RC chain is employed as the first-order approximation. By application of dimensional analysis to SPICE data, we found a simple expression that significantly improves the accuracy of the approximation. Similarly, another simple closed-form model to estimate phase delay through fundamental RC networks is also obtained. Estimation error of both expressions is much less than previous works, only less 2% for most of the cases . The framework of this analysis can be extended to address similar issues of other VLSI structures.

Evaluation of Fuzzy ARTMAP with DBSCAN in VLSI Application

The various applications of VLSI circuits in highperformance computing, telecommunications, and consumer electronics has been expanding progressively, and at a very hasty pace. This paper describes a new model for partitioning a circuit using DBSCAN and fuzzy ARTMAP neural network. The first step is concerned with feature extraction, where we had make use DBSCAN algorithm. The second step is the classification and is composed of a fuzzy ARTMAP neural network. The performance of both approaches is compared using benchmark data provided by MCNC standard cell placement benchmark netlists. Analysis of the investigational results proved that the fuzzy ARTMAP with DBSCAN model achieves greater performance then only fuzzy ARTMAP in recognizing sub-circuits with lowest amount of interconnections between them The recognition rate using fuzzy ARTMAP with DBSCAN is 97.7% compared to only fuzzy ARTMAP.

Efficient Power-Delay Product Modulo 2n+1 Adder Design

As embedded and portable systems were emerged power consumption of circuits had been major challenge. On the other hand latency as determines frequency of circuits is also vital task. Therefore, trade off between both of them will be desirable. Modulo 2n+1 adders are important part of the residue number system (RNS) based arithmetic units with the interesting moduli set (2n-1,2n, 2n+1). In this manuscript we have introduced novel binary representation to the design of modulo 2n+1 adder. VLSI realization of proposed architecture under 180 nm full static CMOS technology reveals its superiority in terms of area, power consumption and power-delay product (PDP) against several peer existing structures.

BDD Package Based on Boolean NOR Operation

Binary Decision Diagrams (BDDs) are useful data structures for symbolic Boolean manipulations. BDDs are used in many tasks in VLSI/CAD, such as equivalence checking, property checking, logic synthesis, and false paths. In this paper we describe a new approach for the realization of a BDD package. To perform manipulations of Boolean functions, the proposed approach does not depend on the recursive synthesis operation of the IF-Then-Else (ITE). Instead of using the ITE operation, the basic synthesis algorithm is done using Boolean NOR operation.

Comparative Analysis of Transient-Fault Tolerant Schemes for Network on Chips

Network on a chip (NoC) has been proposed as a viable solution to counter the inefficiency of buses in the current VLSI on-chip interconnects. However, as the silicon chip accommodates more transistors, the probability of transient faults is increasing, making fault tolerance a key concern in scaling chips. In packet based communication on a chip, transient failures can corrupt the data packet and hence, undermine the accuracy of data communication. In this paper, we present a comparative analysis of transient fault tolerant techniques including end-to-end, node-by-node, and stochastic communication based on flooding principle.

A Novel Low Power, High Speed 14 Transistor CMOS Full Adder Cell with 50% Improvement in Threshold Loss Problem

Full adders are important components in applications such as digital signal processors (DSP) architectures and microprocessors. In addition to its main task, which is adding two numbers, it participates in many other useful operations such as subtraction, multiplication, division,, address calculation,..etc. In most of these systems the adder lies in the critical path that determines the overall speed of the system. So enhancing the performance of the 1-bit full adder cell (the building block of the adder) is a significant goal.Demands for the low power VLSI have been pushing the development of aggressive design methodologies to reduce the power consumption drastically. To meet the growing demand, we propose a new low power adder cell by sacrificing the MOS Transistor count that reduces the serious threshold loss problem, considerably increases the speed and decreases the power when compared to the static energy recovery full (SERF) adder. So a new improved 14T CMOS l-bit full adder cell is presented in this paper. Results show 50% improvement in threshold loss problem, 45% improvement in speed and considerable power consumption over the SERF adder and other different types of adders with comparable performance.

Estimation of Attenuation and Phase Delay in Driving Voltage Waveform of an Ultra-High-Speed Image Sensor by Dimensional Analysis

We present an explicit expression to estimate driving voltage attenuation through RC networks representation of an ultrahigh- speed image sensor. Elmore delay metric for a fundamental RC chain is employed as the first-order approximation. By application of dimensional analysis to SPICE simulation data, we found a simple expression that significantly improves the accuracy of the approximation. Estimation error of the resultant expression for uniform RC networks is less than 2%. Similarly, another simple closed-form model to estimate 50 % delay through fundamental RC networks is also derived with sufficient accuracy. The framework of this analysis can be extended to address delay or attenuation issues of other VLSI structures.

A Survey of Various Algorithms for Vlsi Physical Design

Electronic Systems are the core of everyday lives. They form an integral part in financial networks, mass transit, telephone systems, power plants and personal computers. Electronic systems are increasingly based on complex VLSI (Very Large Scale Integration) integrated circuits. Initial electronic design automation is concerned with the design and production of VLSI systems. The next important step in creating a VLSI circuit is Physical Design. The input to the physical design is a logical representation of the system under design. The output of this step is the layout of a physical package that optimally or near optimally realizes the logical representation. Physical design problems are combinatorial in nature and of large problem sizes. Darwin observed that, as variations are introduced into a population with each new generation, the less-fit individuals tend to extinct in the competition of basic necessities. This survival of fittest principle leads to evolution in species. The objective of the Genetic Algorithms (GA) is to find an optimal solution to a problem .Since GA-s are heuristic procedures that can function as optimizers, they are not guaranteed to find the optimum, but are able to find acceptable solutions for a wide range of problems. This survey paper aims at a study on Efficient Algorithms for VLSI Physical design and observes the common traits of the superior contributions.

Mapping Complex, Large – Scale Spiking Networks on Neural VLSI

Traditionally, VLSI implementations of spiking neural nets have featured large neuron counts for fixed computations or small exploratory, configurable nets. This paper presents the system architecture of a large configurable neural net system employing a dedicated mapping algorithm for projecting the targeted biology-analog nets and dynamics onto the hardware with its attendant constraints.

A Novel Multiple Valued Logic OHRNS Modulo rn Adder Circuit

Residue Number System (RNS) is a modular representation and is proved to be an instrumental tool in many digital signal processing (DSP) applications which require high-speed computations. RNS is an integer and non weighted number system; it can support parallel, carry-free, high-speed and low power arithmetic. A very interesting correspondence exists between the concepts of Multiple Valued Logic (MVL) and Residue Number Arithmetic. If the number of levels used to represent MVL signals is chosen to be consistent with the moduli which create the finite rings in the RNS, MVL becomes a very natural representation for the RNS. There are two concerns related to the application of this Number System: reaching the most possible speed and the largest dynamic range. There is a conflict when one wants to resolve both these problem. That is augmenting the dynamic range results in reducing the speed in the same time. For achieving the most performance a method is considere named “One-Hot Residue Number System" in this implementation the propagation is only equal to one transistor delay. The problem with this method is the huge increase in the number of transistors they are increased in order m2 . In real application this is practically impossible. In this paper combining the Multiple Valued Logic and One-Hot Residue Number System we represent a new method to resolve both of these two problems. In this paper we represent a novel design of an OHRNS-based adder circuit. This circuit is useable for Multiple Valued Logic moduli, in comparison to other RNS design; this circuit has considerably improved the number of transistors and power consumption.

Bode Stability Analysis for Single Wall Carbon Nanotube Interconnects Used in 3D-VLSI Circuits

Bode stability analysis based on transmission line modeling (TLM) for single wall carbon nanotube (SWCNT) interconnects used in 3D-VLSI circuits is investigated for the first time. In this analysis, the dependence of the degree of relative stability for SWCNT interconnects on the geometry of each tube has been acquired. It is shown that, increasing the length and diameter of each tube, SWCNT interconnects become more stable.

An Efficient VLSI Design Approach to Reduce Static Power using Variable Body Biasing

In CMOS integrated circuit design there is a trade-off between static power consumption and technology scaling. Recently, the power density has increased due to combination of higher clock speeds, greater functional integration, and smaller process geometries. As a result static power consumption is becoming more dominant. This is a challenge for the circuit designers. However, the designers do have a few methods which they can use to reduce this static power consumption. But all of these methods have some drawbacks. In order to achieve lower static power consumption, one has to sacrifice design area and circuit performance. In this paper, we propose a new method to reduce static power in the CMOS VLSI circuit using Variable Body Biasing technique without being penalized in area requirement and circuit performance.

Learning Monte Carlo Data for Circuit Path Length

This paper analyzes the patterns of the Monte Carlo data for a large number of variables and minterms, in order to characterize the circuit path length behavior. We propose models that are determined by training process of shortest path length derived from a wide range of binary decision diagram (BDD) simulations. The creation of the model was done use of feed forward neural network (NN) modeling methodology. Experimental results for ISCAS benchmark circuits show an RMS error of 0.102 for the shortest path length complexity estimation predicted by the NN model (NNM). Use of such a model can help reduce the time complexity of very large scale integrated (VLSI) circuitries and related computer-aided design (CAD) tools that use BDDs.

High Performance VLSI Architecture of 2D Discrete Wavelet Transform with Scalable Lattice Structure

In this paper, we propose a fully-utilized, block-based 2D DWT (discrete wavelet transform) architecture, which consists of four 1D DWT filters with two-channel QMF lattice structure. The proposed architecture requires about 2MN-3N registers to save the intermediate results for higher level decomposition, where M and N stand for the filter length and the row width of the image respectively. Furthermore, the proposed 2D DWT processes in horizontal and vertical directions simultaneously without an idle period, so that it computes the DWT for an N×N image in a period of N2(1-2-2J)/3. Compared to the existing approaches, the proposed architecture shows 100% of hardware utilization and high throughput rates. To mitigate the long critical path delay due to the cascaded lattices, we can apply the pipeline technique with four stages, while retaining 100% of hardware utilization. The proposed architecture can be applied in real-time video signal processing.

Explicit Delay and Power Estimation Method for CMOS Inverter Driving on-Chip RLC Interconnect Load

The resistive-inductive-capacitive behavior of long interconnects which are driven by CMOS gates are presented in this paper. The analysis is based on the ¤Ç-model of a RLC load and is developed for submicron devices. Accurate and analytical expressions for the output load voltage, the propagation delay and the short circuit power dissipation have been proposed after solving a system of differential equations which accurately describe the behavior of the circuit. The effect of coupling capacitance between input and output and the short circuit current on these performance parameters are also incorporated in the proposed model. The estimated proposed delay and short circuit power dissipation are in very good agreement with the SPICE simulation with average relative error less than 6%.

A Novel VLSI Architecture for Image Compression Model Using Low power Discrete Cosine Transform

In Image processing the Image compression can improve the performance of the digital systems by reducing the cost and time in image storage and transmission without significant reduction of the Image quality. This paper describes hardware architecture of low complexity Discrete Cosine Transform (DCT) architecture for image compression[6]. In this DCT architecture, common computations are identified and shared to remove redundant computations in DCT matrix operation. Vector processing is a method used for implementation of DCT. This reduction in computational complexity of 2D DCT reduces power consumption. The 2D DCT is performed on 8x8 matrix using two 1-Dimensional Discrete cosine transform blocks and a transposition memory [7]. Inverse discrete cosine transform (IDCT) is performed to obtain the image matrix and reconstruct the original image. The proposed image compression algorithm is comprehended using MATLAB code. The VLSI design of the architecture is implemented Using Verilog HDL. The proposed hardware architecture for image compression employing DCT was synthesized using RTL complier and it was mapped using 180nm standard cells. . The Simulation is done using Modelsim. The simulation results from MATLAB and Verilog HDL are compared. Detailed analysis for power and area was done using RTL compiler from CADENCE. Power consumption of DCT core is reduced to 1.027mW with minimum area[1].

Interconnect Analysis of a Novel Multiplexer Based Full-Adder Cell for Power and Propagation Delay Optimizations

The proposed multiplexer-based novel 1-bit full adder cell is schematized by using DSCH2 and its layout is generated by using microwind VLSI CAD tool. The adder cell layout interconnect analysis is performed by using BSIM4 layout analyzer. The adder circuit is compared with other six existing adder circuits for parametric analysis. The proposed adder cell gives better performance than the other existing six adder circuits in terms of power, propagation delay and PDP. The proposed adder circuit is further analyzed for interconnect analysis, which gives better performance than other adder circuits in terms of layout thickness, width and height.