Dynamic Power Reduction in Sequential Circuits Using Look Ahead Clock Gating Technique

In this paper, a novel Linear Feedback Shift Register (LFSR) with Look Ahead Clock Gating (LACG) technique is presented to reduce the power consumption in modern processors and System-on-Chip. Clock gating is a predominant technique used to reduce unwanted switching of clock signals. Several clock gating techniques to reduce the dynamic power have been developed, of which LACG is predominant. LACG computes the clock enabling signals of each flip-flop (FF) one cycle ahead of time, based on the present cycle data of the flip-flops on which it depends. It overcomes the timing problems in the existing clock gating methods like datadriven clock gating and Auto-Gated flip-flops (AGFF) by allotting a full clock cycle for the determination of the clock enabling signals. Further to reduce the power consumption in LACG technique, FFs can be grouped so that they share a common clock enabling signal. Simulation results show that the novel grouped LFSR with LACG achieves 15.03% power savings than conventional LFSR with LACG and 44.87% than data-driven clock gating.

A High Level Implementation of a High Performance Data Transfer Interface for NoC

The distribution of a single global clock across a chip has become the major design bottleneck for high performance VLSI systems owing to the power dissipation, process variability and multicycle cross-chip signaling. A Network-on-Chip (NoC) architecture partitioned into several synchronous blocks has become a promising approach for attaining fine-grain power management at the system level. In a NoC architecture the communication between the blocks is handled asynchronously. To interface these blocks on a chip operating at different frequencies, an asynchronous FIFO interface is inevitable. However, these asynchronous FIFOs are not required if adjacent blocks belong to the same clock domain. In this paper, we have designed and analyzed a 16-bit asynchronous micropipelined FIFO of depth four, with the awareness of place and route on an FPGA device. We have used a commercially available Spartan 3 device and designed a high speed implementation of the asynchronous 4-phase micropipeline. The asynchronous FIFO implemented on the FPGA device shows 76 Mb/s throughput and a handshake cycle of 109 ns for write and 101.3 ns for read at the simulation under the worst case operating conditions (voltage = 0.95V) on a working chip at the room temperature.

Two Kinds of Self-Oscillating Circuits Mechanically Demonstrated

This study introduces two types of self-oscillating circuits that are frequently found in power electronics applications. Special effort is made to relate the circuits to the analogous mechanical systems of some important scientific inventions: Galileo’s pendulum clock and Coulomb’s friction model. A little touch of related history and philosophy of science will hopefully encourage curiosity, advance the understanding of self-oscillating systems and satisfy the aspiration of some students for scientific literacy. Finally, the two self-oscillating circuits are applied to design a simple class-D audio amplifier.

A Novel Approach to Asynchronous State Machine Modeling on Multisim for Avoiding Function Hazards

The aim of this study was to design and simulate a particular type of Asynchronous State Machine (ASM), namely a ‘traffic light controller’ (TLC), operated at a frequency of 0.5 Hz. The design task involved two main stages: firstly, designing a 4-bit binary counter using J-K flip flops as the timing signal and, subsequently, attaining the digital logic by deploying ASM design process. The TLC was designed such that it showed a sequence of three different colours, i.e. red, yellow and green, corresponding to set thresholds by deploying the least number of AND, OR and NOT gates possible. The software Multisim was deployed to design such circuit and simulate it for circuit troubleshooting in order for it to display the output sequence of the three different colours on the traffic light in the correct order. A clock signal, an asynchronous 4- bit binary counter that was designed through the use of J-K flip flops along with an ASM were used to complete this sequence, which was programmed to be repeated indefinitely. Eventually, the circuit was debugged and optimized, thus displaying the correct waveforms of the three outputs through the logic analyser. However, hazards occurred when the frequency was increased to 10 MHz. This was attributed to delays in the feedback being too high.

Design and Development of an Efficient and Cost-Effective Microcontroller-Based Irrigation Control System to Enhance Food Security

The development of the agricultural sector in Ghana has been reliant on the use of irrigation systems to ensure food security. However, the manual operation of these systems has not facilitated their maximum efficiency due to human limitations. This paper seeks to address this problem by designing and implementing an efficient, cost effective automated system which monitors and controls the water flow of irrigation through communication with an authorized operator via text messages. The automatic control component of the system is timer based with an Atmega32 microcontroller and a real time clock from the SM5100B cellular module. For monitoring purposes, the system sends periodic notification of the system on the performance of duty via SMS to the authorized person(s). Moreover, the GSM based Irrigation Monitoring and Control System saves time and labour and reduces cost of operating irrigation systems by saving electricity usage and conserving water. Field tests conducted have proven its operational efficiency and ease of assessment of farm irrigation equipment due to its costeffectiveness and data logging capabilities.

Design of an Efficient Retimed CIC Compensation Filter

Unwanted side effects because of spectral aliasing and spectral imaging during signal processing would be the major concern over the sampling rate alteration. Multirate-multistage implementation of digital filter could come about a large computational saving than single rate filter suitable for sample rate conversion. This implementation can further improve through high-level architectural transformation in circuit level. Reallocating registers and  relocating flip-flops across logic gates through retiming certainly a prominent sequential transformation technology, that optimize hardware circuits to achieve faster clocking speed without affecting the functionality. In this paper, we proposed an efficient compensated cascade Integrator comb (CIC) decimation filter structure that analyze the consequence of filter order variation which has a retimed FIR filter being compensator while using the cutset retiming technique and achieved an improvement in the passband droop by 14% to 39%, in computation time by 38.04%, 25.78%, 12.21%, 6.69% and 4.44% and reduction in path delay by 62.27%, 72%, 86.63%, 91.56% and 94.42% of 3, 6, 8, 12 and 24 order filter respectively than the non-retimed CIC compensation filter.

Frequent and Systematic Timing Enhancement of Congestion Window in Typical Transmission Control Protocol

Transmission Control Protocol (TCP) among the wired and wireless networks, it still has a practical problem; where the congestion control mechanism does not permit the data stream to get complete bandwidth over the existing network links. To solve this problem, many TCP protocols have been introduced with high speed performance. Therefore, an enhanced congestion window (cwnd) for the congestion control mechanism is proposed in this article to improve the performance of TCP by increasing the number of cycles of the new window to improve the transmitted packet number. The proposed algorithm used a new mechanism based on the available bandwidth of the connection to detect the capacity of network path in order to improve the regular clocking of congestion avoidance mechanism. The work in this paper based on using Network Simulator 2 (NS-2) to simulate the proposed algorithm.

Design and Optimization of Parity Generator and Parity Checker Based On Quantum-dot Cellular Automata

Quantum-dot Cellular Automata (QCA) is one of the most substitute emerging nanotechnologies for electronic circuits, because of lower power consumption, higher speed and smaller size in comparison with CMOS technology. The basic devices, a Quantum-dot cell can be used to implement logic gates and wires. As it is the fundamental building block on nanotechnology circuits. By applying XOR gate the hardware requirements for a QCA circuit can be decrease and circuits can be simpler in terms of level, delay and cell count. This article present a modest approach for implementing novel optimized XOR gate, which can be applied to design many variants of complex QCA circuits. Proposed XOR gate is simple in structure and powerful in terms of implementing any digital circuits. In order to verify the functionality of the proposed design some complex implementation of parity generator and parity checker circuits are proposed and simulating by QCA Designer tool and compare with some most recent design. Simulation results and physical relations confirm its usefulness in implementing every digital circuit.

Design and Analysis of a Low Power High Speed 1 Bit Full Adder Cell Based On TSPC Logic with Multi-Threshold CMOS

An adder is one of the most integral component of a digital system like a digital signal processor or a microprocessor. Being an extremely computationally intensive part of a system, the optimization for speed and power consumption of the adder is of prime importance. In this paper we have designed a 1 bit full adder cell based on dynamic TSPC logic to achieve high speed operation. A high threshold voltage sleep transistor is used to reduce the static power dissipation in standby mode. The circuit is designed and simulated in TSPICE using TSMC 180nm CMOS process. Average power consumption, delay and power-delay product is measured which showed considerable improvement in performance over the existing full adder designs.

Design and Implementation of Quantum Cellular Automata Based Novel Adder Circuits

The most important mathematical operation for any computing system is addition. An efficient adder can be of greater assistance in designing of any arithmetic circuits. Quantum-dot Cellular Automata (QCA) is a promising nanotechnology to create electronic circuits for computing devices and suitable candidate for next generation of computing systems. The article presents a modest approach to implement a novel XOR gate. The gate is simple in structure and powerful in terms of implementing digital circuits. By applying the XOR gate, the hardware requirement for a QCA circuit can be decrease and circuits can be simpler in level, clock phase and cell count. In order to verify the functionality of the proposed device some implementation of Half Adder (HA) and Full Adder (FA) is checked by means of computer simulations using QCA-Designer tool. Simulation results and physical relations confirm its usefulness in implementing every digital circuit.

A Multi Cordic Architecture on FPGA Platform

Coordinate Rotation Digital Computer (CORDIC) is a unique digital computing unit intended for the computation of mathematical operations and functions. This paper presents A multi CORDIC processor that integrates different CORDIC architectures on a single FPGA chip and allows the user to select the CORDIC architecture to proceed with based on what he wants to calculate and his needs. Synthesis show that radix 2 CORDIC has the lowest clock delay, radix 8 CORDIC has the highest LUT usage and lowest register usage while Hybrid Radix 4 CORDIC had the highest clock delay.

Evaluation of Features Extraction Algorithms for a Real-Time Isolated Word Recognition System

Paper presents an comparative evaluation of features extraction algorithm for a real-time isolated word recognition system based on FPGA. The Mel-frequency cepstral, linear frequency cepstral, linear predictive and their cepstral coefficients were implemented in hardware/software design. The proposed system was investigated in speaker dependent mode for 100 different Lithuanian words. The robustness of features extraction algorithms was tested recognizing the speech records at different signal to noise rates. The experiments on clean records show highest accuracy for Mel-frequency cepstral and linear frequency cepstral coefficients. For records with 15 dB signal to noise rate the linear predictive cepstral coefficients gives best result. The hard and soft part of the system is clocked on 50 MHz and 100 MHz accordingly. For the classification purpose the pipelined dynamic time warping core was implemented. The proposed word recognition system satisfy the real-time requirements and is suitable for applications in embedded systems.

An Energy Efficient Digital Baseband for Batteryless Remote Control

In this paper, an energy efficient digital baseband circuit for piezoelectric (PE) harvester powered batteryless remote control system is presented. Pulse mode PE harvester, which provides short duration of energy, is adopted to replace conventional chemical battery in wireless remote controller. The transmitter digital baseband repeats the control command transmission once the digital circuit is initiated by the power-on-reset. A power efficient data frame format is proposed to maximize the transmission repetition time. By using the proposed frame format and receiver clock and data recovery method, the receiver baseband is able to decode the command even when the received data has 20% error. The proposed transmitter and receiver baseband are implemented using FPGA and simulation results are presented.

14-Bit 1MS/s Cyclic-Pipelined ADC

This paper presents a 14-bit cyclic-pipelined Analog to digital converter (ADC) running at 1 MS/s. The architecture is based on a 1.5-bit per stage structure utilizing digital correction for each stage. The ADC consists of two 1.5-bit stages, one shift register delay line, and digital error correction logic. Inside each 1.5-bit stage, there is one gain-boosting op-amp and two comparators. The ADC was implemented in 0.18µm CMOS process and the design has an area of approximately 0.2 mm2. The ADC has a differential input range of 1.2 Vpp. The circuit has an average power consumption of 3.5mA with 10MHz sampling clocks. The post-layout simulations of the design satisfy 12-bit SNDR with a full-scale sinusoid input.

Modified Buck Boost Circuit for Linear and Non-Linear Piezoelectric Energy Harvesting

Plenty researches have reported techniques to harvest energy from piezoelectric transducer. In the earlier years, the researches mainly report linear energy harvesting techniques whereby interface circuitry is designed to have input impedance that match with the impedance of the piezoelectric transducer. In recent years non-linear techniques become more popular. The non-linear technique employs voltage waveform manipulation to boost the available-for-extraction energy at the time of energy transfer.  The fact that non-linear energy extraction provides larger available-for-extraction energy doesn’t mean the linear energy extraction is completely obsolete. In some scenarios, such as where initial power is not available, linear energy extraction is still preferred. A modified Buck Boost circuit which is capable of harvesting piezoelectric energy using both linear and non-linear techniques is reported in this paper. Efficiency of at least 64% can be achieved using this circuit. For linear extraction, the modified Buck Boost circuit is controlled using a fix frequency and duty cycle clock. A voltage sensor and a pulse generator are added as the controller for the non-linear extraction technique. 

The Performance of the Character-Access on the Checking Phase in String Searching Algorithms

A new algorithm called Character-Comparison to Character-Access (CCCA) is developed to test the effect of both: 1) converting character-comparison and number-comparison into character-access and 2) the starting point of checking on the performance of the checking operation in string searching. An experiment is performed; the results are compared with five algorithms, namely, Naive, BM, Inf_Suf_Pref, Raita, and Circle. With the CCCA algorithm, the results suggest that the evaluation criteria of the average number of comparisons are improved up to 74.0%. Furthermore, the results suggest that the clock time required by the other algorithms is improved in range from 28% to 68% by the new CCCA algorithm

Design and Implementation of a Microcontroller Based LCD Screen Digital Stop Watch

The stop watch is used to measure the time required for a certain event. This is different from normal clocks in many ways, one of which is the accuracy of time. The stop watch requires much more accuracy than the normal clocks. In this paper, an ATmega8535 microcontroller was used to control the stop watch, by which perfect accuracy can be ensured. For compiling the C code and for loading the compiled .hex file into the microcontroller, AVR studio and PonyProg were used respectively. The stop watch is also different from traditional stop watches, as it contains two different timing modes namely 'Split timing' and 'Lap timing'.

Hardware Prototyping of an Efficient Encryption Engine

An approach to develop the FPGA of a flexible key RSA encryption engine that can be used as a standard device in the secured communication system is presented. The VHDL modeling of this RSA encryption engine has the unique characteristics of supporting multiple key sizes, thus can easily be fit into the systems that require different levels of security. A simple nested loop addition and subtraction have been used in order to implement the RSA operation. This has made the processing time faster and used comparatively smaller amount of space in the FPGA. The hardware design is targeted on Altera STRATIX II device and determined that the flexible key RSA encryption engine can be best suited in the device named EP2S30F484C3. The RSA encryption implementation has made use of 13,779 units of logic elements and achieved a clock frequency of 17.77MHz. It has been verified that this RSA encryption engine can perform 32-bit, 256-bit and 1024-bit encryption operation in less than 41.585us, 531.515us and 790.61us respectively.

FPGA Based Longitudinal and Lateral Controller Implementation for a Small UAV

This paper presents implementation of attitude controller for a small UAV using field programmable gate array (FPGA). Due to the small size constrain a miniature more compact and computationally extensive; autopilot platform is needed for such systems. More over UAV autopilot has to deal with extremely adverse situations in the shortest possible time, while accomplishing its mission. FPGAs in the recent past have rendered themselves as fast, parallel, real time, processing devices in a compact size. This work utilizes this fact and implements different attitude controllers for a small UAV in FPGA, using its parallel processing capabilities. Attitude controller is designed in MATLAB/Simulink environment. The discrete version of this controller is implemented using pipelining followed by retiming, to reduce the critical path and thereby clock period of the controller datapath. Pipelined, retimed, parallel PID controller implementation is done using rapidprototyping and testing efficient development tool of “system generator", which has been developed by Xilinx for FPGA implementation. The improved timing performance enables the controller to react abruptly to any changes made to the attitudes of UAV.

Low Power and Less Area Architecture for Integer Motion Estimation

Full search block matching algorithm is widely used for hardware implementation of motion estimators in video compression algorithms. In this paper we are proposing a new architecture, which consists of a 2D parallel processing unit and a 1D unit both working in parallel. The proposed architecture reduces both data access power and computational power which are the main causes of power consumption in integer motion estimation. It also completes the operations with nearly the same number of clock cycles as compared to a 2D systolic array architecture. In this work sum of absolute difference (SAD)-the most repeated operation in block matching, is calculated in two steps. The first step is to calculate the SAD for alternate rows by a 2D parallel unit. If the SAD calculated by the parallel unit is less than the stored minimum SAD, the SAD of the remaining rows is calculated by the 1D unit. Early termination, which stops avoidable computations has been achieved with the help of alternate rows method proposed in this paper and by finding a low initial SAD value based on motion vector prediction. Data reuse has been applied to the reference blocks in the same search area which significantly reduced the memory access.