Complementary Energy Path Adiabatic Logic based Full Adder Circuit

In this paper, we present the design and experimental evaluation of complementary energy path adiabatic logic (CEPAL) based 1 bit full adder circuit. A simulative investigation on the proposed full adder has been done using VIRTUOSO SPECTRE simulator of cadence in 0.18μm UMC technology and its performance has been compared with the conventional CMOS full adder circuit. The CEPAL based full adder circuit exhibits the energy saving of 70% to the conventional CMOS full adder circuit, at 100 MHz frequency and 1.8V operating voltage.

A Temporal Synchronization Model for Heterogeneous Data in Distributed Systems

Multimedia distributed systems deal with heterogeneous data, such as texts, images, graphics, video and audio. The specification of temporal relations among different data types and distributed sources is an open research area. This paper proposes a fully distributed synchronization model to be used in multimedia systems. One original aspect of the model is that it avoids the use of a common reference (e.g. wall clock and shared memory). To achieve this, all possible multimedia temporal relations are specified according to their causal dependencies.

A Power Reduction Technique for Built-In-Self Testing Using Modified Linear Feedback Shift Register

A linear feedback shift register (LFSR) is proposed which targets to reduce the power consumption from within. It reduces the power consumption during testing of a Circuit Under Test (CUT) at two stages. At first stage, Control Logic (CL) makes the clocks of the switching units of the register inactive for a time period when output from them is going to be same as previous one and thus reducing unnecessary switching of the flip-flops. And at second stage, the LFSR reorders the test vectors by interchanging the bit with its next and closest neighbor bit. It keeps fault coverage capacity of the vectors unchanged but reduces the Total Hamming Distance (THD) so that there is reduction in power while shifting operation.

VLSI Design of 2-D Discrete Wavelet Transform for Area-Efficient and High-Speed Image Computing

This paper presents a VLSI design approach of a highspeed and real-time 2-D Discrete Wavelet Transform computing. The proposed architecture, based on new and fast convolution approach, reduces the hardware complexity in addition to reduce the critical path to the multiplier delay. Furthermore, an advanced twodimensional (2-D) discrete wavelet transform (DWT) implementation, with an efficient memory area, is designed to produce one output in every clock cycle. As a result, a very highspeed is attained. The system is verified, using JPEG2000 coefficients filters, on Xilinx Virtex-II Field Programmable Gate Array (FPGA) device without accessing any external memory. The resulting computing rate is up to 270 M samples/s and the (9,7) 2-D wavelet filter uses only 18 kb of memory (16 kb of first-in-first-out memory) with 256×256 image size. In this way, the developed design requests reduced memory and provide very high-speed processing as well as high PSNR quality.

A Pipelined FSBM Hardware Architecture for HTDV-H.26x

In MPEG and H.26x standards, to eliminate the temporal redundancy we use motion estimation. Given that the motion estimation stage is very complex in terms of computational effort, a hardware implementation on a re-configurable circuit is crucial for the requirements of different real time multimedia applications. In this paper, we present hardware architecture for motion estimation based on "Full Search Block Matching" (FSBM) algorithm. This architecture presents minimum latency, maximum throughput, full utilization of hardware resources such as embedded memory blocks, and combining both pipelining and parallel processing techniques. Our design is described in VHDL language, verified by simulation and implemented in a Stratix II EP2S130F1020C4 FPGA circuit. The experiment result show that the optimum operating clock frequency of the proposed design is 89MHz which achieves 160M pixels/sec.

Real-Time Digital Oscilloscope Implementation in 90nm CMOS Technology FPGA

This paper describes the design of a real-time audiorange digital oscilloscope and its implementation in 90nm CMOS FPGA platform. The design consists of sample and hold circuits, A/D conversion, audio and video processing, on-chip RAM, clock generation and control logic. The design of internal blocks and modules in 90nm devices in an FPGA is elaborated. Also the key features and their implementation algorithms are presented. Finally, the timing waveforms and simulation results are put forward.

64 bit Computer Architectures for Space Applications – A study

The more recent satellite projects/programs makes extensive usage of real – time embedded systems. 16 bit processors which meet the Mil-Std-1750 standard architecture have been used in on-board systems. Most of the Space Applications have been written in ADA. From a futuristic point of view, 32 bit/ 64 bit processors are needed in the area of spacecraft computing and therefore an effort is desirable in the study and survey of 64 bit architectures for space applications. This will also result in significant technology development in terms of VLSI and software tools for ADA (as the legacy code is in ADA). There are several basic requirements for a special processor for this purpose. They include Radiation Hardened (RadHard) devices, very low power dissipation, compatibility with existing operational systems, scalable architectures for higher computational needs, reliability, higher memory and I/O bandwidth, predictability, realtime operating system and manufacturability of such processors. Further on, these may include selection of FPGA devices, selection of EDA tool chains, design flow, partitioning of the design, pin count, performance evaluation, timing analysis etc. This project deals with a brief study of 32 and 64 bit processors readily available in the market and designing/ fabricating a 64 bit RISC processor named RISC MicroProcessor with added functionalities of an extended double precision floating point unit and a 32 bit signal processing unit acting as co-processors. In this paper, we emphasize the ease and importance of using Open Core (OpenSparc T1 Verilog RTL) and Open “Source" EDA tools such as Icarus to develop FPGA based prototypes quickly. Commercial tools such as Xilinx ISE for Synthesis are also used when appropriate.

Parallel-computing Approach for FFT Implementation on Digital Signal Processor (DSP)

An efficient parallel form in digital signal processor can improve the algorithm performance. The butterfly structure is an important role in fast Fourier transform (FFT), because its symmetry form is suitable for hardware implementation. Although it can perform a symmetric structure, the performance will be reduced under the data-dependent flow characteristic. Even though recent research which call as novel memory reference reduction methods (NMRRM) for FFT focus on reduce memory reference in twiddle factor, the data-dependent property still exists. In this paper, we propose a parallel-computing approach for FFT implementation on digital signal processor (DSP) which is based on data-independent property and still hold the property of low-memory reference. The proposed method combines final two steps in NMRRM FFT to perform a novel data-independent structure, besides it is very suitable for multi-operation-unit digital signal processor and dual-core system. We have applied the proposed method of radix-2 FFT algorithm in low memory reference on TI TMSC320C64x DSP. Experimental results show the method can reduce 33.8% clock cycles comparing with the NMRRM FFT implementation and keep the low-memory reference property.

Development of Circulating Support Environment of Multilingual Medical Communication using Parallel Texts for Foreign Patients

The need for multilingual communication in Japan has increased due to an increase in the number of foreigners in the country. When people communicate in their nonnative language, the differences in language prevent mutual understanding among the communicating individuals. In the medical field, communication between the hospital staff and patients is a serious problem. Currently, medical translators accompany patients to medical care facilities, and the demand for medical translators is increasing. However, medical translators cannot necessarily provide support, especially in cases in which round-the-clock support is required or in case of emergencies. The medical field has high expectations from information technology. Hence, a system that supports accurate multilingual communication is required. Despite recent advances in machine translation technology, it is very difficult to obtain highly accurate translations. We have developed a support system called M3 for multilingual medical reception. M3 provides support functions that aid foreign patients in the following respects: conversation, questionnaires, reception procedures, and hospital navigation; it also has a Q&A function. Users can operate M3 using a touch screen and receive text-based support. In addition, M3 uses accurate translation tools called parallel texts to facilitate reliable communication through conversations between the hospital staff and the patients. However, if there is no parallel text that expresses what users want to communicate, the users cannot communicate. In this study, we have developed a circulating support environment for multilingual medical communication using parallel texts. The proposed environment can circulate necessary parallel texts through the following procedure: (1) a user provides feedback about the necessary parallel texts, following which (2) these parallel texts are created and evaluated.

Fast and Efficient On-Chip Interconnection Modeling for High Speed VLSI Systems

Timing driven physical design, synthesis, and optimization tools need efficient closed-form delay models for estimating the delay associated with each net in an integrated circuit (IC) design. The total number of nets in a modern IC design has increased dramatically and exceeded millions. Therefore efficient modeling of interconnection is needed for high speed IC-s. This paper presents closed–form expressions for RC and RLC interconnection trees in current mode signaling, which can be implemented in VLSI design tool. These analytical model expressions can be used for accurate calculation of delay after the design clock tree has been laid out and the design is fully routed. Evaluation of these analytical models is several orders of magnitude faster than simulation using SPICE.

A New Hardware Implementation of Manchester Line Decoder

In this paper, we present a simple circuit for Manchester decoding and without using any complicated or programmable devices. This circuit can decode 90kbps of transmitted encoded data; however, greater than this transmission rate can be decoded if high speed devices were used. We also present a new method for extracting the embedded clock from Manchester data in order to use it for serial-to-parallel conversion. All of our experimental measurements have been done using simulation.

Detecting and Locating Wormhole Attacks in Wireless Sensor Networks Using Beacon Nodes

This paper focuses on wormhole attacks detection in wireless sensor networks. The wormhole attack is particularly challenging to deal with since the adversary does not need to compromise any nodes and can use laptops or other wireless devices to send the packets on a low latency channel. This paper introduces an easy and effective method to detect and locate the wormholes: Since beacon nodes are assumed to know their coordinates, the straight line distance between each pair of them can be calculated and then compared with the corresponding hop distance, which in this paper equals hop counts × node-s transmission range R. Dramatic difference may emerge because of an existing wormhole. Our detection mechanism is based on this. The approximate location of the wormhole can also be derived in further steps based on this information. To the best of our knowledge, our method is much easier than other wormhole detecting schemes which also use beacon nodes, and to those have special requirements on each nodes (e.g., GPS receivers or tightly synchronized clocks or directional antennas), ours is more economical. Simulation results show that the algorithm is successful in detecting and locating wormholes when the density of beacon nodes reaches 0.008 per m2.

Discrete Polyphase Matched Filtering-based Soft Timing Estimation for Mobile Wireless Systems

In this paper we present a soft timing phase estimation (STPE) method for wireless mobile receivers operating in low signal to noise ratios (SNRs). Discrete Polyphase Matched (DPM) filters, a Log-maximum a posterior probability (MAP) and/or a Soft-output Viterbi algorithm (SOVA) are combined to derive a new timing recovery (TR) scheme. We apply this scheme to wireless cellular communication system model that comprises of a raised cosine filter (RCF), a bit-interleaved turbo-coded multi-level modulation (BITMM) scheme and the channel is assumed to be memory-less. Furthermore, no clock signals are transmitted to the receiver contrary to the classical data aided (DA) models. This new model ensures that both the bandwidth and power of the communication system is conserved. However, the computational complexity of ideal turbo synchronization is increased by 50%. Several simulation tests on bit error rate (BER) and block error rate (BLER) versus low SNR reveal that the proposed iterative soft timing recovery (ISTR) scheme outperforms the conventional schemes.

A 3.125Gb/s Clock and Data Recovery Circuit Using 1/4-Rate Technique

This paper describes the design and fabrication of a clock and data recovery circuit (CDR). We propose a new clock and data recovery which is based on a 1/4-rate frequency detector (QRFD). The proposed frequency detector helps reduce the VCO frequency and is thus advantageous for high speed application. The proposed frequency detector can achieve low jitter operation and extend the pull-in range without using the reference clock. The proposed CDR was implemented using a 1/4-rate bang-bang type phase detector (PD) and a ring voltage controlled oscillator (VCO). The CDR circuit has been fabricated in a standard 0.18 CMOS technology. It occupies an active area of 1 x 1 and consumes 90 mW from a single 1.8V supply.

Power Reduction by Automatic Monitoring and Control System in Active Mode

This paper describes a novel monitoring scheme to minimize total active power in digital circuits depend on the demand frequency, by adjusting automatically both supply voltage and threshold voltages based on circuit operating conditions such as temperature, process variations, and desirable frequency. The delay monitoring results, will be control and apply so as to be maintained at the minimum value at which the chip is able to operate for a given clock frequency. Design details of power monitor are examined using simulation framework in 32nm BTPM model CMOS process. Experimental results show the overhead of proposed circuit in terms of its power consumption is about 40 μW for 32nm technology; moreover the results show that our proposed circuit design is not far sensitive to the temperature variations and also process variations. Besides, uses the simple blocks which offer good sensitivity, high speed, the continuously feedback loop. This design provides up to 40% reduction in power consumption in active mode.

Performance Analysis of Digital Signal Processors Using SMV Benchmark

Unlike general-purpose processors, digital signal processors (DSP processors) are strongly application-dependent. To meet the needs for diverse applications, a wide variety of DSP processors based on different architectures ranging from the traditional to VLIW have been introduced to the market over the years. The functionality, performance, and cost of these processors vary over a wide range. In order to select a processor that meets the design criteria for an application, processor performance is usually the major concern for digital signal processing (DSP) application developers. Performance data are also essential for the designers of DSP processors to improve their design. Consequently, several DSP performance benchmarks have been proposed over the past decade or so. However, none of these benchmarks seem to have included recent new DSP applications. In this paper, we use a new benchmark that we recently developed to compare the performance of popular DSP processors from Texas Instruments and StarCore. The new benchmark is based on the Selectable Mode Vocoder (SMV), a speech-coding program from the recent third generation (3G) wireless voice applications. All benchmark kernels are compiled by the compilers of the respective DSP processors and run on their simulators. Weighted arithmetic mean of clock cycles and arithmetic mean of code size are used to compare the performance of five DSP processors. In addition, we studied how the performance of a processor is affected by code structure, features of processor architecture and optimization of compiler. The extensive experimental data gathered, analyzed, and presented in this paper should be helpful for DSP processor and compiler designers to meet their specific design goals.

Jitter Transfer in High Speed Data Links

Phase locked loops for data links operating at 10 Gb/s or faster are low phase noise devices designed to operate with a low jitter reference clock. Characterization of their jitter transfer function is difficult because the intrinsic noise of the device is comparable to the random noise level in the reference clock signal. A linear model is proposed to account for the intrinsic noise of a PLL. The intrinsic noise data of a PLL for 10 Gb/s links is presented. The jitter transfer function of a PLL in a test chip for 12.8 Gb/s data links was determined in experiments using the 400 MHz reference clock as the source of simultaneous excitations over a wide range of frequency. The result shows that the PLL jitter transfer function can be approximated by a second order linear model.

Construction of Attitude Reference Benchmark for Test of Star Sensor Based on Precise Timing

To satisfy the need of outfield tests of star sensors, a method is put forward to construct the reference attitude benchmark. Firstly, its basic principle is introduced; Then, all the separate conversion matrixes are deduced, which include: the conversion matrix responsible for the transformation from the Earth Centered Inertial frame i to the Earth-centered Earth-fixed frame w according to the time of an atomic clock, the conversion matrix from frame w to the geographic frame t, and the matrix from frame t to the platform frame p, so the attitude matrix of the benchmark platform relative to the frame i can be obtained using all the three matrixes as the multiplicative factors; Next, the attitude matrix of the star sensor relative to frame i is got when the mounting matrix from frame p to the star sensor frame s is calibrated, and the reference attitude angles for star sensor outfield tests can be calculated from the transformation from frame i to frame s; Finally, the computer program is finished to solve the reference attitudes, and the error curves are drawn about the three axis attitude angles whose absolute maximum error is just 0.25ÔÇ│. The analysis on each loop and the final simulating results manifest that the method by precise timing to acquire the absolute reference attitude is feasible for star sensor outfield tests.

A Virtual Learning Environment for Deaf Children: Design and Evaluation

The object of this research is the design and evaluation of an immersive Virtual Learning Environment (VLE) for deaf children. Recently we have developed a prototype immersive VR game to teach sign language mathematics to deaf students age K- 4 [1] [2]. In this paper we describe a significant extension of the prototype application. The extension includes: (1) user-centered design and implementation of two additional interactive environments (a clock store and a bakery), and (2) user-centered evaluation including development of user tasks, expert panel-based evaluation, and formative evaluation. This paper is one of the few to focus on the importance of user-centered, iterative design in VR application development, and to describe a structured evaluation method.

A Single-Phase Register File with Complementary Pass-Transistor Adiabatic Logic

This paper introduces an adiabatic register file based on two-phase CPAL (Complementary Pass-Transistor Adiabatic Logic circuits) with power-gating scheme, which can operate on a single-phase power clock. A 32×32 single-phase adiabatic register file with power-gating scheme has been implemented with TSMC 0.18μm CMOS technology. All the circuits except for the storage cells employ two-phase CPAL circuits, and the storage cell is based on the conventional memory one. The two-phase non-overlap power-clock generator with power-gating scheme is used to supply the proposed adiabatic register file. Full-custom layouts are drawn. The energy and functional simulations have been performed using the net-list extracted from their layouts. Compared with the traditional static CMOS register file, HSPICE simulations show that the proposed adiabatic register file can work very well, and it attains about 73% energy savings at 100 MHz.