Abstract: In this paper, we present the design and experimental
evaluation of complementary energy path adiabatic logic (CEPAL)
based 1 bit full adder circuit. A simulative investigation on the
proposed full adder has been done using VIRTUOSO SPECTRE
simulator of cadence in 0.18μm UMC technology and its
performance has been compared with the conventional CMOS full
adder circuit. The CEPAL based full adder circuit exhibits the energy
saving of 70% to the conventional CMOS full adder circuit, at 100
MHz frequency and 1.8V operating voltage.
Abstract: Multimedia distributed systems deal with heterogeneous
data, such as texts, images, graphics, video and audio. The specification
of temporal relations among different data types and distributed
sources is an open research area. This paper proposes a fully
distributed synchronization model to be used in multimedia systems.
One original aspect of the model is that it avoids the use of a common
reference (e.g. wall clock and shared memory). To achieve this, all
possible multimedia temporal relations are specified according to
their causal dependencies.
Abstract: A linear feedback shift register (LFSR) is proposed which targets to reduce the power consumption from within. It reduces the power consumption during testing of a Circuit Under Test (CUT) at two stages. At first stage,
Control Logic (CL) makes the clocks of the switching units
of the register inactive for a time period when output from
them is going to be same as previous one and thus reducing
unnecessary switching of the flip-flops. And at second stage,
the LFSR reorders the test vectors by interchanging the bit
with its next and closest neighbor bit. It keeps fault coverage
capacity of the vectors unchanged but reduces the Total Hamming Distance (THD) so that there is reduction in power
while shifting operation.
Abstract: This paper presents a VLSI design approach of a highspeed
and real-time 2-D Discrete Wavelet Transform computing. The
proposed architecture, based on new and fast convolution approach,
reduces the hardware complexity in addition to reduce the critical
path to the multiplier delay. Furthermore, an advanced twodimensional
(2-D) discrete wavelet transform (DWT)
implementation, with an efficient memory area, is designed to
produce one output in every clock cycle. As a result, a very highspeed
is attained. The system is verified, using JPEG2000
coefficients filters, on Xilinx Virtex-II Field Programmable Gate
Array (FPGA) device without accessing any external memory. The
resulting computing rate is up to 270 M samples/s and the (9,7) 2-D
wavelet filter uses only 18 kb of memory (16 kb of first-in-first-out
memory) with 256×256 image size. In this way, the developed design
requests reduced memory and provide very high-speed processing as
well as high PSNR quality.
Abstract: In MPEG and H.26x standards, to eliminate the
temporal redundancy we use motion estimation. Given that the
motion estimation stage is very complex in terms of computational
effort, a hardware implementation on a re-configurable circuit is
crucial for the requirements of different real time multimedia
applications. In this paper, we present hardware architecture for
motion estimation based on "Full Search Block Matching" (FSBM)
algorithm. This architecture presents minimum latency, maximum
throughput, full utilization of hardware resources such as embedded
memory blocks, and combining both pipelining and parallel
processing techniques. Our design is described in VHDL language,
verified by simulation and implemented in a Stratix II
EP2S130F1020C4 FPGA circuit. The experiment result show that the
optimum operating clock frequency of the proposed design is 89MHz
which achieves 160M pixels/sec.
Abstract: This paper describes the design of a real-time audiorange
digital oscilloscope and its implementation in 90nm CMOS
FPGA platform. The design consists of sample and hold circuits,
A/D conversion, audio and video processing, on-chip RAM, clock
generation and control logic. The design of internal blocks and
modules in 90nm devices in an FPGA is elaborated. Also the key
features and their implementation algorithms are presented.
Finally, the timing waveforms and simulation results are put
forward.
Abstract: The more recent satellite projects/programs makes
extensive usage of real – time embedded systems. 16 bit processors
which meet the Mil-Std-1750 standard architecture have been used in
on-board systems. Most of the Space Applications have been written
in ADA. From a futuristic point of view, 32 bit/ 64 bit processors are
needed in the area of spacecraft computing and therefore an effort is
desirable in the study and survey of 64 bit architectures for space
applications. This will also result in significant technology
development in terms of VLSI and software tools for ADA (as the
legacy code is in ADA).
There are several basic requirements for a special processor for
this purpose. They include Radiation Hardened (RadHard) devices,
very low power dissipation, compatibility with existing operational
systems, scalable architectures for higher computational needs,
reliability, higher memory and I/O bandwidth, predictability, realtime
operating system and manufacturability of such processors.
Further on, these may include selection of FPGA devices, selection
of EDA tool chains, design flow, partitioning of the design, pin
count, performance evaluation, timing analysis etc.
This project deals with a brief study of 32 and 64 bit processors
readily available in the market and designing/ fabricating a 64 bit
RISC processor named RISC MicroProcessor with added
functionalities of an extended double precision floating point unit
and a 32 bit signal processing unit acting as co-processors. In this
paper, we emphasize the ease and importance of using Open Core
(OpenSparc T1 Verilog RTL) and Open “Source" EDA tools such as
Icarus to develop FPGA based prototypes quickly. Commercial tools
such as Xilinx ISE for Synthesis are also used when appropriate.
Abstract: An efficient parallel form in digital signal processor can improve the algorithm performance. The butterfly structure is an important role in fast Fourier transform (FFT), because its symmetry form is suitable for hardware implementation. Although it can perform a symmetric structure, the performance will be reduced under the data-dependent flow characteristic. Even though recent research which call as novel memory reference reduction methods (NMRRM) for FFT focus on reduce memory reference in twiddle factor, the data-dependent property still exists. In this paper, we propose a parallel-computing approach for FFT implementation on digital signal processor (DSP) which is based on data-independent property and still hold the property of low-memory reference. The proposed method combines final two steps in NMRRM FFT to perform a novel data-independent structure, besides it is very suitable for multi-operation-unit digital signal processor and dual-core system. We have applied the proposed method of radix-2 FFT algorithm in low memory reference on TI TMSC320C64x DSP. Experimental results show the method can reduce 33.8% clock cycles comparing with the NMRRM FFT implementation and keep the low-memory reference property.
Abstract: The need for multilingual communication in Japan has
increased due to an increase in the number of foreigners in the
country. When people communicate in their nonnative language,
the differences in language prevent mutual understanding among
the communicating individuals. In the medical field, communication
between the hospital staff and patients is a serious problem. Currently,
medical translators accompany patients to medical care facilities, and
the demand for medical translators is increasing. However, medical
translators cannot necessarily provide support, especially in cases in
which round-the-clock support is required or in case of emergencies.
The medical field has high expectations from information technology.
Hence, a system that supports accurate multilingual communication is
required. Despite recent advances in machine translation technology,
it is very difficult to obtain highly accurate translations. We have
developed a support system called M3 for multilingual medical
reception. M3 provides support functions that aid foreign patients in
the following respects: conversation, questionnaires, reception procedures,
and hospital navigation; it also has a Q&A function. Users
can operate M3 using a touch screen and receive text-based support.
In addition, M3 uses accurate translation tools called parallel texts
to facilitate reliable communication through conversations between
the hospital staff and the patients. However, if there is no parallel
text that expresses what users want to communicate, the users cannot
communicate. In this study, we have developed a circulating support
environment for multilingual medical communication using parallel
texts. The proposed environment can circulate necessary parallel texts
through the following procedure: (1) a user provides feedback about
the necessary parallel texts, following which (2) these parallel texts
are created and evaluated.
Abstract: Timing driven physical design, synthesis, and
optimization tools need efficient closed-form delay models for
estimating the delay associated with each net in an integrated circuit
(IC) design. The total number of nets in a modern IC design has
increased dramatically and exceeded millions. Therefore efficient
modeling of interconnection is needed for high speed IC-s. This
paper presents closed–form expressions for RC and RLC
interconnection trees in current mode signaling, which can be
implemented in VLSI design tool. These analytical model
expressions can be used for accurate calculation of delay after the
design clock tree has been laid out and the design is fully routed.
Evaluation of these analytical models is several orders of magnitude
faster than simulation using SPICE.
Abstract: In this paper, we present a simple circuit for
Manchester decoding and without using any complicated or
programmable devices. This circuit can decode 90kbps of transmitted
encoded data; however, greater than this transmission rate can be
decoded if high speed devices were used. We also present a new
method for extracting the embedded clock from Manchester data in
order to use it for serial-to-parallel conversion. All of our
experimental measurements have been done using simulation.
Abstract: This paper focuses on wormhole attacks detection in wireless sensor networks. The wormhole attack is particularly challenging to deal with since the adversary does not need to compromise any nodes and can use laptops or other wireless devices to send the packets on a low latency channel. This paper introduces an easy and effective method to detect and locate the wormholes: Since beacon nodes are assumed to know their coordinates, the straight line distance between each pair of them can be calculated and then compared with the corresponding hop distance, which in this paper equals hop counts × node-s transmission range R. Dramatic difference may emerge because of an existing wormhole. Our detection mechanism is based on this. The approximate location of the wormhole can also be derived in further steps based on this information. To the best of our knowledge, our method is much easier than other wormhole detecting schemes which also use beacon nodes, and to those have special requirements on each nodes (e.g., GPS receivers or tightly synchronized clocks or directional antennas), ours is more economical. Simulation results show that the algorithm is successful in detecting and locating wormholes when the density of beacon nodes reaches 0.008 per m2.
Abstract: In this paper we present a soft timing phase estimation (STPE) method for wireless mobile receivers operating in low signal to noise ratios (SNRs). Discrete Polyphase Matched (DPM) filters, a Log-maximum a posterior probability (MAP) and/or a Soft-output Viterbi algorithm (SOVA) are combined to derive a new timing recovery (TR) scheme. We apply this scheme to wireless cellular communication system model that comprises of a raised cosine filter (RCF), a bit-interleaved turbo-coded multi-level modulation (BITMM) scheme and the channel is assumed to be memory-less. Furthermore, no clock signals are transmitted to the receiver contrary to the classical data aided (DA) models. This new model ensures that both the bandwidth and power of the communication system is conserved. However, the computational complexity of ideal turbo synchronization is increased by 50%. Several simulation tests on bit error rate (BER) and block error rate (BLER) versus low SNR reveal that the proposed iterative soft timing recovery (ISTR) scheme outperforms the conventional schemes.
Abstract: This paper describes the design and fabrication of a clock and data recovery circuit (CDR). We propose a new clock and data recovery which is based on a 1/4-rate frequency detector (QRFD). The proposed frequency detector helps reduce the VCO frequency and is thus advantageous for high speed application. The proposed frequency detector can achieve low jitter operation and extend the pull-in range without using the reference clock. The proposed CDR was implemented using a 1/4-rate bang-bang type phase detector (PD) and a ring voltage controlled oscillator (VCO). The CDR circuit has been fabricated in a standard 0.18 CMOS technology. It occupies an active area of 1 x 1 and consumes 90 mW from a single 1.8V supply.
Abstract: This paper describes a novel monitoring scheme to
minimize total active power in digital circuits depend on the demand
frequency, by adjusting automatically both supply voltage and
threshold voltages based on circuit operating conditions such as
temperature, process variations, and desirable frequency. The delay
monitoring results, will be control and apply so as to be maintained at
the minimum value at which the chip is able to operate for a given
clock frequency. Design details of power monitor are examined using
simulation framework in 32nm BTPM model CMOS process.
Experimental results show the overhead of proposed circuit in terms
of its power consumption is about 40 μW for 32nm technology;
moreover the results show that our proposed circuit design is not far
sensitive to the temperature variations and also process variations.
Besides, uses the simple blocks which offer good sensitivity, high
speed, the continuously feedback loop. This design provides up to
40% reduction in power consumption in active mode.
Abstract: Unlike general-purpose processors, digital signal
processors (DSP processors) are strongly application-dependent. To
meet the needs for diverse applications, a wide variety of DSP
processors based on different architectures ranging from the
traditional to VLIW have been introduced to the market over the
years. The functionality, performance, and cost of these processors
vary over a wide range. In order to select a processor that meets the
design criteria for an application, processor performance is usually
the major concern for digital signal processing (DSP) application
developers. Performance data are also essential for the designers of
DSP processors to improve their design. Consequently, several DSP
performance benchmarks have been proposed over the past decade or
so. However, none of these benchmarks seem to have included recent
new DSP applications.
In this paper, we use a new benchmark that we recently developed
to compare the performance of popular DSP processors from Texas
Instruments and StarCore. The new benchmark is based on the
Selectable Mode Vocoder (SMV), a speech-coding program from the
recent third generation (3G) wireless voice applications. All
benchmark kernels are compiled by the compilers of the respective
DSP processors and run on their simulators. Weighted arithmetic
mean of clock cycles and arithmetic mean of code size are used to
compare the performance of five DSP processors.
In addition, we studied how the performance of a processor is
affected by code structure, features of processor architecture and
optimization of compiler. The extensive experimental data gathered,
analyzed, and presented in this paper should be helpful for DSP
processor and compiler designers to meet their specific design goals.
Abstract: Phase locked loops for data links operating at 10 Gb/s
or faster are low phase noise devices designed to operate with a low
jitter reference clock. Characterization of their jitter transfer function
is difficult because the intrinsic noise of the device is comparable to
the random noise level in the reference clock signal. A linear model
is proposed to account for the intrinsic noise of a PLL. The intrinsic
noise data of a PLL for 10 Gb/s links is presented. The jitter transfer
function of a PLL in a test chip for 12.8 Gb/s data links was
determined in experiments using the 400 MHz reference clock as the
source of simultaneous excitations over a wide range of frequency.
The result shows that the PLL jitter transfer function can be
approximated by a second order linear model.
Abstract: To satisfy the need of outfield tests of star sensors, a
method is put forward to construct the reference attitude benchmark.
Firstly, its basic principle is introduced; Then, all the separate
conversion matrixes are deduced, which include: the conversion
matrix responsible for the transformation from the Earth Centered
Inertial frame i to the Earth-centered Earth-fixed frame w according to
the time of an atomic clock, the conversion matrix from frame w to the
geographic frame t, and the matrix from frame t to the platform frame
p, so the attitude matrix of the benchmark platform relative to the
frame i can be obtained using all the three matrixes as the
multiplicative factors; Next, the attitude matrix of the star sensor
relative to frame i is got when the mounting matrix from frame p to the
star sensor frame s is calibrated, and the reference attitude angles for
star sensor outfield tests can be calculated from the transformation
from frame i to frame s; Finally, the computer program is finished to
solve the reference attitudes, and the error curves are drawn about the
three axis attitude angles whose absolute maximum error is just 0.25ÔÇ│.
The analysis on each loop and the final simulating results manifest that
the method by precise timing to acquire the absolute reference attitude
is feasible for star sensor outfield tests.
Abstract: The object of this research is the design and
evaluation of an immersive Virtual Learning Environment (VLE) for
deaf children. Recently we have developed a prototype immersive
VR game to teach sign language mathematics to deaf students age K-
4 [1] [2]. In this paper we describe a significant extension of the
prototype application. The extension includes: (1) user-centered
design and implementation of two additional interactive
environments (a clock store and a bakery), and (2) user-centered
evaluation including development of user tasks, expert panel-based
evaluation, and formative evaluation. This paper is one of the few to
focus on the importance of user-centered, iterative design in VR
application development, and to describe a structured evaluation
method.
Abstract: This paper introduces an adiabatic register file based
on two-phase CPAL (Complementary Pass-Transistor Adiabatic
Logic circuits) with power-gating scheme, which can operate on a
single-phase power clock. A 32×32 single-phase adiabatic register file
with power-gating scheme has been implemented with TSMC 0.18μm
CMOS technology. All the circuits except for the storage cells employ
two-phase CPAL circuits, and the storage cell is based on the
conventional memory one. The two-phase non-overlap power-clock
generator with power-gating scheme is used to supply the proposed
adiabatic register file. Full-custom layouts are drawn. The energy and
functional simulations have been performed using the net-list
extracted from their layouts. Compared with the traditional static
CMOS register file, HSPICE simulations show that the proposed
adiabatic register file can work very well, and it attains about 73%
energy savings at 100 MHz.