Abstract: Speedups from mapping four real-life DSP
applications on an embedded system-on-chip that couples coarsegrained
reconfigurable logic with an instruction-set processor are
presented. The reconfigurable logic is realized by a 2-Dimensional
Array of Processing Elements. A design flow for improving
application-s performance is proposed. Critical software parts, called
kernels, are accelerated on the Coarse-Grained Reconfigurable
Array. The kernels are detected by profiling the source code. For
mapping the detected kernels on the reconfigurable logic a prioritybased
mapping algorithm has been developed. Two 4x4 array
architectures, which differ in their interconnection structure among
the Processing Elements, are considered. The experiments for eight
different instances of a generic system show that important overall
application speedups have been reported for the four applications.
The performance improvements range from 1.86 to 3.67, with an
average value of 2.53, compared with an all-software execution.
These speedups are quite close to the maximum theoretical speedups
imposed by Amdahl-s law.
Abstract: A fully on-chip low drop-out (LDO) voltage regulator with 100pF output load capacitor is presented. A novel frequency compensation scheme using current buffer is adopted to realize single dominant pole within the unit gain frequency of the regulation loop, the phase margin (PM) is at least 50 degree under the full range of the load current, and the power supply rejection (PSR) character is improved compared with conventional Miller compensation. Besides, the differentiator provides a high speed path during the load current transient. Implemented in 0.18μm CMOS technology, the LDO voltage regulator provides 100mA load current with a stable 1.8V output voltage consuming 80μA quiescent current.
Abstract: As the development of digital technology is increasing,
Digital cinema is getting more spread.
However, content copy and attack against the digital cinema becomes
a serious problem. To solve the above security problem, we propose
“Additional Watermarking" for digital cinema delivery system. With
this proposed “Additional watermarking" method, we protect content
copyrights at encoder and user side information at decoder. It realizes
the traceability of the watermark embedded at encoder.
The watermark is embedded into the random-selected frames using
Hash function. Using it, the embedding position is distributed by Hash
Function so that third parties do not break off the watermarking
algorithm.
Finally, our experimental results show that proposed method is much
better than the convenient watermarking techniques in terms of
robustness, image quality and its simple but unbreakable algorithm.
Abstract: Local Linear Neuro-Fuzzy Models (LLNFM) like other neuro- fuzzy systems are adaptive networks and provide robust learning capabilities and are widely utilized in various applications such as pattern recognition, system identification, image processing and prediction. Local linear model tree (LOLIMOT) is a type of Takagi-Sugeno-Kang neuro fuzzy algorithm which has proven its efficiency compared with other neuro fuzzy networks in learning the nonlinear systems and pattern recognition. In this paper, a dedicated reconfigurable and parallel processing hardware for LOLIMOT algorithm and its applications are presented. This hardware realizes on-chip learning which gives it the capability to work as a standalone device in a system. The synthesis results on FPGA platforms show its potential to improve the speed at least 250 of times faster than software implemented algorithms.
Abstract: Electron multiplying charge coupled devices (EMCCDs) have revolutionized the world of low light imaging by introducing on-chip multiplication gain based on the impact ionization effect in the silicon. They combine the sub-electron readout noise with high frame rates. Signal-to-noise Ratio (SNR) is an important performance parameter for low-light-level imaging systems. This work investigates the SNR performance of an EMCCD operated in Non-inverted Mode (NIMO) and Inverted Mode (IMO). The theory of noise characteristics and operation modes is presented. The results show that the SNR of is determined by dark current and clock induced charge at high gain level. The optimum SNR performance is provided by an EMCCD operated in NIMO in short exposure and strong cooling applications. In contrast, an IMO EMCCD is preferable.
Abstract: In this paper, 3X3 routing nodes are proposed to
provide speedup and parallel processing capability in Data Vortex
network architectures. The new design not only significantly
improves network throughput and latency, but also eliminates the
need for distributive traffic control mechanism originally embedded
among nodes and the need for nodal buffering. The cost effectiveness
is studied by a comparison study with the previously proposed 2-
input buffered networks, and considerable performance enhancement
can be achieved with similar or lower cost of hardware. Unlike
previous implementation, the network leaves small probability of
contention, therefore, the packet drop rate must be kept low for such
implementation to be feasible and attractive, and it can be achieved
with proper choice of operation conditions.
Abstract: A new genetic algorithm, termed the 'optimum individual monogenetic genetic algorithm' (OIMGA), is presented whose properties have been deliberately designed to be well suited to hardware implementation. Specific design criteria were to ensure fast access to the individuals in the population, to keep the required silicon area for hardware implementation to a minimum and to incorporate flexibility in the structure for the targeting of a range of applications. The first two criteria are met by retaining only the current optimum individual, thereby guaranteeing a small memory requirement that can easily be stored in fast on-chip memory. Also, OIMGA can be easily reconfigured to allow the investigation of problems that normally warrant either large GA populations or individuals many genes in length. Local convergence is achieved in OIMGA by retaining elite individuals, while population diversity is ensured by continually searching for the best individuals in fresh regions of the search space. The results given in this paper demonstrate that both the performance of OIMGA and its convergence time are superior to those of a range of existing hardware GA implementations.
Abstract: In this paper, the problem of reducing switching
activity in on-chip buses at the stage of high-level synthesis is
considered, and a high-level low power bus binding based on dynamic
bit reordering is proposed. Whereas conventional methods use a fixed
bit ordering between variables within a bus, the proposed method
switches a bit ordering dynamically to obtain a switching activity
reduction. As a result, the proposed method finds a binding solution
with a smaller value of total switching activity (TSA). Experimental
result shows that the proposed method obtains a binding solution
having 12.0-34.9% smaller TSA compared with the conventional
methods.
Abstract: An efficient architecture for low jitter All Digital
Phase Locked Loop (ADPLL) suitable for high speed SoC
applications is presented in this paper. The ADPLL is designed using
standard cells and described by Hardware Description Language
(HDL). The ADPLL implemented in a 90 nm CMOS process can
operate from 10 to 200 MHz and achieve worst case frequency
acquisition in 14 reference clock cycles. The simulation result shows
that PLL has cycle to cycle jitter of 164 ps and period jitter of 100 ps
at 100MHz. Since the digitally controlled oscillator (DCO) can
achieve both high resolution and wide frequency range, it can meet
the demands of system-level integration. The proposed ADPLL can
easily be ported to different processes in a short time. Thus, it can
reduce the design time and design complexity of the ADPLL, making
it very suitable for System-on-Chip (SoC) applications.
Abstract: PDMS (Polydimethylsiloxane) polymer is a suitable material for biological and MEMS (Microelectromechanical systems) designers, because of its biocompatibility, transparency and high resistance under plasma treatment. PDMS round channel is always been of great interest due to its ability to confine the liquid with membrane type micro valves. In this paper we are presenting a very simple way to form round shapemicrofluidic channel, which is based on reflow of positive photoresist AZ® 40 XT. With this method, it is possible to obtain channel of different height simply by varying the spin coating parameters of photoresist.
Abstract: Fast delay estimation methods, as opposed to
simulation techniques, are needed for incremental performance
driven layout synthesis. On-chip inductive effects are becoming
predominant in deep submicron interconnects due to increasing clock
speed and circuit complexity. Inductance causes noise in signal
waveforms, which can adversely affect the performance of the circuit
and signal integrity. Several approaches have been put forward which
consider the inductance for on-chip interconnect modelling. But for
even much higher frequency, of the order of few GHz, the shunt
dielectric lossy component has become comparable to that of other
electrical parameters for high speed VLSI design. In order to cope up
with this effect, on-chip interconnect has to be modelled as
distributed RLCG line. Elmore delay based methods, although
efficient, cannot accurately estimate the delay for RLCG interconnect
line. In this paper, an accurate analytical delay model has been
derived, based on first and second moments of RLCG
interconnection lines. The proposed model considers both the effect
of inductance and conductance matrices. We have performed the
simulation in 0.18μm technology node and an error of as low as less
as 5% has been achieved with the proposed model when compared to
SPICE. The importance of the conductance matrices in interconnect
modelling has also been discussed and it is shown that if G is
neglected for interconnect line modelling, then it will result an delay
error of as high as 6% when compared to SPICE.