Abstract: The success of an electronic system in a System-on- Chip is highly dependent on the efficiency of its interconnection network, which is constructed from routers and channels (the routers move data across the channels between nodes). Since neither classical bus based nor point to point architectures can provide scalable solutions and satisfy the tight power and performance requirements of future applications, the Network-on-Chip (NoC) approach has recently been proposed as a promising solution. Indeed, in contrast to the traditional solutions, the NoC approach can provide large bandwidth with moderate area overhead. The selected topology of the components interconnects plays prime rule in the performance of NoC architecture as well as routing and switching techniques that can be used. In this paper, we present two generic NoC architectures that can be customized to the specific communication needs of an application in order to reduce the area with minimal degradation of the latency of the system. An experimental study is performed to compare these structures with basic NoC topologies represented by 2D mesh, Butterfly-Fat Tree (BFT) and SPIN. It is shown that Cluster mesh (CMesh) and MinRoot schemes achieves significant improvements in network latency and energy consumption with only negligible area overhead and complexity over existing architectures. In fact, in the case of basic NoC topologies, CMesh and MinRoot schemes provides substantial savings in area as well, because they requires fewer routers. The simulation results show that CMesh and MinRoot networks outperforms MESH, BFT and SPIN in main performance metrics.
Abstract: The general purpose processors that are used in
embedded systems must support constraints like execution time,
power consumption, code size and so on. On the other hand an
Application Specific Instruction-set Processor (ASIP) has advantages
in terms of power consumption, performance and flexibility. In this
paper, a 16-bit Application Specific Instruction-set processor for the
sensor data transfer is proposed. The designed processor architecture
consists of on-chip transmitter and receiver modules along with the
processing and controlling units to enable the data transmission and
reception on a single die. The data transfer is accomplished with less
number of instructions as compared with the general purpose
processor. The ASIP core operates at a maximum clock frequency of
1.132GHz with a delay of 0.883ns and consumes 569.63mW power
at an operating voltage of 1.2V. The ASIP is implemented in Verilog
HDL using the Xilinx platform on Virtex4.
Abstract: A synchronous network-on-chip using wormhole packet switching
and supporting guaranteed-completion best-effort with low-priority (LP)
and high-priority (HP) wormhole packet delivery service is presented in
this paper. Both our proposed LP and HP message services deliver a good
quality of service in term of lossless packet completion and in-order message
data delivery. However, the LP message service does not guarantee minimal
completion bound. The HP packets will absolutely use 100% bandwidth of
their reserved links if the HP packets are injected from the source node with
maximum injection. Hence, the service are suitable for small size messages
(less than hundred bytes). Otherwise the other HP and LP messages, which
require also the links, will experience relatively high latency depending on the
size of the HP message. The LP packets are routed using a minimal adaptive
routing, while the HP packets are routed using a non-minimal adaptive routing
algorithm. Therefore, an additional 3-bit field, identifying the packet type,
is introduced in their packet headers to classify and to determine the type
of service committed to the packet. Our NoC prototypes have been also
synthesized using a 180-nm CMOS standard-cell technology to evaluate the
cost of implementing the combination of both services.
Abstract: Independent spanning trees (ISTs) provide a number of advantages in data broadcasting. One can cite the use in fault tolerance network protocols for distributed computing and bandwidth. However, the problem of constructing multiple ISTs is considered hard for arbitrary graphs. In this paper we present an efficient algorithm to construct ISTs on hypercubes that requires minimum resources to be performed.
Abstract: As far as the latest technological improvements are concerned, digital systems more become popular than the past. Despite this growing demand to the digital systems, content copy and attack against the digital cinema contents becomes a serious problem. To solve the above security problem, we propose “traceable watermarking using Hash functions for digital cinema system. Digital Cinema is a great application for traceable watermarking since it uses watermarking technology during content play as well as content transmission. The watermark is embedded into the randomly selected movie frames using CRC-32 techniques. CRC-32 is a Hash function. Using it, the embedding position is distributed by Hash Function so that any party cannot break off the watermarking or will not be able to change. Finally, our experimental results show that proposed DWT watermarking method using CRC-32 is much better than the convenient watermarking techniques in terms of robustness, image quality and its simple but unbreakable algorithm.
Abstract: Network on a chip (NoC) has been proposed as a viable solution to counter the inefficiency of buses in the current VLSI on-chip interconnects. However, as the silicon chip accommodates more transistors, the probability of transient faults is increasing, making fault tolerance a key concern in scaling chips. In packet based communication on a chip, transient failures can corrupt the data packet and hence, undermine the accuracy of data communication. In this paper, we present a comparative analysis of transient fault tolerant techniques including end-to-end, node-by-node, and stochastic communication based on flooding principle.
Abstract: We prove detailed analysis of a waveguide-based Schottky barrier photodetector (SBPD) where a thin silicide film is put on the top of a silicon-on-insulator (SOI) channel waveguide to absorb light propagating along the waveguide. Taking both the confinement factor of light absorption and the wall scanning induced gain of the photoexcited carriers into account, an optimized silicide thickness is extracted to maximize the effective gain, thereby the responsivity. For typical lengths of the thin silicide film (10-20 Ðçm), the optimized thickness is estimated to be in the range of 1-2 nm, and only about 50-80% light power is absorbed to reach the maximum responsivity. Resonant waveguide-based SBPDs are proposed, which consist of a microloop, microdisc, or microring waveguide structure to allow light multiply propagating along the circular Si waveguide beneath the thin silicide film. Simulation results suggest that such resonant waveguide-based SBPDs have much higher repsonsivity at the resonant wavelengths as compared to the straight waveguidebased detectors. Some experimental results about Si waveguide-based SBPD are also reported.
Abstract: This paper presents design trade-off and performance impacts of
the amount of pipeline phase of control path signals in a wormhole-switched
network-on-chip (NoC). The numbers of the pipeline phase of the control
path vary between two- and one-cycle pipeline phase. The control paths
consist of the routing request paths for output selection and the arbitration
paths for input selection. Data communications between on-chip routers are
implemented synchronously and for quality of service, the inter-router data
transports are controlled by using a link-level congestion control to avoid
lose of data because of an overflow. The trade-off between the area (logic
cell area) and the performance (bandwidth gain) of two proposed NoC router
microarchitectures are presented in this paper. The performance evaluation is
made by using a traffic scenario with different number of workloads under
2D mesh NoC topology using a static routing algorithm. By using a 130-nm
CMOS standard-cell technology, our NoC routers can be clocked at 1 GHz,
resulting in a high speed network link and high router bandwidth capacity
of about 320 Gbit/s. Based on our experiments, the amount of control path
pipeline stages gives more significant impact on the NoC performance than
the impact on the logic area of the NoC router.
Abstract: In this paper, we address the problem of reducing the
switching activity (SA) in on-chip buses through the use of a bus
binding technique in high-level synthesis. While many binding
techniques to reduce the SA exist, we present yet another technique for
further reducing the switching activity. Our proposed method
combines bus binding and data sequence reordering to explore a wider
solution space. The problem is formulated as a multiple traveling
salesman problem and solved using simulated annealing technique.
The experimental results revealed that a binding solution obtained
with the proposed method reduces 5.6-27.2% (18.0% on average) and
2.6-12.7% (6.8% on average) of the switching activity when compared
with conventional binding-only and hybrid binding-encoding
methods, respectively.
Abstract: The resistive-inductive-capacitive behavior of long
interconnects which are driven by CMOS gates are presented in this
paper. The analysis is based on the ¤Ç-model of a RLC load and is
developed for submicron devices. Accurate and analytical
expressions for the output load voltage, the propagation delay and the
short circuit power dissipation have been proposed after solving a
system of differential equations which accurately describe the
behavior of the circuit. The effect of coupling capacitance between
input and output and the short circuit current on these performance
parameters are also incorporated in the proposed model. The
estimated proposed delay and short circuit power dissipation are in
very good agreement with the SPICE simulation with average
relative error less than 6%.
Abstract: Every day human life experiences new equipments
more automatic and with more abilities. So the need for faster
processors doesn-t seem to finish. Despite new architectures and
higher frequencies, a single processor is not adequate for many
applications. Parallel processing and networks are previous solutions
for this problem. The new solution to put a network of resources on a
chip is called NOC (network on a chip). The more usual topology for
NOC is mesh topology. There are several routing algorithms suitable
for this topology such as XY, fully adaptive, etc. In this paper we
have suggested a new algorithm named Intermittent X, Y (IX/Y). We
have developed the new algorithm in simulation environment to
compare delay and power consumption with elders' algorithms.
Abstract: The practical implementation of audio-video coupled speech recognition systems is mainly limited by the hardware complexity to integrate two radically different information capturing devices with good temporal synchronisation. In this paper, we propose a solution based on a smart CMOS image sensor in order to simplify the hardware integration difficulties. By using on-chip image processing, this smart sensor can calculate in real time the X/Y projections of the captured image. This on-chip projection reduces considerably the volume of the output data. This data-volume reduction permits a transmission of the condensed visual information via the same audio channel by using a stereophonic input available on most of the standard computation devices such as PC, PDA and mobile phones. A prototype called VMIKE (Visio-Microphone) has been designed and realised by using standard 0.35um CMOS technology. A preliminary experiment gives encouraged results. Its efficiency will be further investigated in a large variety of applications such as biometrics, speech recognition in noisy environments, and vocal control for military or disabled persons, etc.
Abstract: Although silicon photonic devices provide a significantly larger bandwidth and dissipate a substantially less power than the electronic devices, they suffer from a large size due to the fundamental diffraction limit and the weak optical response of Si. A potential solution is to exploit Si plasmonics, which may not only miniaturize the photonic device far beyond the diffraction limit, but also enhance the optical response in Si due to the electromagnetic field confinement. In this paper, we discuss and summarize the recently developed metal-insulator-Si-insulator-metal nanoplasmonic waveguide as well as various passive and active plasmonic components based on this waveguide, including coupler, bend, power splitter, ring resonator, MZI, modulator, detector, etc. All these plasmonic components are CMOS compatible and could be integrated with electronic and conventional dielectric photonic devices on the same SOI chip. More potential plasmonic devices as well as plasmonic nanocircuits with complex functionalities are also addressed.
Abstract: This paper describes the design of a real-time audiorange
digital oscilloscope and its implementation in 90nm CMOS
FPGA platform. The design consists of sample and hold circuits,
A/D conversion, audio and video processing, on-chip RAM, clock
generation and control logic. The design of internal blocks and
modules in 90nm devices in an FPGA is elaborated. Also the key
features and their implementation algorithms are presented.
Finally, the timing waveforms and simulation results are put
forward.
Abstract: This work proposes an accurate crosstalk noise estimation method in the presence of multiple RLC lines for the use in design automation tools. This method correctly models the loading effects of non switching aggressors and aggressor tree branches using resistive shielding effect and realistic exponential input waveforms. Noise peak and width expressions have been derived. The results obtained are at good agreement with SPICE results. Results show that average error for noise peak is 4.7% and for the width is 6.15% while allowing a very fast analysis.
Abstract: A new concept for long-term reagent storage for Labon- a-Chip (LoC) devices is described. Here we present a polymer multilayer stack with integrated stick packs for long-term storage of several liquid reagents, which are necessary for many diagnostic applications. Stick packs are widely used in packaging industry for storing solids and liquids for long time. The storage concept fulfills two main requirements: First, a long-term storage of reagents in stick packs without significant losses and interaction with surroundings, second, on demand releasing of liquids, which is realized by pushing a membrane against the stick pack through pneumatic pressure. This concept enables long-term on-chip storage of liquid reagents at room temperature and allows an easy implementation in different LoC devices.
Abstract: With the increasing number of on-chip components and the critical requirement for processing power, Chip Multiprocessor (CMP) has gained wide acceptance in both academia and industry during the last decade. However, the conventional bus-based onchip communication schemes suffer from very high communication delay and low scalability in large scale systems. Network-on-Chip (NoC) has been proposed to solve the bottleneck of parallel onchip communications by applying different network topologies which separate the communication phase from the computation phase. Observing that the memory bandwidth of the communication between on-chip components and off-chip memory has become a critical problem even in NoC based systems, in this paper, we propose a novel 3D NoC with on-chip Dynamic Random Access Memory (DRAM) in which different layers are dedicated to different functionalities such as processors, cache or memory. Results show that, by using our proposed architecture, average link utilization has reduced by 10.25% for SPLASH-2 workloads. Our proposed design costs 1.12% less execution cycles than the traditional design on average.
Abstract: With the advent of digital cinema and digital
broadcasting, copyright protection of video data has been one of the
most important issues.
We present a novel method of watermarking for video image data
based on the hardware and digital wavelet transform techniques and
name it as “traceable watermarking" because the watermarked data is
constructed before the transmission process and traced after it has been
received by an authorized user.
In our method, we embed the watermark to the lowest part of each
image frame in decoded video by using a hardware LSI.
Digital Cinema is an important application for traceable
watermarking since digital cinema system makes use of watermarking
technology during content encoding, encryption, transmission,
decoding and all the intermediate process to be done in digital cinema
systems. The watermark is embedded into the randomly selected
movie frames using hash functions.
Embedded watermark information can be extracted from the
decoded video data. For that, there is no need to access original movie
data. Our experimental results show that proposed traceable
watermarking method for digital cinema system is much better than the
convenient watermarking techniques in terms of robustness, image
quality, speed, simplicity and robust structure.
Abstract: Timing driven physical design, synthesis, and
optimization tools need efficient closed-form delay models for
estimating the delay associated with each net in an integrated circuit
(IC) design. The total number of nets in a modern IC design has
increased dramatically and exceeded millions. Therefore efficient
modeling of interconnection is needed for high speed IC-s. This
paper presents closed–form expressions for RC and RLC
interconnection trees in current mode signaling, which can be
implemented in VLSI design tool. These analytical model
expressions can be used for accurate calculation of delay after the
design clock tree has been laid out and the design is fully routed.
Evaluation of these analytical models is several orders of magnitude
faster than simulation using SPICE.