Abstract: The need for micromechanical inertial sensors is increasing
in future electronic stability control (ESC) and other positioning,
navigation and guidance systems. Due to the rising density of
sensors in automotive and consumer devices the goal is not only to get
high performance, robustness and smaller package sizes, but also to
optimize the energy management of the overall sensor system. This
paper presents an evaluation concept for a surface micromachined
yaw rate sensor. Within this evaluation concept an energy-efficient
operation of the drive mode of the yaw rate sensor is enabled. The
presented system concept can be realized within a power management
subsystem.
Abstract: In MPEG and H.26x standards, to eliminate the
temporal redundancy we use motion estimation. Given that the
motion estimation stage is very complex in terms of computational
effort, a hardware implementation on a re-configurable circuit is
crucial for the requirements of different real time multimedia
applications. In this paper, we present hardware architecture for
motion estimation based on "Full Search Block Matching" (FSBM)
algorithm. This architecture presents minimum latency, maximum
throughput, full utilization of hardware resources such as embedded
memory blocks, and combining both pipelining and parallel
processing techniques. Our design is described in VHDL language,
verified by simulation and implemented in a Stratix II
EP2S130F1020C4 FPGA circuit. The experiment result show that the
optimum operating clock frequency of the proposed design is 89MHz
which achieves 160M pixels/sec.
Abstract: This paper describes the design of a real-time audiorange
digital oscilloscope and its implementation in 90nm CMOS
FPGA platform. The design consists of sample and hold circuits,
A/D conversion, audio and video processing, on-chip RAM, clock
generation and control logic. The design of internal blocks and
modules in 90nm devices in an FPGA is elaborated. Also the key
features and their implementation algorithms are presented.
Finally, the timing waveforms and simulation results are put
forward.
Abstract: The more recent satellite projects/programs makes
extensive usage of real – time embedded systems. 16 bit processors
which meet the Mil-Std-1750 standard architecture have been used in
on-board systems. Most of the Space Applications have been written
in ADA. From a futuristic point of view, 32 bit/ 64 bit processors are
needed in the area of spacecraft computing and therefore an effort is
desirable in the study and survey of 64 bit architectures for space
applications. This will also result in significant technology
development in terms of VLSI and software tools for ADA (as the
legacy code is in ADA).
There are several basic requirements for a special processor for
this purpose. They include Radiation Hardened (RadHard) devices,
very low power dissipation, compatibility with existing operational
systems, scalable architectures for higher computational needs,
reliability, higher memory and I/O bandwidth, predictability, realtime
operating system and manufacturability of such processors.
Further on, these may include selection of FPGA devices, selection
of EDA tool chains, design flow, partitioning of the design, pin
count, performance evaluation, timing analysis etc.
This project deals with a brief study of 32 and 64 bit processors
readily available in the market and designing/ fabricating a 64 bit
RISC processor named RISC MicroProcessor with added
functionalities of an extended double precision floating point unit
and a 32 bit signal processing unit acting as co-processors. In this
paper, we emphasize the ease and importance of using Open Core
(OpenSparc T1 Verilog RTL) and Open “Source" EDA tools such as
Icarus to develop FPGA based prototypes quickly. Commercial tools
such as Xilinx ISE for Synthesis are also used when appropriate.
Abstract: Face detection and recognition has many applications
in a variety of fields such as security system, videoconferencing and
identification. Face classification is currently implemented in
software. A hardware implementation allows real-time processing,
but has higher cost and time to-market.
The objective of this work is to implement a classifier based on
neural networks MLP (Multi-layer Perceptron) for face detection.
The MLP is used to classify face and non-face patterns. The systm is
described using C language on a P4 (2.4 Ghz) to extract weight
values. Then a Hardware implementation is achieved using VHDL
based Methodology. We target Xilinx FPGA as the implementation
support.
Abstract: We present in this paper an acquisition and treatment system designed for semi-analog Gamma-camera. It consists of a nuclear medical Image Acquisition, Treatment and Display chain(IATD) ensuring the acquisition, the treatment of the signals(resulting from the Gamma-camera detection head) and the scintigraphic image construction in real time. This chain is composed by an analog treatment board and a digital treatment board. We describe the designed systems and the digital treatment algorithms in which we have improved the performance and the flexibility. The digital treatment algorithms are implemented in a specific reprogrammable circuit FPGA (Field Programmable Gate Array).interface for semi-analog cameras of Sopha Medical Vision(SMVi) by taking as example SOPHY DS7. The developed system consists of an Image Acquisition, Treatment and Display (IATD) ensuring the acquisition and the treatment of the signals resulting from the DH. The developed chain is formed by a treatment analog board and a digital treatment board designed around a DSP [2]. In this paper we have presented the architecture of a new version of our chain IATD in which the integration of the treatment algorithms is executed on an FPGA (Field Programmable Gate Array)
Abstract: The advent of multi-million gate Field Programmable
Gate Arrays (FPGAs) with hardware support for multiplication opens
an opportunity to recreate a significant portion of the front end of a
human cochlea using this technology. In this paper we describe the
implementation of the cochlear filter and show that it is entirely
suited to a single device XC3S500 FPGA implementation .The filter
gave a good fit to real time data with efficiency of hardware usage.
Abstract: Model Predictive Control (MPC) is an established control
technique in a wide range of process industries. The reason for
this success is its ability to handle multivariable systems and systems
having input, output or state constraints. Neverthless comparing to
PID controller, the implementation of the MPC in miniaturized
devices like Field Programmable Gate Arrays (FPGA) and microcontrollers
has historically been very small scale due to its complexity in
implementation and its computation time requirement. At the same
time, such embedded technologies have become an enabler for future
manufacturing enterprisers as well as a transformer of organizations
and markets. In this work, we take advantage of these recent advances
in this area in the deployment of one of the most studied and applied
control technique in the industrial engineering. In this paper, we
propose an efficient firmware for the implementation of constrained
MPC in the performed STM32 microcontroller using interior point
method. Indeed, performances study shows good execution speed
and low computational burden. These results encourage to develop
predictive control algorithms to be programmed in industrial standard
processes. The PID anti windup controller was also implemented in
the STM32 in order to make a performance comparison with the
MPC. The main features of the proposed constrained MPC framework
are illustrated through two examples.
Abstract: Falling has been one of the major concerns and threats
to the independence of the elderly in their daily lives. With the
worldwide significant growth of the aging population, it is essential
to have a promising solution of fall detection which is able to operate
at high accuracy in real-time and supports large scale implementation
using multiple cameras. Field Programmable Gate Array (FPGA) is a
highly promising tool to be used as a hardware accelerator in many
emerging embedded vision based system. Thus, it is the main
objective of this paper to present an FPGA-based solution of visual
based fall detection to meet stringent real-time requirements with
high accuracy. The hardware architecture of visual based fall
detection which utilizes the pixel locality to reduce memory accesses
is proposed. By exploiting the parallel and pipeline architecture of
FPGA, our hardware implementation of visual based fall detection
using FGPA is able to achieve a performance of 60fps for a series of
video analytical functions at VGA resolutions (640x480). The results
of this work show that FPGA has great potentials and impacts in
enabling large scale vision system in the future healthcare industry
due to its flexibility and scalability.
Abstract: Encryption and decryption in RSA are done by modular exponentiation which is achieved by repeated modular multiplication. Hence efficiency of modular multiplication directly determines the efficiency of RSA cryptosystem. This paper designs a Modified Montgomery Modular Multiplication in which addition of operands is computed by 4:2 compressor. The basic logic operations in addition are partitioned over two iterations such that parallel computations are performed. This reduces the critical path delay of proposed Montgomery design. The proposed design and RSA are implemented on Virtex 2 and Virtex 5 FPGAs. The two factors partitioning and parallelism have improved the frequency and throughput of proposed design.
Abstract: Modular multiplication is the basic operation
in most public key cryptosystems, such as RSA, DSA, ECC,
and DH key exchange. Unfortunately, very large operands
(in order of 1024 or 2048 bits) must be used to provide
sufficient security strength. The use of such big numbers
dramatically slows down the whole cipher system, especially
when running on embedded processors.
So far, customized hardware accelerators - developed on
FPGAs or ASICs - were the best choice for accelerating
modular multiplication in embedded environments. On the
other hand, many algorithms have been developed to speed
up such operations. Examples are the Montgomery modular
multiplication and the interleaved modular multiplication
algorithms. Combining both customized hardware with
an efficient algorithm is expected to provide a much faster
cipher system.
This paper introduces an enhanced architecture for computing
the modular multiplication of two large numbers X
and Y modulo a given modulus M. The proposed design is
compared with three previous architectures depending on
carry save adders and look up tables. Look up tables should
be loaded with a set of pre-computed values. Our proposed
architecture uses the same carry save addition, but replaces
both look up tables and pre-computations with an enhanced
version of sign detection techniques. The proposed architecture
supports higher frequencies than other architectures.
It also has a better overall absolute time for a single operation.
Abstract: In this paper, an analysis is presented, which
demonstrates the effect pre-logic factoring could have on an
automated combinational logic synthesis process succeeding it. The
impact of pre-logic factoring for some arbitrary combinatorial
circuits synthesized within a FPGA based logic design environment
has been analyzed previously. This paper explores a similar effect,
but with the non-regenerative logic synthesized using elements of a
commercial standard cell library. On an overall basis, the results
obtained pertaining to the analysis on a variety of MCNC/IWLS
combinational logic benchmark circuits indicate that pre-logic
factoring has the potential to facilitate simultaneous power, delay and
area optimized synthesis solutions in many cases.
Abstract: METIS is the Multi Element Telescope for Imaging
and Spectroscopy, a Coronagraph aboard the European Space
Agency-s Solar Orbiter Mission aimed at the observation of the solar
corona via both VIS and UV/EUV narrow-band imaging and spectroscopy. METIS, with its multi-wavelength capabilities, will
study in detail the physical processes responsible for the corona heating and the origin and properties of the slow and fast solar wind.
METIS electronics will collect and process scientific data thanks to its detectors proximity electronics, the digital front-end subsystem
electronics and the MPPU, the Main Power and Processing Unit,
hosting a space-qualified processor, memories and some rad-hard
FPGAs acting as digital controllers.This paper reports on the overall
METIS electronics architecture and data processing capabilities
conceived to address all the scientific issues as a trade-off solution between requirements and allocated resources, just before the
Preliminary Design Review as an ESA milestone in April 2012.
Abstract: Local Linear Neuro-Fuzzy Models (LLNFM) like other neuro- fuzzy systems are adaptive networks and provide robust learning capabilities and are widely utilized in various applications such as pattern recognition, system identification, image processing and prediction. Local linear model tree (LOLIMOT) is a type of Takagi-Sugeno-Kang neuro fuzzy algorithm which has proven its efficiency compared with other neuro fuzzy networks in learning the nonlinear systems and pattern recognition. In this paper, a dedicated reconfigurable and parallel processing hardware for LOLIMOT algorithm and its applications are presented. This hardware realizes on-chip learning which gives it the capability to work as a standalone device in a system. The synthesis results on FPGA platforms show its potential to improve the speed at least 250 of times faster than software implemented algorithms.
Abstract: Modern applications realized onto FPGAs exhibit high connectivity demands. Throughout this paper we study the routing constraints of Virtex devices and we propose a systematic methodology for designing a novel general-purpose interconnection network targeting to reconfigurable architectures. This network consists of multiple segment wires and SB patterns, appropriately selected and assigned across the device. The goal of our proposed methodology is to maximize the hardware utilization of fabricated routing resources. The derived interconnection scheme is integrated on a Virtex style FPGA. This device is characterized both for its high-performance, as well as for its low-energy requirements. Due to this, the design criterion that guides our architecture selections was the minimal Energy×Delay Product (EDP). The methodology is fully-supported by three new software tools, which belong to MEANDER Design Framework. Using a typical set of MCNC benchmarks, extensive comparison study in terms of several critical parameters proves the effectiveness of the derived interconnection network. More specifically, we achieve average Energy×Delay Product reduction by 63%, performance increase by 26%, reduction in leakage power by 21%, reduction in total energy consumption by 11%, at the expense of increase of channel width by 20%.
Abstract: Streaming Applications usually run in parallel or in
series that incrementally transform a stream of input data. It poses a
design challenge to break such an application into distinguishable
blocks and then to map them into independent hardware processing
elements. For this, there is required a generic controller that
automatically maps such a stream of data into independent processing
elements without any dependencies and manual considerations. In
this paper, Kahn Process Networks (KPN) for such streaming
applications is designed and developed that will be mapped on
MPSoC. This is designed in such a way that there is a generic Cbased
compiler that will take the mapping specifications as an input
from the user and then it will automate these design constraints and
automatically generate the synthesized RTL optimized code for
specified application.
Abstract: This article describes design of the 8-bit asynchronous
microcontroller simulation model in VHDL. The model is created in
ISE Foundation design tool and simulated in Modelsim tool. This
model is a simple application example of asynchronous systems
designed in synchronous design tools. The design process of creating
asynchronous system with 4-phase bundled-data protocol and with
matching delays is described in the article. The model is described in
gate-level abstraction.
The simulation waveform of the functional construction is the
result of this article. Described construction covers only the
simulation model. The next step would be creating synthesizable
model to FPGA.
Abstract: In this paper, we propose a novel concept of relative
distance measurement using Stereo Vision Technology and discuss
its implementation on a FPGA based real-time image processor. We
capture two images using two CCD cameras and compare them.
Disparity is calculated for each pixel using a real time dense disparity
calculation algorithm. This algorithm is based on the concept of
indexed histogram for matching. Disparity being inversely
proportional to distance (Proved Later), we can thus get the relative
distances of objects in front of the camera. The output is displayed on
a TV screen in the form of a depth image (optionally using pseudo
colors). This system works in real time on a full PAL frame rate (720
x 576 active pixels @ 25 fps).
Abstract: The work reported in this paper proposes
Swarm-Array computing, a novel technique inspired by swarm
robotics, and built on the foundations of autonomic and parallel
computing. The approach aims to apply autonomic computing
constructs to parallel computing systems and in effect achieve the
self-ware objectives that describe self-managing systems. The
constitution of swarm-array computing comprising four constituents,
namely the computing system, the problem/task, the swarm and the
landscape is considered. Approaches that bind these constituents
together are proposed. Space applications employing FPGAs are
identified as a potential area for applying swarm-array computing for
building reliable systems. The feasibility of a proposed approach is
validated on the SeSAm multi-agent simulator and landscapes are
generated using the MATLAB toolkit.
Abstract: A new and highly efficient architecture for elliptic curve scalar point multiplication which is optimized for a binary field recommended by NIST and is well-suited for elliptic curve cryptographic (ECC) applications is presented. To achieve the maximum architectural and timing improvements we have reorganized and reordered the critical path of the Lopez-Dahab scalar point multiplication architecture such that logic structures are implemented in parallel and operations in the critical path are diverted to noncritical paths. With G=41, the proposed design is capable of performing a field multiplication over the extension field with degree 163 in 11.92 s with the maximum achievable frequency of 251 MHz on Xilinx Virtex-4 (XC4VLX200) while 22% of the chip area is occupied, where G is the digit size of the underlying digit-serial finite field multiplier.