Performance Evaluation of a Neural Network based General Purpose Space Vector Modulator

Space Vector Modulation (SVM) is an optimum Pulse Width Modulation (PWM) technique for an inverter used in a variable frequency drive applications. It is computationally rigorous and hence limits the inverter switching frequency. Increase in switching frequency can be achieved using Neural Network (NN) based SVM, implemented on application specific chips. This paper proposes a neural network based SVM technique for a Voltage Source Inverter (VSI). The network proposed is independent of switching frequency. Different architectures are investigated keeping the total number of neurons constant. The performance of the inverter is compared for various switching frequencies for different architectures of NN based SVM. From the results obtained, the network with minimum resource and appropriate word length is identified. The bit precision required for this application is identified. The network with 8-bit precision is implemented in the IC XCV 400 and the results are presented. The performance of NN based general purpose SVM with higher bit precision is discussed.

Enhanced Traveling Salesman Problem Solving by Genetic Algorithm Technique (TSPGA)

The well known NP-complete problem of the Traveling Salesman Problem (TSP) is coded in genetic form. A software system is proposed to determine the optimum route for a Traveling Salesman Problem using Genetic Algorithm technique. The system starts from a matrix of the calculated Euclidean distances between the cities to be visited by the traveling salesman and a randomly chosen city order as the initial population. Then new generations are then created repeatedly until the proper path is reached upon reaching a stopping criterion. This search is guided by a solution evaluation function.

Compact Binary Tree Representation of Logic Function with Enhanced Throughput

An effective approach for realizing the binary tree structure, representing a combinational logic functionality with enhanced throughput, is discussed in this paper. The optimization in maximum operating frequency was achieved through delay minimization, which in turn was possible by means of reducing the depth of the binary network. The proposed synthesis methodology has been validated by experimentation with FPGA as the target technology. Though our proposal is technology independent, yet the heuristic enables better optimization in throughput even after technology mapping for such Boolean functionality; whose reduced CNF form is associated with a lesser literal cost than its reduced DNF form at the Boolean equation level. For cases otherwise, our method converges to similar results as that of [12]. The practical results obtained for a variety of case studies demonstrate an improvement in the maximum throughput rate for Spartan IIE (XC2S50E-7FT256) and Spartan 3 (XC3S50-4PQ144) FPGA logic families by 10.49% and 13.68% respectively. With respect to the LUTs and IOBUFs required for physical implementation of the requisite non-regenerative logic functionality, the proposed method enabled savings to the tune of 44.35% and 44.67% respectively, over the existing efficient method available in literature [12].

VLSI Design of 2-D Discrete Wavelet Transform for Area-Efficient and High-Speed Image Computing

This paper presents a VLSI design approach of a highspeed and real-time 2-D Discrete Wavelet Transform computing. The proposed architecture, based on new and fast convolution approach, reduces the hardware complexity in addition to reduce the critical path to the multiplier delay. Furthermore, an advanced twodimensional (2-D) discrete wavelet transform (DWT) implementation, with an efficient memory area, is designed to produce one output in every clock cycle. As a result, a very highspeed is attained. The system is verified, using JPEG2000 coefficients filters, on Xilinx Virtex-II Field Programmable Gate Array (FPGA) device without accessing any external memory. The resulting computing rate is up to 270 M samples/s and the (9,7) 2-D wavelet filter uses only 18 kb of memory (16 kb of first-in-first-out memory) with 256×256 image size. In this way, the developed design requests reduced memory and provide very high-speed processing as well as high PSNR quality.

Energy-Efficient Sensing Concept for a Micromachined Yaw Rate Sensor

The need for micromechanical inertial sensors is increasing in future electronic stability control (ESC) and other positioning, navigation and guidance systems. Due to the rising density of sensors in automotive and consumer devices the goal is not only to get high performance, robustness and smaller package sizes, but also to optimize the energy management of the overall sensor system. This paper presents an evaluation concept for a surface micromachined yaw rate sensor. Within this evaluation concept an energy-efficient operation of the drive mode of the yaw rate sensor is enabled. The presented system concept can be realized within a power management subsystem.

A Pipelined FSBM Hardware Architecture for HTDV-H.26x

In MPEG and H.26x standards, to eliminate the temporal redundancy we use motion estimation. Given that the motion estimation stage is very complex in terms of computational effort, a hardware implementation on a re-configurable circuit is crucial for the requirements of different real time multimedia applications. In this paper, we present hardware architecture for motion estimation based on "Full Search Block Matching" (FSBM) algorithm. This architecture presents minimum latency, maximum throughput, full utilization of hardware resources such as embedded memory blocks, and combining both pipelining and parallel processing techniques. Our design is described in VHDL language, verified by simulation and implemented in a Stratix II EP2S130F1020C4 FPGA circuit. The experiment result show that the optimum operating clock frequency of the proposed design is 89MHz which achieves 160M pixels/sec.

Real-Time Digital Oscilloscope Implementation in 90nm CMOS Technology FPGA

This paper describes the design of a real-time audiorange digital oscilloscope and its implementation in 90nm CMOS FPGA platform. The design consists of sample and hold circuits, A/D conversion, audio and video processing, on-chip RAM, clock generation and control logic. The design of internal blocks and modules in 90nm devices in an FPGA is elaborated. Also the key features and their implementation algorithms are presented. Finally, the timing waveforms and simulation results are put forward.

64 bit Computer Architectures for Space Applications – A study

The more recent satellite projects/programs makes extensive usage of real – time embedded systems. 16 bit processors which meet the Mil-Std-1750 standard architecture have been used in on-board systems. Most of the Space Applications have been written in ADA. From a futuristic point of view, 32 bit/ 64 bit processors are needed in the area of spacecraft computing and therefore an effort is desirable in the study and survey of 64 bit architectures for space applications. This will also result in significant technology development in terms of VLSI and software tools for ADA (as the legacy code is in ADA). There are several basic requirements for a special processor for this purpose. They include Radiation Hardened (RadHard) devices, very low power dissipation, compatibility with existing operational systems, scalable architectures for higher computational needs, reliability, higher memory and I/O bandwidth, predictability, realtime operating system and manufacturability of such processors. Further on, these may include selection of FPGA devices, selection of EDA tool chains, design flow, partitioning of the design, pin count, performance evaluation, timing analysis etc. This project deals with a brief study of 32 and 64 bit processors readily available in the market and designing/ fabricating a 64 bit RISC processor named RISC MicroProcessor with added functionalities of an extended double precision floating point unit and a 32 bit signal processing unit acting as co-processors. In this paper, we emphasize the ease and importance of using Open Core (OpenSparc T1 Verilog RTL) and Open “Source" EDA tools such as Icarus to develop FPGA based prototypes quickly. Commercial tools such as Xilinx ISE for Synthesis are also used when appropriate.

Design of a Neural Networks Classifier for Face Detection

Face detection and recognition has many applications in a variety of fields such as security system, videoconferencing and identification. Face classification is currently implemented in software. A hardware implementation allows real-time processing, but has higher cost and time to-market. The objective of this work is to implement a classifier based on neural networks MLP (Multi-layer Perceptron) for face detection. The MLP is used to classify face and non-face patterns. The systm is described using C language on a P4 (2.4 Ghz) to extract weight values. Then a Hardware implementation is achieved using VHDL based Methodology. We target Xilinx FPGA as the implementation support.

Nuclear Medical Image Treatment System Based On FPGA in Real Time

We present in this paper an acquisition and treatment system designed for semi-analog Gamma-camera. It consists of a nuclear medical Image Acquisition, Treatment and Display chain(IATD) ensuring the acquisition, the treatment of the signals(resulting from the Gamma-camera detection head) and the scintigraphic image construction in real time. This chain is composed by an analog treatment board and a digital treatment board. We describe the designed systems and the digital treatment algorithms in which we have improved the performance and the flexibility. The digital treatment algorithms are implemented in a specific reprogrammable circuit FPGA (Field Programmable Gate Array).interface for semi-analog cameras of Sopha Medical Vision(SMVi) by taking as example SOPHY DS7. The developed system consists of an Image Acquisition, Treatment and Display (IATD) ensuring the acquisition and the treatment of the signals resulting from the DH. The developed chain is formed by a treatment analog board and a digital treatment board designed around a DSP [2]. In this paper we have presented the architecture of a new version of our chain IATD in which the integration of the treatment algorithms is executed on an FPGA (Field Programmable Gate Array)

Evaluation of Horizontal Seismic Hazard of Naghan, Iran

This paper presents probabilistic horizontal seismic hazard assessment of Naghan, Iran. It displays the probabilistic estimate of Peak Ground Horizontal Acceleration (PGHA) for the return period of 475, 950 and 2475 years. The output of the probabilistic seismic hazard analysis is based on peak ground acceleration (PGA), which is the most common criterion in designing of buildings. A catalogue of seismic events that includes both historical and instrumental events was developed and covers the period from 840 to 2009. The seismic sources that affect the hazard in Naghan were identified within the radius of 200 km and the recurrence relationships of these sources were generated by Kijko and Sellevoll. Finally Peak Ground Horizontal Acceleration (PGHA) has been prepared to indicate the earthquake hazard of Naghan for different hazard levels by using SEISRISK III software.

Digital Filter for Cochlear Implant Implemented on a Field- Programmable Gate Array

The advent of multi-million gate Field Programmable Gate Arrays (FPGAs) with hardware support for multiplication opens an opportunity to recreate a significant portion of the front end of a human cochlea using this technology. In this paper we describe the implementation of the cochlear filter and show that it is entirely suited to a single device XC3S500 FPGA implementation .The filter gave a good fit to real time data with efficiency of hardware usage.

A Microcontroller Implementation of Constrained Model Predictive Control

Model Predictive Control (MPC) is an established control technique in a wide range of process industries. The reason for this success is its ability to handle multivariable systems and systems having input, output or state constraints. Neverthless comparing to PID controller, the implementation of the MPC in miniaturized devices like Field Programmable Gate Arrays (FPGA) and microcontrollers has historically been very small scale due to its complexity in implementation and its computation time requirement. At the same time, such embedded technologies have become an enabler for future manufacturing enterprisers as well as a transformer of organizations and markets. In this work, we take advantage of these recent advances in this area in the deployment of one of the most studied and applied control technique in the industrial engineering. In this paper, we propose an efficient firmware for the implementation of constrained MPC in the performed STM32 microcontroller using interior point method. Indeed, performances study shows good execution speed and low computational burden. These results encourage to develop predictive control algorithms to be programmed in industrial standard processes. The PID anti windup controller was also implemented in the STM32 in order to make a performance comparison with the MPC. The main features of the proposed constrained MPC framework are illustrated through two examples.

An FPGA Implementation of Intelligent Visual Based Fall Detection

Falling has been one of the major concerns and threats to the independence of the elderly in their daily lives. With the worldwide significant growth of the aging population, it is essential to have a promising solution of fall detection which is able to operate at high accuracy in real-time and supports large scale implementation using multiple cameras. Field Programmable Gate Array (FPGA) is a highly promising tool to be used as a hardware accelerator in many emerging embedded vision based system. Thus, it is the main objective of this paper to present an FPGA-based solution of visual based fall detection to meet stringent real-time requirements with high accuracy. The hardware architecture of visual based fall detection which utilizes the pixel locality to reduce memory accesses is proposed. By exploiting the parallel and pipeline architecture of FPGA, our hardware implementation of visual based fall detection using FGPA is able to achieve a performance of 60fps for a series of video analytical functions at VGA resolutions (640x480). The results of this work show that FPGA has great potentials and impacts in enabling large scale vision system in the future healthcare industry due to its flexibility and scalability.

Modified Montgomery for RSA Cryptosystem

Encryption and decryption in RSA are done by modular exponentiation which is achieved by repeated modular multiplication. Hence efficiency of modular multiplication directly determines the efficiency of RSA cryptosystem. This paper designs a Modified Montgomery Modular Multiplication in which addition of operands is computed by 4:2 compressor. The basic logic operations in addition are partitioned over two iterations such that parallel computations are performed. This reduces the critical path delay of proposed Montgomery design. The proposed design and RSA are implemented on Virtex 2 and Virtex 5 FPGAs. The two factors partitioning and parallelism have improved the frequency and throughput of proposed design.

Design of a CMOS Highly Linear Front-end IC with Auto Gain Controller for a Magnetic Field Transceiver

This paper describes a low-voltage and low-power channel selection analog front end with continuous-time low pass filters and highly linear programmable gain amplifier (PGA). The filters were realized as balanced Gm-C biquadratic filters to achieve a low current consumption. High linearity and a constant wide bandwidth are achieved by using a new transconductance (Gm) cell. The PGA has a voltage gain varying from 0 to 65dB, while maintaining a constant bandwidth. A filter tuning circuit that requires an accurate time base but no external components is presented. With a 1-Vrms differential input and output, the filter achieves -85dB THD and a 78dB signal-to-noise ratio. Both the filter and PGA were implemented in a 0.18um 1P6M n-well CMOS process. They consume 3.2mW from a 1.8V power supply and occupy an area of 0.19mm2.

An Efficient Architecture for Interleaved Modular Multiplication

Modular multiplication is the basic operation in most public key cryptosystems, such as RSA, DSA, ECC, and DH key exchange. Unfortunately, very large operands (in order of 1024 or 2048 bits) must be used to provide sufficient security strength. The use of such big numbers dramatically slows down the whole cipher system, especially when running on embedded processors. So far, customized hardware accelerators - developed on FPGAs or ASICs - were the best choice for accelerating modular multiplication in embedded environments. On the other hand, many algorithms have been developed to speed up such operations. Examples are the Montgomery modular multiplication and the interleaved modular multiplication algorithms. Combining both customized hardware with an efficient algorithm is expected to provide a much faster cipher system. This paper introduces an enhanced architecture for computing the modular multiplication of two large numbers X and Y modulo a given modulus M. The proposed design is compared with three previous architectures depending on carry save adders and look up tables. Look up tables should be loaded with a set of pre-computed values. Our proposed architecture uses the same carry save addition, but replaces both look up tables and pre-computations with an enhanced version of sign detection techniques. The proposed architecture supports higher frequencies than other architectures. It also has a better overall absolute time for a single operation.

Analysis of Effect of Pre-Logic Factoring on Cell Based Combinatorial Logic Synthesis

In this paper, an analysis is presented, which demonstrates the effect pre-logic factoring could have on an automated combinational logic synthesis process succeeding it. The impact of pre-logic factoring for some arbitrary combinatorial circuits synthesized within a FPGA based logic design environment has been analyzed previously. This paper explores a similar effect, but with the non-regenerative logic synthesized using elements of a commercial standard cell library. On an overall basis, the results obtained pertaining to the analysis on a variety of MCNC/IWLS combinational logic benchmark circuits indicate that pre-logic factoring has the potential to facilitate simultaneous power, delay and area optimized synthesis solutions in many cases.

The Data Processing Electronics of the METIS Coronagraph aboard the ESA Solar Orbiter Mission

METIS is the Multi Element Telescope for Imaging and Spectroscopy, a Coronagraph aboard the European Space Agency-s Solar Orbiter Mission aimed at the observation of the solar corona via both VIS and UV/EUV narrow-band imaging and spectroscopy. METIS, with its multi-wavelength capabilities, will study in detail the physical processes responsible for the corona heating and the origin and properties of the slow and fast solar wind. METIS electronics will collect and process scientific data thanks to its detectors proximity electronics, the digital front-end subsystem electronics and the MPPU, the Main Power and Processing Unit, hosting a space-qualified processor, memories and some rad-hard FPGAs acting as digital controllers.This paper reports on the overall METIS electronics architecture and data processing capabilities conceived to address all the scientific issues as a trade-off solution between requirements and allocated resources, just before the Preliminary Design Review as an ESA milestone in April 2012.

Local Linear Model Tree (LOLIMOT) Reconfigurable Parallel Hardware

Local Linear Neuro-Fuzzy Models (LLNFM) like other neuro- fuzzy systems are adaptive networks and provide robust learning capabilities and are widely utilized in various applications such as pattern recognition, system identification, image processing and prediction. Local linear model tree (LOLIMOT) is a type of Takagi-Sugeno-Kang neuro fuzzy algorithm which has proven its efficiency compared with other neuro fuzzy networks in learning the nonlinear systems and pattern recognition. In this paper, a dedicated reconfigurable and parallel processing hardware for LOLIMOT algorithm and its applications are presented. This hardware realizes on-chip learning which gives it the capability to work as a standalone device in a system. The synthesis results on FPGA platforms show its potential to improve the speed at least 250 of times faster than software implemented algorithms.