A Processor with Dynamically Reconfigurable Circuit for Floating-Point Arithmetic

This paper describes about dynamic reconfiguration to miniaturize arithmetic circuits in general-purpose processor. Dynamic reconfiguration is a technique to realize required functions by changing hardware construction during operation. The proposed arithmetic circuit performs floating-point arithmetic which is frequently used in science and technology. The data format is floating-point based on IEEE754. The proposed circuit is designed using VHDL, and verified the correct operation by simulations and experiments.

A Message Passing Implementation of a New Parallel Arrangement Algorithm

This paper describes a new algorithm of arrangement in parallel, based on Odd-Even Mergesort, called division and concurrent mixes. The main idea of the algorithm is to achieve that each processor uses a sequential algorithm for ordering a part of the vector, and after that, for making the processors work in pairs in order to mix two of these sections ordered in a greater one, also ordered; after several iterations, the vector will be completely ordered. The paper describes the implementation of the new algorithm on a Message Passing environment (such as MPI). Besides, it compares the obtained experimental results with the quicksort sequential algorithm and with the parallel implementations (also on MPI) of the algorithms quicksort and bitonic sort. The comparison has been realized in an 8 processors cluster under GNU/Linux which is running on a unique PC processor.

A New High Speed Neural Model for Fast Character Recognition Using Cross Correlation and Matrix Decomposition

Neural processors have shown good results for detecting a certain character in a given input matrix. In this paper, a new idead to speed up the operation of neural processors for character detection is presented. Such processors are designed based on cross correlation in the frequency domain between the input matrix and the weights of neural networks. This approach is developed to reduce the computation steps required by these faster neural networks for the searching process. The principle of divide and conquer strategy is applied through image decomposition. Each image is divided into small in size sub-images and then each one is tested separately by using a single faster neural processor. Furthermore, faster character detection is obtained by using parallel processing techniques to test the resulting sub-images at the same time using the same number of faster neural networks. In contrast to using only faster neural processors, the speed up ratio is increased with the size of the input image when using faster neural processors and image decomposition. Moreover, the problem of local subimage normalization in the frequency domain is solved. The effect of image normalization on the speed up ratio of character detection is discussed. Simulation results show that local subimage normalization through weight normalization is faster than subimage normalization in the spatial domain. The overall speed up ratio of the detection process is increased as the normalization of weights is done off line.

An Embedded System for Artificial Intelligence Applications

Conventional approaches in the implementation of logic programming applications on embedded systems are solely of software nature. As a consequence, a compiler is needed that transforms the initial declarative logic program to its equivalent procedural one, to be programmed to the microprocessor. This approach increases the complexity of the final implementation and reduces the overall system's performance. On the contrary, presenting hardware implementations which are only capable of supporting logic programs prevents their use in applications where logic programs need to be intertwined with traditional procedural ones, for a specific application. We exploit HW/SW codesign methods to present a microprocessor, capable of supporting hybrid applications using both programming approaches. We take advantage of the close relationship between attribute grammar (AG) evaluation and knowledge engineering methods to present a programmable hardware parser that performs logic derivations and combine it with an extension of a conventional RISC microprocessor that performs the unification process to report the success or failure of those derivations. The extended RISC microprocessor is still capable of executing conventional procedural programs, thus hybrid applications can be implemented. The presented implementation is programmable, supports the execution of hybrid applications, increases the performance of logic derivations (experimental analysis yields an approximate 1000% increase in performance) and reduces the complexity of the final implemented code. The proposed hardware design is supported by a proposed extended C-language called C-AG.

Intelligent Home: SMS Based Home Security System with Immediate Feedback

A low cost Short Message System (SMS) based Home security system equipped with motion, smoke, temperature, humidity and light sensors has been studied and tested. The sensors are controlled by a microprocessor PIC 18F4520 through the SMS having password protection code for the secure operation. The user is able to switch light and the appliances and get instant feedback. Also in cases of emergencies such as fire or robbery the system will send alert message to occupant and relevant civil authorities. The operation of the home security has been tested on Vodafone- Fiji network and Digicel Fiji Network for emergency and feedback responses for 25 samples. The experiment showed that it takes about 8-10s for the security system to respond in case of emergency. It takes about 18-22s for the occupant to switch and monitor lights and appliances and then get feedback depending upon the network traffic.

Scale Time Offset Robust Modulation (STORM) in a Code Division Multiaccess Environment

Scale Time Offset Robust Modulation (STORM) [1]– [3] is a high bandwidth waveform design that adds time-scale to embedded reference modulations using only time-delay [4]. In an environment where each user has a specific delay and scale, identification of the user with the highest signal power and that user-s phase is facilitated by the STORM processor. Both of these parameters are required in an efficient multiuser detection algorithm. In this paper, the STORM modulation approach is evaluated with a direct sequence spread quadrature phase shift keying (DS-QPSK) system. A misconception of the STORM time scale modulation is that a fine temporal resolution is required at the receiver. STORM will be applied to a QPSK code division multiaccess (CDMA) system by modifying the spreading codes. Specifically, the in-phase code will use a typical spreading code, and the quadrature code will use a time-delayed and time-scaled version of the in-phase code. Subsequently, the same temporal resolution in the receiver is required before and after the application of STORM. In this paper, the bit error performance of STORM in a synchronous CDMA system is evaluated and compared to theory, and the bit error performance of STORM incorporated in a single user WCDMA downlink is presented to demonstrate the applicability of STORM in a modern communication system.

Enhancing Cache Performance Based on Improved Average Access Time

A high performance computer includes a fast processor and millions bytes of memory. During the data processing, huge amount of information are shuffled between the memory and processor. Because of its small size and its effectiveness speed, cache has become a common feature of high performance computers. Enhancing cache performance proved to be essential in the speed up of cache-based computers. Most enhancement approaches can be classified as either software based or hardware controlled. The performance of the cache is quantified in terms of hit ratio or miss ratio. In this paper, we are optimizing the cache performance based on enhancing the cache hit ratio. The optimum cache performance is obtained by focusing on the cache hardware modification in the way to make a quick rejection to the missed line's tags from the hit-or miss comparison stage, and thus a low hit time for the wanted line in the cache is achieved. In the proposed technique which we called Even- Odd Tabulation (EOT), the cache lines come from the main memory into cache are classified in two types; even line's tags and odd line's tags depending on their Least Significant Bit (LSB). This division is exploited by EOT technique to reject the miss match line's tags in very low time compared to the time spent by the main comparator in the cache, giving an optimum hitting time for the wanted cache line. The high performance of EOT technique against the familiar mapping technique FAM is shown in the simulated results.

Design and Control of DC-DC Converter for the Military Application Fuel Cell

This paper presents a 24 watts SEPIC converter design and control using microprocessor. SEPIC converter has advantages of a wide input range and miniaturization caused by the low stress at elements. There is also an advantage that the input and output are isolated in MOSFET-off state. This paper presents the PID control through the SEPIC converter transfer function using a DSP and the protective circuit for fuel cell from the over-current and inverse-voltage by using the characteristic of SEPIC converter. Then it derives them through the experiments.

Numerical Study of Airfoils Aerodynamic Performance in Heavy Rain Environment

Heavy rainfall greatly affects the aerodynamic performance of the aircraft. There are many accidents of aircraft caused by aerodynamic efficiency degradation by heavy rain. In this Paper we have studied the heavy rain effects on the aerodynamic efficiency of cambered NACA 64-210 and symmetric NACA 0012 airfoils. Our results show significant increase in drag and decrease in lift. We used preprocessing software gridgen for creation of geometry and mesh, used fluent as solver and techplot as postprocessor. Discrete phase modeling called DPM is used to model the rain particles using two phase flow approach. The rain particles are assumed to be inert. Both airfoils showed significant decrease in lift and increase in drag in simulated rain environment. The most significant difference between these two airfoils was the NACA 64-210 more sensitivity than NACA 0012 to liquid water content (LWC). We believe that the results showed in this paper will be useful for the designer of the commercial aircrafts and UAVs, and will be helpful for training of the pilots to control the airplanes in heavy rain.

3D Network-on-Chip with on-Chip DRAM: An Empirical Analysis for Future Chip Multiprocessor

With the increasing number of on-chip components and the critical requirement for processing power, Chip Multiprocessor (CMP) has gained wide acceptance in both academia and industry during the last decade. However, the conventional bus-based onchip communication schemes suffer from very high communication delay and low scalability in large scale systems. Network-on-Chip (NoC) has been proposed to solve the bottleneck of parallel onchip communications by applying different network topologies which separate the communication phase from the computation phase. Observing that the memory bandwidth of the communication between on-chip components and off-chip memory has become a critical problem even in NoC based systems, in this paper, we propose a novel 3D NoC with on-chip Dynamic Random Access Memory (DRAM) in which different layers are dedicated to different functionalities such as processors, cache or memory. Results show that, by using our proposed architecture, average link utilization has reduced by 10.25% for SPLASH-2 workloads. Our proposed design costs 1.12% less execution cycles than the traditional design on average.

A Hyper-Domain Image Watermarking Method based on Macro Edge Block and Wavelet Transform for Digital Signal Processor

In order to protect original data, watermarking is first consideration direction for digital information copyright. In addition, to achieve high quality image, the algorithm maybe can not run on embedded system because the computation is very complexity. However, almost nowadays algorithms need to build on consumer production because integrator circuit has a huge progress and cheap price. In this paper, we propose a novel algorithm which efficient inserts watermarking on digital image and very easy to implement on digital signal processor. In further, we select a general and cheap digital signal processor which is made by analog device company to fit consumer application. The experimental results show that the image quality by watermarking insertion can achieve 46 dB can be accepted in human vision and can real-time execute on digital signal processor.

A Simplified Adaptive Decision Feedback Equalization Technique for π/4-DQPSK Signals

We present a simplified equalization technique for a π/4 differential quadrature phase shift keying ( π/4 -DQPSK) modulated signal in a multipath fading environment. The proposed equalizer is realized as a fractionally spaced adaptive decision feedback equalizer (FS-ADFE), employing exponential step-size least mean square (LMS) algorithm as the adaptation technique. The main advantage of the scheme stems from the usage of exponential step-size LMS algorithm in the equalizer, which achieves similar convergence behavior as that of a recursive least squares (RLS) algorithm with significantly reduced computational complexity. To investigate the finite-precision performance of the proposed equalizer along with the π/4 -DQPSK modem, the entire system is evaluated on a 16-bit fixed point digital signal processor (DSP) environment. The proposed scheme is found to be attractive even for those cases where equalization is to be performed within a restricted number of training samples.

FPGA based Relative Distance Measurement using Stereo Vision Technology

In this paper, we propose a novel concept of relative distance measurement using Stereo Vision Technology and discuss its implementation on a FPGA based real-time image processor. We capture two images using two CCD cameras and compare them. Disparity is calculated for each pixel using a real time dense disparity calculation algorithm. This algorithm is based on the concept of indexed histogram for matching. Disparity being inversely proportional to distance (Proved Later), we can thus get the relative distances of objects in front of the camera. The output is displayed on a TV screen in the form of a depth image (optionally using pseudo colors). This system works in real time on a full PAL frame rate (720 x 576 active pixels @ 25 fps).

Performance Enhancement of Motion Estimation Using SSE2 Technology

Motion estimation is the most computationally intensive part in video processing. Many fast motion estimation algorithms have been proposed to decrease the computational complexity by reducing the number of candidate motion vectors. However, these studies are for fast search algorithms themselves while almost image and video compressions are operated with software based. Therefore, the timing constraints for running these motion estimation algorithms not only challenge for the video codec but also overwhelm for some of processors. In this paper, the performance of motion estimation is enhanced by using Intel's Streaming SIMD Extension 2 (SSE2) technology with Intel Pentium 4 processor.