A Novel Implementation of Application Specific Instruction-set Processor (ASIP) using Verilog

The general purpose processors that are used in embedded systems must support constraints like execution time, power consumption, code size and so on. On the other hand an Application Specific Instruction-set Processor (ASIP) has advantages in terms of power consumption, performance and flexibility. In this paper, a 16-bit Application Specific Instruction-set processor for the sensor data transfer is proposed. The designed processor architecture consists of on-chip transmitter and receiver modules along with the processing and controlling units to enable the data transmission and reception on a single die. The data transfer is accomplished with less number of instructions as compared with the general purpose processor. The ASIP core operates at a maximum clock frequency of 1.132GHz with a delay of 0.883ns and consumes 569.63mW power at an operating voltage of 1.2V. The ASIP is implemented in Verilog HDL using the Xilinx platform on Virtex4.

Interfacing C and TMS320C6713 Assembly Language (Part-I)

This paper describes an interfacing of C and the TMS320C6713 assembly language which is crucially important for many real-time applications. Similarly, interfacing of C with the assembly language of a conventional microprocessor such as MC68000 is presented for comparison. However, it should be noted that the way the C compiler passes arguments among various functions in the TMS320C6713-based environment is totally different from the way the C compiler passes arguments in a conventional microprocessor such as MC68000. Therefore, it is very important for a user of the TMS320C6713-based system to properly understand and follow the register conventions when interfacing C with the TMS320C6713 assembly language subroutine. It should be also noted that in some cases (examples 6-9) the endian-mode of the board needs to be taken into consideration. In this paper, one method is presented in great detail. Other methods will be presented in the future.

A Processor with Dynamically Reconfigurable Circuit for Floating-Point Arithmetic

This paper describes about dynamic reconfiguration to miniaturize arithmetic circuits in general-purpose processor. Dynamic reconfiguration is a technique to realize required functions by changing hardware construction during operation. The proposed arithmetic circuit performs floating-point arithmetic which is frequently used in science and technology. The data format is floating-point based on IEEE754. The proposed circuit is designed using VHDL, and verified the correct operation by simulations and experiments.

The Decentralized Nonlinear Controller of Robot Manipulator with External Load Compensation

This paper describes a newly designed decentralized nonlinear control strategy to control a robot manipulator. Based on the concept of the nonlinear state feedback theory and decentralized concept is developed to improve the drawbacks in previous works concerned with complicate intelligent control and low cost effective sensor. The control methodology is derived in the sense of Lyapunov theorem so that the stability of the control system is guaranteed. The decentralized algorithm does not require other joint angle and velocity information. Individual Joint controller is implemented using a digital processor with nearly actuator to make it possible to achieve good dynamics and modular. Computer simulation result has been conducted to validate the effectiveness of the proposed control scheme under the occurrence of possible uncertainties and different reference trajectories. The merit of the proposed control system is indicated in comparison with a classical control system.

A Message Passing Implementation of a New Parallel Arrangement Algorithm

This paper describes a new algorithm of arrangement in parallel, based on Odd-Even Mergesort, called division and concurrent mixes. The main idea of the algorithm is to achieve that each processor uses a sequential algorithm for ordering a part of the vector, and after that, for making the processors work in pairs in order to mix two of these sections ordered in a greater one, also ordered; after several iterations, the vector will be completely ordered. The paper describes the implementation of the new algorithm on a Message Passing environment (such as MPI). Besides, it compares the obtained experimental results with the quicksort sequential algorithm and with the parallel implementations (also on MPI) of the algorithms quicksort and bitonic sort. The comparison has been realized in an 8 processors cluster under GNU/Linux which is running on a unique PC processor.

Feature-Based Machining using Macro

This paper presents an on-going research work on the implementation of feature-based machining via macro programming. Repetitive machining features such as holes, slots, pockets etc can readily be encapsulated in macros. Each macro consists of methods on how to machine the shape as defined by the feature. The macro programming technique comprises of a main program and subprograms. The main program allows user to select several subprograms that contain features and define their important parameters. With macros, complex machining routines can be implemented easily and no post processor is required. A case study on machining of a part that comprised of planar face, hole and pocket features using the macro programming technique was carried out. It is envisaged that the macro programming technique can be extended to other feature-based machining fields such as the newly developed STEP-NC domain.

Solar Thermal Aquaculture System Controller Based on Artificial Neural Network

Temperature is one of the most principle factors affects aquaculture system. It can cause stress and mortality or superior environment for growth and reproduction. This paper presents the control of pond water temperature using artificial intelligence technique. The water temperature is very important parameter for shrimp growth. The required temperature for optimal growth is 34oC, if temperature increase up to 38oC it cause death of the shrimp, so it is important to control water temperature. Solar thermal water heating system is designed to supply an aquaculture pond with the required hot water in Mersa Matruh in Egypt. Neural networks are massively parallel processors that have the ability to learn patterns through a training experience. Because of this feature, they are often well suited for modeling complex and non-linear processes such as those commonly found in the heating system. Artificial neural network is proposed to control water temperature due to Artificial intelligence (AI) techniques are becoming useful as alternate approaches to conventional techniques. They have been used to solve complicated practical problems. Moreover this paper introduces a complete mathematical modeling and MATLAB SIMULINK model for the aquaculture system. The simulation results indicate that, the control unit success in keeping water temperature constant at the desired temperature by controlling the hot water flow rate.

The Splitting Upwind Schemes for Spectral Action Balance Equation

The spectral action balance equation is an equation that used to simulate short-crested wind-generated waves in shallow water areas such as coastal regions and inland waters. This equation consists of two spatial dimensions, wave direction, and wave frequency which can be solved by finite difference method. When this equation with dominating convection term are discretized using central differences, stability problems occur when the grid spacing is chosen too coarse. In this paper, we introduce the splitting upwind schemes for avoiding stability problems and prove that it is consistent to the upwind scheme with same accuracy. The splitting upwind schemes was adopted to split the wave spectral action balance equation into four onedimensional problems, which for each small problem obtains the independently tridiagonal linear systems. For each smaller system can be solved by direct or iterative methods at the same time which is very fast when performed by a multi-processor computer.

Sub-Image Detection Using Fast Neural Processors and Image Decomposition

In this paper, an approach to reduce the computation steps required by fast neural networksfor the searching process is presented. The principle ofdivide and conquer strategy is applied through imagedecomposition. Each image is divided into small in sizesub-images and then each one is tested separately usinga fast neural network. The operation of fast neuralnetworks based on applying cross correlation in thefrequency domain between the input image and theweights of the hidden neurons. Compared toconventional and fast neural networks, experimentalresults show that a speed up ratio is achieved whenapplying this technique to locate human facesautomatically in cluttered scenes. Furthermore, fasterface detection is obtained by using parallel processingtechniques to test the resulting sub-images at the sametime using the same number of fast neural networks. Incontrast to using only fast neural networks, the speed upratio is increased with the size of the input image whenusing fast neural networks and image decomposition.

Design of Multi-disease Diagnosis Processor using Hypernetworks Technique

In this paper, we propose disease diagnosis hardware architecture by using Hypernetworks technique. It can be used to diagnose 3 different diseases (SPECT Heart, Leukemia, Prostate cancer). Generally, the disparate diseases require specified diagnosis hardware model for each disease. Using similarities of three diseases diagnosis processor, we design diagnosis processor that can diagnose three different diseases. Our proposed architecture that is combining three processors to one processor can reduce hardware size without decrease of the accuracy.

No-Reference Image Quality Assessment using Blur and Noise

Assessment for image quality traditionally needs its original image as a reference. The conventional method for assessment like Mean Square Error (MSE) or Peak Signal to Noise Ratio (PSNR) is invalid when there is no reference. In this paper, we present a new No-Reference (NR) assessment of image quality using blur and noise. The recent camera applications provide high quality images by help of digital Image Signal Processor (ISP). Since the images taken by the high performance of digital camera have few blocking and ringing artifacts, we only focus on the blur and noise for predicting the objective image quality. The experimental results show that the proposed assessment method gives high correlation with subjective Difference Mean Opinion Score (DMOS). Furthermore, the proposed method provides very low computational load in spatial domain and similar extraction of characteristics to human perceptional assessment.

Evaluation of Sensitometric Properties of Radiographic Films at Different Processing Solutions

The aim of this study was to compare the sensitometric properties of commonly used radiographic films processed with chemical solutions in different workload hospitals. The effect of different processing conditions on induced densities on radiologic films was investigated. Two accessible double emulsions Fuji and Kodak films were exposed with 11-step wedge and processed with Champion and CPAC processing solutions. The mentioned films provided in both workloads centers, high and low. Our findings displays that the speed and contrast of Kodak filmscreen in both work load (high and low) is higher than Fuji filmscreen for both processing solutions. However there was significant differences in films contrast for both workloads when CPAC solution had been used (p=0.000 and 0.028). The results showed base plus fog density for Kodak film was lower than Fuji. Generally Champion processing solution caused more speed and contrast for investigated films in different conditions and there was significant differences in 95% confidence level between two used processing solutions (p=0.01). Low base plus fog density for Kodak films provide more visibility and accuracy and higher contrast results in using lower exposure factors to obtain better quality in resulting radiographs. In this study we found an economic advantages since Champion solution and Kodak film are used while it makes lower patient dose. Thus, in a radiologic facility any change in film processor/processing cycle or chemistry should be carefully investigated before radiological procedures of patients are acquired.

Architecture Based on Dynamic Graphs for the Dynamic Reconfiguration of Farms of Computers

In the last years, the computers have increased their capacity of calculus and networks, for the interconnection of these machines. The networks have been improved until obtaining the actual high rates of data transferring. The programs that nowadays try to take advantage of these new technologies cannot be written using the traditional techniques of programming, since most of the algorithms were designed for being executed in an only processor,in a nonconcurrent form instead of being executed concurrently ina set of processors working and communicating through a network.This paper aims to present the ongoing development of a new system for the reconfiguration of grouping of computers, taking into account these new technologies.

Lattice Boltzmann Simulation of Binary Mixture Diffusion Using Modern Graphics Processors

A highly optimized implementation of binary mixture diffusion with no initial bulk velocity on graphics processors is presented. The lattice Boltzmann model is employed for simulating the binary diffusion of oxygen and nitrogen into each other with different initial concentration distributions. Simulations have been performed using the latest proposed lattice Boltzmann model that satisfies both the indifferentiability principle and the H-theorem for multi-component gas mixtures. Contemporary numerical optimization techniques such as memory alignment and increasing the multiprocessor occupancy are exploited along with some novel optimization strategies to enhance the computational performance on graphics processors using the C for CUDA programming language. Speedup of more than two orders of magnitude over single-core processors is achieved on a variety of Graphical Processing Unit (GPU) devices ranging from conventional graphics cards to advanced, high-end GPUs, while the numerical results are in excellent agreement with the available analytical and numerical data in the literature.

Multiple Job Shop-Scheduling using Hybrid Heuristic Algorithm

In this paper, multi-processors job shop scheduling problems are solved by a heuristic algorithm based on the hybrid of priority dispatching rules according to an ant colony optimization algorithm. The objective function is to minimize the makespan, i.e. total completion time, in which a simultanous presence of various kinds of ferons is allowed. By using the suitable hybrid of priority dispatching rules, the process of finding the best solution will be improved. Ant colony optimization algorithm, not only promote the ability of this proposed algorithm, but also decreases the total working time because of decreasing in setup times and modifying the working production line. Thus, the similar work has the same production lines. Other advantage of this algorithm is that the similar machines (not the same) can be considered. So, these machines are able to process a job with different processing and setup times. According to this capability and from this algorithm evaluation point of view, a number of test problems are solved and the associated results are analyzed. The results show a significant decrease in throughput time. It also shows that, this algorithm is able to recognize the bottleneck machine and to schedule jobs in an efficient way.

A New High Speed Neural Model for Fast Character Recognition Using Cross Correlation and Matrix Decomposition

Neural processors have shown good results for detecting a certain character in a given input matrix. In this paper, a new idead to speed up the operation of neural processors for character detection is presented. Such processors are designed based on cross correlation in the frequency domain between the input matrix and the weights of neural networks. This approach is developed to reduce the computation steps required by these faster neural networks for the searching process. The principle of divide and conquer strategy is applied through image decomposition. Each image is divided into small in size sub-images and then each one is tested separately by using a single faster neural processor. Furthermore, faster character detection is obtained by using parallel processing techniques to test the resulting sub-images at the same time using the same number of faster neural networks. In contrast to using only faster neural processors, the speed up ratio is increased with the size of the input image when using faster neural processors and image decomposition. Moreover, the problem of local subimage normalization in the frequency domain is solved. The effect of image normalization on the speed up ratio of character detection is discussed. Simulation results show that local subimage normalization through weight normalization is faster than subimage normalization in the spatial domain. The overall speed up ratio of the detection process is increased as the normalization of weights is done off line.

Model Transformation with a Visual Control Flow Language

Graph rewriting-based visual model processing is a widely used technique for model transformation. Visual model transformations often need to follow an algorithm that requires a strict control over the execution sequence of the transformation steps. Therefore, in Visual Model Processors (VMPs) the execution order of the transformation steps is crucial. This paper presents the visual control flow support of Visual Modeling and Transformation System (VMTS), which facilitates composing complex model transformations of simple transformation steps and executing them. The VMTS Visual Control Flow Language (VCFL) uses stereotyped activity diagrams to specify control flow structures and OCL constraints to choose between different control flow branches. This paper introduces VCFL, discusses its termination properties and provides an algorithm to support the termination analysis of VCFL transformations.

An Images Monitoring System based on Multi-Format Streaming Grid Architecture

This paper proposes a novel multi-format stream grid architecture for real-time image monitoring system. The system, based on a three-tier architecture, includes stream receiving unit, stream processor unit, and presentation unit. It is a distributed computing and a loose coupling architecture. The benefit is the amount of required servers can be adjusted depending on the loading of the image monitoring system. The stream receive unit supports multi capture source devices and multi-format stream compress encoder. Stream processor unit includes three modules; they are stream clipping module, image processing module and image management module. Presentation unit can display image data on several different platforms. We verified the proposed grid architecture with an actual test of image monitoring. We used a fast image matching method with the adjustable parameters for different monitoring situations. Background subtraction method is also implemented in the system. Experimental results showed that the proposed architecture is robust, adaptive, and powerful in the image monitoring system.

Semi-Lagrangian Method for Advection Equation on GPU in Unstructured R3 Mesh for Fluid Dynamics Application

Numerical integration of initial boundary problem for advection equation in 3 ℜ is considered. The method used is  conditionally stable semi-Lagrangian advection scheme with high order interpolation on unstructured mesh. In order to increase time step integration the BFECC method with limiter TVD correction is used. The method is adopted on parallel graphic processor unit environment using NVIDIA CUDA and applied in Navier-Stokes solver. It is shown that the calculation on NVIDIA GeForce 8800  GPU is 184 times faster than on one processor AMDX2 4800+ CPU. The method is extended to the incompressible fluid dynamics solver. Flow over a Cylinder for 3D case is compared to the experimental data.

2D and 3D Finite Element Method Packages of CEMTool for Engineering PDE Problems

CEMTool is a command style design and analyzing package for scientific and technological algorithm and a matrix based computation language. In this paper, we present new 2D & 3D finite element method (FEM) packages for CEMTool. We discuss the detailed structures and the important features of pre-processor, solver, and post-processor of CEMTool 2D & 3D FEM packages. In contrast to the existing MATLAB PDE Toolbox, our proposed FEM packages can deal with the combination of the reserved words. Also, we can control the mesh in a very effective way. With the introduction of new mesh generation algorithm and fast solving technique, our FEM packages can guarantee the shorter computational time than MATLAB PDE Toolbox. Consequently, with our new FEM packages, we can overcome some disadvantages or limitations of the existing MATLAB PDE Toolbox.