Parallel Vector Processing Using Multi Level Orbital DATA

Many applications use vector operations by applying single instruction to multiple data that map to different locations in conventional memory. Transferring data from memory is limited by access latency and bandwidth affecting the performance gain of vector processing. We present a memory system that makes all of its content available to processors in time so that processors need not to access the memory, we force each location to be available to all processors at a specific time. The data move in different orbits to become available to other processors in higher orbits at different time. We use this memory to apply parallel vector operations to data streams at first orbit level. Data processed in the first level move to upper orbit one data element at a time, allowing a processor in that orbit to apply another vector operation to deal with serial code limitations inherited in all parallel applications and interleaved it with lower level vector operations.

A Fast Neural Algorithm for Serial Code Detection in a Stream of Sequential Data

In recent years, fast neural networks for object/face detection have been introduced based on cross correlation in the frequency domain between the input matrix and the hidden weights of neural networks. In our previous papers [3,4], fast neural networks for certain code detection was introduced. It was proved in [10] that for fast neural networks to give the same correct results as conventional neural networks, both the weights of neural networks and the input matrix must be symmetric. This condition made those fast neural networks slower than conventional neural networks. Another symmetric form for the input matrix was introduced in [1-9] to speed up the operation of these fast neural networks. Here, corrections for the cross correlation equations (given in [13,15,16]) to compensate for the symmetry condition are presented. After these corrections, it is proved mathematically that the number of computation steps required for fast neural networks is less than that needed by classical neural networks. Furthermore, there is no need for converting the input data into symmetric form. Moreover, such new idea is applied to increase the speed of neural networks in case of processing complex values. Simulation results after these corrections using MATLAB confirm the theoretical computations.

PeliGRIFF: A Parallel DEM-DLM/FD Method for DNS of Particulate Flows with Collisions

An original Direct Numerical Simulation (DNS) method to tackle the problem of particulate flows at moderate to high concentration and finite Reynolds number is presented. Our method is built on the framework established by Glowinski and his coworkers [1] in the sense that we use their Distributed Lagrange Multiplier/Fictitious Domain (DLM/FD) formulation and their operator-splitting idea but differs in the treatment of particle collisions. The novelty of our contribution relies on replacing the simple artificial repulsive force based collision model usually employed in the literature by an efficient Discrete Element Method (DEM) granular solver. The use of our DEM solver enables us to consider particles of arbitrary shape (at least convex) and to account for actual contacts, in the sense that particles actually touch each other, in contrast with the simple repulsive force based collision model. We recently upgraded our serial code, GRIFF 1 [2], to full MPI capabilities. Our new code, PeliGRIFF 2, is developed under the framework of the full MPI open source platform PELICANS [3]. The new MPI capabilities of PeliGRIFF open new perspectives in the study of particulate flows and significantly increase the number of particles that can be considered in a full DNS approach: O(100000) in 2D and O(10000) in 3D. Results on the 2D/3D sedimentation/fluidization of isometric polygonal/polyedral particles with collisions are presented.