Abstract: In this paper, we evaluate the choice of suitable
quantization characteristics for both the decoder messages and the
received samples in Low Density Parity Check (LDPC) coded
systems using M-QAM (Quadrature Amplitude Modulation)
schemes. The analysis involves the demapper block that provides
initial likelihood values for the decoder, by relating its quantization
strategy of the decoder. A mapping strategy refers to the grouping of
bits within a codeword, where each m-bit group is used to select a
2m-ary signal in accordance with the signal labels. Further we
evaluate the system with mapping strategies like Consecutive-Bit
(CB) and Bit-Reliability (BR). A new demapper version, based on
approximate expressions, is also presented to yield a low complexity
hardware implementation.
Abstract: This paper presents a novel genetic algorithm, termed
the Optimum Individual Monogenetic Algorithm (OIMGA) and
describes its hardware implementation. As the monogenetic strategy
retains only the optimum individual, the memory requirement is
dramatically reduced and no crossover circuitry is needed, thereby
ensuring the requisite silicon area is kept to a minimum.
Consequently, depending on application requirements, OIMGA
allows the investigation of solutions that warrant either larger GA
populations or individuals of greater length. The results given in this
paper demonstrate that both the performance of OIMGA and its
convergence time are superior to those of existing hardware GA
implementations. Local convergence is achieved in OIMGA by
retaining elite individuals, while population diversity is ensured by
continually searching for the best individuals in fresh regions of the
search space.
Abstract: Higher-order Statistics (HOS), also known as
cumulants, cross moments and their frequency domain counterparts,
known as poly spectra have emerged as a powerful signal processing
tool for the synthesis and analysis of signals and systems. Algorithms
used for the computation of cross moments are computationally
intensive and require high computational speed for real-time
applications. For efficiency and high speed, it is often advantageous
to realize computation intensive algorithms in hardware. A promising
solution that combines high flexibility together with the speed of a
traditional hardware is Field Programmable Gate Array (FPGA). In
this paper, we present FPGA-based parallel architecture for the
computation of third-order cross moments. The proposed design is
coded in Very High Speed Integrated Circuit (VHSIC) Hardware
Description Language (VHDL) and functionally verified by
implementing it on Xilinx Spartan-3 XC3S2000FG900-4 FPGA.
Implementation results are presented and it shows that the proposed
design can operate at a maximum frequency of 86.618 MHz.
Abstract: An efficient parallel form in digital signal processor can improve the algorithm performance. The butterfly structure is an important role in fast Fourier transform (FFT), because its symmetry form is suitable for hardware implementation. Although it can perform a symmetric structure, the performance will be reduced under the data-dependent flow characteristic. Even though recent research which call as novel memory reference reduction methods (NMRRM) for FFT focus on reduce memory reference in twiddle factor, the data-dependent property still exists. In this paper, we propose a parallel-computing approach for FFT implementation on digital signal processor (DSP) which is based on data-independent property and still hold the property of low-memory reference. The proposed method combines final two steps in NMRRM FFT to perform a novel data-independent structure, besides it is very suitable for multi-operation-unit digital signal processor and dual-core system. We have applied the proposed method of radix-2 FFT algorithm in low memory reference on TI TMSC320C64x DSP. Experimental results show the method can reduce 33.8% clock cycles comparing with the NMRRM FFT implementation and keep the low-memory reference property.
Abstract: Falling has been one of the major concerns and threats
to the independence of the elderly in their daily lives. With the
worldwide significant growth of the aging population, it is essential
to have a promising solution of fall detection which is able to operate
at high accuracy in real-time and supports large scale implementation
using multiple cameras. Field Programmable Gate Array (FPGA) is a
highly promising tool to be used as a hardware accelerator in many
emerging embedded vision based system. Thus, it is the main
objective of this paper to present an FPGA-based solution of visual
based fall detection to meet stringent real-time requirements with
high accuracy. The hardware architecture of visual based fall
detection which utilizes the pixel locality to reduce memory accesses
is proposed. By exploiting the parallel and pipeline architecture of
FPGA, our hardware implementation of visual based fall detection
using FGPA is able to achieve a performance of 60fps for a series of
video analytical functions at VGA resolutions (640x480). The results
of this work show that FPGA has great potentials and impacts in
enabling large scale vision system in the future healthcare industry
due to its flexibility and scalability.
Abstract: In this paper, low end Digital Signal Processors (DSPs)
are applied to accelerate integer neural networks. The use of DSPs
to accelerate neural networks has been a topic of study for some
time, and has demonstrated significant performance improvements.
Recently, work has been done on integer only neural networks, which
greatly reduces hardware requirements, and thus allows for cheaper
hardware implementation. DSPs with Arithmetic Logic Units (ALUs)
that support floating or fixed point arithmetic are generally more
expensive than their integer only counterparts due to increased circuit
complexity. However if the need for floating or fixed point math
operation can be removed, then simpler, lower cost DSPs can be
used. To achieve this, an integer only neural network is created in
this paper, which is then accelerated by using DSP instructions to
improve performance.
Abstract: A new genetic algorithm, termed the 'optimum individual monogenetic genetic algorithm' (OIMGA), is presented whose properties have been deliberately designed to be well suited to hardware implementation. Specific design criteria were to ensure fast access to the individuals in the population, to keep the required silicon area for hardware implementation to a minimum and to incorporate flexibility in the structure for the targeting of a range of applications. The first two criteria are met by retaining only the current optimum individual, thereby guaranteeing a small memory requirement that can easily be stored in fast on-chip memory. Also, OIMGA can be easily reconfigured to allow the investigation of problems that normally warrant either large GA populations or individuals many genes in length. Local convergence is achieved in OIMGA by retaining elite individuals, while population diversity is ensured by continually searching for the best individuals in fresh regions of the search space. The results given in this paper demonstrate that both the performance of OIMGA and its convergence time are superior to those of a range of existing hardware GA implementations.