A Survey of Field Programmable Gate Array-Based Convolutional Neural Network Accelerators

With the rapid development of deep learning, neural network and deep learning algorithms play a significant role in various practical applications. Due to the high accuracy and good performance, Convolutional Neural Networks (CNNs) especially have become a research hot spot in the past few years. However, the size of the networks becomes increasingly large scale due to the demands of the practical applications, which poses a significant challenge to construct a high-performance implementation of deep learning neural networks. Meanwhile, many of these application scenarios also have strict requirements on the performance and low-power consumption of hardware devices. Therefore, it is particularly critical to choose a moderate computing platform for hardware acceleration of CNNs. This article aimed to survey the recent advance in Field Programmable Gate Array (FPGA)-based acceleration of CNNs. Various designs and implementations of the accelerator based on FPGA under different devices and network models are overviewed, and the versions of Graphic Processing Units (GPUs), Application Specific Integrated Circuits (ASICs) and Digital Signal Processors (DSPs) are compared to present our own critical analysis and comments. Finally, we give a discussion on different perspectives of these acceleration and optimization methods on FPGA platforms to further explore the opportunities and challenges for future research. More helpfully, we give a prospect for future development of the FPGA-based accelerator.

Digital Encoder Based Power Frequency Deviation Measurement

In this paper, a simple method is presented for measurement of power frequency deviations. A phase locked loop (PLL) is used to multiply the signal under test by a factor of 100. The number of pulses in this pulse train signal is counted over a stable known period, using decade driving assemblies (DDAs) and flip-flops. These signals are combined using logic gates and then passed through decade counters to give a unique combination of pulses or levels, which are further encoded. These pulses are equally suitable for both control applications and display units. The experimental circuit developed gives a resolution of 1 Hz within the measurement period of 20 ms. The proposed circuit is also simulated in Verilog Hardware Description Language (VHDL) and implemented using Field Programing Gate Arrays (FPGAs). A Mixed signal Oscilloscope (MSO) is used to observe the results of FPGA implementation. These results are compared with the results of the proposed circuit of discrete components. The proposed system is useful for frequency deviation measurement and control in power systems.

Implementation of Edge Detection Based on Autofluorescence Endoscopic Image of Field Programmable Gate Array

Autofluorescence Imaging (AFI) is a technology for detecting early carcinogenesis of the gastrointestinal tract in recent years. Compared with traditional white light endoscopy (WLE), this technology greatly improves the detection accuracy of early carcinogenesis, because the colors of normal tissues are different from cancerous tissues. Thus, edge detection can distinguish them in grayscale images. In this paper, based on the traditional Sobel edge detection method, optimization has been performed on this method which considers the environment of the gastrointestinal, including adaptive threshold and morphological processing. All of the processes are implemented on our self-designed system based on the image sensor OV6930 and Field Programmable Gate Array (FPGA), The system can capture the gastrointestinal image taken by the lens in real time and detect edges. The final experiments verified the feasibility of our system and the effectiveness and accuracy of the edge detection algorithm.

FPGA Implementation of the BB84 Protocol

The development of a quantum key distribution (QKD) system on a field-programmable gate array (FPGA) platform is the subject of this paper. A quantum cryptographic protocol is designed based on the properties of quantum information and the characteristics of FPGAs. The proposed protocol performs key extraction, reconciliation, error correction, and privacy amplification tasks to generate a perfectly secret final key. We modeled the presence of the spy in our system with a strategy to reveal some of the exchanged information without being noticed. Using an FPGA card with a 100 MHz clock frequency, we have demonstrated the evolution of the error rate as well as the amounts of mutual information (between the two interlocutors and that of the spy) passing from one step to another in the key generation process.

Comparative Study of Equivalent Linear and Non-Linear Ground Response Analysis for Rapar District of Kutch, India

Earthquakes are considered to be the most destructive rapid-onset disasters human beings are exposed to. The amount of loss it brings in is sufficient to take careful considerations for designing of structures and facilities. Seismic Hazard Analysis is one such tool which can be used for earthquake resistant design. Ground Response Analysis is one of the most crucial and decisive steps for seismic hazard analysis. Rapar district of Kutch, Gujarat falls in Zone 5 of earthquake zone map of India and thus has high seismicity because of which it is selected for analysis. In total 8 bore-log data were studied at different locations in and around Rapar district. Different soil engineering properties were analyzed and relevant empirical correlations were used to calculate maximum shear modulus (Gmax) and shear wave velocity (Vs) for the soil layers. The soil was modeled using Pressure-Dependent Modified Kodner Zelasko (MKZ) model and the reference curve used for fitting was Seed and Idriss (1970) for sand and Darendeli (2001) for clay. Both Equivalent linear (EL), as well as Non-linear (NL) ground response analysis, has been carried out with Masing Hysteretic Re/Unloading formulation for comparison. Commercially available DEEPSOIL v. 7.0 software is used for this analysis. In this study an attempt is made to quantify ground response regarding generated acceleration time-history at top of the soil column, Response spectra calculation at 5 % damping and Fourier amplitude spectrum calculation. Moreover, the variation of Peak Ground Acceleration (PGA), Maximum Displacement, Maximum Strain (in %), Maximum Stress Ratio, Mobilized Shear Stress with depth is also calculated. From the study, PGA values estimated in rocky strata are nearly same as bedrock motion and marginal amplification is observed in sandy silt and silty clays by both analyses. The NL analysis gives conservative results of maximum displacement as compared to EL analysis. Maximum strain predicted by both studies is very close to each other. And overall NL analysis is more efficient and realistic because it follows the actual hyperbolic stress-strain relationship, considers stiffness degradation and mobilizes stresses generated due to pore water pressure.

Seismic Behavior of a Jumbo Container Crane in the Low Seismicity Zone Using Time-History Analyses

Jumbo container crane is an important part of port structures that needs to be designed properly, even when the port locates in low seismicity zone such as in Korea. In this paper, 30 artificial ground motions derived from the elastic response spectra of Korean Building Code (2005) are used for time history analysis. It is found that the uplift might not occur in this analysis when the crane locates in the low seismic zone. Therefore, a selection of a pinned or a gap element for base supporting has not much effect on the determination of the total base shear. The relationships between the total base shear and peak ground acceleration (PGA) and the relationships between the portal drift and the PGA are proposed in this study.

FPGA Implementation of Adaptive Clock Recovery for TDMoIP Systems

Circuit switched networks widely used until the end of the 20th century have been transformed into packages switched networks. Time Division Multiplexing over Internet Protocol (TDMoIP) is a system that enables Time Division Multiplexing (TDM) traffic to be carried over packet switched networks (PSN). In TDMoIP systems, devices that send TDM data to the PSN and receive it from the network must operate with the same clock frequency. In this study, it was aimed to implement clock synchronization process in Field Programmable Gate Array (FPGA) chips using time information attached to the packages received from PSN. The designed hardware is verified using the datasets obtained for the different carrier types and comparing the results with the software model. Field tests are also performed by using the real time TDMoIP system.

The DAQ Debugger for iFDAQ of the COMPASS Experiment

In general, state-of-the-art Data Acquisition Systems (DAQ) in high energy physics experiments must satisfy high requirements in terms of reliability, efficiency and data rate capability. This paper presents the development and deployment of a debugging tool named DAQ Debugger for the intelligent, FPGA-based Data Acquisition System (iFDAQ) of the COMPASS experiment at CERN. Utilizing a hardware event builder, the iFDAQ is designed to be able to readout data at the average maximum rate of 1.5 GB/s of the experiment. In complex softwares, such as the iFDAQ, having thousands of lines of code, the debugging process is absolutely essential to reveal all software issues. Unfortunately, conventional debugging of the iFDAQ is not possible during the real data taking. The DAQ Debugger is a tool for identifying a problem, isolating the source of the problem, and then either correcting the problem or determining a way to work around it. It provides the layer for an easy integration to any process and has no impact on the process performance. Based on handling of system signals, the DAQ Debugger represents an alternative to conventional debuggers provided by most integrated development environments. Whenever problem occurs, it generates reports containing all necessary information important for a deeper investigation and analysis. The DAQ Debugger was fully incorporated to all processes in the iFDAQ during the run 2016. It helped to reveal remaining software issues and improved significantly the stability of the system in comparison with the previous run. In the paper, we present the DAQ Debugger from several insights and discuss it in a detailed way.

Effect of Fill Material Density under Structures on Ground Motion Characteristics Due to Earthquake

Due to limited areas and excessive cost of land for projects, backfilling process has become necessary. Also, backfilling will be done to overcome the un-leveling depths or raising levels of site construction, especially near the sea region. Therefore, backfilling soil materials used under the foundation of structures should be investigated regarding its effect on ground motion characteristics, especially at regions subjected to earthquakes. In this research, 60-meter thickness of sandy fill material was used above a fixed 240-meter of natural clayey soil underlying by rock formation to predict the modified ground motion characteristics effect at the foundation level. Comparison between the effect of using three different situations of fill material compaction on the recorded earthquake is studied, i.e. peak ground acceleration, time history, and spectra acceleration values. The three different densities of the compacted fill material used in the study were very loose, medium dense and very dense sand deposits, respectively. Shake computer program was used to perform this study. Strong earthquake records, with Peak Ground Acceleration (PGA) of 0.35 g, were used in the analysis. It was found that, higher compaction of fill material thickness has a significant effect on eliminating the earthquake ground motion properties at surface layer of fill material, near foundation level. It is recommended to consider the fill material characteristics in the design of foundations subjected to seismic motions. Future studies should be analyzed for different fill and natural soil deposits for different seismic conditions.

The Communication Library DIALOG for iFDAQ of the COMPASS Experiment

Modern experiments in high energy physics impose great demands on the reliability, the efficiency, and the data rate of Data Acquisition Systems (DAQ). This contribution focuses on the development and deployment of the new communication library DIALOG for the intelligent, FPGA-based Data Acquisition System (iFDAQ) of the COMPASS experiment at CERN. The iFDAQ utilizing a hardware event builder is designed to be able to readout data at the maximum rate of the experiment. The DIALOG library is a communication system both for distributed and mixed environments, it provides a network transparent inter-process communication layer. Using the high-performance and modern C++ framework Qt and its Qt Network API, the DIALOG library presents an alternative to the previously used DIM library. The DIALOG library was fully incorporated to all processes in the iFDAQ during the run 2016. From the software point of view, it might be considered as a significant improvement of iFDAQ in comparison with the previous run. To extend the possibilities of debugging, the online monitoring of communication among processes via DIALOG GUI is a desirable feature. In the paper, we present the DIALOG library from several insights and discuss it in a detailed way. Moreover, the efficiency measurement and comparison with the DIM library with respect to the iFDAQ requirements is provided.

CPU Architecture Based on Static Hardware Scheduler Engine and Multiple Pipeline Registers

The development of CPUs and of real-time systems based on them made it possible to use time at increasingly low resolutions. Together with the scheduling methods and algorithms, time organizing has been improved so as to respond positively to the need for optimization and to the way in which the CPU is used. This presentation contains both a detailed theoretical description and the results obtained from research on improving the performances of the nMPRA (Multi Pipeline Register Architecture) processor by implementing specific functions in hardware. The proposed CPU architecture has been developed, simulated and validated by using the FPGA Virtex-7 circuit, via a SoC project. Although the nMPRA processor hardware structure with five pipeline stages is very complex, the present paper presents and analyzes the tests dedicated to the implementation of the CPU and of the memory on-chip for instructions and data. In order to practically implement and test the entire SoC project, various tests have been performed. These tests have been performed in order to verify the drivers for peripherals and the boot module named Bootloader.

Field-Programmable Gate Array Based Tester for Protective Relay

The reliability of the power grid depends on the successful operation of thousands of protective relays. The failure of one relay to operate as intended may lead the entire power grid to blackout. In fact, major power system failures during transient disturbances may be caused by unnecessary protective relay tripping rather than by the failure of a relay to operate. Adequate relay testing provides a first defense against false trips of the relay and hence improves power grid stability and prevents catastrophic bulk power system failures. The goal of this research project is to design and enhance the relay tester using a technology such as Field Programmable Gate Array (FPGA) card NI 7851. A PC based tester framework has been developed using Simulink power system model for generating signals under different conditions (faults or transient disturbances) and LabVIEW for developing the graphical user interface and configuring the FPGA. Besides, the interface system has been developed for outputting and amplifying the signals without distortion. These signals should be like the generated ones by the real power system and large enough for testing the relay’s functionality. The signals generated that have been displayed on the scope are satisfactory. Furthermore, the proposed testing system can be used for improving the performance of protective relay.

Selection of Intensity Measure in Probabilistic Seismic Risk Assessment of a Turkish Railway Bridge

Fragility curve is an effective common used tool to determine the earthquake performance of structural and nonstructural components. Also, it is used to determine the nonlinear behavior of bridges. There are many historical bridges in the Turkish railway network; the earthquake performances of these bridges are needed to be investigated. To derive fragility curve Intensity measures (IMs) and Engineering demand parameters (EDP) are needed to be determined. And the relation between IMs and EDP are needed to be derived. In this study, a typical simply supported steel girder riveted railway bridge is studied. Fragility curves of this bridge are derived by two parameters lognormal distribution. Time history analyses are done for selected 60 real earthquake data to determine the relation between IMs and EDP. Moreover, efficiency, practicality, and sufficiency of three different IMs are discussed. PGA, Sa(0.2s) and Sa(1s), the most common used IMs parameters for fragility curve in the literature, are taken into consideration in terms of efficiency, practicality and sufficiency.

Family Functionality in Mexican Children with Congenital and Non-Congenital Deafness

A total of 100 primary caregivers (mothers, fathers, grandparents) with at least one child or grandchild with a diagnosis of congenital bilateral profound deafness were assessed in order to evaluate the functionality of families with a deaf member, who was evaluated by specialists in audiology, molecular biology, genetics and psychology. After confirmation of the clinical diagnosis, DNA from the patients and parents were analyzed in search of the 35delG deletion of the GJB2 gene to determine who possessed the mutation. All primary caregivers were provided psychological support, regardless of whether or not they had the mutation, and prior and subsequent, the family APGAR test was applied. All parents, grandparents were informed of the results of the genetic analysis during the psychological intervention. The family APGAR, after psychological and genetic counseling, showed that 14% perceived their families as functional, 62% moderately functional and 24% dysfunctional. This shows the importance of psychological support in family functionality that has a direct impact on the quality of life of these families.

Seismic Fragility for Sliding Failure of Weir Structure Considering the Process of Concrete Aging

This study investigated the change of weir structure performances when durability of concrete, which is the main material of weir structure, decreased due to their aging by mean of seismic fragility analysis. In the analysis, it was assumed that the elastic modulus of concrete was reduced by 10% in order to account for their aged deterioration. Additionally, the analysis of seismic fragility was based on Monte Carlo Simulation method combined with a 2D nonlinear finite element in ABAQUS platform with the consideration of deterioration of concrete. Finally, the comparison of seismic fragility of model pre- and post-deterioration was made to study the performance of weir. Results show that the probability of failure in moderate damage for deteriorated model was found to be larger than pre-deterioration model when peak ground acceleration (PGA) passed 0.4 g.

Improving the Performances of the nMPRA Architecture by Implementing Specific Functions in Hardware

Minimizing the response time to asynchronous events in a real-time system is an important factor in increasing the speed of response and an interesting concept in designing equipment fast enough for the most demanding applications. The present article will present the results regarding the validation of the nMPRA (Multi Pipeline Register Architecture) architecture using the FPGA Virtex-7 circuit. The nMPRA concept is a hardware processor with the scheduler implemented at the processor level; this is done without affecting a possible bus communication, as is the case with the other CPU solutions. The implementation of static or dynamic scheduling operations in hardware and the improvement of handling interrupts and events by the real-time executive described in the present article represent a key solution for eliminating the overhead of the operating system functions. The nMPRA processor is capable of executing a preemptive scheduling, using various algorithms without a software scheduler. Therefore, we have also presented various scheduling methods and algorithms used in scheduling the real-time tasks.

Analysis of Lightweight Register Hardware Threat

In this paper, we present a design methodology of lightweight register transfer level (RTL) hardware threat implemented based on a MAX II FPGA platform. The dynamic power consumed by the toggling of the various bit of registers as well as the dynamic power consumed per unit of logic circuits were analyzed. The hardware threat was designed taking advantage of the differences in dynamic power consumed per unit of logic circuits to hide the transfer information. The experiment result shows that the register hardware threat was successfully implemented by using different dynamic power consumed per unit of logic circuits to hide the key information of DES encryption module. It needs more than 100000 sample curves to reduce the background noise by comparing the sample space when it completely meets the time alignment requirement. In additional, an external trigger signal is playing a very important role to detect the hardware threat in this experiment.

Seismic Soil-Pile Interaction Considering Nonlinear Soil Column Behavior in Saturated and Dry Soil Conditions

This paper investigates seismic soil-pile interaction using the Beam on Nonlinear Winkler Foundation (BNWF) approach. Three soil types are considered to cover all the possible responses, as well as nonlinear site response analysis using finite element method in OpenSees platform. Excitations at each elevation that are output of the site response analysis are used as the input excitation to the soil pile system implementing multi-support excitation method. Spectral intensities of acceleration show that the extent of the response in sand is more severe than that of clay, in addition, increasing the PGA of ground strong motion will affect the sandy soil more, in comparison with clayey medium, which is an indicator of the sensitivity of soil-pile systems in sandy soil.

Run-Time Customisation of Soft-Core CPUs on Field Programmable Gate Array

The use of customised soft-core processors in which instructions can be integrated into a system in application hardware is increasing in the Field Programmable Gate Array (FPGA) field. Specifically, the partial run-time reconfiguration of FPGAs in specialised processors for a particular domain can be very beneficial. In this report, the design and implementation for the customisation of a soft-core MIPS processor using an FPGA and partial reconfiguration (PR) of FPGA technology will be addressed to achieve efficient resource use. This can be achieved using a PR design flow that helps the design fit into a smaller device. Moreover, the impact of static power consumption could be reduced due to runtime reconfiguration. This will be done by configurable custom instructions implemented in the hardware as an extension on the MIPS CPU. The aim of this project is to investigate the PR of FPGAs for run-time adaptations of the instruction set of a soft-core CPU, including the integration of custom instructions and the exploration of the potential to use the MultiBoot feature available in Xilinx FPGAs to carry out the PR process. The system will be evaluated and tested on a Nexus 3 development board featuring a Xilinx Spartran-6 FPGA. The system will be able to load reconfigurable custom instructions dynamically into user programs with the help of the trap handler when the custom instruction is called by the MIPS CPU. The results of this experiment demonstrate that custom instructions in hardware can speed up a certain function and many instructions can be saved when compared to a software implementation of the same function. Implementing custom instructions in hardware is perfectly possible and worth exploring.

Design of Local Interconnect Network Controller for Automotive Applications

Local interconnect network (LIN) is a communication protocol that combines sensors, actuators, and processors to a functional module in automotive applications. In this paper, a LIN ver. 2.2A controller was designed in Verilog hardware description language (Verilog HDL) and implemented in field-programmable gate array (FPGA). Its operation was verified by making full-scale LIN network with the presented FPGA-implemented LIN controller, commercial LIN transceivers, and commercial processors. When described in Verilog HDL and synthesized in 0.18 μm technology, its gate size was about 2,300 gates.