Evaluating the Impact of Replacement Policies on the Cache Performance and Energy Consumption in Different Multicore Embedded Systems

The cache has an important role in the reduction of access delay between a processor and memory in high-performance embedded systems. In these systems, the energy consumption is one of the most important concerns, and it will become more important with smaller processor feature sizes and higher frequencies. Meanwhile, the cache system dissipates a significant portion of energy compared to the other components of a processor. There are some elements that can affect the energy consumption of the cache such as replacement policy and degree of associativity. Due to these points, it can be inferred that selecting an appropriate configuration for the cache is a crucial part of designing a system. In this paper, we investigate the effect of different cache replacement policies on both cache’s performance and energy consumption. Furthermore, the impact of different Instruction Set Architectures (ISAs) on cache’s performance and energy consumption has been investigated.

Verification and Proposal of Information Processing Model Using EEG-Based Brain Activity Monitoring

Human beings perform a task by perceiving information from outside, recognizing them, and responding them. There have been various attempts to analyze and understand internal processes behind the reaction to a given stimulus by conducting psychological experiments and analysis from multiple perspectives. Among these, we focused on Model Human Processor (MHP). However, it was built based on psychological experiments and thus the relation with brain activity was unclear so far. To verify the validity of the MHP and propose our model from a viewpoint of neuroscience, EEG (Electroencephalography) measurements are performed during experiments in this study. More specifically, first, experiments were conducted where Latin alphabet characters were used as visual stimuli. In addition to response time, ERPs (event-related potentials) such as N100 and P300 were measured by using EEG. By comparing cycle time predicted by the MHP and latency of ERPs, it was found that N100, related to perception of stimuli, appeared at the end of the perceptual processor. Furthermore, by conducting an additional experiment, it was revealed that P300, related to decision making, appeared during the response decision process, not at the end. Second, by experiments using Japanese Hiragana characters, i.e. Japan's own phonetic symbols, those findings were confirmed. Finally, Japanese Kanji characters were used as more complicated visual stimuli. A Kanji character usually has several readings and several meanings. Despite the difference, a reading-related task and a meaning-related task exhibited similar results, meaning that they involved similar information processing processes of the brain. Based on those results, our model was proposed which reflects response time and ERP latency. It consists of three processors: the perception processor from an input of a stimulus to appearance of N100, the cognitive processor from N100 to P300, and the decision-action processor from P300 to response. Using our model, an application system which reflects brain activity can be established.

Efficiency Enhancement of Photovoltaic Panels Using an Optimised Air Cooled Heat Sink

Solar panels that use photovoltaic (PV) cells are popular for converting solar radiation into electricity. One of the major problems impacting the performance of PV panels is the overheating caused by excessive solar radiation and high ambient temperatures, which degrades the efficiency of the PV panels remarkably. To overcome this issue, an aluminum heat sink was used to dissipate unwanted heat from PV cells. The dimensions of the heat sink were determined considering the optimal fin spacing that fulfils hot climatic conditions. In this study, the effects of cooling on the efficiency and power output of a PV panel were studied experimentally. Two PV modules were used: one without and one with a heat sink. The experiments ran for 11 hours from 6:00 a.m. to 5:30 p.m. where temperature readings in the rear and front of both PV modules were recorded at an interval of 15 minutes using sensors and an Arduino microprocessor. Results are recorded for both panels simultaneously for analysis, temperate comparison, and for power and efficiency calculations. A maximum increase in the solar to electrical conversion efficiency of 35% and almost 55% in the power output were achieved with the use of a heat sink, while temperatures at the front and back of the panel were reduced by 9% and 11%, respectively.

Design and Motion Control of a Two-Wheel Inverted Pendulum Robot

Two-wheel inverted pendulum robot (TWIPR) is designed with two-hub DC motors for human riding and motion control evaluation. In order to measure the tilt angle and angular velocity of the inverted pendulum robot, accelerometer and gyroscope sensors are chosen. The mobile robot’s moving position and velocity were estimated based on DC motor built in hall sensors. The control kernel of this electric mobile robot is designed with embedded Arduino Nano microprocessor. A handle bar was designed to work as steering mechanism. The intelligent model-free fuzzy sliding mode control (FSMC) was employed as the main control algorithm for this mobile robot motion monitoring with different control purpose adjustment. The intelligent controllers were designed for balance control, and moving speed control purposes of this robot under different operation conditions and the control performance were evaluated based on experimental results.

Three-Stage Mining Metals Supply Chain Coordination and Product Quality Improvement with Revenue Sharing Contract

One of the main concerns of miners is to increase the quality level of their products because the mining metals price depends on their quality level; however, increasing the quality level of these products has different costs at different levels of the supply chain. These costs usually increase after extractor level. This paper studies the coordination issue of a decentralized three-level supply chain with one supplier (extractor), one mineral processor and one manufacturer in which the increasing product quality level cost at the processor level is higher than the supplier and at the level of the manufacturer is more than the processor. We identify the optimal product quality level for each supply chain member by designing a revenue sharing contract. Finally, numerical examples show that the designed contract not only increases the final product quality level but also provides a win-win condition for all supply chain members and increases the whole supply chain profit.

Encapsulation of Satureja khuzestanica Essential Oil in Chitosan Nanoparticles with Enhanced Antifungal Activity

During the recent years the six-fold growth of cancer in Iran has led the production of healthy products to become a challenge in the food industry. Due to the young population in the country, the consumption of fast foods is growing. The chemical cancer-causing preservatives are used to produce these products more than the standard; so using an appropriate alternative seems to be important. On the one hand, the plant essential oils show the high antimicrobial potential against pathogenic and spoilage microorganisms and on the other hand they are highly volatile and decomposed under the processing conditions. The study aims to produce the loaded chitosan nanoparticles with different concentrations of savory essential oil to improve the anti-microbial property and increase the resistance of essential oil to oxygen and heat. The encapsulation efficiency was obtained in the range of 32.07% to 39.93% and the particle size distribution of the samples was observed in the range of 159 to 210 nm. The range of Zeta potential was obtained between -11.9 to -23.1 mV. The essential oil loaded in chitosan showed stronger antifungal activity against Rhizopus stolonifer. The results showed that the antioxidant property is directly related to the concentration of loaded essential oil so that the antioxidant property increases by increasing the concentration of essential oil. In general, it seems that the savory essential oil loaded in chitosan particles can be used as a food processor.

A Case Study of Limited Dynamic Voltage Frequency Scaling in Low-Power Processors

Power management techniques are necessary to save power in the microprocessor. By changing the frequency and/or operating voltage of processor, DVFS can control power consumption. In this paper, we perform a case study to find optimal power state transition for DVFS. We propose the equation to find the optimal ratio between executions of states while taking into account the deadline of processing time and the power state transition delay overhead. The experiment is performed on the Cortex-M4 processor, and average 6.5% power saving is observed when DVFS is applied under the deadline condition.

Single Event Transient Tolerance Analysis in 8051 Microprocessor Using Scan Chain

As semi-conductor manufacturing technology evolves; the single event transient problem becomes more significant issue. Single event transient has a critical impact on both combinational and sequential logic circuits, so it is important to evaluate the soft error tolerance of the circuits at the design stage. In this paper, we present a soft error detecting simulation using scan chain. The simulation model generates a single event transient randomly in the circuit, and detects the soft error during the execution of the test patterns. We verified this model by inserting a scan chain in an 8051 microprocessor using 65 nm CMOS technology. While the test patterns generated by ATPG program are passing through the scan chain, we insert a single event transient and detect the number of soft errors per sub-module. The experiments show that the soft error rates per cell area of the SFR module is 277% larger than other modules.

Design of Local Interconnect Network Controller for Automotive Applications

Local interconnect network (LIN) is a communication protocol that combines sensors, actuators, and processors to a functional module in automotive applications. In this paper, a LIN ver. 2.2A controller was designed in Verilog hardware description language (Verilog HDL) and implemented in field-programmable gate array (FPGA). Its operation was verified by making full-scale LIN network with the presented FPGA-implemented LIN controller, commercial LIN transceivers, and commercial processors. When described in Verilog HDL and synthesized in 0.18 μm technology, its gate size was about 2,300 gates.

Thermal Property of Multi-Walled-Carbon-Nanotube Reinforced Epoxy Composites

In this study, epoxy composite specimens reinforced with multi-walled carbon nanotube filler were fabricated using shear mixer and ultra-sonication processor. The mechanical and thermal properties of the fabricated specimens were measured and evaluated. From the electron microscope images and the results from the measurements of tensile strengths, the specimens having 0.6 wt% nanotube content show better dispersion and higher strength than those of the other specimens. The Young’s moduli of the specimens increased as the contents of the nanotube filler in the matrix were increased. The specimen having a 0.6 wt% nanotube filler content showed higher thermal conductivity than that of the other specimens. While, in the measurement of thermal expansion, specimens having 0.4 and 0.6 wt% filler contents showed a lower value of thermal expansion than that of the other specimens. On the basis of the measured and evaluated properties of the composites, we believe that the simple and time-saving fabrication process used in this study was sufficient to obtain improved properties of the specimens.

Design and Analysis of a Low Power High Speed 1 Bit Full Adder Cell Based On TSPC Logic with Multi-Threshold CMOS

An adder is one of the most integral component of a digital system like a digital signal processor or a microprocessor. Being an extremely computationally intensive part of a system, the optimization for speed and power consumption of the adder is of prime importance. In this paper we have designed a 1 bit full adder cell based on dynamic TSPC logic to achieve high speed operation. A high threshold voltage sleep transistor is used to reduce the static power dissipation in standby mode. The circuit is designed and simulated in TSPICE using TSMC 180nm CMOS process. Average power consumption, delay and power-delay product is measured which showed considerable improvement in performance over the existing full adder designs.

A Multi Cordic Architecture on FPGA Platform

Coordinate Rotation Digital Computer (CORDIC) is a unique digital computing unit intended for the computation of mathematical operations and functions. This paper presents A multi CORDIC processor that integrates different CORDIC architectures on a single FPGA chip and allows the user to select the CORDIC architecture to proceed with based on what he wants to calculate and his needs. Synthesis show that radix 2 CORDIC has the lowest clock delay, radix 8 CORDIC has the highest LUT usage and lowest register usage while Hybrid Radix 4 CORDIC had the highest clock delay.

Digital Automatic Gain Control Integrated on WLAN Platform

In this work we present a solution for DAGC (Digital Automatic Gain Control) in WLAN receivers compatible to IEEE 802.11a/g standard. Those standards define communication in 5/2.4 GHz band using Orthogonal Frequency Division Multiplexing OFDM modulation scheme. WLAN Transceiver that we have used enables gain control over Low Noise Amplifier (LNA) and a Variable Gain Amplifier (VGA). The control over those signals is performed in our digital baseband processor using dedicated hardware block DAGC. DAGC in this process is used to automatically control the VGA and LNA in order to achieve better signal-to-noise ratio, decrease FER (Frame Error Rate) and hold the average power of the baseband signal close to the desired set point. DAGC function in baseband processor is done in few steps: measuring power levels of baseband samples of an RF signal,accumulating the differences between the measured power level and actual gain setting, adjusting a gain factor of the accumulation, and applying the adjusted gain factor the baseband values. Based on the measurement results of RSSI signal dependence to input power we have concluded that this digital AGC can be implemented applying the simple linearization of the RSSI. This solution is very simple but also effective and reduces complexity and power consumption of the DAGC. This DAGC is implemented and tested both in FPGA and in ASIC as a part of our WLAN baseband processor. Finally, we have integrated this circuit in a compact WLAN PCMCIA board based on MAC and baseband ASIC chips designed from us.

Diagnosing Dangerous Arrhythmia of Patients by Automatic Detecting of QRS Complexes in ECG

In this paper, an automatic detecting algorithm for QRS complex detecting was applied for analyzing ECG recordings and five criteria for dangerous arrhythmia diagnosing are applied for a protocol type of automatic arrhythmia diagnosing system. The automatic detecting algorithm applied in this paper detected the distribution of QRS complexes in ECG recordings and related information, such as heart rate and RR interval. In this investigation, twenty sampled ECG recordings of patients with different pathologic conditions were collected for off-line analysis. A combinative application of four digital filters for bettering ECG signals and promoting detecting rate for QRS complex was proposed as pre-processing. Both of hardware filters and digital filters were applied to eliminate different types of noises mixed with ECG recordings. Then, an automatic detecting algorithm of QRS complex was applied for verifying the distribution of QRS complex. Finally, the quantitative clinic criteria for diagnosing arrhythmia were programmed in a practical application for automatic arrhythmia diagnosing as a post-processor. The results of diagnoses by automatic dangerous arrhythmia diagnosing were compared with the results of off-line diagnoses by experienced clinic physicians. The results of comparison showed the application of automatic dangerous arrhythmia diagnosis performed a matching rate of 95% compared with an experienced physician-s diagnoses.

QSI Dynamical Fetch Policy for SMT

A Simultaneous Multithreading (SMT) Processor is capable of executing instructions from multiple threads in the same cycle. SMT in fact was introduced as a powerful architecture to superscalar to increase the throughput of the processor. Simultaneous Multithreading is a technique that permits multiple instructions from multiple independent applications or threads to compete limited resources each cycle. While the fetch unit has been identified as one of the major bottlenecks of SMT architecture, several fetch schemes were proposed by prior works to enhance the fetching efficiency and overall performance. In this paper, we propose a novel fetch policy called queue situation identifier (QSI) which counts some kind of long latency instructions of each thread each cycle then properly selects which threads to fetch next cycle. Simulation results show that in best case our fetch policy can achieve 30% on speedup and also can reduce the data cache level 1 miss rate.

Parallel Direct Integration Variable Step Block Method for Solving Large System of Higher Order Ordinary Differential Equations

The aim of this paper is to investigate the performance of the developed two point block method designed for two processors for solving directly non stiff large systems of higher order ordinary differential equations (ODEs). The method calculates the numerical solution at two points simultaneously and produces two new equally spaced solution values within a block and it is possible to assign the computational tasks at each time step to a single processor. The algorithm of the method was developed in C language and the parallel computation was done on a parallel shared memory environment. Numerical results are given to compare the efficiency of the developed method to the sequential timing. For large problems, the parallel implementation produced 1.95 speed-up and 98% efficiency for the two processors.

Optimization of SAD Algorithm on VLIW DSP

SAD (Sum of Absolute Difference) algorithm is heavily used in motion estimation which is computationally highly demanding process in motion picture encoding. To enhance the performance of motion picture encoding on a VLIW processor, an efficient implementation of SAD algorithm on the VLIW processor is essential. SAD algorithm is programmed as a nested loop with a conditional branch. In VLIW processors, loop is usually optimized by software pipelining, but researches on optimal scheduling of software pipelining for nested loops, especially nested loops with conditional branches are rare. In this paper, we propose an optimal scheduling and implementation of SAD algorithm with conditional branch on a VLIW DSP processor. The proposed optimal scheduling first transforms the nested loop with conditional branch into a single loop with conditional branch with consideration of full utilization of ILP capability of the VLIW processor and realization of earlier escape from the loop. Next, the proposed optimal scheduling applies a modulo scheduling technique developed for single loop. Based on this optimal scheduling strategy, optimal implementation of SAD algorithm on TMS320C67x, a VLIW DSP is presented. Through experiments on TMS320C6713 DSK, it is shown that H.263 encoder with the proposed SAD implementation performs better than other H.263 encoder with other SAD implementations, and that the code size of the optimal SAD implementation is small enough to be appropriate for embedded environments.

Sensorless Commutation Control of Switched Reluctance Motor

This paper addresses control of commutation of switched reluctance (SR) motor without the use of a physical position detector. Rotor position detection schemes for SR motor based on magnetisation characteristics of the motor use normal excitation or applied current /voltage pulses. The resulting schemes are referred to as passive or active methods respectively. The research effort is in realizing an economical sensorless SR rotor position detector that is accurate, reliable and robust to suit a particular application. An effective and reliable means of generating commutation signals of an SR motor based on inductance profile of its stator windings determined using active probing technique is presented. The scheme has been validated online using a 4-phase 8/6 SR motor and an 8-bit processor.

Performance Evaluation of a Limited Round-Robin System

Performance of a limited Round-Robin (RR) rule is studied in order to clarify the characteristics of a realistic sharing model of a processor. Under the limited RR rule, the processor allocates to each request a fixed amount of time, called a quantum, in a fixed order. The sum of the requests being allocated these quanta is kept below a fixed value. Arriving requests that cannot be allocated quanta because of such a restriction are queued or rejected. Practical performance measures, such as the relationship between the mean sojourn time, the mean number of requests, or the loss probability and the quantum size are evaluated via simulation. In the evaluation, the requested service time of an arriving request is converted into a quantum number. One of these quanta is included in an RR cycle, which means a series of quanta allocated to each request in a fixed order. The service time of the arriving request can be evaluated using the number of RR cycles required to complete the service, the number of requests receiving service, and the quantum size. Then an increase or decrease in the number of quanta that are necessary before service is completed is reevaluated at the arrival or departure of other requests. Tracking these events and calculations enables us to analyze the performance of our limited RR rule. In particular, we obtain the most suitable quantum size, which minimizes the mean sojourn time, for the case in which the switching time for each quantum is considered.

A Novel Implementation of Application Specific Instruction-set Processor (ASIP) using Verilog

The general purpose processors that are used in embedded systems must support constraints like execution time, power consumption, code size and so on. On the other hand an Application Specific Instruction-set Processor (ASIP) has advantages in terms of power consumption, performance and flexibility. In this paper, a 16-bit Application Specific Instruction-set processor for the sensor data transfer is proposed. The designed processor architecture consists of on-chip transmitter and receiver modules along with the processing and controlling units to enable the data transmission and reception on a single die. The data transfer is accomplished with less number of instructions as compared with the general purpose processor. The ASIP core operates at a maximum clock frequency of 1.132GHz with a delay of 0.883ns and consumes 569.63mW power at an operating voltage of 1.2V. The ASIP is implemented in Verilog HDL using the Xilinx platform on Virtex4.