Enhanced Planar Pattern Tracking for an Outdoor Augmented Reality System

In this paper, a scalable augmented reality framework for handheld devices is presented. The presented framework is enabled by using a server-client data communication structure, in which the search for tracking targets among a database of images is performed on the server-side while pixel-wise 3D tracking is performed on the client-side, which, in this case, is a handheld mobile device. Image search on the server-side adopts a residual-enhanced image descriptors representation that gives the framework a scalability property. The tracking algorithm on the client-side is based on a gravity-aligned feature descriptor which takes the advantage of a sensor-equipped mobile device and an optimized intensity-based image alignment approach that ensures the accuracy of 3D tracking. Automatic content streaming is achieved by using a key-frame selection algorithm, client working phase monitoring and standardized rules for content communication between the server and client. The recognition accuracy test performed on a standard dataset shows that the method adopted in the presented framework outperforms the Bag-of-Words (BoW) method that has been used in some of the previous systems. Experimental test conducted on a set of video sequences indicated the real-time performance of the tracking system with a frame rate at 15-30 frames per second. The presented framework is exposed to be functional in practical situations with a demonstration application on a campus walk-around.

High Level Synthesis of Canny Edge Detection Algorithm on Zynq Platform

Real time image and video processing is a demand in many computer vision applications, e.g. video surveillance, traffic management and medical imaging. The processing of those video applications requires high computational power. Thus, the optimal solution is the collaboration of CPU and hardware accelerators. In this paper, a Canny edge detection hardware accelerator is proposed. Edge detection is one of the basic building blocks of video and image processing applications. It is a common block in the pre-processing phase of image and video processing pipeline. Our presented approach targets offloading the Canny edge detection algorithm from processing system (PS) to programmable logic (PL) taking the advantage of High Level Synthesis (HLS) tool flow to accelerate the implementation on Zynq platform. The resulting implementation enables up to a 100x performance improvement through hardware acceleration. The CPU utilization drops down and the frame rate jumps to 60 fps of 1080p full HD input video stream.

Indicator of Small Calcification Detection in Ultrasonography using Decorrelation of Forward Scattered Waves

For the improvement of the ability in detecting small calcifications using Ultrasonography (US) we propose a novel indicator of calcifications in an ultrasound B-mode image without decrease in frame rate. Since the waveform of an ultrasound pulse changes at a calcification position, the decorrelation of adjacent scan lines occurs behind a calcification. Therefore, we employ the decorrelation of adjacent scan lines as an indicator of a calcification. The proposed indicator depicted wires 0.05 mm in diameter at 2 cm depth with a sensitivity of 86.7% and a specificity of 100%, which were hardly detected in ultrasound B-mode images. This study shows the potential of the proposed indicator to approximate the detectable calcification size using an US device to that of an X-ray imager, implying the possibility that an US device will become a convenient, safe, and principal clinical tool for the screening of breast cancer.

Multi-Element Synthetic Transmit Aperture Method in Medical Ultrasound Imaging

The paper presents the multi-element synthetic transmit aperture (MSTA) method with a small number of elements transmitting and all elements apertures in medical ultrasound imaging. As compared to the other methods MSTA allows to increase the system frame rate and provides the best compromise between penetration depth and lateral resolution. In the experiments a 128-element linear transducer array with 0.3 mm pitch excited by a burst pulse of 125 ns duration were used. The comparison of 2D ultrasound images of tissue mimicking phantom obtained using the STA and the MSTA methods is presented to demonstrate the benefits of the second approach. The results were obtained using SA algorithm with transmit and receive signals correction based on a single element directivity function.

Synthetic Transmit Aperture Method in Medical Ultrasonic Imaging

The work describes the use of a synthetic transmit aperture (STA) with a single element transmitting and all elements receiving in medical ultrasound imaging. STA technique is a novel approach to today-s commercial systems, where an image is acquired sequentially one image line at a time that puts a strict limit on the frame rate and the amount of data needed for high image quality. The STA imaging allows to acquire data simultaneously from all directions over a number of emissions, and the full image can be reconstructed. In experiments a 32-element linear transducer array with 0.48 mm inter-element spacing was used. Single element transmission aperture was used to generate a spherical wave covering the full image region. The 2D ultrasound images of wire phantom are presented obtained using the STA and commercial ultrasound scanner Antares to demonstrate the benefits of the SA imaging.

A Vehicular Visual Tracking System Incorporating Global Positioning System

Surveillance system is widely used in the traffic monitoring. The deployment of cameras is moving toward a ubiquitous camera (UbiCam) environment. In our previous study, a novel service, called GPS-VT, was firstly proposed by incorporating global positioning system (GPS) and visual tracking techniques for the UbiCam environment. The first prototype is called GODTA (GPS-based Moving Object Detection and Tracking Approach). For a moving person carried GPS-enabled mobile device, he can be tracking when he enters the field-of-view (FOV) of a camera according to his real-time GPS coordinate. In this paper, GPS-VT service is applied to the tracking of vehicles. The moving speed of a vehicle is much faster than a person. It means that the time passing through the FOV is much shorter than that of a person. Besides, the update interval of GPS coordinate is once per second, it is asynchronous with the frame rate of the real-time image. The above asynchronous is worsen by the network transmission delay. These factors are the main challenging to fulfill GPS-VT service on a vehicle.In order to overcome the influence of the above factors, a back-propagation neural network (BPNN) is used to predict the possible lane before the vehicle enters the FOV of a camera. Then, a template matching technique is used for the visual tracking of a target vehicle. The experimental result shows that the target vehicle can be located and tracking successfully. The success location rate of the implemented prototype is higher than that of the previous GODTA.

Estimation of Attenuation and Phase Delay in Driving Voltage Waveform of a Digital-Noiseless, Ultra-High-Speed Image Sensor

Since 2004, we have been developing an in-situ storage image sensor (ISIS) that captures more than 100 consecutive images at a frame rate of 10 Mfps with ultra-high sensitivity as well as the video camera for use with this ISIS. Currently, basic research is continuing in an attempt to increase the frame rate up to 100 Mfps and above. In order to suppress electro-magnetic noise at such high frequency, a digital-noiseless imaging transfer scheme has been developed utilizing solely sinusoidal driving voltages. This paper presents highly efficient-yet-accurate expressions to estimate attenuation as well as phase delay of driving voltages through RC networks of an ultra-high-speed image sensor. Elmore metric for a fundamental RC chain is employed as the first-order approximation. By application of dimensional analysis to SPICE data, we found a simple expression that significantly improves the accuracy of the approximation. Similarly, another simple closed-form model to estimate phase delay through fundamental RC networks is also obtained. Estimation error of both expressions is much less than previous works, only less 2% for most of the cases . The framework of this analysis can be extended to address similar issues of other VLSI structures.

Practical Issues for Real-Time Video Tracking

In this paper we present the algorithm which allows us to have an object tracking close to real time in Full HD videos. The frame rate (FR) of a video stream is considered to be between 5 and 30 frames per second. The real time track building will be achieved if the algorithm can follow 5 or more frames per second. The principle idea is to use fast algorithms when doing preprocessing to obtain the key points and track them after. The procedure of matching points during assignment is hardly dependent on the number of points. Because of this we have to limit pointed number of points using the most informative of them.

Modeling of Statistically Multiplexed Non Uniform Activity VBR Video

This paper reports the feasibility of the ARMA model to describe a bursty video source transmitting over a AAL5 ATM link (VBR traffic). The traffic represents the activity of the action movie "Lethal Weapon 3" transmitted over the ATM network using the Fore System AVA-200 ATM video codec with a peak rate of 100 Mbps and a frame rate of 25. The model parameters were estimated for a single video source and independently multiplexed video sources. It was found that the model ARMA (2, 4) is well-suited for the real data in terms of average rate traffic profile, probability density function, autocorrelation function, burstiness measure, and the pole-zero distribution of the filter model.

Bridging the Gap Between CBR and VBR for H264 Standard

This paper provides a flexible way of controlling Variable-Bit-Rate (VBR) of compressed digital video, applicable to the new H264 video compression standard. The entire video sequence is assessed in advance and the quantisation level is then set such that bit rate (and thus the frame rate) remains within predetermined limits compatible with the bandwidth of the transmission system and the capabilities of the remote end, while at the same time providing constant quality similar to VBR encoding. A process for avoiding buffer starvation by selectively eliminating frames from the encoded output at times when the frame rate is slow (large number of bits per frame) will be also described. Finally, the problem of buffer overflow will be solved by selectively eliminating frames from the received input to the decoder. The decoder detects the omission of the frames and resynchronizes the transmission by monitoring time stamps and repeating frames if necessary.

Content and Resources based Mobile and Wireless Video Transcoding

Delivering streaming video over wireless is an important component of many interactive multimedia applications running on personal wireless handset devices. Such personal devices have to be inexpensive, compact, and lightweight. But wireless channels have a high channel bit error rate and limited bandwidth. Delay variation of packets due to network congestion and the high bit error rate greatly degrades the quality of video at the handheld device. Therefore, mobile access to multimedia contents requires video transcoding functionality at the edge of the mobile network for interworking with heterogeneous networks and services. Therefore, to guarantee quality of service (QoS) delivered to the mobile user, a robust and efficient transcoding scheme should be deployed in mobile multimedia transporting network. Hence, this paper examines the challenges and limitations that the video transcoding schemes in mobile multimedia transporting network face. Then handheld resources, network conditions and content based mobile and wireless video transcoding is proposed to provide high QoS applications. Exceptional performance is demonstrated in the experiment results. These experiments were designed to verify and prove the robustness of the proposed approach. Extensive experiments have been conducted, and the results of various video clips with different bit rate and frame rate have been provided.

FPGA Implement of a Vision Based Lane Departure Warning System

Using vision based solution in intelligent vehicle application often needs large memory to handle video stream and image process which increase complexity of hardware and software. In this paper, we present a FPGA implement of a vision based lane departure warning system. By taking frame of videos, the line gradient of line is estimated and the lane marks are found. By analysis the position of lane mark, departure of vehicle will be detected in time. This idea has been implemented in Xilinx Spartan6 FPGA. The lane departure warning system used 39% logic resources and no memory of the device. The average availability is 92.5%. The frame rate is more than 30 frames per second (fps).

GPU-Based Volume Rendering for Medical Imagery

We present a method for fast volume rendering using graphics hardware (GPU). To our knowledge, it is the first implementation on the GPU. Based on the Shear-Warp algorithm, our GPU-based method provides real-time frame rates and outperforms the CPU-based implementation. When the number of slices is not sufficient, we add in-between slices computed by interpolation. This improves then the quality of the rendered images. We have also implemented the ray marching algorithm on the GPU. The results generated by the three algorithms (CPU-based and GPU-based Shear- Warp, GPU-based Ray Marching) for two test models has proved that the ray marching algorithm outperforms the shear-warp methods in terms of speed up and image quality.

Optimum Signal-to-noise Ratio Performance of Electron Multiplying Charge Coupled Devices

Electron multiplying charge coupled devices (EMCCDs) have revolutionized the world of low light imaging by introducing on-chip multiplication gain based on the impact ionization effect in the silicon. They combine the sub-electron readout noise with high frame rates. Signal-to-noise Ratio (SNR) is an important performance parameter for low-light-level imaging systems. This work investigates the SNR performance of an EMCCD operated in Non-inverted Mode (NIMO) and Inverted Mode (IMO). The theory of noise characteristics and operation modes is presented. The results show that the SNR of is determined by dark current and clock induced charge at high gain level. The optimum SNR performance is provided by an EMCCD operated in NIMO in short exposure and strong cooling applications. In contrast, an IMO EMCCD is preferable.

FPGA based Relative Distance Measurement using Stereo Vision Technology

In this paper, we propose a novel concept of relative distance measurement using Stereo Vision Technology and discuss its implementation on a FPGA based real-time image processor. We capture two images using two CCD cameras and compare them. Disparity is calculated for each pixel using a real time dense disparity calculation algorithm. This algorithm is based on the concept of indexed histogram for matching. Disparity being inversely proportional to distance (Proved Later), we can thus get the relative distances of objects in front of the camera. The output is displayed on a TV screen in the form of a depth image (optionally using pseudo colors). This system works in real time on a full PAL frame rate (720 x 576 active pixels @ 25 fps).

Retrospective Synthetic Focusing with Correlation Weighting for Very High Frame Rate Ultrasound

The need of high frame-rate imaging has been triggered by the new applications of ultrasound imaging to transient elastography and real-time 3D ultrasound. Using plane wave excitation (PWE) is one of the methods to achieve very high frame-rate imaging since an image can be formed with a single insonification. However, due to the lack of transmit focusing, the image quality with PWE is lower compared with those using conventional focused transmission. To solve this problem, we propose a filter-retrieved transmit focusing (FRF) technique combined with cross-correlation weighting (FRF+CC weighting) for high frame-rate imaging with PWE. A restrospective focusing filter is designed to simultaneously minimize the predefined sidelobe energy associated with single PWE and the filter energy related to the signal-to-noise-ratio (SNR). This filter attempts to maintain the mainlobe signals and to reduce the sidelobe ones, which gives similar mainlobe signals and different sidelobes between the original PWE and the FRF baseband data. Normalized cross-correlation coefficient at zero lag is calculated to quantify the degree of similarity at each imaging point and used as a weighting matrix to the FRF baseband data to further suppress sidelobes, thus improving the filter-retrieved focusing quality.

FPGA Implementation of a Vision-Based Blind Spot Warning System

Vision-based intelligent vehicle applications often require large amounts of memory to handle video streaming and image processing, which in turn increases complexity of hardware and software. This paper presents an FPGA implement of a vision-based blind spot warning system. Using video frames, the information of the blind spot area turns into one-dimensional information. Analysis of the estimated entropy of image allows the detection of an object in time. This idea has been implemented in the XtremeDSP video starter kit. The blind spot warning system uses only 13% of its logic resources and 95k bits block memory, and its frame rate is over 30 frames per sec (fps).

Transmit Sub-aperture Optimization in MSTA Ultrasound Imaging Method

The paper presents the optimization problem for the multi-element synthetic transmit aperture method (MSTA) in ultrasound imaging applications. The optimal choice of the transmit aperture size is performed as a trade-off between the lateral resolution, penetration depth and the frame rate. Results of the analysis obtained by a developed optimization algorithm are presented. Maximum penetration depth and the best lateral resolution at given depths are chosen as the optimization criteria. The optimization algorithm was tested using synthetic aperture data of point reflectors simulated by Filed II program for MatlabĀ® for the case of 5MHz 128-element linear transducer array with 0.48 mm pitch are presented. The visualization of experimentally obtained synthetic aperture data of a tissue mimicking phantom and in vitro measurements of the beef liver are also shown. The data were obtained using the SonixTOUCH Research systemequipped with a linear 4MHz 128 element transducerwith 0.3 mm element pitch, 0.28 mm element width and 70% fractional bandwidth was excited by one sine cycle pulse burst of transducer's center frequency.