Abstract: In this paper, a scalable augmented reality framework for handheld devices is presented. The presented framework is enabled by using a server-client data communication structure, in which the search for tracking targets among a database of images is performed on the server-side while pixel-wise 3D tracking is performed on the client-side, which, in this case, is a handheld mobile device. Image search on the server-side adopts a residual-enhanced image descriptors representation that gives the framework a scalability property. The tracking algorithm on the client-side is based on a gravity-aligned feature descriptor which takes the advantage of a sensor-equipped mobile device and an optimized intensity-based image alignment approach that ensures the accuracy of 3D tracking. Automatic content streaming is achieved by using a key-frame selection algorithm, client working phase monitoring and standardized rules for content communication between the server and client. The recognition accuracy test performed on a standard dataset shows that the method adopted in the presented framework outperforms the Bag-of-Words (BoW) method that has been used in some of the previous systems. Experimental test conducted on a set of video sequences indicated the real-time performance of the tracking system with a frame rate at 15-30 frames per second. The presented framework is exposed to be functional in practical situations with a demonstration application on a campus walk-around.
Abstract: Real time image and video processing is a demand in
many computer vision applications, e.g. video surveillance, traffic
management and medical imaging. The processing of those video
applications requires high computational power. Thus, the optimal
solution is the collaboration of CPU and hardware accelerators. In
this paper, a Canny edge detection hardware accelerator is proposed.
Edge detection is one of the basic building blocks of video and image
processing applications. It is a common block in the pre-processing
phase of image and video processing pipeline. Our presented
approach targets offloading the Canny edge detection algorithm from
processing system (PS) to programmable logic (PL) taking the
advantage of High Level Synthesis (HLS) tool flow to accelerate the
implementation on Zynq platform. The resulting implementation
enables up to a 100x performance improvement through hardware
acceleration. The CPU utilization drops down and the frame rate
jumps to 60 fps of 1080p full HD input video stream.
Abstract: For the improvement of the ability in detecting
small calcifications using Ultrasonography (US) we propose a
novel indicator of calcifications in an ultrasound B-mode image
without decrease in frame rate. Since the waveform of an
ultrasound pulse changes at a calcification position, the
decorrelation of adjacent scan lines occurs behind a
calcification. Therefore, we employ the decorrelation of
adjacent scan lines as an indicator of a calcification. The
proposed indicator depicted wires 0.05 mm in diameter at 2 cm
depth with a sensitivity of 86.7% and a specificity of 100%,
which were hardly detected in ultrasound B-mode images. This
study shows the potential of the proposed indicator to
approximate the detectable calcification size using an US
device to that of an X-ray imager, implying the possibility that
an US device will become a convenient, safe, and principal
clinical tool for the screening of breast cancer.
Abstract: The paper presents the multi-element synthetic
transmit aperture (MSTA) method with a small number of elements
transmitting and all elements apertures in medical ultrasound
imaging. As compared to the other methods MSTA allows to
increase the system frame rate and provides the best compromise
between penetration depth and lateral resolution.
In the experiments a 128-element linear transducer array with
0.3 mm pitch excited by a burst pulse of 125 ns duration were used.
The comparison of 2D ultrasound images of tissue mimicking
phantom obtained using the STA and the MSTA methods is
presented to demonstrate the benefits of the second approach. The
results were obtained using SA algorithm with transmit and receive
signals correction based on a single element directivity function.
Abstract: The work describes the use of a synthetic transmit
aperture (STA) with a single element transmitting and all elements
receiving in medical ultrasound imaging. STA technique is a novel
approach to today-s commercial systems, where an image is acquired
sequentially one image line at a time that puts a strict limit on the
frame rate and the amount of data needed for high image quality. The
STA imaging allows to acquire data simultaneously from all
directions over a number of emissions, and the full image can be
reconstructed.
In experiments a 32-element linear transducer array with 0.48 mm
inter-element spacing was used. Single element transmission aperture
was used to generate a spherical wave covering the full image region.
The 2D ultrasound images of wire phantom are presented obtained
using the STA and commercial ultrasound scanner Antares to
demonstrate the benefits of the SA imaging.
Abstract: Surveillance system is widely used in the traffic
monitoring. The deployment of cameras is moving toward a
ubiquitous camera (UbiCam) environment. In our previous study, a
novel service, called GPS-VT, was firstly proposed by incorporating
global positioning system (GPS) and visual tracking techniques for
the UbiCam environment. The first prototype is called GODTA
(GPS-based Moving Object Detection and Tracking Approach). For a
moving person carried GPS-enabled mobile device, he can be
tracking when he enters the field-of-view (FOV) of a camera
according to his real-time GPS coordinate. In this paper, GPS-VT
service is applied to the tracking of vehicles. The moving speed of a
vehicle is much faster than a person. It means that the time passing
through the FOV is much shorter than that of a person. Besides, the
update interval of GPS coordinate is once per second, it is
asynchronous with the frame rate of the real-time image. The above
asynchronous is worsen by the network transmission delay. These
factors are the main challenging to fulfill GPS-VT service on a
vehicle.In order to overcome the influence of the above factors, a
back-propagation neural network (BPNN) is used to predict the
possible lane before the vehicle enters the FOV of a camera. Then, a
template matching technique is used for the visual tracking of a target
vehicle. The experimental result shows that the target vehicle can be
located and tracking successfully. The success location rate of the
implemented prototype is higher than that of the previous GODTA.
Abstract: Since 2004, we have been developing an in-situ storage image sensor (ISIS) that captures more than 100 consecutive images at a frame rate of 10 Mfps with ultra-high sensitivity as well as the video camera for use with this ISIS. Currently, basic research is continuing in an attempt to increase the frame rate up to 100 Mfps and above. In order to suppress electro-magnetic noise at such high frequency, a digital-noiseless imaging transfer scheme has been developed utilizing solely sinusoidal driving voltages. This paper presents highly efficient-yet-accurate expressions to estimate attenuation as well as phase delay of driving voltages through RC networks of an ultra-high-speed image sensor. Elmore metric for a fundamental RC chain is employed as the first-order approximation. By application of dimensional analysis to SPICE data, we found a simple expression that significantly improves the accuracy of the approximation. Similarly, another simple closed-form model to estimate phase delay through fundamental RC networks is also obtained. Estimation error of both expressions is much less than previous works, only less 2% for most of the cases . The framework of this analysis can be extended to address similar issues of other VLSI structures.
Abstract: In this paper we present the algorithm which allows
us to have an object tracking close to real time in Full HD videos.
The frame rate (FR) of a video stream is considered to be between
5 and 30 frames per second. The real time track building will be
achieved if the algorithm can follow 5 or more frames per second. The
principle idea is to use fast algorithms when doing preprocessing to
obtain the key points and track them after. The procedure of matching
points during assignment is hardly dependent on the number of points.
Because of this we have to limit pointed number of points using the
most informative of them.
Abstract: This paper reports the feasibility of the ARMA model
to describe a bursty video source transmitting over a AAL5 ATM link
(VBR traffic). The traffic represents the activity of the action movie
"Lethal Weapon 3" transmitted over the ATM network using the Fore
System AVA-200 ATM video codec with a peak rate of 100 Mbps
and a frame rate of 25. The model parameters were estimated for a
single video source and independently multiplexed video sources. It
was found that the model ARMA (2, 4) is well-suited for the real data
in terms of average rate traffic profile, probability density function,
autocorrelation function, burstiness measure, and the pole-zero
distribution of the filter model.
Abstract: This paper provides a flexible way of controlling
Variable-Bit-Rate (VBR) of compressed digital video, applicable to
the new H264 video compression standard. The entire video
sequence is assessed in advance and the quantisation level is then set
such that bit rate (and thus the frame rate) remains within
predetermined limits compatible with the bandwidth of the
transmission system and the capabilities of the remote end, while at
the same time providing constant quality similar to VBR encoding.
A process for avoiding buffer starvation by selectively eliminating
frames from the encoded output at times when the frame rate is slow
(large number of bits per frame) will be also described. Finally, the
problem of buffer overflow will be solved by selectively eliminating
frames from the received input to the decoder. The decoder detects
the omission of the frames and resynchronizes the transmission by
monitoring time stamps and repeating frames if necessary.
Abstract: Delivering streaming video over wireless is an
important component of many interactive multimedia applications
running on personal wireless handset devices. Such personal devices
have to be inexpensive, compact, and lightweight. But wireless
channels have a high channel bit error rate and limited bandwidth.
Delay variation of packets due to network congestion and the high bit
error rate greatly degrades the quality of video at the handheld
device. Therefore, mobile access to multimedia contents requires
video transcoding functionality at the edge of the mobile network for
interworking with heterogeneous networks and services. Therefore,
to guarantee quality of service (QoS) delivered to the mobile user, a
robust and efficient transcoding scheme should be deployed in
mobile multimedia transporting network. Hence, this paper
examines the challenges and limitations that the video transcoding
schemes in mobile multimedia transporting network face. Then
handheld resources, network conditions and content based mobile
and wireless video transcoding is proposed to provide high QoS
applications. Exceptional performance is demonstrated in the
experiment results. These experiments were designed to verify and
prove the robustness of the proposed approach. Extensive
experiments have been conducted, and the results of various video
clips with different bit rate and frame rate have been provided.
Abstract: Using vision based solution in intelligent vehicle application often needs large memory to handle video stream and image process which increase complexity of hardware and software. In this paper, we present a FPGA implement of a vision based lane departure warning system. By taking frame of videos, the line gradient of line is estimated and the lane marks are found. By analysis the position of lane mark, departure of vehicle will be detected in time. This idea has been implemented in Xilinx Spartan6 FPGA. The lane departure warning system used 39% logic resources and no memory of the device. The average availability is 92.5%. The frame rate is more than 30 frames per second (fps).
Abstract: We present a method for fast volume rendering using
graphics hardware (GPU). To our knowledge, it is the first implementation
on the GPU. Based on the Shear-Warp algorithm, our
GPU-based method provides real-time frame rates and outperforms
the CPU-based implementation. When the number of slices is not
sufficient, we add in-between slices computed by interpolation. This
improves then the quality of the rendered images. We have also
implemented the ray marching algorithm on the GPU. The results
generated by the three algorithms (CPU-based and GPU-based Shear-
Warp, GPU-based Ray Marching) for two test models has proved that
the ray marching algorithm outperforms the shear-warp methods in
terms of speed up and image quality.
Abstract: Electron multiplying charge coupled devices (EMCCDs) have revolutionized the world of low light imaging by introducing on-chip multiplication gain based on the impact ionization effect in the silicon. They combine the sub-electron readout noise with high frame rates. Signal-to-noise Ratio (SNR) is an important performance parameter for low-light-level imaging systems. This work investigates the SNR performance of an EMCCD operated in Non-inverted Mode (NIMO) and Inverted Mode (IMO). The theory of noise characteristics and operation modes is presented. The results show that the SNR of is determined by dark current and clock induced charge at high gain level. The optimum SNR performance is provided by an EMCCD operated in NIMO in short exposure and strong cooling applications. In contrast, an IMO EMCCD is preferable.
Abstract: In this paper, we propose a novel concept of relative
distance measurement using Stereo Vision Technology and discuss
its implementation on a FPGA based real-time image processor. We
capture two images using two CCD cameras and compare them.
Disparity is calculated for each pixel using a real time dense disparity
calculation algorithm. This algorithm is based on the concept of
indexed histogram for matching. Disparity being inversely
proportional to distance (Proved Later), we can thus get the relative
distances of objects in front of the camera. The output is displayed on
a TV screen in the form of a depth image (optionally using pseudo
colors). This system works in real time on a full PAL frame rate (720
x 576 active pixels @ 25 fps).
Abstract: The need of high frame-rate imaging has been triggered by the new applications of ultrasound imaging to transient elastography and real-time 3D ultrasound. Using plane wave excitation (PWE) is one of the methods to achieve very high frame-rate imaging since an image can be formed with a single insonification. However, due to the lack of transmit focusing, the image quality with PWE is lower compared with those using conventional focused transmission. To solve this problem, we propose a filter-retrieved transmit focusing (FRF) technique combined with cross-correlation weighting (FRF+CC weighting) for high frame-rate imaging with PWE. A restrospective focusing filter is designed to simultaneously minimize the predefined sidelobe energy associated with single PWE and the filter energy related to the signal-to-noise-ratio (SNR). This filter attempts to maintain the mainlobe signals and to reduce the sidelobe ones, which gives similar mainlobe signals and different sidelobes between the original PWE and the FRF baseband data. Normalized cross-correlation coefficient at zero lag is calculated to quantify the degree of similarity at each imaging point and used as a weighting matrix to the FRF baseband data to further suppress sidelobes, thus improving the filter-retrieved focusing quality.
Abstract: Vision-based intelligent vehicle applications often require large amounts of memory to handle video streaming and image processing, which in turn increases complexity of hardware and software. This paper presents an FPGA implement of a vision-based blind spot warning system. Using video frames, the information of the blind spot area turns into one-dimensional information. Analysis of the estimated entropy of image allows the detection of an object in time. This idea has been implemented in the XtremeDSP video starter kit. The blind spot warning system uses only 13% of its logic resources and 95k bits block memory, and its frame rate is over 30 frames per sec (fps).
Abstract: The paper presents the optimization problem for the
multi-element synthetic transmit aperture method (MSTA) in
ultrasound imaging applications. The optimal choice of the transmit
aperture size is performed as a trade-off between the lateral
resolution, penetration depth and the frame rate. Results of the
analysis obtained by a developed optimization algorithm are
presented. Maximum penetration depth and the best lateral resolution
at given depths are chosen as the optimization criteria. The
optimization algorithm was tested using synthetic aperture data of
point reflectors simulated by Filed II program for MatlabĀ® for the
case of 5MHz 128-element linear transducer array with 0.48 mm
pitch are presented. The visualization of experimentally obtained
synthetic aperture data of a tissue mimicking phantom and in vitro
measurements of the beef liver are also shown. The data were
obtained using the SonixTOUCH Research systemequipped with a
linear 4MHz 128 element transducerwith 0.3 mm element pitch, 0.28
mm element width and 70% fractional bandwidth was excited by one
sine cycle pulse burst of transducer's center frequency.