Abstract: Currently, in the field of object posture estimation, there is research on estimating the position and angle of an object by storing a 3D model of the object to be estimated in advance in a computer and matching it with the model. However, in this research, we have succeeded in creating a module that is much simpler, smaller in scale, and faster in operation. Our 6D pose estimation model consists of two different networks – a classification network and a regression network. From a single RGB image, the trained model estimates the class of the object in the image, the coordinates of the object, and its rotation angle in 3D space. In addition, we compared the estimation accuracy of each camera position, i.e., the angle from which the object was captured. The highest accuracy was recorded when the camera position was 75°, the accuracy of the classification was about 87.3%, and that of regression was about 98.9%.
Abstract: Estimating the 6D pose of objects is a core step for robot bin-picking tasks. The problem is that various objects are usually randomly stacked with heavy occlusion in real applications. In this work, we propose a method to regress 6D poses by predicting three points for each object in the 3D point cloud through deep learning. To solve the ambiguity of symmetric pose, we propose a labeling method to help the network converge better. Based on the predicted pose, an iterative method is employed for pose optimization. In real-world experiments, our method outperforms the classical approach in both precision and recall.
Abstract: Human pose estimation and tracking are to accurately identify and locate the positions of human joints in the video. It is a computer vision task which is of great significance for human motion recognition, behavior understanding and scene analysis. There has been remarkable progress on human pose estimation in recent years. However, more researches are needed for human pose tracking especially for online tracking. In this paper, a framework, called PoseSRPN, is proposed for online single-person pose estimation and tracking. We use Siamese network attaching a pose estimation branch to incorporate Single-person Pose Tracking (SPT) and Visual Object Tracking (VOT) into one framework. The pose estimation branch has a simple network structure that replaces the complex upsampling and convolution network structure with deconvolution. By augmenting the loss of fully convolutional Siamese network with the pose estimation task, pose estimation and tracking can be trained in one stage. Once trained, PoseSRPN only relies on a single bounding box initialization and producing human joints location. The experimental results show that while maintaining the good accuracy of pose estimation on COCO and PoseTrack datasets, the proposed method achieves a speed of 59 frame/s, which is superior to other pose tracking frameworks.
Abstract: Falls are one of the major causes of injury and death
among elderly people aged 65 and above. A support system to
identify such kind of abnormal activities have become extremely
important with the increase in ageing population. Pose estimation
is a challenging task and to add more to this, it is even more
challenging when pose estimations are performed on challenging
poses that may occur during fall. Location of the body provides a
clue where the person is at the time of fall. This paper presents
a vision-based tracking strategy where available joints are grouped
into three different feature points depending upon the section they are
located in the body. The three feature points derived from different
joints combinations represents the upper region or head region,
mid-region or torso and lower region or leg region. Tracking is always
challenging when a motion is involved. Hence the idea is to locate
the regions in the body in every frame and consider it as the tracking
strategy. Grouping these joints can be beneficial to achieve a stable
region for tracking. The location of the body parts provides a crucial
information to distinguish normal activities from falls.
Abstract: Human motion capture has become one of the major
area of interest in the field of computer vision. Some of the major
application areas that have been rapidly evolving include the
advanced human interfaces, virtual reality and security/surveillance
systems. This study provides a brief overview of the techniques and
applications used for the markerless human motion capture, which
deals with analyzing the human motion in the form of mathematical
formulations. The major contribution of this research is that it
classifies the computer vision based techniques of human motion
capture based on the taxonomy, and then breaks its down into four
systematically different categories of tracking, initialization, pose
estimation and recognition. The detailed descriptions and the
relationships descriptions are given for the techniques of tracking and
pose estimation. The subcategories of each process are further
described. Various hypotheses have been used by the researchers in
this domain are surveyed and the evolution of these techniques have
been explained. It has been concluded in the survey that most
researchers have focused on using the mathematical body models for
the markerless motion capture.
Abstract: In this paper a simple terrain evaluation method for
hexapod robot is introduced. This method is based on feet coordinate
evaluation when all are on the ground. Depending on the feet
coordinate differences the local terrain evaluation is possible. Terrain
evaluation is necessary for right gait selection and/or body position
correction. For terrain roughness evaluation three planes are plotted:
two of them as definition points use opposite feet coordinates, third
coincides with the robot body plane. The leaning angle of body plane
is evaluated measuring gravity force using three-axis accelerometer.
Terrain roughness evaluation method is based on angle estimation
between normal vectors of these planes. Aim of this work is to
present a simple method for embedded robot controller, allowing to
find the best further movement settings.
Abstract: Human pose estimation can be executed using Active Shape Models. The existing techniques for applying to human-body research using Active Shape Models, such as human detection, primarily take the form of silhouette of human body. This technique is not able to estimate accurately for human pose to concern two arms and legs, as the silhouette of human body represents the shape as out of round. To solve this problem, we applied the human body model as stick-figure, “skeleton". The skeleton model of human body can give consideration to various shapes of human pose. To obtain effective estimation result, we applied background subtraction and deformed matching algorithm of primary Active Shape Models in the fitting process. The images which were used to make the model were 600 human bodies, and the model has 17 landmark points which indicate body junction and key features of human pose. The maximum iteration for the fitting process was 30 times and the execution time was less than .03 sec.
Abstract: This paper proposes a balance control scheme for a biped robot to trace an arbitrary path using image information. While moving, it estimates the zero moment point(ZMP) of the biped robot in the next step using a Kalman filter and renders an appropriate balanced pose of the robot. The ZMP can be calculated from the robot's pose, which is measured from the reference object image acquired by a CCD camera on the robot's head. For simplifying the kinematical model, the coordinates systems of individual joints of each leg are aligned and the robot motion is approximated as an inverted pendulum so that a simple linear dynamics, 3D-LIPM(3D-Linear Inverted Pendulum Mode) can be applied. The efficiency of the proposed algorithm has been proven by the experiments performed on unknown trajectory.
Abstract: Detection, feature extraction and pose estimation of
people in images and video is made challenging by the variability of
human appearance, the complexity of natural scenes and the high
dimensionality of articulated body models and also the important
field in Image, Signal and Vision Computing in recent years. In this
paper, four types of people in 2D dimension image will be tested and
proposed. The system will extract the size and the advantage of them
(such as: tall fat, short fat, tall thin and short thin) from image. Fat
and thin, according to their result from the human body that has been
extract from image, will be obtained. Also the system extract every
size of human body such as length, width and shown them in output.