James
Little
"Maps, Places, and
Worlds for Robots"
(Computer
Science, University of British Columbia, Vancouver, BC, Canada)
Vision is a powerful sense that permits a robot to look around itself
and gather information both about the'immediate present and the near
future. The future arrives through the more distant physical space
through which the robot can move, and the possible actions and events
that may arise. To know the future a robot needs to parse the world
with the aid of its models and experience. Lasers and other active
sensors have proven their ability to provide accurate geometric
information. But context and the meaning of the space surrounding the
robot, the objects and the actions they permit are only accessible with
the more complete sensory input of vision. Vision as a sensor is
computationally demanding, but resources have improved to make it
practical. Moreover there has been a convergence of the interests of
roboticists and vision scientists – both want to explore and act in the
world. Many of us have accepted that we must learn the patterns of data
using machine learning, but we must also integrate categorical
descriptions of our world, prototypical information that no individual
robot is yet capable of learning. Vision provides the anchoring for
concepts. I will discuss recent advances and trends linking vision and
robotics through spatial descriptions and the connections with objects,
actions, and meaning.
|
Zoran
Zivkovic and Ben Krose
"Part Based People
Detection on a Mobile Robot"
(University of Amsterdam, The Netherlands)
We design a robust people detection module for a mobile
robot inspired by the latest results on the part-based representations
from the computer vision area. The approach is based on the
probabilistic combination of fast human body part detectors. The
representation is robust to partial occlusions, part detector false
alarms and missed detections of body parts. Furthermore, we show how to
use the fact that the persons walk on a known floor plane to detect
them more reliably and efficiently. Finally, we show how our framework
can be used to combine information from different sensors.
|
C. Doignon, F. Nageotte, B. Maurin and A. Krupa
"Model-based
3-D Pose Estimation and Feature Tracking for Robot Assisted Surgery
with Medical Imaging"
(Louis
Pasteur University of Strasbourg, ENSPS, Illkirch, France,
Cerebellum Automation Company Chavanod, France and
IRISA - INRIA Rennes, France)
In this paper we address the problem of the pose
estimation based on multiple geometrical features for monocular
endoscopic vision with laparoscopes and for stereotaxy with CT
scanners. Partial and full pose estimation (6 dofs) are considered with
applications to minimally invasive surgery. At the University of
Strasbourg, we have been developing a set of techniques for assisting
surgeons in navigating and manipulating the three-dimensional space
within the human body. In order to develop such systems, a variety of
challenging visual tracking and registration problems with
pre-operative and/or intra-operative images must be solved. This paper
integrates several issues where computational vision can play a role.
Depth recovery (from the tip of a surgical instrument w.r.t. living
tissue), the Plucker coordinates (4 dofs) of a markerless cylindrical
instrument, the 6 dofs of a needle-holder with an heterogeneous set of
features and stereotaxy are the examples we describe. Projective
invariants with perspective projection, quadrics of revolution and
stereotactic markers are features which are useful to achieve the
registration with uncalibrated or calibrated devices. Visual
servoing-based tracking methods have been developed for image-guided
robotic systems, for assisting surgeons in laparoscopic surgery and in
interventional radiology. Real-time endoscopic vision and single-slice
stereotactic registration has been proposed to retrieve the
out-of-field of view instruments, to position a needle and to
compensate small displacements like those due to patient breathing or
any small disturbances which may occur during an image-guided surgical
procedure.
|
David Hogg
"Reasoning
and Vision"
(School of Computing,
University of Leeds, UK )
Representations and mechanisms of conceptual
reasoning have traditionally been at the heart of research on
artificial intelligence, and generally (but not always) absent from
approaches to computer vision dealing with natural images and video.
The talk will examine recent work that is attempting to integrate
conceptual reasoning with computer vision in solving problems in
tracking, object recognition, behaviour analysis, and human-machine
interaction. Two important issues are the forms of representation used
and the role of machine learning, contributing to the development of
adaptive systems.
|
Ruben Smits, Duccio Fioravanti, Tinne De Laet,
Benedetto Allotta, Herman Bruyninckx and Joris De Schutter
"Image-Based Visual Servoing
with Extra Task Related Constraints in a General Framework for
Sensor-Based Robot Systems"
(Department of Mechanical
Engineering, Katholieke Universiteit Leuven, Belgium and
Department of Energetics “Sergio Stecco”, Universita degli Studi di
Firenze, Italy)
This paper reformulates
image-based visual servoing (IBVS) as a constraint-based robot task, in
order to integrate it seamlessly with other task constraints, in image
space, in Cartesian space, in the joint space of the robot, or in the
“image space” of any other sensor (e.g. force, distance). In this way
the different sensor data is fused. The integration takes place via the
specification of generic “feature coordinates”, defined in the
different task spaces. Control loops are closed around the feature
coordinate setpoints, in each of these task spaces, and instantaneously
combined into setpoints for a velocity controlled robot that executes
the task. The paper describes real world experimental results for
image-based visual tracking with extra Cartesian constraints. During
the workshop, many more examples will be given, with constraints in all
different task spaces.
|
Daniel Aarno,
Johan Sommerfeld, Danica Kragic, Nicolas Pugeault, Sinan Kalkan,
Florentin Wörgötter, Dirk Kraft, Norbert Krüger
"Early Reactive
Grasping with Second Order 3D Feature Relations"
(Royal Institute of
Technology, Sweden,
University of Edinburgh, UK
University of Göttingen, Germany
Sydansk University and Aalborg University, Denmark)
One of the main challenges in the field of robotics is to
make robots ubiquitous. To intelligently interact with the world, such
robots need to understand the environment and situations around them
and react appropriately, they need
context-awareness. But how to equip robots with capabilities of
gathering and interpreting the necessary information for novel tasks
through interaction with the environment and by providing some minimal
knowledge in advance? This has been a longterm question and one of the
main drives in the field of cognitive system development.
The main idea behind the work presented in this paper is that the robot
should, like a human infant, learn about objects by interacting with
them, forming representations of the objects and their categories that
are grounded in its embodiment. For this purpose, we study an early
learning of object grasping process where the agent, based on a set of
innate reflexes and knowledge about its embodiment. We stress out that
this is not the work on grasping, it is a system that interacts with
the environment based on relations of 3D visual features generated
trough a stereo vision system. We show how geometry, appearance and
spatial relations between the features can guide early reactive
grasping which can later on be used in a more purposive manner when
interacting with the environment.
|
Dov
Katz and Oliver Brock
"Interactive
Perception: Closing the Gap Between Action and Perception"
(Robotics and
Biology Laboratory, Department of Computer Science, University of
Massachusetts Amherst)
We introduce Interactive Perception as a new perceptual paradigm for
autonomous robotics in unstructured environments. Interactive
perception augments the process of perception with physical
interactions, thus integrating robotics and computer vision. By
integrating interactions into the perceptual process, it is possible to
manipulate the environment so as to uncover information relevant for
the robust and reliable execution of a task. Examples of such
interactions include the removal of obstructions or object
repositioning to improve lighting conditions. More importantly,
forceful interaction can uncover perceptual information that would
otherwise be imperceivable. In this paper, we begin to explore the
potential of the interactive perception paradigm. We present an
interactive perceptual primitive that extracts kinematic models from
objects in the environment. Many objects in everyday environments, such
as doors, drawers, and hand tools, contain inherent kinematic degrees
of freedom. Knowledge of these degrees of freedom is required to use
the objects in their intended manner. We demonstrate how a robot is
capable of extracting a kinematic model from a variety of tools, using
very simple algorithms. We then show how the robot can use the
resulting kinematic model to operate the tool. The simplicity of these
algorithms and their effectiveness in our experiments indicate that
Interactive Perception is a promising perceptual paradigm for
autonomous robotics.
|
Jiri
Matas and Jan Sochman
"Wald’s
Sequential Analysis for Time-constrained Vision Problems"
(Center for Machine Perception, Dept. of Cybernetics, Faculty of
Elec. Eng.
Czech Technical University in Prague, Prague, Czech Rep.)
In detection and matching problems in
computer vision, both classification errors and time to decision
characterize the quality of an algorithmic solution. We show how to
formalize such problems in the framework of sequential decisionmaking
and derive quasi-optimal time-constrained solutions for three vision
problems. The methodology is applied to face and interest point
detection and to the RANSAC robust estimator. Error rates of the face
detector proposed algorithm are comparable to the state-of-the-art
methods. In the interest point application, the output of the
Hessian-Laplace detector is approximated by a sequential WaldBoost
classifier which is about five times faster than the original with
comparable repeatability. A sequential strategy based on Wald’s SPRT
for evaluation of model quality in RANSAC leads to significant speed-up
in geometric matching problems.
|
Gabe
Sibley, Larry Matthies and Gaurav Sukhatme
"Constant Time
Sliding Window Filter SLAM as a Basis for Metric
Visual
Perception"
(Jet Propulsion Laboratory, California Institute of
Technology, Pasadena California and
Robotic and Embedded Systems Laboratory, University of Southern
California, Los Angeles, California)
This paper describes a Sliding Window Filter (SWF) that is an on-line
constant-time approximation to the feature-based 6-degree-of-freedom
full Batch Least Squares Simultaneous Localization and Mapping (SLAM)
problem. The ultimate goal is to develop a filter that can quickly and
optimally fuse all data from a sequence of (stereo) images into a
single underlying statistically accurate and precise spatial
representation. Such a capability is highly desirable for mobile
robots, though it is a computationally intense dense data-fusion
problem. The SWF is useful in this context because it can scale from
exhaustive batch solutions to fast incremental solutions. For instance
if the window encompasses all time, the solution is algebraically
equivalent to full SLAM; if only one time step is maintained, the
solution is algebraically equivalent to the Extended Kalman Filter SLAM
solution; if robot poses and environment landmarks are slowly
marginalized out over time such that the state vector ceases to grow,
then the filter becomes constant time, like Visual Odometry.
Interestingly, the SWF enables other properties, such as continuous
submapping, lazy data association, undelayed or delayed landmark
initialization, and incremental robust estimation. We test the
algorithm in simulations using stereo vision exterioceptive sensors and
inertial measurement proprioceptive sensors. Initial experiments show
qualitatively that the SWF approaches the performance of the optimal
batch estimator, even for small windows on the order of 5-10 frames.
|
Simon Lacroix, Thomas Lemaire and Cyrille
Berger
"More
Vision for
SLAM"
(LAAS-CNRS, Toulouse, France)
Many progresses have been made on SLAM so far, and various
visual SLAM approaches have proven their effectiveness in realistic
scenarios. However, there are many other improvements that can be made
to SLAM thanks to vision, essentially for the mapping and data
association functionalities. This paper sketches some of these
improvements, on the basis of recent work in the literature and of
on-going work.
|
A. C. Murillo, J. J. Guerrero and C. Sagues
"Topological
and Metric
Robot Localization through Computer Vision Techniques"
(DIIS - I3A, University
of Zaragoza, Spain)
Vision based robotics
applications have been widely studied in the last years. However, there
is still a certain distance between these and the pure computer vision
methods, although there are many issues of common interest in computer
vision and robotics. For example, object recognition and scene
recognition are closely related, which makes object recognition methods
quite suitable for robot topological localization, e.g. room
recognition. Another important issue in computer vision, the structure
from motion problem SFM, is similar to the Simultaneous Localization
and Mapping problem. This work is based on previous ones where computer
vision techniques are applied for robot self-localization: a vision
based method applied for room recognition and an approach to obtain
metric localization from SFM algorithms for bearing only data. Several
experiments are shown for both kinds of localization, room
identification and metric localization, using different image features
and data sets of conventional and omnidirectional cameras.
|
Darius
Burschka
"Towards Robust Vision-Based Navigation Systems"
(Lab for Robotics and Real-Time Systems, Department of Informatics,
Technische Universitat Munchen, Germany)
We
present our work in the field of vision-based navigation from video sequences and
discuss our current challenges
in this field. We discuss the advantages and disadvantages of the different navigation techniques.
Our challenge is to develop a
low-cost navigation system based on video cameras that can be scaled depending on the
required accuracies and
available resources. A minimal
configuration consists of a single video camera that can be extended by
inertial units, laser systems
and additional cameras. We
include experimental validations of the proposed system. The algorithm does not have any range
limitations present in
similar, sampling-based algorithms.
|
T. Asfour, K. Welke, A. Ude, P. Azad, Jan. Hoeft and R.Dillmann :
"Perceiving Objects and Movements to Generate
Actions on a Humanoid Robot"
(University of Karlsruhe, Germany and Jozef Stefan Institute, Slovenia)
Imitation learning has been suggested as a promising way to teach
humanoid robots. In this paper we present a new humanoid active head
which features human-like characteristics in motion and response and
mimics the human visual system. We present algorithms that can be
applied
to perceive objects and movements, which form the basis for learning
actions on the humanoid. For action representation we use an
HMM-based approach to reproduce the observed movements and build an
action library. Hidden Markov Models (HMM) are used to represent
movements demonstrated to a robot multiple times. They are trained
with the characteristic features (key points) of each demonstration.
We propose strategies for adaptation of movements to the given
situation and for the interpolation between movements in a stored in
a movement library.
|