World & interaction models
Robots are controlled by computers, running all sorts of algorithms to detect how the world looks like, and to calculate an appropriate action for the robot to do next. In many cases, the decision to perform a certain action is based on the information in a model, i.e., a mathematical representation of all knowledge that the robot has gathered about its environment and itself, and about how an action by the robot will change the world. The robot can obtain this knowledge from processing data it got from its sensors, or the knowledge is available because is has been explicitly programmed by the humans making the robot. For a robot operating autonomously in an unstructured environment, it is essential that, besides models of itself, its controller has access to, and/or be able to generate, models of that environment (so-called “maps”, and of its interaction with that environment.
The requirements about the content of models depend on the capabilities and the task of the robot. So, there exists no such thing as “the” map of the world.
In the simplest case, the robot needs only geometric models, representing the locations and shapes of objects in the 2D or 3D world, to various levels of accuracy. For example, an occupied/non-occupied encoding of the environment suffices when the task is navigation without collisions, while the shape of objects becomes important if the robot must grasp these objects.
A geometric world model can be limited to the (relatively small) workspace of a robot manipulator arm, or as extended as the navigation range of a mobile robot. In addition, some robots live only in a planar “2D” world, others must cope with all positions and orientations in a spatial “3D” world. The two major mathematical representations of world maps are
-
Grids. The world is divided in a number of squares (in 2D), cubes (in 3D), or hypercubes (in higher dimensional spaces, such as the 6D space of positions and orientations), and each square is given a description of how “occupied” it is. The advantage of grids is their simplicity, their disadvantage is that they scale badly with increasing dimensions of the world.
Quadtrees and octrees are special cases of grids: they are tree-structures whose top-level node covers the whole space, and which subdivide the space into four (2D) or eight (3D) at each next level in the tree. The advantage is that the tree can stop as soon as a required spatial resolution is reached, and this resolution need not be the same everywhere.
- Geometric datastructures. All objects in the world are approximated by a combination of a set of primitive shapes, such as points, lines, ellipsoids, prisms, etc. Each primitive has a number of position parameters (where is the primitive located in the world) and shape parameters (what are the specific dimensions of the primitive). The advantage is that no effort is spent in representing empty space; the disadvantage is that each geometric primitive requires its own specific algorithm to map sensor data into the appropriate representation of the primitive.
For navigation tasks, it's often relevant to use topological maps too. A topological map just encodes connectivity, without attaching metric distances to objects in the world. Well-known examples of useful topological maps are: train and metro route maps, or driving directions.
When the world is not static, the dynamics of the objects that move in the world could be relevant for the robot controller.
When the robot engages in physical contacts with its environment, the robot controller requires information about the interaction dynamics, i.e., models of the expected compliance, friction, impact effects, … Such a dynamic model should allow the controller to predict the future or to understand the past over a “large” time interval, i.e., it is a mathematical representation of the time evolution of the relationship between the robot actions and the corresponding environment responses. The time scale of this prediction can be “short” (e.g., natural frequencies of the physical interaction; the future motion of humans moving in the neighbourhood of the robot; or the hysteresis effects of friction or deformations), or “long” (e.g., linking the force, feed rate and number of passes during a robotic deburring task with the quality of the finished product). The models can be lumped (i.e., modelled by a set of interconnected “masses”, “dampers” and “springs”) or continuously distributed. Lumped parameter models typically have a mathematical representation in the form of ordinary differential equations (ODEs), while continuously distributed models require partial differential equations (PDEs), which are much more difficult to solve.
When humans are present in the “environment”, and are physically interacting with the robot, the models should also include knowledge about how humans want to interact physically with robots, how they expect the robots to behave, and about how the robot should communicate with humans; all this information can not be found in traditional physics or mathematics, but requires a decent amount of psychological knowledge.
When using a model inside the robot controller, the control developers must decide about the following modelling aspects:
- Model parameters. The model represents information in the form of mathematical relationships between parameters. The developers must decide which parameters to include in the modeled relationships, which are their numerical limits, etc.
- Model structure. A robot controller must decide at each moment in time which of the available models it will effectively use, and how it will interconnect partial models. Typically, making a model consists of selecting the “optimal” combination of available modelling primitives. For example, a map of a building could be constructed by selecting line segments of the appropriate length and with appropriate angles.
- Model semantics. The same mathematical relationships between parameters could be interpreted very differently in the context of different controlers. So, every robotic system must somewhere define the meaning of the model primitives and their parameters.
Various robotics research “fields” (for example, the ”fuzzy” and “Bayesian approaches) have widely varying interpretations of the above-mentioned discussion on models. As mentioned before, the WEBook separates the presentation of various models from the motivation for, and discussion about, their use for specific applications.