Next section - Beginning of article - Back to Johuco

1. A Layered Architecture

Building complete systems that intelligently connect sensors to actuators is a challenging endeavor. We believe a lot of the difficulty stems from the "traditional" artificial intelligence approach of breaking a control system into a number of monolithic functional slices (see Figure 1a). There is typically some perception system which is then followed by a modelling component. The output of this then feeds a general-purpose activity planner whose directives are carried out by an execution monitoring stage. The problem with having just one perception component is that it has to be complete enough to provide all the information that might be needed by succeeding stages. Also such large, comprehensive systems are notoriously slow - not a good feature for a real-time robot. Making even one such system is itself a large undertaking requiring teams of researchers many years to develop. Because these efforts are typically carried in isolation from each other, there is no guarantee that they will have compatible interfaces in the end. The perceptual system may segment the world according to edges and textures patches, whereas the planner might want to deal with individuated "objects". The planner might also presume that properties, such as the type of material composing an object, will be provided even though they may not be derivable from the available sensor data.

subsumption

Figure 1 - Two types of decomposition. a. The traditional chain of highly competent functional subsystems. b. Brooks's collection of parallel special-purpose control paths.

The Subsumption Architecture devised by Brooks at MIT provides an alternative to this type of system. Instead of having a single chain of general-purpose functional slices, there are a number of parallel control paths each with its own perception, modelling, planning, and execution component (see Figure 1b). These paths are typically associated with some task, such as following walls, or grasping objects. The advantage of such a system is that each component only has to be competent enough to support the function delegated to its layer of control. Such special-purpose modules are much easier to build than their more complete counterparts, and typically run quite rapidly. In addition, the competence of a Subsumption Architecture system can be naturally expanded by simply adding more control paths to the existing system. The Subsumption Architecture has been successfully used in this manner on a number of robots.

Partial Representations

What the Subsumption Architecture does not specify is how to decompose a task into a number of separate behaviors. In practice, perception is by far the hardest problem in mobile robotics. With the limited sensory suite commonly available, it is difficult to extract semantically meaningful information from the world. Yet extracting such information in a timely manner is essential for the creature to behave intelligently. Thus, the form and content of those sensory representations which are easily computable from the data at hand is a key factor in determining what sort of task-directed subsystems can be constructed.

In robotics, the standard approach has been to first try to reconstruct an accurate internal representation of the world, sort of a diorama. The system then measures and compares various aspects of this representation to make control decisions. However, building and using such a representation is fraught with a number of difficulties. Sensors must be properly calibrated and transformed into a global coordinate frame, then the information from different sensory modalities must be combined and integrated over time. Interpretation of the raw data to yield compatible types of information from different kinds of sensors is difficult. Maintaining the consistency of the representation across large spatial distance or over extended intervals of time is problematic. Accurate models also imply accurate control. Thus, some precise and properly compensated motion controller must be used to carefully execute the steps of the high-level plan while also handling any contingencies that may arise.

seagull head

Figure 2 - Animals seem to use incomplete models for many activities. Baby seagulls respond just as well to the mockup on the right as they do to their own parent (left). The critical features are that the object must be pointed and have a red spot.

As a contrast to the usual robotics approach, let us examine some work from the field of ethology, the study of animal behavior. Much effort has been devoted to finding the "releasing" stimulus for particular behavioral pattern. By carefully controlled studies, researchers have been able to determine exactly which features of a situation an animal is paying attention to. Typically, creatures do not have very detailed models of the objects they interact with. For instance, when baby seagulls detect the arrival of one of the parents, they raise their heads, open their mouths, and start squeaking in a plea for food. The baby birds do not recognize their parents as individuals, nor are they good at distinguishing seagulls from other animals or even inanimate objects. The birds respond just as well to a simple mockup as to the real parent (Figure 2). The important condition seems to be the presence of a pointed object with a red spot near its tip. In their natural environment, this model works fine because the real parents are the only objects which fit the bill. The same sort of minimal representation has been discovered for many other animals as well. This suggests that we might be able to build reasonably competent mobile robots without investing a large amount of effort into building detailed representations.

The particular type of partial representation we have adopted is mostly local and mostly stateless. Our representations are essentially "snapshots" of certain key types of situations. In this respect, our work can be considered an instance of the "matched filter" approach that seems to pervade insect nervous systems. The idea is to recognize the class of environmental conditions which call for either a particular action or for a transition between modes of operation. The resulting situation-action rules (which we call "behaviors") are then arranged into prioritized sets to control the robot. Since we use mostly "reflex-like" direct responses, the modelling, planning, and execution components of our control paths are almost non-existent.

Structuring Principles

If we have only minimal planning in each behavior, how can the robot's individual reflexes be coordinated to achieve some goal? We start by enumerating a sequence of local environmental configurations that would be experienced as the robot performs a specific task. We then devise a configuration of sensors such that each situation generates a uniquely recognizable signature. Finally, we build simple interpretation routines that link the detection of each situation to an appropriate primitive action or simple control law. Obviously, given our coarse world modelling, some of these stimulus-response pairs might be simultaneously activated. To overcome this, we impose a priority ordering on the whole set of reflexes. More specific rules generally take precedence over vaguer suggestions or default behaviors. Similarly, behaviors likely to be encountered only in the later phases of some task are ranked more highly than the behaviors used in the earlier stages of the task. Finally, tactical behaviors, which need to respond quickly as events change, take precedence over strategic behaviors embodying the robot's longer-term policy.

Strangely enough, the priority structure we impose can also play a representational role. In many cases, the elements of a situation recognized by some behavior are not unique; there may be several different environmental configurations with the same signature. Like a baby seagull, our robots rely on a sense of normalcy, expecting some statistical regularity in their experiences. It is the job of previous behaviors in a sequence to constrain circumstances to such an extent that this assumption is seldom violated. Sometimes the behavior itself may help further test its own triggering condition by slowly changing the relation of the robot to the external world. In such a case, the robot proceeds as if the situation is the desired one until some anomaly is detected. This detection is the responsibility of a number of other behaviors which look for counter-indicative aspects of the current situation. If such non-nominal features are found, these behaviors usurp control and help the robot recover from its mistake or redirect its attention elsewhere.

Still, how can a collection of such mostly local and mostly stateless routines avoid getting stuck in local minima or infinite loops? Our solution is to have the robot continuously monitor a number of global progress indicators derived from incoming sensory data. In our systems, there are typically a number of behaviors which check to see whether the robot is no longer advancing toward the goal or has failed to make substantial headway over an extended interval of time. When such problems occur, these special-purpose routines kick-in to unwedge or reinitialize the robot.

Using the above guidelines, behaviors can be chained together to generate "trajectories" for the robot. Typically, the robot starts by moving in some default direction to get out of its initially ambiguous situation. Eventually, it gets to a place where there is enough information for some more specific behavior to take over. This behavior then retains control until it has sufficiently changed the current situation as to make it recognizable to one of the follow-up behaviors. This repeated transfer of control continues until the robot reaches its goal state or fails to meet some global progress measure. At this point, a different group of control behaviors is usually selected by setting some longer term state bit. The major advantage of specifying the robot's trajectory as a set of response rules, such as this, is that we avoid precommitting to a particular order of events. Thus, the robot can handle "unexpected" occurrences and can take advantage of any "short-cuts" that may appear.


Next section - Beginning of article - Back to Johuco

Copyright © 1992, Johuco Ltd., All Rights Reserved