1994 VR Conference Proceedings

Go to previous article 
Go to next article 
Return to the 1994 VR Table of Contents 

Constructing the Virtual Assistive Agent

By: Philip Barry
George Mason University
Fairfax, Va. 22030
& Synetics Corp.
Vienna, VA. 22181


John Dockery
George Mason, C3I Center
Fairfax, VA. 22030
& Defense Information Systems Agency
Falls Church, VA. 22041
E-Mail: Phil_Barry@synetics.com 


In previous work, directed at examining the theoretical developments necessary to make virtual reality (VR) a pathway to barrier free access, the concept of a smart virtual reality emerged. To describe both internal/input and external/output aspects, the terms Intelligent Virtual Reality (IVR) and Intelligent Assistive Technology (IAT) were coined. Central to the use of IVR for the disabled is the creation of a suitable task emulating VR using what we have called Intent Amplification (IA). IAT focuses on the input problem of the limited signaling capability of disabled persons. Essentially IA bootstraps a suitable VR by iteratively interpreting metaphorical or iconic inputs, and is described in earlier work. This contribution addresses the output side of the problem. Given successful task completion in the VR world, how is the task to be carried out in the external reality? Previously, external robotic agents, each with a repertoire of skills, and each of which could negotiate among themselves, was proposed. The concept is expanded upon in this paper. We consider the creation of a Virtual Assistive Agent (VAA), which is a logical construct comprised of skills borrowed from individual agents. The construct is dynamic as well. VAAs are fashioned along traditional object-oriented lines in that the mechanics of task constructions are encapsulated. Therefore, only the lower level agent's association with the aforesaid tasks is relevant. In this architecture, the user is viewed as an agent so combinations of human and robotic skills become possible. The key contribution in this paper is the introduction, and exploration, of established group theoretic methods to build the VAAs on demand. A hierarchical construction methodology for the VAAs with genetic algorithms as an optimization scheme is suggested to combat the combinatorial explosion inherent in the proposed approach. An example is presented to illustrate these concepts.


We will use this introduction to present a very, very condensed history of our efforts to date; otherwise discussion of virtual agents might not make sense. In a series of papers the theoretical promise of using virtual reality (VR) as a means of providing barrier free access to persons with disabilities was pursued by Dockery and Littman. [1,2,3] Key to that research was the introduction of what they call Intelligent Virtual Reality (IVR), which combined principles of VR with artificial intelligence (AI). In order to move the theoretical development more into practice the design environment was expanded to include the barrier free IVR and the barrier strewn reality. In a paper by the Barrys, Dockery and Littman, the concept of Intelligent Assistive Technology (IAT) was introduced.[4] Central to the use of IAT is the further concept of Intent Amplification (IA). IA essentially permits the user to produce an IVR tailored to his needs via a bootstrap process that begins with the communication of an imprecisely stated intention to carry out a task. Even assuming the successful representation of the task environment, together with successful (but still virtual) task completion, there is still an essential issue unaddressed. How to make it happen in the real world?

After completion of a task successfully in the IVR, a message is assumed to be generated instructing someone or something to execute the task in reality. The design issue is then how one can build a system that dynamically interprets these task instructions emanating from the IVR. Previously, external robotic agents, each with a repertoire of skills, was suggested. However, except in the most limited of circumstances, this approach is confining. In this paper we address this limitation by introducing an intermediate layer of information agents designed to provide an interface between the robotic agents and the IVR. These agents can work together to interpret the user input, as carried out in the virtual world, by constructing complex task oriented behaviors for collections of real world, robotic agents. We have dubbed systems of agents working together as Virtual Assistive Agents (VAAs). Key to this concept is that the virtual assistive agents are constructed on the fly and exist only as long as they are useful. VAAs are logical constructs, similar to a blueprint for complex system behaviors. If one reflects momentarily on this, VAAs can include people both user and caregiver, and servo-mechanisms.

Dealing With Agents

As we have defined them, agents posses a diverse store of implementing capabilities. Capabilities have been defined as those actions that can implement a transformation. In this paper we borrow from group theory to model thesetransformations; and provide a straightforward mathematical framework to define how capabilities can be combined for more complex tasks. Incidentally, group theory has been used extensively in physics and chemistry to model transformations.

As the numbers of capabilities rises, the possible ways to combine them grows much faster. In fact, without extensive a priori domain knowledge, trying to determine which agents can be combined to form beneficial complex behaviors becomes computationally intractable due to the combinatorial explosion. Barry and Littman have previously proposed the use of genetic algorithms (GAs) for the optimization of coordination schemes between agents.[5] GAs are so named because they are patterned after the way in which genes evolve through evolution and natural selection. We now propose to use GAs to conduct a search through the many and diverse combinations of possible capabilities. Call this set of combinations, the capability space. GAs then provide a method for achieving our goal of building VAAs. Moreover, we can even foresee developing templates for the construction of VAAs.

Genetic algorithms themselves are an active area of research because of their potential for solving such complex problems as production scheduling on a shop floor. [6] Genetic algorithms have a nice feature. They do not require an understanding of the relationships and dependencies of problem variables. All they do require is an encoding of the problem, and the development of an evaluation (or "fitness") function. Since we are examining a combinatorially complex space with unclear interactions between every capability, genetic algorithms seemed a good choice. Intrinsic in this approach is the assumption that complex behaviors can be built from fundamental behaviors in a hierarchical fashion. With this in mind it seems possible to develop simpler VAAs; and then to combine them together for complex behaviors.

This remainder of this paper will first expand on the brief introduction to our concept of agents. There follows a word on group theory which is in turn followed by a quick primer on genetic algorithms. We will close by constructing a sample VAA using the proposed tools.

The Compleat Agent

Basically, an agent is any logical construct that can accomplish tasks. Within the IVR framework agents can be associated with input tasks, output tasks, or internal IVR tasks. We focus on a particular type of agent on the output side. This agent class is a specialist in creating a VAA from task requirement information. Because they are going to be members of a mathematical structure called a group, our agents must have some special features; or some would say, restrictions.

To any agent A, let us associate a group of capabilities C, with elements ci. Let us assert that each capability can be undone; in other words an inverse ci-1 exists for each ci. Let us further assert that any two actions, when combined, will result in an action that is specified independently somewhere else. Essentially this states that the universe of agents's capabilities is conceptually closed. This is a vital constraint, the importance of which will be discussed directly.

Agents use capabilities to achieve goals. Agents may possess more than one capability, and there are numerous agents in the VR universe. Therefore, which agent, and which capability, is chosen at a given time must be determined by our system according to the perceived goals of the user as expressed in the information space describing the current state of the universe. The information space is dynamic as are the goals of the users. Therefore, the IVR based system must employ a real time decision making capability to activate agents and capabilities. We have not tied the reasoning structure in this model to a given AI paradigm. The key element for that circumstance is that the IVR can reason to implement the goals of the user. For example, distributed decision making via intelligent agents is certainly an option. (The next section may be bypassed on a first read.)

A Sound Byte On Group Theory

In the IVR, capabilities are actions which transform the space in some way. A mathematical concept which can represents transformations is the group. Basically a group (G, *) is a collection (set (G)) of things endowed with primitive, but paradoxically powerful, mathematical properties (*). Formally speaking (G, *) is a set G together with a binary operation *satisfying the following four axioms [See for example 7]:

A1. [Closure] If a and b are in the group then so is the result of a*b for all a, b elements of G.

A2. [Associativity] (a*b) *c = a*(b*c) for all a, b,c elements of G.

A3.[Identity] There is a member of G called e such that e * a = a* e = a for all elements of G.

A4. [Inverse] Each element a in G has an inverse element a-1 in G such that a-1*a = a * a-1 = e.

Comment on A1: As we stated earlier, we demand that the IVR universe be closed. That is to say that not only are all the individual capabilities present, but all their binary combination as well. This property is formalized in axiom 1 above. For VR axiom 1 may not be unrealistic; but we are still pondering its impact on real task construction.

Comment on A2: As long as the overall order of the capabilities being exercised is consistent, the outcome will be the same regardless of the individual combinatorial sequence. The implication for computational implementation is that we can combine elements of a complex behavior in any order that is convenient as long as we maintain the overall sequence. The actual execution of the combined capability will then not violate temporal order. That is, we grasp and lift before carrying. Notice that item two did not say the elements were commutative, only associative. If a * b = b * a the group is a special construct called an Abelian group. Physically this would mean that the ordering of the capabilities would have no effect on the overall effect. An example of this would be to turn the TV to channel 3 and raise the volume two notches; if the order is reversed the end result is the same.

Comment on A3: The existence of an identity means that there is some capability which results in no action--a place holder as it were. For example, once a switch is "on", pressing it repeatedly won't make it more "on".

Comment on A4: Each capability has an inverse which can undo the action of the former. For example, consider the action of a toggle switch. Where we run into trouble on this one is with things like "unpouring" concrete. Fine for VR, but tricky in the real world! (The next few lines are for the mathematically inclined only.)

We need one other bit of group theory having to do with subgroups because we think of each isolated set of related capabilities as part of some larger group. (Our mission in building complex behaviors is to populate that unknown larger group only for those entries of interest to the immediate task at hand.) For the record! For a given group (G,*), we can define a non empty subgroup (H,*) by the following axioms:

B1. [Closure] If a and b in H, then the results of a*b are also elements of H.

B2. [Inverse] For all elements a in H, there exists a-1 in H.

(Note that if (H,*) is a subgroup of (G,*), then (H,*) itself is also a group.)

We needed the foregoing because we need Lagrange's theorem for computational purposes. This is in turn because the theorem sets numerical limits on the count of possible subgroups.

[Expressed below as o.] Formally speaking: If G is a finite group, and H is a subgroup of G, the o(H) is a divisor of o(G).

[The subgroups may, or may not, exist.] Further, if G is Albelian, than there exist subgroups with orders corresponding to prime number factors of o(G). [These subgroups will exist.]

We now turn to the issue of constructing a VAA.

Virtual Assistive Agents--The Formalism

Suppose two agents each have individual skill sets as described by their capability groups, C1 and C2. If a problem described in a given task state space P necessitates using the capabilities from both groups, we need to form a combined capabilities group K . Since K is a composite action, let K be composed of elements of the groups C1 and C2 and their combinations. Thus, c1, c2 are elements of K as are composite actions (c1*c2). If we impose associativity, and assert the existence of an identity and inverse, we can form a top-level description of K as a group. Clearly we can extend the process to include a list of capability sets Ci , i=1,n. This would create a K', which could be a very large group indeed.

We think that groups provide a natural mechanism for the representation of how capabilities from different agents can be combined. This is the mathematical basis for the VAA. In this case, the K is the VAA and will exist as a logical construct as long as it is convenient. The actual capabilities remain with the agent. The VAA provides a road map as to how to activate them to achieve the desired goal.

As VAAs become more complicated, it would be useful to have a schema that will provide for their decomposition. Practically speaking this means breaking off pieces for less complicated tasks or using pieces to create new VAAs. A generic framework from group theory is provided that describes how proper subgroups can be created for such a decomposition. For this we use Lagrange's theorem previously touched upon.

If K is a finite group and C1 and C2 are subgroups, by Lagrange's theorem, the o(C1 ) and the o(C2) divide o(K), where o(Ci ) is defined as the number of elements in Ci. In other words, a proper subgroup of the VAA cannot consist of a number of elements that is not divisible into the number of elements in the VAA. Since K is finite, the number of proper subgroups is finite. Further, if Hi are finite subgroups consisting of lower level actions, the number of composite actions in K is finite.

If K is a finite Abelian group and the prime p divides o(K), then K contains an element of order p and hence a subgroup of order p. This is a partial converse of Lagrange and states the existence of subgroups of p. This is an important result. For example, if a group contains four elements and is Abelian, we know that there are at least subgroups with 1 and 2 elements. Consequently, a subgroup decomposition with three elements is prohibited.

Albelian groups would make life computationally much easier. The question becomes whether or not we can still explore realistic situations. If we assume an Abelian group for the VAA, we have made an important physical statement, it being that the order of the group members does not matter; i.e., actions can be reversed in order and the outcome is the same. Although this shrinks our universe of possible capability combinations, there remains enough to proceed to experimentation. For this paper we constructed an example in which Abelian and non-Abelian cases are mixed. For evaluation of that we turn to GAs.

Genetic Algorithms--A Quick Primer

The IVR space needs an algorithmic approach to discover construction paradigms for VAAs. This approach must not demand an extensive understanding of the interrelationships between the problem variables; must be robust; and must be computationally tractable to facilitate near-real and real time performance. In this paper, we suggest the use of genetic algorithms.

Genetic algorithms are based on the mechanics of natural selection and genetics. A genetic algorithm consists of a population of binary strings that is optimized with respect to a fitness function. The problem is first encoded as a binary string. A population of the strings is generated at random and then evaluated and scored by the fitness function. Three operations are then employed to optimize the population: reproduction, crossover, and mutation. Reproduction copies individual strings according to their objective fitness function. In other words the strings with higher fitness functions will get copied and the strings with lower fitness functions will get deleted. The threshold as to which members of the population can reproduce is usually determine by a weighted monte carlo method.

Crossover selects two strings from the newly reproduced pool and mates them. Mating consists of selecting a position k on the strings A and B of length l. The new string A will consists the elements of 0 ? k from A, and elements of B from k l. Similarly, B will now consist of elements 0 ? k from B and elements of A from k ? l. This process ensures new members of the population are continually introduced so that other parts of the problem space are explored.

Mutation is the random flipping of a member of the string from a one to a zero or vice versa. As discussed in [8], "mutation is needed because, even though reproduction and crossover effectively search and recombine extant notions, occasionally they may become overzealous and lose some potentially useful genetic material." Mutation ensures that the GA does not get caught in a conceptual valley in the problem space.

The development of the fitness function to evaluate members of the population is the key challenge in the use of genetic algorithms to develop VAAs. However, recall that the primary type of agents in the VAA is informational. Therefore, we suggest that the fitness function can be simulated by implementation of the capabilities in the string; and then comparing the closeness of the result of the simulation to the desired result. If the results of the initiation of each capability on the state space can be predicted, updating of relevant state space variables is all that is necessary to evaluate the effectiveness of the proposed course of action.

Virtual Assistive Agents In Action--An Example

To illustrate how group theory and genetic algorithms can be used to construct Virtual Assistive Agents, imagine a scenario where, through either a helmet mounted display (HMD) or goggles or glasses, a users' view of reality is augmented with a toolbar, similar to those found in MS WindowsTM. The purpose of the toolbar is to provide the capability to pay checks, modulate the temperature in the house, watch TV and dial the telephone. The IVR in this case provides the interface to the physical implementation required to actually execute these tasks. This augmented reality is shown in figure 1, which simulates the user looking into an office with the toolbar in the lower left hand corner.

Suppose the user now wants to dial a fellow IVR user and expound upon the merits of this new technology. He can select the phone icon (either by pointing, using eyegaze technology, etc.) and a window will pop up in his field of vision as shown in figure 2. The user then can select the list, move up and down the list, and select a person or company to dial. By using a simple lookup table and proven technology for the computer to dial the phone, the user can be connected to whomever he desires. When he is done, the phone can be hung up and he will be disconnected. Using a similar approach, the user can be provided with the capability to enhance his reality by watching TV, paying checks, or modulating the room temperature. The TV case is illustrated in figure 3.

It is not unreasonable to now ask what makes the previous example into a simple IVR. It can also be asked what role do agents play in the execution of the tasks. To address the first issue, this is an IVR instead of a VR because the environment can use various means of intent amplification to infer what the user wants and to execute the tasks. For example, instead of requiring the user to select the check icon, select the list, increment the list, select the payee and pay the check every single time he wants to pay a check, the environment should begin to learn. Thus, when intent is signaled that the user wants to pay a check IA takes over. The icon will be selected, the window opened, a specific member of the list chosen, and the indication made that the check should be paid. For a user with a disability this becomes vital.

Figure 1: Augmented Reality with Toolbar

Figure 2: Phone Window

Figure 3: Watching TV

As we have discussed, agents have capabilities. Clearly, we would not want to assign an agent for each task in the IVR example, but would like to have agents with elemental capabilities that we can combine to form more complex behaviors. If we postulate the existence of agents that have the capabilities to accomplish tasks in the example environment, how can these agents organize themselves into behaviors that will be beneficial to the user? A practical approach of group representation of agent capabilities with genetic algorithms as a search methodology will now be examined.

Groups In The Example

To define the groups in our example, we first need to "lay down the law" with five basic rules of behavior in our IVR. These are as follows:

  1. To do any operation (e.g. increment lists, dial phones, pay checks, etc.) the capability must be selected.
  2. To accomplish any function within a window, that window must be selected and opened. For example, the phone cannot be dialed when the user has selected the TV window.
  3. List operations are generic and act upon the local list (e.g., if the list is incremented in the phone window the phone list not the check list will advance. However, the agents creating the transformation are the same.)
  4. Closing a window deactivates all its capabilities.
  5. Multiple windows can be opened. Control resides with the selected window.

Operators have groups of capabilities that can cause transformations. In our example, we have identified the several agents that can cause generic transformations. These agents are selection, window and list. In our definition of groups we mentioned that any two members combined with a binary operator must result in another member of the group. Therefore, for each group we can develop a table which shows these interactions. The selection operator has four capabilities: Select TV, Select Check, Select Phone, and Select Temperature. These selection capabilities will either select the icon if the window is closed; or the window if the window is open. The binary operator refers to choosing both capabilities in order; e.g., Select TV * Select Check means Select TV and then Select the Check.

See accompanying Table, showing Selection Group, List Group, and Volume Group matrices.

Optimizing The Population

Now that we have defined the operators and the rules of combinations (groups), we can evolve desirous behaviors for the IVR using genetic algorithms. The first step is to identify how the population strings will be constructed. We elect a simple encoding by assigning a unique identifier to each capability within the group. The population strings are then sequential executions of these instructions. The progress of the executions is kept track in a matrix of state space variables. The fitness of a given string is assigned by executing the string and comparing the outcome of state space variables to the goal.

For example: Let's assign the following task together with an arbitrary task number in (): increment list (1), decrement list (2), select list (3) and deselect list (4). Let us further assume that our immediate goal is to increment list #1. We assign a fitness of 2 for selecting the list, and 10 for incrementing the list #1. We randomly populate our GA, run several generations with it and get the resultant population of task numbers.

(1 - 2 - 3 - 4), (3 - 2 - 2 - 2), and (3 - 1 - 4 - 3)

The members of the three populations will then be assigned fitness of 0, 2, and 10 respectively. Our system has generated the desired behavior of incrementing the list #1 without explicit instructions as to how to do it.


Experiments were run to evaluate the utility of this approach, and the results were very encouraging. The software used is an off-the-shelf product called "MicroGA/Galapagos". The basic test bed was defined with 80 moves (genes), a population of 90 chromosomes (sample behaviors) and a mutation rate of 1%. The population was quickly optimized to exact answers for simple behaviors such as making a phone call. Matches were achieved in less than 90 iterations. However, even for more complex behaviors encouraging results were also achieved. Figure 4 shows that an 80% of optimal behavior was achieved for four sample complex behaviors. Behavior 1, labeled 12K1, refers to a behavior of making a phone call to list number 5, paying a check for list number 5, and watching TV at a volume setting of 5 on channel 5. In less than 900 iterations of the GA the optimal value of an 80% fitness function was achieved. (This experiment was run to 12.000 iterations, but no further improvement was found.)

Similarly, behavior 2, labeled 12k2, refers to a behavior of making a phone call to list number 3, paying a check for list number 3, and watching TV at a volume setting of 3 on channel 3. As would be expected, since less searching was involved, this experiment reached its optimal value of 80% at less than 100 iterations. No improvement was found for the additional 11,900 iterations run. Behavior 3 (6K1) was a simulation of making a phone call to list number 3, paying a check for list number 3, and watching TV at a volume setting of 2 on channel 3. It reached the optimal value in less than 500 iterations. Behavior 4 (6K2) was defined as making a phone call to list number 6, paying a check for list number 1, and watching TV at a volume setting of 1 on channel 1.

Figure 4: Results of Complex Behavior Search

We comment on these results in general terms. Firstly is that the parameters of the GA were not optimized. Secondly, these results represent a very preliminary search of the combinatorial space. Further, the fitness function was not optimized either; and moreover, used several step functions, i.e. the fitness function did not provide a smooth surface for the GA to search. It is likely that a continuous evaluation function would provide better results for this and more complex problems.

Summary And Endnotes

We have introduced the concept of Virtual Assistive Agents (VAAs) constructed from Logical Agents in the Intelligent Virtual Reality. Group theoretic concepts have been used for the knowledge representation of the effects of the combining various agent capabilities. We have proposed searching the combinatorial space of agent capabilities with genetic algorithms with the goal of forming VAAs. This approach was demonstrated in an example, and preliminary results were presented.

More work needs to be done, especially in the optimization of the GA, and the generation of the fitness function in a non-a priori fashion, to demonstrate the scalability of this approach for the construction of VAAs in IVR space. However, from our preliminary results the employment of group theory and GAs appears to be a viable path for the construction of VAAs.


  1. Dockery, J. and Littman, D., "Intelligent Virtual Reality as a Universal Interface for the Handicapped", in Proceedings of a Conference on Interface to Real and Virtual Worlds, Montpelier, France, March 23-27, 1992, available through Informatique, EC2, 269, rue de la Garene, 92024, Nanterre, Cedex, France.
  2. Dockery, J. and Littman, D., "Intelligent Virtual Reality in the Setting of Fuzzy Sets", in Proceedings of the NAFIPS [North American Fuzzy Information Processing Society] Conference, Puerta Vallarta, Mexico, December 14-17, 1992.
  3. Dockery, J. and Littman, D., "Fuzzy Logic Implementation of Intent Amplification in Virtual Reality", in Proceedings of FUZZIEEE II, San Francisco, CA March 28-31, 1993.
  4. Barry, P., Dockery J., Littman, D. and Barry, M., "Intelligent Assistive Technologies" in Proceedings of the Conference of Virtual Reality and Persons with Disabilities, San Francisco, CA, June 17-18, 1993, CSUN, Center for Disabilities, Northridge, CA.
  5. Barry, P. and Littman, D., "Exploring Coordination Schemes in Design Spaces", Unpublished, January 1993.
  6. Parunak, Van D., "GAÆs and Production Scheduling", in Genetic Algorithms Digest, March 31,1994.
  7. Herstein, I.N. (1975). Topics in Algebra. Xerox College Publishing, Lexington, MA, pp. 26 - 29, 40.
  8. Goldberg, D.E. (1989). Genetic Algorithms in Search, Optimization and Machine Learning. Reading: Addison-Wesley Publishing Company, Inc., pp. 1-23.

Go to previous article 
Go to next article 
Return to the 1994 VR Table of Contents 
Return to the Table of Proceedings 

Reprinted with author(s) permission. Author(s) retain copyright.