Interface Agents
for Interacting with Virtual Environments

Britta Lenzmann
University of Bielefeld
Faculty of Technology, AG Knowledge-Based Systems (AI)
D-33501 Bielefeld, Germany
Phone: 49-521-1062919
email: britta@techfak.uni-bielefeld.de

Abstract

The basic rationale of my Ph.D. thesis is to enhance and simplify interaction with an interactive 3D graphical system. To relieve users from technical detail and allow them to communicate with the system in an intuitive and human-like manner, I am investigating three main aspects: adaptation to user preferences, multimodal input, and open and underspecified input. I use agent-based techniques to approach my solutions.

Keywords: Interface Agents, Interactive Graphical System, User Adaptation, Multimodal Input, Open Input.

Introduction

The overall goal of our research in the AI &amp Computer Graphics Lab at the University of Bielefeld is the design of intelligent human-computer interaction in the domain of interactive 3D graphical systems, also called virtual environments. Conventional graphical systems provide interaction devices like menues and mouse which often require complex manual and intellectual skill from the user. Virtual Reality systems allow the user to navigate and manipulate the virtual world by using simple hand gestures, but require expensive technical equipment (e.g., DataGlove, Head-Mounted Display). Consequently, we think of enabling an "intelligent" interaction with a graphical system by allowing the user to communicate tasks to the system by using more intuitive, human-like interaction forms. Moreover, user-specific tasks should be solved by the system autonomously.

Therefore, techniques have to be explored and realized which enhance and simplify the interaction with a virtual environment in the way that users can be relieved from technical detail and can concentrate on their primary tasks.

In the last decade, interface agents have become prominent as a new paradigm for the design of more intelligent user interfaces [2, 6]. By mediating a relationship between the technical system and the user, they allow more human-like communication forms and, so, can add comfort in human-computer interaction [3, 5, 7]. Maes [5] has realized personal assistants, e.g., for electronic mail handling and electronic news filtering which accumulate knowledge about tasks and habits of their users to act on their behalf. In the VIENA project [7], we consider the manipulation of objects in a virtual office by using simple natural language input. A multiagent interface system is used as a mediator for translating abstract verbal user commands to quantitative, technical commands which are used to update the visualized scene.

My Ph.D. thesis research contributes to the work in the VIENA project. To achieve further enhancement in the interaction with a virtual environment, I investigate three main aspects: adaptation to user preferences, multimodal input, and open and underspecified input. I am using agent-based techniques to approach my solutions. The next three sections will describe these aspects in more detail.

User-Adaptation

To enable an effective human-computer interaction, the interface system must be able to meet varying conditions. Thus, incorporating adaptation facilities in the agent system becomes essential. Since the practical experiences with the VIENA system has shown that variations of individual preferences exist across users, we want to focus on adaptation to user preferences.

In our approach, we consider a system of multiple interface agents which adapts to user preferences by learning from direct feedback without explicit acquisition of user data. Avoiding explicit user modeling seems a desirable goal because explicit user models have found critique with respect to privacy of user data [6]. The core idea of our approach to implicit user adaptation is that agents of the same type but slightly different functionality, corresponding to possible variations of users' preferences, organize themselves to meet the preferences of the individual user. Getting positive or negative feedback from the user, agents increase or decrease their amount of selfconfidence, so that successful agents become dominant in the ongoing session.

From the system internal point of view, the adaptation process is achieved by a form of reinforcement learning [1]. Learning is realized in a way that the system will take actions that maximize the reinforcement signals received from the environment. In our approach, this means that users' instructions (or corrections, resp.) represent reinforcement signals which are interpreted and encoded by the interface agency in the form of credit values. Each agent stores a credit value corresponding to its quality ("strength") at discrete periods of time. Learning is achieved by adjusting agents' credits in correspondence to the users' feedback and assigning those agents which are eligible for the task in question and which have maximal credits.

A prototype version of the adaptation method described above has been implemented and tested for the case of users' preferences for different spatial reference frames. By using simple heuristics, adaptation to varying users' preferences for different spatial reference frames can be achieved. For more detailed information, see [4].

Multimodal Input

Communication between people is so effective and flexible since they can simultaneously use different senses to receive or transmit informations. To allow an intuitive human-computer interaction, the interface agency should be able to understand and integrate user instructions of different modalities.

Whereas several multimodal systems, realized so far, concentrate on methods for the generation and presentation of multimodal output, we focus on the integration of multimodal input. To communicate instructions to the graphical system, natural language input and simple hand gestures indicating a direction can be used.

The problem of integrating informations of these two modalities into one multimodal input should be solved by a multimodal input agency. This agency consists of several mode-specific input agents, i.e., a speech listener agent and a gesture listener agent, a global input data structure, and a coordinator input agent. The listener agents are responsible for receiving and analyzing the sensor data and for sending them to the coordinator input agent which stores all incoming data in the global input data structure.

To integrate the gestural and verbal inputs, the coordinator input agent has to decide which gesture belongs to which verbal input. In our approach, we want to achieve the synchronization by processing in time cycles which is motivated by temporal control mechanisms of humans. The gesture and verbal input of the user are interpreted as belonging together if they are perceived by the listener agents in the same time cycle. In this way, intuitive interaction modalities could be used simultaneously.

Open Input

Interacting with technical systems mostly requires that commands are specified, appropriate to a given syntax, correct and complete. Moreover, each command has to be completed with a special symbol, e.g., by typing the return button of the keyboard. Since communication between people is situated naturally, i.e., embedded in a situational context, they often use vague and uncomplete expressions. To this end, the human-computer interaction can become more intuitive if uncomplete and underspecified instructions are allowed.

In our setting, I want to use a combination of time-oriented and event-oriented techniques within the listener agents to decide when the processing of instructions can begin. In addition, agents should use knowledge which is obtained from previous interactions to determine missing informations.

References

Kaelbling, L.P. (1993). Learning in Embedded Systems. Cambridge (MA): The MIT Press, 1993.
Kay, A. (1990). User Interface: A Personal View. In Laurel, B. (Ed.): The art of human-computer interface design, 191-208. Reading (MA): Addison-Wesley, 1990.
Laurel, B. (1990). Interface agents: Metaphors with character. In Laurel, B. (Ed.): The art of human-computer interface design, 355-365. Reading (MA): Addison-Wesley, 1990.
Lenzmann B., Wachsmuth I. (1995). A User-Adaptive Interface Agency for Interaction with a Virtual Environment. In Working Notes of the IJCAI-95 Workshop on Adaptation and Learning in Multiagent Systems, 43-46. AAAI Press, 1995.
Maes, P. (1994). Agents that Reduce Work and Information Overload. Communications of the ACM, Vol.37, No.7, 1994, 31-40.
Norman, D.A. (1994). How Might People Interact with Agents. Communications of the ACM, Vol.37, No.7, 1994, 68-71.
Wachsmuth, I., Cao, Y. (1995). Interactive Graphics Design with Situated Agents. In W. Strasser & F. Wahl (Eds.): Graphics and Robotics, 73-85. Berlin: Springer, 1995.

Interface Agents for Interacting with Virtual Environments
Britta Lenzmann, britta@techfak.uni-bielefeld.de