Go to previous article.
Go to next article.
Go to Table of Contents for 1993 Virtual Reality Conference
William Putnam and R. Benjamin Knapp
Department of Electrical Engineering
San Jose State University
One Washington Square
San Jose CA 95192-0084
Biological signals offer a largely under-exploited resource for communication between humans and machines. Specifically, the electromyogram (EMG) will be examined regarding its application in such a scenario. The EMG signal results from electrical activity generated by human musculature. Two classes of control signals will be derived from the EMG signal. The first class of signal will be a of a continuous nature, proportional to muscular exertion. This is suited to applications, where control of a continuous or relative variable is desired. An example would be using this signal to control speaker volume in a multimedia environment. The second class of control signals is derived to distinguish between different gestures of the user. This is accomplished using pattern recognition techniques. Gesture recognition can be utilized in an interface scenario to enable the user to make discrete choices by executing different gestures. The specific gestures used in such a system will be dependent upon the individual, and can be designed to best utilize the individual abilities of the user. A working system will be presented in which both classes of EMG control signals are utilized in the context of electronic music and computer control.
Biological signals provide an intrinsically natural set of control signals. It is felt that future human interfaces will rely heavily upon biological signals, and the EMG will be an important aspect of any such system. The EMG as well as other biological signals provide a natural "language" for the human to employ in their communication machines. This redefines the roles which have been developed unconsciously in contemporary interfaces. Presently, humans are met with the challenge of learning the language of the machines in order to communicate. Typical interfaces, be they graphical or mechanical all fit into this category. The new paradigm will be one in which the burden will be shifted to the machines, in which the humans communicate naturally, while the machines are met with the task of learning the meanings and interpreting the biological signals which serve as the direct connection between the human and the machine. These concepts serve as a philosophical framework under which the work described herein was undertaken.
The EMG signal can be used to generate control signals both continuous and discrete in nature. Continuously variable control signals are derived from a proportional estimate of muscular exertion. Discrete control parameters are derived by recognizing the specific movement or gesture executed by the user. Gesture recognition can be utilized in an interface situation to enable the user to make discrete choices by execution of different gestures. Gesture recognition is accomplished by analysis of the EMG signal using pattern recognition techniques. The specific gestures used in such a system are dependent upon the individual, and can be designed to best utilize the individual abilities of the user.
II. Gesture Recognition
Gesture recognition is dependent upon the fact that different gestures occur as a result of different modes of muscle contraction. The detected EMG signal is dependent upon which motor units in the vicinity of the electrodes have been recruited for a movement. The recruitment order is stable for a given movement but varies when the type of movement is changed. This results in an EMG signal whose characteristics change for different gestures. This is the basis for any effort at gesture recognition.
Signal processing techniques are used to extract "features" from the EMG time series. These features provide an abstraction of the underlying physical situation occurring during the execution of a gesture. This is known as pre-processing of the EMG signal. The results from this process are then analyzed using pattern recognition techniques.
In the context of this study, both statistical and neural approaches were employed for pattern recognition. Statistical pattern recognition is concerned with detecting statistical similarities between the training sets and the new data. Neural pattern recognition utilizes parallel computing architectures motivated by the structure of biological neural networks to classify the data. Both approaches depend upon a training phase where sample data is used to train the classifier. For the case of gesture recognition, the user executes the specific gestures to be learned several times, while indicating to the computer which of the gestures are being executed. This is known as a supervised learning procedure. Once trained, the classifier can then generalize its training results to new data.The techniques described above served as the basis for a two class gesture recognition system designed to detect the direction of motion of a subjects arm. Surface electrodes were placed in the vicinity of both the tricep and bicep muscles of the upper arm of the user. Data from bicep flexion and extension were collected and used to train both the neural networks and the statistical classifier structures mentioned previously. Results indicated that data from the tricep was not needed to correctly classify the flexion motion versus the extension motion. A classification accuracy of approximately 95% was achieved using data originating from the bicep only. Despite this, it is felt that a system utilizing both bicep and tricep data is warranted to accommodate users with disabilities who are unable to perform such clearly defined tasks as studied at the present time.
A real-time system for control of musical performance was implemented using a muscular exertion estimation algorithm developed for the Biomuse. The Biomuse is a system for acquisition and processing of biological signals. The Biomuse accommodates eight channels of electrical input. These can be either EMG, EEG (brain waves), or EOG (electric activity from the eyes) signals. In this example only EMG signals were used. The Biomuse communicates to the outside world using either a standard RS-232 serial link or a musical instrument digital interface (MIDI). The MIDI capability allows the Biomuse to communicate with electronic musical devices.
In a typical musical performance situation commands are sent to the Biomuse from a P.C. host computer. In this system the Biomuse is used to estimate the muscular exertion of multiple channels of EMG data. The estimates of the muscular exertion are calculated and sent over the MIDI interface to an Apple Macintosh computer running the software package MAX, by Opcode Systems. MAX is a real-time graphical computing language, tailored to MIDI applications. MAX allows the performer to map the muscular exertion data sent by the Biomuse to any other musically relevant parameters, such as pitch, volume, etc. The results of this mapping are then sent to electronic musical instruments which also communicate via MIDI.
The system described above has been implemented and used in two performances composed by Atau Tanaku from the Center for Computer Research in Music and Acoustics (CCRMA) at Stanford. The first presentation was at a concert sponsored by CCRMA entitled "Digital Music Under the Stars." The second performance was at the International Computer Music Conference (ICMC)1992, at San Jose State University.
Graphical user interfaces
A combination of the proportional estimation and gesture recognition techniques discussed previously were used to implement several graphical user interface (GUI) objects. Specifically, a continuous slider control and continuous knob control were implemented using the MAX programming language.
As mentioned previously, a gesture recognition system allows the user to choose from among a discrete set of choices. The proportional estimate of muscular exertion allows the user to control continuously varying parameters. These two modalities of communication complement each other in a human-computer interface situation. An example where both are used is the case of the slider control mentioned previously. In this example, a specific gesture could be executed to select from one of several sliders or controls displayed on a computer screen. Once selected, the muscular exertion estimate could then be used to adjust the specific value represented by the position of the slider.
In the example presented herein, the direction of motion of the slider was determined by the direction of motion of the users arm. Once the direction was determined, the slider was moved in the appropriate direction. The speed of motion was determined by the level of exertion of the user.
Training for this gesture recognition scheme was accomplished on a non real-time basis. Specifically, several examples of each motion were recorded. Proportional exertion estimates were generated and then the data was subject to thresholding to determine when a gesture was being executed. This data representing valid gestures was used to train a neural network classifier structure. Once trained, the neural network structure operated in real-time.
A positive output frm the neural network indicated an arm motion upward, and thus resulted in the slider moving in an upward direction.
The same principles used for the slider control can be applied to other graphical computing objects. Examples of these include list boxes and pull down menus. These are common to the typical windowing computing environments. EMG control of these elements could be integrated into standard computing environments, thus providing a method of interfacing to existing commercial software.
Two classes of control parameters were discussed. Proportional estimation of muscular exertion provides an interface tool which can be used to control objects of a continuous nature. Recognition of users gestures allows a user to make discrete choices. An application was presented which integrated these two parameters for a functional human interface. Application of the techniques described here are not limited to the case of the slider element. Other GUI elements common to windowing computing environments, such as pull down menus and list boxes have been implemented in a similar manner. Many other standard elements typical to the graphical computing environments common today could benefit by direct control using the EMG. Inclusion of a robust learning structure such as the neural networks employed here make it possible for the system to adapt to the varying abilities of the users, thus making this technology particularly suited to disabled users.
Go to previous article.
Go to next article.
Go to Table of Contents for 1993 Virtual Reality Conference
Return to the Table of Proceedings