1993 VR Conference Proceedings

Go to previous article 
Go to next article 
Go to Table of Contents for 1993 Virtual Reality Conference 


A Proposal For A Three-Dimensional Visual Programming Language Based On Sign Language Constructs

John Lenarcic
Department of Software Development
Monash University, Victoria, Australia

Abstract 

Current implementations of visual programming languages predominantly focus on two-dimensional (2-D) representations of command sets and data structures. Use of three-dimensional (3-D) constructs in visual programming is still a largely unexplored issue. The prospects of employing visual programming constructs adapted from the sign languages of the deaf are discussed. The benefits of this proposed strategy in terms of educating potential computer programmers who are deaf are also noted.


Introduction

Ever since the genesis of conventional high-level computer languages with imperative characteristics, the methodology of programming has generally required the specification of text-based sequential instructions that a computer can then understand and execute in order to solve a particular problem. Green (1982) defines a program, in broad terms, as a succinct description of a temporal process - not necessarily computer-based - in which form transcends content. The key concern is clarification of the contingencies within a process and their interrelationships.

For many, the development of programming skills is both a time-consuming and frustrating pursuit. Indeed, the goal of attaining fluency in a computer language is arguably as demanding as that of acquiring competency in a natural language foreign to one's own native tongue! The new paradigm of visual programming, combined with program visualization, has the potential to act as the nucleus of computer systems that encourage novice programmers to learn via exploration and discovery.

Visual programming is classified as being a variety of software development techniques that employ visual constructs (such as icons or diagrams) in the actual computer-based creative procedure. Program visualization, on the other hand, is defined as being the application of visual representations (such as static graphics, still images or animation sequences) to illustrate the form of programs or data. Myers (1989) carefully distinguishes between the latter two seemingly related computing ideas.

The systematic use of visual expressions to convey meaning constitutes a visual language that would have many benefits in a programming environment. Shu (1989) notes that software development using visual representations as building blocks is a convenient concept because pictorial forms can succinctly transmit meaning, are useful as a mnemonic aid and have the potential to transcend international language barriers.

Visual programming systems offer software engineering environments that enable both visual programming and program visualization to be undertaken (Chang, 1990). Their popularity in recent years (at least in research circles) can be attributed to the falling cost of graphics-related hardware and software. This has made it feasible to use visual expressions as a means of communicating with computers and encouraged the development of graphics-based applications to teach programming (usually in the guise of multimedia).


Teaching Computer Programming to the Deaf

The field of computing has the potential to offer deaf individuals intellectually stimulating career opportunities that are more accessible than other possible options (McLeod, 1981; Ross, 1982). This has become especially valid with the advent of human-computer interfaces that provide a dynamic, visual environment by exploiting an amalgam of the desktop metaphor and windowing. The projected growth of multimedia personal computers in the workplace would also seem to be a boon for the hearing-impaired, particularly if the visual nature of such systems were enhanced rather than the audio side.

The concept of end-user computing has resulted in non-programmers being empowered with menu- or icon-driven software tools that can encourage them to be application developers. For more complex software development tasks, requiring greater creativity, reliance on some form of programming language is still essential. However, the problem of how to effectively teach deaf people the principles of software engineering, as noted in Ross (1982), still has not been solved.

The deaf contingent in most university computing classes is a very small minority - or nonexistent. Dynamic explanations of programming concepts as provided by tutors or lecturers in such classes is lost as all the information is usually filtered to these deaf individuals via static notes written by lecture transcribers who can hear. The only alternatives would seem to be textbooks (static information once again) or one-on-one individual instruction (by the rare combination of a computing teacher with sign language fluency).

Since many of the deaf are adept at communicating via a natural language that is truly visual in nature - namely one of the many sign language dialects - a more intuitive visual programming system specifically geared for them could provide a solution as a teaching aid. The majority of such systems are based on 2-D visual languages, whereas sign languages are gesture-based and 3-D in form. Obviously, it would seem that a visual programming system with a gesture-based visual language would be ideal.


Gestures in the Human-Computer Interface

A gesture can be defined as being any motion of the human body that possesses underlying semantic content. Sturman (1991) offers a narrower definition by stating that gestures consist of motion of the fingers or hands, whereas specific static instances of finger or hand positions are termed postures. The significance of human gestures in the overall spectrum of communication has often been understated (Morris et al, 1979). The many sign language dialects of the deaf are examples of true natural language systems based entirely on gestures (Klima and Bellugi, 1979). In the case of hearing individuals, social intercourse via spoken languages is heavily augmented by body actions, postures, movements and other forms of physical expression.

Gestures can function as non-verbal signals acting in conjunction with spoken language to establish a parallel communication channel for the exchange of feedback and synchronization cues without interrupting the flow of a conversation. In fact, McNeill (1992) goes so far as to argue that gestures are an integral part of natural language for hearing individuals as well, on a par with words, phrases and sentences. According to his hypothesis, gestures and natural language are one system, with the function of the latter being a display construct for inner thoughts dealing with world-event knowledge. An intriguing aspect of his research involved a study of gesture usage in the thought processes of mathematicians.

Communication in the human-computer interface is generally dependant on user knowledge of highly restricted subsets of natural language of the written variety, such as programming languages. Kurtenbach and Hulteen (1989) note that a computer's inability to interpret gestures restricts the expressive range of interactions possible. Gesture input has the potential to improve the efficiency of certain interface tasks by empowering the user with a greater degree of functionality in computer-mediated activities (e.g. Bolt, 1980; Papper and Gigante, 1993). However, as the latter authors note, very few existing computer systems use gesture interaction as the dominant form of input. Where they are employed, gestures generally form a simple and incomplete subset of the main command set in systems, or are used as an adjunct to other forms of input. For example, Hauptmann and McAvinney (1993) describe efforts to develop a hybrid computer system combining the utility of gestures and speech to manipulate graphic objects.

Gestures have an important role to play in Virtual Environments : Virtual environments is the less sensational name given to a family of technologies more popularly called "Virtual Reality" or VR (Rheingold, 1991). Such hardware/software environments normally provide users with real-time 3-D graphics, combined with a visual display system that creates the illusion of total immersion in a synthetic world subject to direct manipulation. To achieve the sensation of direct engagement with these virtual worlds, users are appropriately outfitted with stereoscopic head-mounted video displays and electronic glove-based peripheral devices. These gloves enable rudimentary interaction with virtual worlds via gesture input. Advances in 3-D visual display units could eventually lead to VR systems that do not rely on head-mounted displays.

Fisher (1991) describes a virtual environment in which gesture input (using a glove device) was implemented in the form of commands based on a subset of American Sign Language, designed for users who are presumably hearing individuals. Research has also been undertaken to develop a computer-based framework for hand-sensing technology that can eventually translate sign languages of the deaf to audible speech (Kramer and Leifer, 1987; Fels, 1990; Vamplew and Adams, 1992).

Another category of VR technology that may be of use for gesture recognition systems is "artificial reality" (Krueger, 1991). Artificial reality can be described as being an interactive computing environment that permits unencumbered, full-body, multi-sensory participation in simulated events. In other words, artificial reality systems produce computer-generated graphic "worlds" into which people can appear to enter from different places to interact with each other or with static/dynamic graphic objects. Virtual worlds can be viewed on video display units. Such environments do not rely on glove-based input devices but detect gestures via visual sensors, meaning that problems invariably arise with regard to multi-finger tracking/identification, or the occlusion of fingers.

Sturman (1991) defines the study of the full and direct use of the human hand in human-computer interaction as being "whole-hand input". Defined as a field of study in its own right, whole-hand input has no theoretical foundation at present and a dearth of practical research to draw inspiration from. To design future computer-based controls exploiting whole-hand input requires further knowledge to be obtained about the range of tasks amenable to the actions of the hand in virtual environments: A 3-D visual programming system that exploits whole-hand input would be one such possible application.


The Case for a Three-Dimensional Visual Programming Language

The rising popularity of WIMP (Windows, Icons, Menus, Pointing) graphical interfaces for personal computers is one clear indication of the impact that visual metaphors have in human-computer communication. Even with such public acceptance, there are still some sections of the academic community that are highly skeptical of the current advances in visual programming and program visualization (Brooks, 1987; Dijkstra, 1989). Their disparaging remarks include claims that conventional software is very difficult to visualize and that existing visual representations used in experimental systems are often highly idiosyncratic, making them difficult to understand and manipulate for the casual user. Harel (1992), however, strongly encourages researchers in this field to cast aside such emotive criticisms and press on with projects that transform computer system modeling into a predominantly visual process, emphasizing realistic, motion-based, 3-D imagery in particular.

Without specifically referring to the notion of VR, Glinert (1987) raises the issue of applying 3-D metaphors to improve the design of visual programming languages that are currently restricted to 2-D environments. Examples are offered from mathematics and physics where seemingly intractable problems have been solved by reformulating them in 3-D scenarios. Could this tactic then be a partial panacea for programming problems? An extra dimension can allow users to view more information simultaneously in the programming process (i.e. parallel perception). Representations of information hierarchies and interconnections may also be improved in 3-D environments. Also, the problem of screen clutter in 2-D visual programming systems could be reduced by a shift to 3-D environments.

If one can view software design as being analogous to architectural design with its obvious 3-D foundations (Borenstein, 1991), then a 3-D visual programming is a natural choice as design tool. The rationale for the latter being that most real-world functional tasks require manipulation or modification of multi-dimensional objects within a natural environment. A rare example of an actual 3-D visual programming language is featured in Najork and Kaplan (1992), who describe an experimental system called CUBE that is designed for eventual use in a VR environment. Based on the logic programming paradigm, CUBE programs are data flow oriented and, as the name suggests, they are represented by linking graphical blocks together.

Standard text-based programming languages can be considered to be highly constrained forms of written language (invariably based on spoken English) dedicated to the representation of computer algorithms. Consequently, it would seem that visual programming languages should be adapted from existing visual languages used for natural discourse. The many dialects of sign languages for the deaf are examples of distinct natural languages that have developed solely in a gestural-visual modality and are 3-D in form (e.g. Australian Sign Language or Auslan). Over the centuries, the English language has evolved by borrowing words - and concepts in general - from other languages in order to enrich its expressive power. (Where would science be without terminologies adapted from Ancient Greek and Latin?) In an analogous vein, sign languages - such as Auslan - have the potential to improve the graphical vocabularies of visual programming languages. Ideally, it would probably be useful to exploit any remaining vestiges of sign language iconicity in the design of a 3-D visual programming language (Mandel, 1977; Deuchar, 1990).

The development of a 3-D visual programming language based on sign language constructs would initially involve the creation of a written form of sign language, possibly one that has a greater isomorphism with English orthography - a challenging goal in itself (Stokoe, 1987). However, being 3-D in form and dynamic in nature, sign language in its written guise would ideally be represented by animated, interactive 3-D computer graphics that are ideographic in nature.

Ideographic languages are supposedly a holistic written form by which ideas themselves are conveyed. In linguistic terms, pure ideographic languages have no special relation to any particular spoken language (as opposed to written languages that are phonetically based). Sperling (1978) speculates that a written sign language combining the best features of phonetic and ideographic languages is possible (albeit fiendishly difficult to craft) and that such a code would be both parsable, as well as ideographic. These characteristics would make it a highly suitable foundation for a visual programming language.


Conclusion

The successful development of any 3-D visual programming language based on sign language constructs will rest upon advances in gesture recognition systems, which are currently only focusing on identifying hand actions. Interpersonal communication in sign languages takes place using a cluster of co-ordinated symbols, such as hand gestures and postures, body postures and movements and facial expressions. It might be advantageous for shorthand forms of existing sign languages to be developed for use with computer-based interpretation systems. Research investigating the minimum features of the visual signal that are necessary for adequate sign language perception is featured in Fischer and Tartter (1983).

At the simplest level, symbolic interpretation of hand actions using VR technology (such as glove sensors) involves analysis of static postures. More complicated interpretation schemes to differentiate hand movements require interrelated analyses of temporal and spatial data. Full comprehension of a sign language via a computer-based natural language understanding system would be as intricate a process as that faced in similar systems based on voice recognition. Perhaps even more so. Consequently at this stage, unrestricted sign language as the foundation of a 3-D visual programming language would be as ambitious a prospect as using written English as a text-based programming language.

A visual programming research initiative focusing on a subset of sign language constructs as 3-D commands would seem to be the best approach. Incidentally, one of the obvious benefits of using sign language constructs in the human-computer interface would be the opportunity to exploit two-handed input, something that is currently alien to conventional computer users but commonplace to video game users (Buxton, 1990).

On a financial note, it is disappointing though that the astronomically high cost of VR component technology in general is preventing many talented researchers from tackling the problems discussed in this paper. Many funding bodies are inherently conservative and still view the field of VR as being too exotic a risk, especially in these recession-afflicted times. The fruits of such a research effort could very well have a great impact on all computer programmers - both the hearing and the deaf - if given the chance.


References

Go to previous article 
Go to next article 
Go to Table of Contents for 1993 Virtual Reality Conference 
Return to the Table of Proceedings 


Reprinted with author(s) permission. Author(s) retain copyright.