2000 Conference Proceedings

Go to previous article 
Go to next article 
Return to 2000 Table of Contents

A Universal Logging Format for Augmentative Communication

Gregory W. Lesher, Ph.D.
Enkidu Research, Inc.
24 Howard Avenue
Lockport, NY 14094
Phone: 716-433-0608
Fax: 716-433-6164
Email: lesher@enkidu.net

Bryan J. Moulton
Enkidu Research, Inc.
Email: moulton@enkidu.net

Gerard Rinkus, Ph.D.
Enkidu Research, Inc.
Email: rinkus@enkidu.net

D. Jeffery Higginbotham, Ph.D.
Department of Communicative Disorders and Sciences
122 Cary Hall, 3435 Main Street, University at Buffalo
Buffalo, NY 14214-3005
Phone: 716-829-2797 ext. 635
Email: cdsjeff@acsu.buffalo.edu

Over the past few years, technical and technological advances in augmentative communication have outstripped our ability to assess the impact of these advances on the actual act of communication. This is due in part to the lack of a consistent and reliable method to measure long-term communicative efficacy. It has been extremely difficult for researchers, clinicians, and manufacturers to perform the kind of quantitative empirical studies that are an essential counterpart to theoretical advances and qualitative evaluations. Without a disciplined quantitative analysis, it is hard to identify and correct problems in a communication interface. Although customized data logging and analysis tools have been developed for specific investigations (Horstmann Koester & Levine, 1994, 1996; Lesher, Moulton, & Higginbotham, 1998), this inefficient case-by-case approach is impractical for most of the AAC community.

The future success of technical advances in AAC will depend increasingly on complex analyses of user-machine interactions. A comprehensive and universal format for the automatic logging of communication would make such analyses possible. Improvements in human-machine interactions will require detailed and reliable data collection procedures that can be accomplished on all devices (Miller, Demasco, & Elkins, 1990). We are therefore proposing a new standard for data logging in augmentative communication. Such a format will allow researchers and clinicians to maximize communication rate though an analysis of error types and machine latency patterns. Similarly, it will facilitate comparisons between different AAC approaches by quantifying cross-interface variations in production efficiency.

The recording of AAC data has received a considerable amount of focus in the last year. Hill and Romich (1999a, 1999b; Romich & Hill, 1999) have been active proponents of a new standard format for the automatic recording of communication data. They have developed a Language Activity Monitor (LAM) that collects character output data from the serial port of dedicated communication devices. This data can later be uploaded to a computer for analysis. The LAM will allow clinicians and researchers to collect an unprecedented amount of data from augmented communicators using dedicated hardware devices such as Prentke Romich's Liberator or a Dynavox system. However, Hill and Romich (1999b) have proposed making the LAM's data format a logfile standard for all communication devices. We believe that this format is neither flexible nor powerful enough to serve as a general-purpose standard. Computer-based AAC devices offer a much broader range of logging possibilities than the dedicated systems for which the LAM was designed.

Since the LAM records data from the serial port, all it can store is a time stamp (generated by the LAM itself) and the characters output by the communication device - this is the only data that AAC devices generally make available through the serial port. For many augmentative paradigms, however, the character output is the result of a series of intermediate steps. For example, in a Minspeak environment a sequence of symbols must be selected before there is any message production. Similarly, in an interface utilizing a page-based hierarchy there may be several page navigation commands prior to message production.

In addition to skipping over intermediate message production steps, the LAM format does not provide explicit information about the source of the text output. A word appearing in the LAM file might have been produced by a Minspeak sequence, a single-key word selection, a dynamic word list selection, or an abbreviation expansion. Higginbotham, Lesher, and Moulton (1999) have identified several types of AAC investigations that would be impossible without more detailed logging information than the LAM format can provide.

Under the auspices the Rehabilitation Engineering Research Center on Communication Enhancement (the AAC-RERC, sponsored by the National Institute on Disability and Rehabilitation Research), we are defining a general-purpose logging standard for augmentative communication. Since the LAM represents the only automated recording method for dedicated communication devices, it is imperative that its storage format be incorporated as a subset of the proposed logfile standard. Additionally, we are constructing a software tool for the analysis of logfiles complying to the proposed standard. When completed, this tool will be freely distributed via the Internet.


The definition of a universal format for AAC logging is complicated by the fact that the resulting logfiles will not have a single, specific use. Academic researchers, clinicians, educators, manufacturers, and end-users will utilize logfiles for different purposes and will therefore have widely varying data logging requirements. One possible solution to this quandary is to record every parameter that could be conceivably be interesting. Besides being extremely inefficient, such an effort is certain to fail - there are simply too many variables of interest in augmentative communication to comprehensively catalog them all.

To meet the varied demands of the AAC community, we propose a flexible logfile format that is powerful enough to support the most common data collection requirements while also providing an extendable framework for customized logging needs. The logfile is structured such that only those parameters appropriate to a particular situation (communication paradigm, AAC device, specific user, etc.) need be recorded. A file header specifies exactly what information will appear in the individual logfile entries, as well as how this information will be formatted.

The proposed logfile consists of three basic parts:

In addition, comments (preceded with a #) and blank lines may be positioned anywhere within the logfile. There are no size constraints on any part of the logfile. The file is currently limited to ASCII characters, although if there is significant interest the format may be extended to support Unicode (two-byte) characters.

The header contains a formalized description of each field that appears in the individual logfile entries. An entry may consist of an arbitrary number (and ordering) of fields. The header might specify, for example, that each entry consists of a timestamp, followed by an indication of what kind of action triggered the selection, followed by the text output associated with the selection. In the body of the logfile, these parameters would appear separated by spaces or tabs within each entry. Besides specifying the order and type of the entry fields, additional field-specific details can be defined in the header. For example, the resolution of the timestamp can be established.

Optionally, the header may be completely omitted. In this case, individual entries must consist of a timestamp followed by a text output (delimited by quotes). Since this is exactly the structure of a LAM record, this format is consistent with our proposed format. We are also investigating the possibility of allowing free-form entries from which the structure of the entries can be inferred without requiring explicit header information. If a header is present in the logfile, its end is indicated by a marker sequence ($$$).

The fields that compose each logfile entry quantify unique aspects of the selection process that produced that entry. For many studies, the text output may be the only aspect of interest. For other purposes, however, information such as the selection method or the source of the output may be important. We are in the process of identifying a set of fundamental parameters that can be used to quantify the communication process. A few instructive examples are provided below.

The number of entries in the body of the logfile is limited only by the memory available to store the file. The end of the body is indicated by another marker sequence ($$$).

Following the body of the logfile, a system may optionally record some statistics on the logging session. There is no specific format for the data in this analysis section, nor is there any limitation on the type of information that can be provided. The nature of the measures recorded depend wholly upon the device manufacturer. For example, our IMPACT software can be configured to record the total number of characters and words logged during a session, as well as estimates of communication rate and keystroke efficiency.

A very brief logfile example is provided below. This example was recorded using a QWERTY keyboard supplemented by a 5 word prediction list accessed through the function keys (F1 through F5). Besides providing a timestamp and output information, this logfile records the source action and type of each selection, as well as information about the current context (useful for analyzing the effectiveness of word prediction).

    TIME             Absolute time
    OUTPUT           Text output
    TYPE             Type of selected element
    ACTION           Selection action
    CONTEXT          Local context
    Time: 12:10:39 09/29/1999
    $$$ End Header (and begin Body)

    12:10:41.0  ""      Shift      key_shift  ""
    12:10:42.7  "The "  List       key_f1     ""
    12:10:43.8  "b"     Character  key_b      "The "
    12:10:45.4  "est "  List       key_f3     "The b"
    12:10:46.5  "t"     Character  key_t      "The best "
    12:10:47.8  "h"     Character  key_h      "The best t"
    12:10:49.2  "i"     Character  key_i      "The best th"
    12:10:50.9  "ng "   List       key_f2     "The best thi"
    $$$ End Body (and begin Analysis)

    Time: 12:10:53 09/29/1999
    Output: "The best thing "
    Characters: 15
    Words: 3
    Characters/word: 5.00 (4.00)
    Keystrokes/character: 0.47

In developing the logfile format, we are actively seeking feedback from persons in the AAC community. A complete specification of the proposed format can be found at http://www.enkidu.net/logfile.html, along with instructions on how to suggest enhancements. The feedback period will continue until September 30, 2000, at which time the format will be fixed.


Many of the measures commonly used in augmentative communication cannot be easily derived using generic statistical analysis programs. For example, keystroke savings cannot be computed without additional information about the baseline keystroke count. While it would be possible to write programs to compute most measures using commercially available statistical packages, a dedicated program for computing AAC-specific statistics would facilitate logfile analysis. The existence of such a program would also provide additional incentive for manufacturers to adopt the proposed logfile format.

We are developing a statistical analysis program that will provide a fast and convenient means to analyze logfile data. The Augmentative Communication Quantitative Analysis (ACQUA) package is being written for Microsoft Windows. Once completed, this program will be made freely available to the AAC community. Besides providing AAC-related statistics, ACQUA will allow operators to filter and reformat logfiles for export to popular commercial analysis packages such as SPSS. The program will also serve as a logfile viewing tool, allowing the operator to browse through recorded data.

In defining a set of statistics and performance measures to be incorporated in ACQUA, we are identifying those in common AAC usage. These include measures of language usage (for example, average sentence length, average word length, raw number of sentences, and vocabulary distribution), derived measures of communication efficiency (for example, keystrokes per character and communication rate), and device-specific usage measures (for example, frequency of selection for specific keys). A comprehensive list of ACQUA statistics can be found at http://www.enkidu.net/acqua.html. As with the logfile format, we are actively seeking feedback from members of the AAC community regarding which statistical measurements should be included in ACQUA.

Since a logfile may consist of many days worth of communication data, ACQUA can be configured to analyze only specific sections of the data. The span of this data window can be defined in terms of the following parameters:

ACQUA can also be utilized to analyze a series of consecutive (or overlapping) data windows, providing a sliding estimate of the specified measures. This approach could be used, for example, to plot and analyze how communication rate changes with time. Such windowing can provide more specific information about the effectiveness of augmentative communication than can global (non-windowed) analysis. For example, a windowed measure of communication rate might reveal specific contexts in which an interface is particularly effective (or ineffective).


We have defined the framework of a universal format for the continuous logging of augmentative communication. This format is flexible and powerful enough to satisfy the needs of researchers, clinicians, care-givers, and end-users. At the same time, it provides compatibility with simpler formats such as that used by the LAM. When combined with the ACQUA package for logfile analysis, this standard promises to open new possibilities for the quantitative assessment of AAC technologies. These empirical studies will in turn serve to guide future advancements and to enhance the communication experience for users of current technologies.



The authors wish to acknowledge support from the National Institute on Disability and Rehabilitation Research of the U.S. Department of Education under grant #H133E980026. The opinions expressed are those of the authors and do not necessarily reflect those of the supporting agency.

Go to previous article 
Go to next article 
Return to 2000 Table of Contents 
Return to Table of Proceedings

Reprinted with author(s) permission. Author(s) retain copyright.