Restructuring Textual Information for Online Retrieval

IBM Research Report RC 11278
July 1985

Larry Koved

Master's Thesis
Department of Computer Science
University of Maryland
College Park, Maryland

Abstract

Two experiments were conducted to evaluate two styles of online documents. The first experiment compared paper manuals to online manuals using two different database structuring techniques - a sequential (linear) structure and a tree structure. People using the paper manuals were faster at solving problems than the people using the computer manuals. No differences were found between the linear and tree structures, or in accuracy of problem solutions. In a subjective evaluation of user preferences, the computer manuals were rated as better and more organized than the paper manuals.

The second experiment compared two methods of retrieving online information that allowed the reader to specify the attributes needed to guide the information retrieval process. The first manual recorded the attributes entered by the reader via menus, and material in the manuals not relevant to the current search was pruned from the search space. The second manual did not record the menu selections, and the readers repeatedly entered the attributes several times in order to complete the task. The manual that recorded the attributes allowed the readers to work over twice as fast and was preferred over the other manual.

A theoretical foundation is presented for the underlying online documentation used in the experiments. The user's traversal through the database is presented as a graph search process, using a production system.

The results of the experiments and their theoretical foundations are evaluated in terms of the impact they might have on future online document storage and retrieval systems.


Table of contents


Abstract
Introduction
Thesis organization
Prior work
Information structuring models
Search graph
A production system approach
Manuals
The maintenance environment
The user interface
Introduction to the experiments
Experiment 1: Mode and Structure of Presentation
Experiment 2: Online Imperative style manuals
Conclusions
Appendix A: Questionnaire for experiment 1
Appendix B: Questionnaire for experiment 2
Appendix C: Experiment 1 Paper-Tree manual
Appendix D: Experiment 2 manuals (samples)

Footnotes
References

Introduction

With the rapid proliferation of computers, there is an increasing need to be able to easily learn to use and obtain information from them. Yet, few novel methods have been developed to enable people to use computers for storing documents and easily accessing their contents on demand. The problem of retrieving information from documents is not limited to computers. This problem dates from the use of paper to record information. The traditional aids for improving access to information is through such methods as indexing and place marks.

Printed documents have existed since Gutenberg invented movable type in the 15th century, and hand-written documents on papyrus and parchment since the ancient times. Successive refinements in aids for designing and using printed material have been developed during the past five centuries. We easily access sections of a book or document by turning the pages, and locate specific topics by turning to the table of contents or index. Inserting a book mark preserves the location of a particular item of interest. To enhance the understanding of information, books include graphics, diagrams, and artwork. The physical presentation of the printed material, such as fonts, color and format of presentation, helps convey information to the reader. Through experience and research, those factors that enhance the design of printed material have been refined.

Many of the aids available for printed material have not been made available for online documents. The problems are both hardware and software. The most difficult problem to overcome is the limitation of the hardware used for online retrieval and display of documents. Until recently, video display terminals (VDT) have not become widely available. Unfortunately, most VDT's currently in use display limited quantities of information - typically 24 rows by 80 columns. When compared to an 8 1/2 x 11 inch page of paper, that has 66 single spaced lines, the VDT is clearly at a disadvantage. In additional, most VDT's are limited to a single font, where the sharpness and resolution of the characters is limited to 5 x 7 or 7 x 9 dot matrix fonts on phosphor displays. Experiments comparing comprehension and reading rates of material presented on VDT and paper have shown that the latter is better (Wright, & Lickorish, 1983; Gould, & Grischkowsky, 1984; Hansen, Doring, & Whitlock, 1978). Reading rates from VDT's are as much as 22% slower than from paper (Gould, & Grischkowsky, 1983). Another disadvantage of the VDT is that it can not be held as a piece of paper can be at any desired angle or position. If possible, the software that controls the VDT must compensate for these disadvantages. But that is not sufficient. Online systems are qualitatively different from computers and have to do what paper does not do. The capabilities of the computer must be used to their best advantage.

From the software point of view, the aids we have for printed material have not generally been provided in online systems. Many of the basic aids for organization of online information are already available as a result of text processing software. Text processing programs will generate a table of contents, a list of figures, and an index. Other aids, such as book marks, dog ears, and margin notes, are usually not available online. Most of these deficiencies are not based on technical restrictions, per se, because the techniques exist to provide them.

The semantics of using online documents is not the same as using documents printed on paper. Even the simple act of turning a page is different. Turning a page in an online document may require different cognitive processes and physical actions. An online document presentation system frequently requires the entering of commands or the pressing of a function key. To turn to a particular section, other commands must be entered, possibly with a page number. However, finding a chapter in a book can be done by rapidly flipping the pages until a chapter title is located. Given the hardware characteristics of the VDT and the responsiveness of most software systems, it is not desirable to search in the same way for a chapter title page in online documents.

Conversely, there are several characteristics of a computer that give it an advantage over paper. First, a computer can be used to rapidly search for keywords in many documents more economically than can be done manually. Computers can also hold very large volumes of information in relatively little space and at low cost. With the proliferation of computer networks, and the very low cost of portable magnetic media (e.g., floppy disks, cartridge tapes), it is now possible to update online documents almost instantaneously. Rapid updating is not possible with printed material because printed updates take time to prepare and distribute. The problem remains of how to use the perceived advantages of the computer to make online documentation as good as, or better than, its paper counterpart.

Thesis organization

This thesis is divided into 12 chapters:

Prior work

Many systems have been developed to provide online information, including several systems for online document creation and retrieval. Each of these systems is suited for a specific audience. The uses of these systems include online document retrieval and display, online writing and literature retrieval, historical databases for museum environments, and online help systems.

Interactive viewing of existing documents

Viewing documents already stored electronically can be especially useful. The technology exists to take documents and transform them into a form that allows the user to view the overall structure, and browse through the contents online (Witten, & Bramwell, 1985). However, the drawback to this method of information retrieval, when applied to maintenance manuals and other online systems, is the strict hierarchical ordering inherent to paper documents. The hierarchy is needed in order to create an orderly presentation on paper, but is not required or necessarily desirable when information is to be viewed online. Improved cross referencing would require that the document be modified for its online presentation, and the hierarchical organization restriction must be relaxed. The other major disadvantage is that the information being displayed on the VDT screen was originally designed for paper. However, the VDT displays less information than a piece of paper. Thus, the user is forced to scroll the information in order to view an entire section of the document.

Hypertext

Hypertext is a method of organizing information, where pieces of information are connected via links into in a network database (Moran, & Halasz, 1985; Trigg, 1984; Nelson, 1981). The length of stored information can be a few words or a complete literary work. A hypertext system is not limited to textual information. It can include graphics, video, and audio recordings.

The concept of linking pieces of information together to create a whole is one of the most appealing aspects of hypertext systems. The database links can be grouped together to form predefined paths that guide the user through the network, presenting a particular view of the database. Through the path mechanism, different versions of the same information can be presented. For example, the author may want to view several different organizations of a paper being written. Creating different paths for the same material will allow the author to see what the paper would look like when using the different organizations.

The hypertext approach allows users to incorporate sections of other peoples' works into their own. If a writer wants to cite another work, or quote a paragraph from another document, the writer simply inserts a link in the text that points to the other document. There is no need to copy or replicate any information in the database.

ZOG

The ZOG system is very similar to hypertext (Robertson, McCracken, & Newell, 1981). The underlying structure is a network of subnets containing information and menu frames. Just as in hypertext, these frames are connected together by links in the database. Links extend from menu selection frames to other menu or information frames.

A unique feature of ZOG is that it may be used as a front-end for other programming systems. Interactions with ZOG can result in programs being executed on the user's behalf. In turn, these programs can feed results to ZOG which presents the information to the user. If the user's program needs information (e.g., terminal input), ZOG can provide assistance in locating and constructing the correct response from menu inputs. The output from the user's program can be interpreted by ZOG, resulting in the selection of a new frame to be displayed.

The feedback between user programs and ZOG is very attractive for the diagnostic front-end aspects of maintenance systems. A diagnostic machine with a database of information can be attached to the system being tested. The diagnostic machine queries the target system for specific status information. The results are fed back for analysis. Next, the front-end automatically retrieves the appropriate information for the user from the database. An "expert system" (Nau, 1983) can manipulate the results from the target system before passing the information back to the front-end. There are a number of different possible ways in which these and other techniques can be combined.

TIES

The Interactive Encyclopedia System (TIES) allows museum visitors to learn about European history by accessing a historical database (Ewing, Mehrabanzad, Sheck, Ostroff, & Shneiderman, 1985; Shneiderman, & Ostroff, 1984). In the museum environment, emphasis is placed on keeping the user interface simple since TIES is to be used by many different types of people, including novices. The objective is that TIES should take less than a minute to learn.


Figure: TIES menu example
The above example is an embedded menu from a database created as part of a student project (Hsu, & Powell, 1984). The imbedded menu highlights the menu items, and are shown here in bold type. The cursor is an inverse video bar that points to the current menu selection. Selecting NEXT PAGE allows the user can turn to the next page. END is used to quit TIES.
INTRODUCTION: STAMP UNION                            PAGE 1 OF 2
The Adele H. Stamp Union, formerly known as the Student Union, is the cultural and social center for the University. The Union provides a variety of services to the faculty, staff, and students. A plethora of restaurants are available providing a wide choice of atmosphere and a variety of menus. The Union is also center for entertainment. Several shops and many special services are available too. Union programs include concerts, exhibitions, and craft classes. The Union is open all week from 7 A.M. to 1 A.M. Monday through Friday, and until 2 A.M. on weekends.
NEXT PAGE       (Select option then press RETURN)           END

TIES has three active keys - move cursor left, move cursor right, and select menu item.1 Each of these commands is a single key stroke. Instead of extracting menu items and displaying them as an explicit menu (Koved, 1984), the items are highlighted directly in the text (see TIES menu example). This method of displaying text with embedded menu items has become known as touchtext (Koved, & Shneiderman, 1985). The initial screen contains an article with phrases that are highlighted. The user positions the cursor on one of the highlighted phrases. Pressing the select key causes a new article to be retrieved.

The top line of the screen contains information relevant to the current article. On the left there is an "Article Title" and on the right is a page number, such as "Page 1 of 4". If there are two or more pages of text for an article, then commands (an explicit menu) are shown at the bottom of the screen to allow page turning. These commands include "Next Page," "Previous Page," and "Return to ." Those menu items that are not applicable are not shown. For example, "Previous Page" is not shown when the first page of an article is being displayed.

The TIES system presents a navigation strategy that is similar to the hypertext and ZOG systems. The user is presented with the view that the database is organized as a unified network of related articles. The user starts at an introductory article and then proceeds to make menu selections that retrieve new articles. At each node in the network, except at the introduction, the user can request that the system return to the previously viewed article.

The TIES concept provides the basis of an online maintenance manual system because of its simplicity. People can easily access information by touching the screen, using a mouse, or pressing cursor keys. The database traversal strategies are simple and flexible. Little training is required to learn and use TIES. These concepts are desirable in construction of large online maintenance manuals.

Current online help and manuals

Most online help and documentation systems are relatively unsophisticated. The primary access mechanism is through keyword lookup, and access into a simple single level database of documents. For example, when a user needs information on the correct syntax or semantics of a command, the user enters "HELP <command_name>". While the specific syntax varies, this is the primary method of obtaining online information from many current systems such as TOPS-20 (DEC, 1980), UNIX (UCB, 1981), and VM/CMS (IBM, 1981).

Other online help systems have hierarchically organized help and documentation, enabling the user to access various levels of detail of a command or topic (Posner, Hill, Miller, Gottheil, & Davis, 1983; DEC, 1980). These systems are usually keyword driven (e.g., HELP <command> <argument> ...). Other systems are more sophisticated, such as Lotus 1-2-3 (Posner et al., 1983) that uses a menu approach to accessing the information database.

These systems are general purpose, and mostly use static databases. When the user tries to get help information, the text presented to the user is the same every time, regardless of the context or task being performed.2

Information structuring models

The underlying structure of a database is a very important consideration when designing and constructing an online information retrieval system. Various structures allow different types of navigation and dictate how the information must be organized. The linear traversal of a set of nodes and cyclic graph are opposite extremes of structuring information. The linear traversal, turning page by page, is similar to reading from paper but lacks flexibility. A cyclic graph by itself lacks the structure needed for organizing information. Documents do exhibit structure, and are typically organized as trees or lattices.

Trees

Trees organize information in a hierarchical manner appropriate for many applications. Videotex and other similar systems (Tetzlaff, 1984; Raymond, 1984; Ferrarini, 1980) use trees as the basic structure for databases. Each menu in the system provides access to subtrees. The user starts from a main menu (the root) and traverses down the edges of the tree until the desired information is retrieved.

Each node in the tree (except leaf nodes) is a decision point. Each selection the user makes becomes information carried down through the tree. The information known at node N, at depth D, is all of the prior D-1 decisions made to reach node N. Since the structure is a tree, the path to reach node N is unique. A tree is an useful structure when designing an Imperative style manual,3 since a node in the tree can not be reached unless all of the prior nodes in the path have been visited. The tree structure and its traversal strategy ensures that all of the prerequisite information has been viewed.


Figure: Tree structure with cross-links

Information is not always organized as a strict hierarchy. Designers of Videotex systems allow cross-linking between subtrees in the database. The links avoid replicating information in the database. Links serve as a mechanism to allow the user to traverse several different paths to reach a given piece of information (see Tree structure with cross-links). If an incorrect decision is made early in the search through the database, the user will not find the desired information unless the database contains cross-links or replicated data. The use of links allows mistakes to be corrected at lower levels of the tree.

The addition of links eliminates one of the strengths of the tree by providing multiple paths to a node in the database. With links, it is impossible to determine the exact path (the decisions) taken reach a node in the database. That is, if an Imperative style of manual uses a tree structure with links, it is impossible to know if all of the prerequisites have been met.

Lattices

A directed graph is the foundation of hypertext systems because of its flexible structure. Unlike the structure of trees, a directed graph, either acyclic or cyclic, allows the designer to determine the number of roots in the database.

Parallel paths are frequently found in manuals when presenting alternative methods for accomplishing a task. Alternatives are presented at decision nodes. Different parallel paths can be followed from the decision nodes until the paths merge at a later node in the network. For example, the installation procedures of a new piece of hardware may be accomplished in several ways depending upon the type or model of the hardware. The instructions for the different hardware proceed along independent paths that merge later in the installation procedure.


Figure: Lattice
A lattice with decision point D and merge point M. The lattice is a subgraph of a network of nodes.

The database structure of this parallelism can be viewed as a lattice.4 The lattice is a subgraph of the database network (see Lattice). Node D of the lattice is a decision point. All nodes between, but not including D and M, contain information about the decision made at D, just as in a tree structure. Lattices within lattices are permitted in the database (see Lattice within lattice). As in the tree, the information regarding the decisions made determines whether or not prerequisites have been met.


Figure: Lattice within lattice
A lattice can be a subgraph of a larger lattice.

Lattices help structure information in a database. The combining of an unstructured network of nodes into a network of lattices has some of the strengths of a tree, but eliminates the redundancy and hierarchy problems. However, major drawbacks are inherent in the lattice structure. Parallel paths of lattices merge.5 At the node where the paths merge, the information about the original decision is lost.6 So, information about the decision made at node D is lost when the user reaches node M. In general, when a node N is reached from several non-intersecting paths (see Merge of many paths), there is no way of determining which path was traversed to reach it.


Figure: Merge of many paths
Disjoint paths may merge at a single node. Without a decision database, the exact path traversed to reach node N would be unknown.

Lattices with a decision database

To prevent losing information when using lattices, the decisions can be recorded in a decision database.7 Thus, when the paths merge at a later node, the information about the decision is not lost since it is retained in the decision database. The information remains in the decision database until it is subsequently modified by another decision point. Nodes further along in the search path use the decision database to determine the specific path traversed. This technique dramatically reduces the number of nodes needed in a information or document database since a bushy tree is no longer required.

Search graph

User search space

A user searches the database of an online manual for desired information. The set of linked hypertext nodes forms a graph. While looking for desired information, the user is performing a graph search to reach one or more goals in the search space.

The search space can be extremely large for a commercially available database such as The Source or a Videotex system. The search space must be pruned to reduce it to a manageable size. Several different approaches to pruning the search space exist. The simplest technique presents the user with a set of menus and asks for the user to traverse an edge in the search graph. If the database is stored as a tree, then each selection has the potential for rapidly reducing the search space. In a cyclic graph network, however, a single menu selection may not appreciably reduce the search space. Therefore, other techniques are needed to prune the search space.

Two types of pruning

Two new methods exist for pruning the search space in online information retrieval systems. The first prunes the number of accessible nodes during the search. The second method reduces the volume of information displayed to the user on the screen.

Pruning subtrees

When using a lattice structure with a decision database, the decisions made along the path to the current node are recorded. When the information retrieval system fetches the current node from the database, it uses predefined rules8 to determine whether or not paths should be made accessible to the user. If one or more paths are determined to be not applicable, then the system prunes those paths by not allowing the user to traverse them. For example, by not displaying a menu selection, the user is denied access to the the subpath; hence, the graph has been pruned (see Pruned menu selections).


Figure: Pruned menu selections

The example shown above shows benefits for a company that makes a distinction between its salaried and hourly employees. Example 1 shows the type of information that is made available to the salaried employees. They have more benefits than the hourly workers who only have access to the information shown in Example 2. The pruning process eliminated two of the menu items and relabeled the menu selections.

The two figures to the right of the menus show unpruned and the pruned graphs that represent the two menus.
Example 1: 

A - Official holidays
B - Personal days off
C - Sick days
D - Paid vacation days
Example 2: 

A - Official holidays
B - Sick days


Pruning text


Figure: Sentence tree


The second type of pruning occurs while the text is being formatted for display. The complete text of a database node is a network of words, phrases, and sentences linked together in a serial fashion (see the Sentence tree figure). Using predefined rules, as used in pruning subtrees of a graph, the text formatter modifies the displayed information based upon the current state of the decision database. The pruning takes place when the formatter determines that a particular piece of text is not relevant and should not be shown.

Database node reduction

Text pruning reduces the number of physically distinct nodes stored in a document database. For example, a decision node may create several parallel paths through a document. Each of the parallel paths is only a slight variation of one another. Text pruning collapses the multiple paths into a single path by making the appropriate text substitutions (see the Text substitution figure).



Figure: Text substitution

The text is substituted to dynamically create new nodes in the database. "Knob" and "handle" are interchangeable, and "counter" was added to the second sentence.
_____________________________
Turn the knob clockwise.

Turn the handle counter clockwise.


A production system approach

Following the structure of many artificial intelligence (AI) systems, the design of the online documentation system presented in this thesis is divided into three parts: control structure, rules, and database (Nilsson, 1980). The difference between an AI application and the online documentation systems is that the latter are directed by the user rather than a programmed heuristic. The user determines when to backtrack, whether or not acceptable goals have been reached, or when a search has failed and should be terminated.

Control structure

The control structure is a combination of the online maintenance manual program manipulating the database through the use of rules, and through interaction with a person using the system. The program is given a start node in the network where the search is begun. The rules indicate the successor nodes in the graph. The user selects a successor node for expansion by making a menu selection. At any point in the search, the user can initiate backtracking by returning to previously viewed sections of the manual.9

The user determines the success or failure of the search. If the user fails to find the desired information, the search is terminated. Sometimes multiple subgoals are required to solve a given problem. Similar to AND/OR trees (Nilsson, 1980), each subgoal may require searching one or more subpaths in the database.

People have various strategies (heuristics) for searching. The search process may not be optimal or result in reaching the goal due to the pruning of subgraphs the user perceives to be unpromising. The problem of pruning subgraphs is related to the difficulty of indexing material for document databases (Blair, & Maron, 1984; Furnas, Landauer, Gomez, & Dumais, 1983; Landauer, Dumais, Gomez, & Furnas, 1982; Thomas, & Carroll, 1981). When the database systems are used by people with different backgrounds, synonymy and polysemy make indexing of information difficult.

Rules

Simple rules are links tying the documents into a network. Complex rules include links with semantics or information used for graph pruning. In the online maintenance manual systems used in the present research, two different styles of rules were developed. The first system uses simple rules for linking documents together. The second system uses the rules to link the documents together as well as antecedents for graph pruning (Nau, 1983). The information in the decision database is used by the antecedent rules in the pruning process.

Database

Databases are the stored documents that are usually divided into subsections. Each of these subsections is linked into a digraph. Most systems use static databases. However, techniques exist to dynamically generate new nodes in the graph on demand as was previously described in section 4.2.3 on database node reduction.

Substitution

Each of the three parts of the production system described can be interchanged with an equivalent component. Different users can use the same system and programs are interchangeable with other programs that use the same databases and rules. Substituting the rules modifies the underlying graph structure of the database as well as the pruning behavior, and can be used for developing alternate presentations for different audiences. The database can be substituted without replacing the rules as was done to develop two sets of structurally identical problems for the second experiment of the present research. The ability to substitute any of the three parts of the system provides flexibility and extensibility.

An expert system

The databases and rules are designed by experts in a specific field. These two components form the basis of an expert system (Nau, 1983), albeit a simple one. The unique aspect of the expert system described here is that the user is an integral part who directs the search process as well as provides inputs. At each node in the document database, the control structure selects the antecedent rules and values in the decision database to perform pruning of the search space.

Text processing technique

For simplicity, it may be desirable to merge the rules and the database into a single file. Using text processing languages that model document structure, such as GML (IBM, 1980) and Scribe (Reid, 1980), the rules can be be written as text processing commands embedded in the text of the database. Text processors can be used for printing the database as a document, or the file can be processed by another transducer to produce the database and rules as previously described. The use of text processing languages has the advantage of creating documents with existing tools, and allowing existing documents to be easily transformed into the database and rules format.


Figure: Text processing tags for pruning

In the first example, the text following the 'text' tag is included if the rule (the lisp function) returns a non-nil response. In the second example, the rule and text are similar to the first example, but the identifier in the database called 'quantity' is set to the value '1' if the user selects the menu item 'One'.

:tag-text
:rule ( = 'draft' version )
:text This is text to be included in the draft.
:end-tag

:menu-item
:rule ( = 'left' handed )
:text One
:set identifier=quantity value=1
:end-tag

The relationship to pruning

The production system controls the type of pruning taking place by making text substitutions. There is two types of pruning taking place -- the first is to prune text to be shown on the screen, and the second is to prune menu selection items. In the figure above, the text is preceded by text processing tags. These tags identify the function that determines whether or not the text or menu item is to be pruned. If the function returns nil, then the characters following the tag is pruned. Otherwise, the text is formatted and displayed on the screen. If the text being formatted is a menu item, then the text processing tags indicate that it is selectable and optionally defines an identifier and a value. When the user selects a menu item, the production system assigns the value to the identifier. These identifiers are used by the functions defined in the tags to determine whether or not pruning is to be performed.

Manuals

Styles of manuals for maintenance

Listed below is a description of typical manual styles:

Current online manuals

Current online manuals can be substantially improved. Most manuals use static databases whose content is displayed on demand. These manuals are usually General Information manuals or Narrative with references (e.g., UNIX man command, VM/CMS Help command). A common organizing technique for online manuals is to subdivide the manual and allow access via keywords. The user must know what keywords to use to access the database. The information presented to the user is usually not based upon the user's current context. Since the manual contents is static, the reader must infer from the descriptions presented what is important, and how to apply the information. The contents of the manual is usually syntactic and semantic descriptions without showing examples of how the information presented is to be used. The static nature of most online documentation systems does not readily assist in solving the user's problem. They merely present what is possible, but not what to do to reach a specific goal.

The maintenance environment

Modes of search

In general, there there are two basic modes in which a person searches for information.

Problem solving

The problem solving in the maintenance environment is constrained. The complexity and sophistication of mechanical and electrical devices is considerable. Only a limited number of personnel are usually available to service all of machines such as mainframe or mini computers. If each mechanical or electrical device in these machines had to be repaired rather than replaced with a new or similar unit, the cost would be staggering in terms of manpower. In addition, an insufficient number of technicians would be available to perform the repairs. In some companies, a person may be required to service and repair several different types of complex equipment.

For these reasons, many companies have chosen to design their equipment so that they are composed of "black boxes." The technician is required to perform diagnostics to determine which of several "black boxes" has failed, and replace that component. The replacement of a modular component reduces the amount of time to train service personnel, and also reduces the amount of time to perform a particular maintenance task.

The user interface

A very simple user interface was needed for conducting experiments involving online manuals. An objective of this interface was to minimize the time to learn to use the system and find the information of interest. Minimizing the number of errors the user committed while searching through the online manual was desired. Time devoted to learning the system was also minimized due to the time limitations for conducting experiments. These criteria led to the adoption of an interface similar to TIES (Ewing, et al., 1985; Shneiderman, & Ostroff, 1984).

Screen layout


Figure: OnLine Maintenance Manual screen and key layout

The screen is divided into three regions. The top is for status information (e.g., title and page numbers), the central region for text, and the bottom for page turning commands. The function keys are shown on the right.

In the OnLine Maintenance Manual (OLMM), two adjacent display screens were used to simultaneously display text and graphics.

One display screen was divided into three regions (see the figure above). Each region was visually separated by thin double lines. The top region provided orientation guides, listing the current section title of the manual and the page number. The bottom region contained an explicit menu of navigation commands. If a section of the manual had two or more pages of text, then commands at the bottom of the screen allowed page turning. These commands included "Next Page," "Previous Page," "Previous Section," and "First Page." The "Previous Section" command allowed the user to return to the previously viewed section of the manual. When the first page of a manual was displayed, the "Previous Page" menu item was not available. Pressing the "Page Number" function key erased the information at the bottom of the screen and displayed the question: "Page number?" Once the user entered and selected the page number, the bottom line returned to its former state.

The central portion of the screen displayed up to ten lines of textual material. Using double spacing alleviated difficulty with reading densely packed material on a small screen (Kolers, Duchnicky, & Ferguson, 1981). A 70 character text body was chosen, rather than the full 80 character width of the display, to allow space for left and right margins.

The cursor was an inverse video region on the screen. The cursor could point only to the menu items. These include the embedded items in the text and the navigation commands at the bottom of the screen.

The second display, an IBM Personal Computer Color Display, presented graphics. The graphics were limited to line drawings, filled rectangular regions, and textual captions or legends. The graphics were in high resolution mode.

Input syntax

The input syntax was restricted to five function keys. The design of the user interface deliberately limited the choice of actions to maintain the simplicity of the user interface.

The five active keys were labeled "Move Bar," "Select," "First Page," "Page Number," and "End." Each of these commands was a single key stroke, using either the function keys on the left side (first experiment), or the numeric keypad on the right side of the keyboard (second experiment). The function keys were overlaid with appropriate labels (e.g., "Move Bar").

System semantics

OLMM presented the user with a simple strategy for accessing information. The user viewed the database as an acyclic tree. In practice, however, the database may be a cyclic network of nodes.

Each time the user made an embedded menu selection, a new section of the manual was retrieved from the database. To make a selection, the user positioned the cursor on an underlined phrase, that became highlighted in inverse video, and pressed the select key. The new information appeared on the screen. The selection process was both repeatable and reversible.

Each manual section was either an interior node or leaf of the tree. If the node was an interior node, then it contained additional embedded menu items. Leaf nodes of the tree did not contain embedded menu items.

Once an embedded menu selection was made, the current location in the database was pushed onto a stack. At each node in the tree (except at the root), the user could request that the system move one level back toward the root by selecting "Previous Section". To locate the previous section, the system popped the top of the stack and retrieved the database entry corresponding to the top of the stack.

If there had been more than ten lines of text to be displayed in a section, the commands for page turning were shown on the bottom of the screen. The user could point to any of the page turning commands with the cursor and select the correct one.

Each node, or section of a manual, had an associated graphic description. Whenever a section of a manual was retrieved from the database, the associated graphic description was also retrieved and the figure was shown on the second display. The need for a second display screen for the graphics was due to both space limitations of the first display and hardware capabilities.

Introduction to the experiments

Two experiments investigated peoples' performance (time to perform a specific task) while using different styles of manuals. The subjects were placed in a situation similar to a maintenance environment where they were required to use a manual to solve hardware maintenance problems. The subjects were required to locate information and perform actions according to the directions in the manuals.

The manuals provided a directed search strategy. Once the subjects located the appropriate starting location, the manuals explicitly instructed them as to what needed to be done to solve the problems. Each of the experiments used a different style of manual -- Narrative style for the first experiment, and Imperative style for the second experiment.

Experiment 1: Mode and Structure of Presentation

Overview

Manuals written in the Narrative style have two primary characteristics. The first is that they refer the reader to other sections of the manual. References are given as page numbers, section numbers, or topic titles. When the reference uses a page number, the search is straightforward -- just turn to the correct page. However, not all page numbers follow a sequential scheme. Some books use a chapter-page number format, such as 5-145 to refer to page 145 of chapter 5. This form of page numbering increases the search time since both the chapter and the page numbers must be located.

Searching for a section number is analogous to locating a word in a dictionary. The reference words located at the top of the page in a dictionary serve as an index. Searching for a section by topic title is more complex. Either the table of contents or the index must be consulted to determine the page number, and then the reader must turn to the correct page.

In general, each of these search operations can be tedious and time consuming. Online systems using these reference techniques can require more time than paper manuals to locate the correct section because of complex input syntax, slow speeds of the computer output devices, or poor indexing and access strategies.

The other primary characteristic of Narrative manuals is that the reader must keep track of the locations in the text where reading of the narrative was suspended in order to read a reference. With paper manuals, the reader may insert place marks, such as hands or fingers, into the manual. However, online place holders are not generally available. When using online manuals, the reader must remember the location where reading was suspended in order to resume at the proper location. If a multiple reference chain leads the reader from one topic to another, the page numbers and the correct sequence that returns the reader to the original topic may be forgotten. As Borenstein (1985) has noted, many online help systems lose or destroy the current context. Thus, the user is burdened with remembering and restoring the previous context.

Hypotheses

The first experiment attempted to answer the following questions:

  1. Which mode of presentation (online or paper) is faster for retrieving information, and by how much?

    Overall, the subjects using paper manuals, both linear and tree, were expected to read and solve problems faster than the subjects using the online versions. These expectations were based on the poorer performance of people reading from VDT screens than from paper (Mills, & Weldon, 1984; Wright, & Lickorish, 1983; Gould, & Grischkowsky, 1984; Hansen, Doring, & Whitlock, 1978), and the greater familiarity the subjects have with using paper manuals.

    The linear presentation was hypothesized to be faster than the tree version for the paper manuals. The linear presentation was similar to the standard method of organizing information on paper, thus it was more familiar to the readers.

    In the computer versions, the numbers of pages viewed by the subjects for each trial were recorded. The hypothesis was that subjects with the tree version would view fewer pages than the linear version. That is, for the tree version, only the pages necessary to perform the experimental task needed to be viewed. In contrast, when using the linear version, the readers were expected to turn the pages sequentially to find the correct location in the manual.

  2. Does the presentation structure (linear vs. tree) affect the search time? Is there an interaction between the presentation structure and the mode of presentation?

    For the computer versions, the subjects using the tree structured manual were expected to be faster than those using the linear version. The tree structure contained mechanisms to automatically locate the correct information. Also, the implementation of the tree structured manual returned the reader to the correct page after a reference had been viewed. However, in the linear presentation, the reader had to remember page numbers in order to return to the correct narrative pages. Forgetting the page number from which they had come was expected to result in impaired performance for the subjects in the linear presentation.

  3. Will the mode or structure of the presentation affect the number of errors made?

    Fewer errors were expected in the most familiar situation (paper-linear) as compared to the other situations that were less familiar.

Method

Experimental design

The experiment was a 2 x 2 x 12 design with mode of presentation (paper vs. computer), structure of presentation (linear vs. tree), and trials (1-12) as factors. Both mode and structure of information were between-subject factors, and trials was a within-subject factor. The four experimental conditions were labeled computer-tree, computer-linear, paper-tree, and paper-linear (see the Experiment 1 - Design figure).


Figure: Experiment 1 - Design

The four experimental conditions, mode of presentation by structure of presentation.
Structure of Presentation

Mode of
Presentation     Tree         Linear
               /-----------.------------\
               |           |            |
     Computer  | Computer- | Computer-  |
               | Tree      | Linear     |
               |           |            |
               ------------+------------|
               |           |            |
     Paper     | Paper-    | Paper-     |
               | Tree      | Linear     |
               |           |            |
               \-----------.------------/


The dependent measures were the time to solve the problems and the number of errors made across the trials. In the computer conditions, the number of pages viewed per trial was an additional dependent measure. A questionnaire completed by each subject at the conclusion of the experiment provided additional between-subject statistical measures.

Subjects

Fifty-six undergraduate students recruited from the University of Maryland student body participated in the first experiment. Data from forty subjects was used in the analysis. The remaining sixteen subjects were not included because they were either part of the pilot testing (four students) or did not complete the twelve trials. The subjects not included were evenly distributed among the experimental conditions. Each subject was paid $5 for participating in the experiment.

Materials

Manuals

A simulated electronic intercom maintenance manual was written for the experiment. Four versions were developed, one for each of the experimental conditions. Each version contained the same text and graphics (diagrams); however, the organization and the number of pages in each version differed. A page in the computer versions was considered to be a single display screen of text with an accompanying screen of figures.

The pages in the computer-tree manual were organized hierarchically, giving the appearance of being a tree. Access to information in the manual was through embedded menus. The user interface provided a subset of the functions and features previously described.10 The three function keys provided, "Move Bar," "Select," and "First Page," were located on the left-hand side of the keyboard (see the figure below). The reader could move the cursor to an underlined phrase and then select it in order to display a detailed section on that topic.


Figure: Experiment 1 - Computer-tree screen and function keys


Previous Section was a selectable menu item that appeared on every page except for the first page. This menu item allowed the user to turn back to the previously viewed section of the manual. The Previous Section function was analogous to backward page turning.

Although there were only 30 unique pages in the manual, the hypertext organization gave the reader the impression that there were 52 pages. The readers were able to select some sections from more than one place in the manual. For example, the "Power Up" section could be reached from three different places in the manual even though the article existed only once in the database. The pages were not numbered in this version since the manual was logically organized as a lattice.

The pages in the computer-linear manual were organized in a sequential, or book, format. Access to the pages of the manual was through page-turning commands, similar to many current online manuals.

The user interface for this manual was a subset of the functions and features previously described.11 The three function keys used were "Move Bar," "Select," and "Page Number." Pressing the "Page Number" key allowed the user to enter the number of any page in the manual. By pressing the select key, the page was retrieved and displayed. The page-turning commands Previous Page, Next Page, and First Page could be pointed to with the highlighted cursor; the cursor could not move to the underlined phrases in the text. The underlined phrases were followed by a page number (a reference), in parentheses, indicating where the corresponding information was located (see the figure below). The page numbers always referred to pages later in the manual (forward references). Page numbering was shown in the upper right-hand corner of the screen.


Figure: Experiment 1 - Computer-linear screen and function keys


The pages were organized as though an in-order traversal of the tree version had been performed. The in-order traversal yielded 52 pages (30 unique, 22 duplicates) for the computer-linear version.

The paper-linear manual was a paper version of the computer-linear manual. The contents of the text and graphics had been replicated on paper in the same format as they appeared in the computer screens. The bottom section containing computer-specific commands was omitted in the paper-linear manual.

The paper-tree manual was similar to the paper-linear version except that all duplicate pages were removed. That is, there were 30 pages in this manual. All introductory pages of sections were at the beginning of the manual and all detailed information sections referred to were at the back of the manual. The page numbers that followed the underlined phrases were adjusted correspondingly.

In the paper conditions, the pages were placed in a binder with the graphics displayed on the page facing the text page. The difference between the two paper versions was in the numbering and organization of the pages, which reflected their different structures.

Questionnaire

A questionnaire regarding the experiment was completed by the subjects after they completed the experimental tasks (see Appendix A). The first nine questions were a subjective evaluation of the experiment, and the last question asked how the manual could be improved.

Problem sets

The experimental problems were to use the manuals to determine the correct settings for two sets of eight dip switches. The dip switches were soldered to a prototyping card along with IC chips and resistors to appear realistic. The manuals referred to this card as the "controller card." The twelve problems had different combinations of on and off switch settings to be set according to three parameters:

  1. The serial number on the card.
  2. The number of "cards in the system unit."
  3. The number of "extensions in the system unit."
The serial number was one of two numbers printed on labels attached to the prototyping card. The serial number was different for each of the trials. The other two parameters were given to the subjects on an index card at the beginning of each trial.

Equipment

Subjects who were tested on the computer manuals used an IBM PC/XT microcomputer with two adjacent display screens. An IBM Personal Computer (monochrome) Display12 was on the right, and an IBM Personal Computer Color Display was on the left. Text was displayed on the monochrome display, and graphics were displayed in a single color (yellow-gold) against a black background on the color display.

All subjects were given a stylus that could be used to set the dip switches.

Procedure

Practice

Initially, the subjects were given a practice problem to familiarize them with the experimental task and the type of manual they would be using. Four short manuals were developed for the practice problem, corresponding to the four experimental conditions. Each manual included general instructions on manual usage, as well as information needed to complete the practice problem. The practice problem consisted of following manual instructions for setting eight dip switches that were soldered to a prototyping card.

Experimental procedure

The subjects were randomly assigned to one of the four experimental conditions. Specific instructions were read to the subject at the beginning of each session. Each subject was informed of the task to be performed through the use of a practice trial. The subjects in the computer conditions were told the functions of the keys they could use to traverse the online manual. The subjects in the paper conditions and the computer-linear condition were told how the page numbers after the underlined words in the text could be used to locate information in the manual.

Once the subjects had completed a practice problem they were given additional instructions describing the experimental task. They were told to set the sixteen switches on the controller card based on information in the manual, on the controller card, and on the index card. For each trial, the subject was given a controller card that had a different randomly generated serial number on it, an index card containing information on the "number of cards in the system unit," and the "number of extensions in the system unit." Each subject received the same randomized set of twelve problems in the same order.

The subjects were informed that the time for each trial was being recorded. The subjects in both computer and paper conditions were timed using the computer's internal clock. The subjects in the computer conditions were timed from the start of the trial until they pressed the "End" key to indicate they were finished. The subjects in the paper conditions were timed by the experimenter by pressing the appropriate keys on the computer. Timing began the moment the subjects indicated they were ready, and ended when they said they were finished. The subjects' identification number, trial (problem) number, and time for each trial was recorded on the computer. The subjects' responses (the switch settings) were recorded by the experimenter. The subjects were not given feedback about their performance during the experiment.

After the subjects finished the twelve trials, they completed the subjective evaluation questionnaire.

Results


Figure: Experiment 1 - Mean total times

The mean total time in seconds; mode by structure by trial, for mode and structure.
           Computer       Paper           Mode         Structure
Trial   Tree  Linear   Tree  Linear  Computer Paper   Tree  Linear

  1    576.6  679.6   611.4  554.4     628.1  582.9  594.0  617.0
  2    203.9  211.1   165.4  160.1     207.5  162.8  184.7  185.6
  3    117.2  123.7   109.1  110.1     120.5  109.6  113.2  116.9
  4    113.2  102.8    73.1   92.9     108.0   83.0   93.2   88.0
  5    103.0   97.9    85.7   81.8     100.0   83.8   94.4   89.9
  6     91.4   88.8    65.0   91.7      90.1   78.4   78.2   90.3
  7     87.8   82.7    55.2   68.5      85.3   61.9   64.2   75.6
  8     73.2   61.2    41.4   44.0      67.2   42.7   57.3   52.6
  9     68.8   56.8    42.2   60.1      62.8   51.2   55.5   58.5
 10     64.5   54.5    41.8   52.6      59.5   47.2   53.2   53.6
 11     61.6   59.0    45.8   60.9      60.3   53.4   53.7   60.0
 12     59.9   58.0    45.7   43.7      59.0   44.7   52.8   50.9
mean   135.1  139.7   115.2  118.4     137.4  116.8  125.2  129.1

ANOVA (Reciprocal of time per trial)
F(1,36)  =10.42, p<.01  for computer vs. paper
F(11,396)=74.60, p<.001 for trials

The mean total time in seconds per trial for each of the experimental conditions is shown in the figure above.13 A 2 x 2 x 12 analysis of variance was performed on the reciprocals of the total time per subject per trial, with mode, structure, and trials as factors.14 A significant effect of mode (paper vs. computer) was found (F(1,36)=10.42, p<.01). The mean time in seconds was 116.8 for setting switches for the paper conditions, and 137.4 for the computer conditions. Hence, the subjects in the paper conditions were faster than the subjects in the computer conditions. There were no significant interactions with the structure (linear vs. tree).

There was little difference in performance between the linear and tree structures (see the figure above). The mean time in seconds to set the switches across all trials was 129.1 in the linear conditions, and 125.2 in the tree conditions. A 2 x 12 analysis of variance between the linear and tree conditions, by trial, did not yield a significant difference across trials.


Figure: Experiment 1 - Mean total times graph

The mean total time in seconds; by mode by structure by trial.

There was a significant main effect of trials (F(11,396)=74.60, p<.001) (see the figure above). The subjects were generally becoming faster at solving the problems across each successive trial, regardless of the mode or structure.


Figure: Experiment 1 - Mean number of pages viewed

The mean number of pages viewed in the tree and linear structures for the computer manuals.
          Computer
Trial   Tree  Linear

  1     53.1   43.1
  2     23.8   17.3
  3     15.8   12.0
  4     15.9   10.5
  5     16.0    9.0
  6     15.1    9.2
  7     15.0    9.0
  8     13.1    8.5
  9     13.5    7.0
 10     12.8    6.9
 11     13.4    7.5
 12     12.5    7.4
mean    18.3   12.3
ANOVA
F(1,18) = 4.91, p<.04  tree vs. linear
F(1,198)=36.21, p<.001 trials

In the computer conditions, the number of pages viewed was recorded (see the figure above). A 2 x 12 analysis of variance with structure and trials as factors indicated a significant difference between linear and tree structures (F(1,18)=4.91, p<.04). The overall mean number of total pages viewed was 12.3 in the linear condition, and 18.3 in the tree condition. There was also a significant main effect of trials (F(11,198)=36.21, p<.001), indicating that the subjects were viewing fewer pages across each successive trial. The interaction between structure and trials was not significant.


Figure: Experiment 1 - Mean number of errors

The mean number of errors by condition. Errors for each subject were summed across all trials.
   Computer       Paper              Mode         Structure
Tree  Linear   Tree  Linear     Computer Paper   Tree  Linear
 3.8    3.5     6.8    4.0         3.7    5.4     5.3    3.8

For each subject, the total number of errors in switch setting combinations was calculated. A switch combination was defined as a group of switches that could be set from one of the graphic displays of the manual. For each trial, there were four such combinations with the number of switches per combination ranging from 1 to 8 switches. Thus, the total number of switch setting errors possible per subject ranged from 0 to 48. A 2 x 2 analysis of variance was performed on these data, with mode of presentation and structure of information as factors (see the figure above). No significant differences were found in the number of errors among the experimental conditions.

The subjective evaluations provided an indication of preference for the computer manuals over the paper manuals. One question was eliminated because the scale indicating preference, relative to the other questions, was erroneously reversed. Results from the remaining eight questions indicating preference were summed, and a 2 x 2 analysis of variance with mode and structure as factors indicated a significant difference between paper and computer conditions (F(1,36)=5.08, p<.03) (see the figure below).


Figure: Experiment 1 - Questionnaire results

              Computer       Paper           Mode         Structure
Question   Tree  Linear   Tree  Linear  Computer Paper   Tree  Linear
overall    16.8   17.6    24.8   20.6     17.2    22.7
  3         2.5    2.1     3.4    2.9      2.3     3.2
(good-bad)
(1=good 7=bad)

  7         2.0    2.2     4.4    2.7      2.1     3.6    3.2   2.5
(organi-
zation)
(1=well organized 7=poorly organized)

ANOVA
F(1,36)=5.08, p<.03  overall computer vs. paper (for all questions)
F(1,36)=4.94, p<.033 question 3 computer vs. paper
F(1,36)=9.69, p<.004 question 7 computer vs. paper
F(1,36)=4.16, p<.049 question 7 interaction of mode and structure

Significant differences were found for two of the questions in a subsequent analysis of the subjective evaluations. Question three (good - bad manual) showed a significant difference between the computer and paper conditions (F(1,36)=4.94, p<.033). The mean was 2.3 for the computer condition, and 3.2 for the paper condition on a scale of 1 to 7. This indicated a preference for the computer manuals over the paper manuals regardless of the structure. Question seven (well organized - poorly organized) also showed a significant difference between computer and paper (F(1,36)=9.69, p<.004), and a significant interaction between mode and structure (F(1,36)=4.16, p<.05). The computer manuals were rated as better organized than the paper manuals (2.1 versus 3.6). In the computer conditions, the tree structure manual was rated as better organized than the linear structure manual (2.0 versus 2.2), whereas in the paper conditions the tree structured manual was rated worse than the linear structured manual (4.4 versus 2.7).

Discussion

There was a significant difference in time taken to solve the problems between the computer and paper conditions. Based upon a mean of 116.8 seconds for the paper conditions and 137.4 seconds for the computer conditions, paper was faster by 15%. This result confirms other studies that indicated reading from paper is faster than from a VDT (Wright, & Lickorish, 1983; Gould, & Grischkowsky, 1983; Hansen, Doring, & Whitlock, 1978). The subjects in the computer conditions were also burdened with learning traversal procedures and strategies to use the online manuals. The traversal process for the computer linear condition (entering page numbers) and the computer tree (moving the touchtext cursor) may have slowed down the subjects.

There was no indication that touchtext helped to reduce the search time. Subjects in the computer conditions performed more slowly than those who had the paper versions of the manual. Also, an analysis of the two computer conditions did not reveal a difference between the linear and tree versions -- the mean search times for both conditions were almost the same. However, the subjects in the computer-tree condition viewed 48.8% more pages than in the computer-linear condition. The tree structure of the manual may have slowed the progress of subjects in the computer-tree condition. In each trial, the subproblems -- setting the four groups of switches -- forced the subject to traverse down a predefined path through the manual. Once a subproblem was completed, the only way to begin the next subproblem was to retrace the path back to the starting point of the subproblem. No aids were provided to allow the subject to skip a section, or turn directly to a specific part of the manual.

A traversal strategy was not forced upon the subjects in the computer-linear condition. The subjects learned the layout of the manuals very rapidly (see Experiment 1 - Mean total times graph). As the subjects worked the successive problems, they began to skip directly to the pages of interest rather than following the underlying tree structure of the manual. It had not been anticipated that the subjects would memorize as much of the manual structure as they evidently did. Memorization could account for the significant difference between the number of pages read in the two conditions. It also perhaps explains why subjects in the computer-tree condition did not perform better than the computer-linear condition subjects. The computer-tree subjects may have memorized the layout of the manual, but no mechanisms were available to allow them to use that knowledge to skip directly to the most relevant sections of the manual (nodes of the database).

There are several indications that the subjects may have been memorizing the contents and layout of the manuals. First, in all of the conditions and across all trials, the time to solve problems was steadily decreasing. Also, as measured in the computer conditions, the subjects were viewing fewer pages during successive trials. Memorization would explain both the decreasing time to solve problems and the declining number of pages viewed to solve the problems.

Memorization is also suspected to have played a major role in the paper conditions, thus allowing those subjects to rapidly work each of the problems. Borenstein (1985) found similar behavior in his study of online help information. Those people familiar with a particular structure and content of manuals (e.g., the UNIX man and key commands) were able to locate information and solve problems faster than people without the same knowledge. Further study is required to determine what role memorization and learning plays in the use of manuals.

The subjective evaluations may provide some insight into peoples' expectations of computers. Although the manuals were the same in the computer and paper conditions, the subjects preferred the computer manuals. They felt that the computer manuals were better, on a scale of "good" to "bad," than the paper manuals. They also believed that computer manuals were better organized than the paper manuals. People may have lower expectations for online information than from printed material. When they are presented with the same material both online and in printed form, they rate the online version higher because their expectations are exceeded.

Another possible explanation for the preference for the computer manuals is that in the paper conditions all of the manual pages were in front of the subjects so they could get a better overall view of the organization of the manual as a whole. The organization may not have been very appealing, particularly in the paper-tree condition where page references required both forward and backward page turning in the manual.

The physical presence of the paper manuals may have also permitted different search and learning strategies that were not possible with the computer manuals. In the computer conditions, the subjects may not have been able to get the same overall perspective on the organization of the manual because they could only see one page of text and graphics at a time. This would have required them to rely more heavily upon memory to understand the structure and organization of the manuals. The differences in search strategies may have deprived the subjects in the computer conditions of the opportunity to learn as much about the structure and organization of the manual. The level of knowledge of organization and layout of the manual as well as access strategies may have influenced their opinions.

The interaction between mode and structure may also be due to knowledge of the structure of the manual and the access strategies available. The computer-tree manual subjects felt the manual was better organized than the computer-linear manual subjects, while the opposite was true in the paper conditions. In fact, in the paper-tree condition the manual was rated much poorer, indicating that the subjects did not feel the organization and layout was as good as in the paper-linear condition. In the paper-tree condition, the references (page number in parentheses) were to places both forward and backward from the current page. This organization required the reader in some situations to turn forward in the manual to locate information, and at other times to turn back to earlier sections of the manual. In the paper-linear manual, all of the references were to pages later in the manual. The subjects became aware of the direction in which they had to turn the pages to locate information. Perhaps in the paper-tree condition the subjects did not like to search for pages by turning in both directions. In the computer-tree condition, the subjects were not necessarily aware of the structure of the manual because it was hidden from view, but in the computer-linear condition they were aware of the structure because they used page-turning commands.

Two common themes were found in the subjective evaluations. The most common criticism of the manual was that too many pages had to be read in order to solve the problems. The other criticism was that the subjects wanted to get to the "right" information more easily. These criticisms are closely related. In the Narrative style, the reader is required to follow several references (e.g., page numbers) to locate directions for setting the switches. The subjects most likely would prefer a manual resembling a Imperative style. The Imperative style takes the reader, in a minimum number of steps, directly to the information needed to solve the problems. The second experiment was designed to investigate manuals written in the Imperative style.

Experiment 2: Online Imperative style manuals

Overview

The Imperative style of writing manuals was chosen for examination in the second experiment. This style of manual has the following characteristics:

The most appropriate structure for organizing a manual based upon the above description is a tree. However, a tree becomes very bushy quickly. Many times procedures are identical, yet they lie on different paths in the tree. To save space, the manual writer consolidates these identical procedures into a single location in the manual. Once they are consolidated, the manual is no longer organized as a tree. The information about the original decision that created the separate paths through the tree is lost. If the lost information is needed later, the question must be asked again.

When working with paper manuals the decision information is lost when the procedures are consolidated. Information need not be lost when the manual is stored in a computer. The online manual software can record the decision and retrieve the information at a later point in the search through the manual.

This experiment compares two ways of designing online imperative style manuals. The first manual, called unpruned, is a version where decision information is lost when procedures are consolidated (merged) into a single page. The second manual, called pruned, records the decisions. The information is used at several points in the manual to automatically prune the paths, offering readers a substantially more compact and relevant presentation.

Hypotheses

The experiment attempted to answer the following questions:
  1. Which manual, pruned or unpruned, allows the subject to view fewer pages?
  2. Will either manual reduce the number of pages the subject rereads while solving the problems?
  3. Which of the two manuals allows the subject to solve problems faster, and by how much?
  4. When trying to solve a problem, which manual minimizes the number of errors?
  5. Do the subjects have a preference for either manual?
  6. Is one manual perceived to be easier to learn? Easier to use?
  7. Is it easier to avoid getting lost in one of the manuals?
  8. Do the subjects feel that one of the manuals is easier to read than the other?

In the pruned version of the imperative style manual, fewer pages would be required to be viewed than would be necessary for the unpruned version of the same manual. The subject must view eleven pages in the pruned manuals to solve a problem. In the unpruned manuals, the minimum number of pages to solve a problem was sixteen pages.

The mean time per page in the pruned condition was expected to be less than in the unpruned condition. Questions in the unpruned manual were more complex than in the comparable pruned manuals -- decision information is lost and must be obtained from the user each time it is needed. Since the questions were less complex in the pruned conditions, less text was presented on the pages where questions were asked than in the unpruned conditions.

The number of pages that would be reread was predicted to be higher in the unpruned condition because the subjects would be less confident of the decisions they made. The questions to be answered were more complex in the unpruned manual, so the subjects would turn back more frequently to verify their decisions.

The number of errors was hypothesized to be higher in the unpruned condition. The complexity of the unpruned manual was expected to result in more errors being made at the decision points, leading to the selection of the wrong path through the manual. Following the incorrect path would result in incorrectly setting the switches.

The subjective evaluation of the two manual styles was expected to be in favor of the pruned manuals. The relative simplicity of questions in the pruned manual, compared to the unpruned manual, was expected to be the major influence. Thus, it was predicted that the subjects would perceive the pruned manual to be easier to learn and use, easier to read, and less likely to cause the subjects to get lost while using it.

Method

Experimental design

The experiment was a 2 x 2 x 2 x 4 design with technique (pruned vs. unpruned), text (computer manual vs. airplane manual), order of presentation (pruned first vs. unpruned first), and trials (1-4) as factors. Order of presentation is a between subjects factor. The technique, text, and trials were all within-subjects factors. Subjects were given a total of eight problems, four using each technique (pruned and unpruned). The order of presentation, pruned condition first or unpruned condition first, was counterbalanced between subjects.


Figure: Experiment 2 - Design

The four experimental conditions were order of text presentation by sequence of technique. The four conditions were counterbalanced by order.


                         /------------.-----------\
                        /            /           /|
               Second  /            /           / |
                      /            /           /  |
      Order          /------------+-----------/   |
                    /            /           /|   /
           First   /            /           / |  /|
                  /            /           /  | / |
                 /------------.------------\  |/  |
                 |            |            |  /   |
        Computer |            |            | /|  /
                 |            |            |/ | / 
Text             |------------+------------|  |/  
                 |            |            |  /  
        Airplane |            |            | /
                 |            |            |/
                 \------------.------------/
                     Pruned     Unpruned

                        Technique

There were four experimental conditions counterbalancing the order of technique and text presentation (see Experiment 2 - Design). Two sets of text were written for both techniques, for a total of four texts. Subjects were randomly assigned to one of the four cells corresponding to the order in which the technique was presented and the order in which the texts were viewed.

The dependent measures were the time to solve each of the problems in the eight trials, the number of pages viewed during each trial, and the total number of switch setting errors summed across trials by technique. A questionnaire completed by each subject after the experiment provided subjective measures comparing the two techniques.

Subjects

Twenty-three undergraduate students recruited from the University of Maryland participated in the second experiment. Data from twenty subjects, 9 male and 11 female, were used in the analysis. The remaining three subjects were not included because they either did not complete the eight trials, or follow the directions. Each of the subjects was either paid $5 for participation in the experiment, or received credit for a psychology course.

Materials

Manuals


Figure: Experiment 2 - Function keys


The online manuals for this experiment had the same user interface for text presentation as the computer-tree manual in the first experiment. However, graphic information was not shown on the color display unit in this experiment. Based upon feedback and observation in a pilot study, the function keys for page turning (e.g., "Move Bar" key, etc.) were moved to the right-hand side of the keyboard (see Experiment 2 - Function keys). The relocated functions keys were easier for right-handed subjects to reach because of the physical location of the keyboard, computer, and display units. The new location of the function keys was not found to be a problem for the one left-handed subject.

The two manual texts were written, one for "computer installation," and one "aircraft flight control card installation." Both of these texts, computer and airplane, were written for both techniques. An attempt was made to keep the information content and wording as consistent as possible in all of the manuals.

The computer and airplane manuals were written to be structurally identical. The two computer manuals were written first. These were then duplicated, and the information for the airplane manuals was substituted to create the second set of texts. The same terminology and phrasing was used in the manuals whenever possible.

Questionnaire

A questionnaire regarding the experiment was completed by the subjects after they finished the experiment (see Appendix B). One question asked the subjects to indicate a preference for one of the two manuals. The first two pages contained seven pairs of questions for comparing the two techniques. The third page of the questionnaire asked for written comments about the manuals and the experiment.

Problem sets

The experimental problems were to use the manuals to determine the correct settings for two sets of eight dip switches. The dip switches were soldered to a prototyping card along with IC chips and resistors to appear realistic. The computer manuals referred to this card as the "system board," while the airplane manuals referred to the card as the "controller card." The eight problems had different combinations of on and off switch settings based upon five parameters. In the computer manual, the parameters were:

  1. Serial number of the computer.
  2. Memory size.
  3. Display type.
  4. Number of floppy disk drives.
  5. Number of fixed disk drives.
For the airplane manual, the parameters were:
  1. Aircraft type.
  2. Number of fuel tanks.
  3. Number of engines.
  4. Number of automatic flight controls.
  5. Number of feedback controls.
An index card with information on the five parameters was given to the subject at the beginning of each trial.

Eight problems were randomly generated for the computer and airplane manuals. The corresponding problems for each of the manuals were designed to require the subject to search either manual in the same sequence. The parameters supplied on the index card would require the subject to make the same menu selections, regardless of the manual text (computer or airplane). This was possible since the two manuals were structurally identical.

Equipment

Subjects were tested on an IBM PC/XT microcomputer with an IBM Personal Computer (monochrome) Display.15 All subjects were given a stylus that could be used to set the dip switches.

Procedure

Practice

Before each half of the experiment, the subjects were given two practice problems similar to those in the experiment. The practice problems were to familiarize them with using the manual. Two short manuals were developed for the practice problems, corresponding to the pruned and unpruned conditions. Each manual included general instructions on manual usage, as well as information needed to complete the practice problems. The practice problems consisted of following manual instructions to set eight dip switches soldered to a prototyping card. After each practice problem was completed, the subject was informed of any mistakes in setting the switches and the location in the manual where the mistake was made.

Experimental procedure

The subjects were randomly assigned to one of the four experimental conditions (text x order). Specific instructions were read to the subject at the beginning of each session. Each subject was familiarized with the task to be performed through the use of two practice trials. The subjects were informed of the functions of the keys they could use to traverse the online manual.

Once the subjects completed the practice problems, they were given additional instructions describing the experimental task. They were told to set the sixteen switches on the prototyping card based upon information in the manual and on the index card containing the five parameters. Each subject received the same randomized set of eight problems in the same order.

The subjects were requested to work as quickly and accurately as possible. The subjects were timed, using the computer's internal clock, from the start of the trial until they pressed the "End" key to indicate they were finished. The subjects' identification number, trial (problem) number, and time for each trial was recorded by the computer. The subjects' responses (the switch settings) were recorded by the experimenter. The subjects were informed of errors in setting the switches and the location in the manual where the error was made after each trial.

After the first four trials were completed, the subject was given practice trials using a manual with the technique and text not used in the first half of the experiment. Thus, a subject who had the pruned-computer condition first would have the unpruned-airplane condition second. The procedure for the last four trials was the same for as the first four.

After the subjects completed the eight trials, they filled out the questionnaire.

Results


Figure: Experiment 2 - Mean total time

Mean total time in seconds, by trial, for each technique.
             Pruned First      Unpruned First
Trial     Computer Airplane   Computer Airplane

Pruned
Manuals

  1         167.6   241.8       114.2   125.2
  2         136.4   164.2       111.0    92.2
  3          90.2   102.0        80.6    65.4
  4         107.0   116.0        85.8    69.2
mean        125.3   156.0        97.9    88.0

Unpruned
Manuals

  1         252.0   386.0       318.6   265.0
  2         212.4   316.8       272.0   214.8
  3         156.6   220.8       217.6   183.8
  4         185.4   275.2       190.2   163.8
mean        201.6   299.7       249.6   206.9

mean by     163.5   227.9       173.8   147.4
condition

pruned mean   = 116.8
unpruned mean = 239.4

ANOVA
F(3,48)= 74.18, p<.001 for trials
F(3.48)=  4.62, p<.006 for trials x order
F(1,16)=  5.49, p<.032 for order x text
F(1,16)=178.31, p<.001 for technique
F(1,16)=  7.45, p<.015 for technique x order x text
F(3,48)=  3.89, p<.015 for technique x trial
F(3,48)=  3.87, p<.015 for technique x trial x order


Figure: Experiment 2 - Mean total time by technique

Mean total time in seconds; technique by trial by order of presentation.

         Pruned First          Unpruned First
Trial    Pruned Unpruned       Pruned Unpruned

  1      204.7  319.0          119.7  291.8
  2      147.3  264.6          101.6  243.4
  3       96.1  188.7           73.0  200.7
  4      111.5  230.3           77.5  177.0


Figure: Experiment 2 - Mean total time graphs


A 2 x 2 x 2 x 4 analysis of variance using technique, text, order, and trials as factors revealed a significant difference in the total time to solve the problems between the pruned and unpruned conditions (F(1,16)=178.31, p<.001) (see Experiment 2 - Mean total time). The mean time across trials was 116.8 seconds in the pruned condition, and 239.4 seconds in the unpruned condition. Thus, the pruned condition required only 48.8% as much time as the unpruned condition. A significant difference was found for trials, indicating that the subjects were getting faster as they worked successive trials (F(3,48)=74.18, p<.001). Also, significant differences were found in interactions for technique x order x text, trial x order, technique x trial, and technique x trial x order (see Experiment 2 - Mean total time graphs). The differences due to both text and order are attributed to the unpruned-computer manual for subjects who had the pruned condition first. The unpruned-computer manual repeatedly asked the subjects for the numeric range of the "serial number" parameter. In contrast, the unpruned-airplane manual did not require determining numeric ranges for the corresponding parameter. More time was required in the unpruned-computer trials because of the need to determine the numeric range several times to solve a problem. In general, the mean times in the unpruned-computer trials were all greater than for the unpruned-airplane trials, most likely for the same reason. No significant main effects were found for text or order alone.


Figure: Experiment 2 - Mean number of pages viewed

The mean number of pages viewed in each of the conditions; technique by text by trial.
             Pruned First      Unpruned First
Trial      Computer Airplane   Computer Airplane
Pruned
Manuals

  1           11.6    14.0       11.8    12.2
  2           12.6    14.0       11.2    11.0
  3           12.0    11.2       11.2    11.0
  4           12.2    11.4       11.2    11.8
mean          12.1    12.7       11.4    11.5


Unpruned
Manuals

  1           16.4    21.0       16.0    16.0
  2           17.2    19.0       16.8    16.4
  3           16.0    19.6       16.8    18.0
  4           17.2    17.2       16.4    16.4
mean          16.7    19.2       16.5    16.7

Pruned first mean   = 15.2
Unpruned first mean = 14.0
Pruned mean   = 11.9
Unpruned mean = 17.3

ANOVA
F(1,16)=  4.80, p<.04  for order of presentation
F(1,16)=189.64, p<.001 for technique (pruned vs. unpruned)


Figure: Experiment 2 - Mean number of pages viewed graph


A 2 x 2 x 2 x 4 analysis of variance using technique, text, order, and trials as factors was performed on the number of pages viewed. A significant difference was found between techniques for the number of pages viewed (F(1,16)=189.64, p<.001) (see Experiment 2 - Mean number of pages viewed). Fewer pages were viewed in the pruned condition. This predicted result is not surprising because the minimum number of pages to be read for solving a problem in the pruned condition was 11 pages, but 16 pages in the unpruned condition. In addition, an order effect was found between the pruned first and unpruned first condition (F(1,16)=4.80, p<.044) (see Experiment 2 - Mean number of pages viewed graph). Those subjects who received a pruned manual first viewed more pages across all trials than those who received an unpruned manual first. This order effect is most likely due to the difficulty the subjects had with the numeric ranges in the unpruned-computer manual, as was previously described.


Figure: Experiment 2 - Mean time per page

The mean time per page in seconds; order by text by trial.
             Pruned First      Unpruned First
Trial     Computer Airplane   Computer Airplane

Pruned
Manuals

  1          14.4    17.6         9.7    10.1
  2          11.1    12.4         9.9     8.4
  3           7.5     9.1         7.2     5.9
  4           8.7    10.1         7.7     6.0
mean         10.4    12.3         8.6     7.6

Unpruned
Manuals

  1          15.4    19.0        19.9    16.6
  2          12.6    16.5        16.3    13.1
  3           9.8    10.9        13.0    10.5
  4          10.8    15.9        11.7    10.0
mean         12.1    15.6        15.2    12.5

mean by      11.3    13.9        11.9    10.1
condition

pruned mean   =  9.7
unpruned mean = 13.9

ANOVA
F(1,16)=159.59, p<.001 for technique
F(1,16)= 25.02, p<.001 for technique by order
F(1,16)=  5.99, p<.026 for technique by order by text
F(3,48)=106.29, p<.001 for trials
F(3,48)=  4.58, p<.007 for trials by order
F(3,48)=  7.76, p<.001 for technique by trials by order


Figure: Experiment 2 - Mean Time Per Page Graphs


A significant difference was found between techniques for the mean time per page (F(1,16)=159.59, p<.001) (see Experiment 2 - Mean time per page).16 The questions in the unpruned manuals were more complex, requiring more reading and thought, resulting in 30.2% more time to read each page. A significant difference was found across trials, indicating that subjects improved their performance during successive trials (F(3,48)=106.29, p<.001). Other significant results were technique x order, technique x order x text, trials x order, and technique x trial x order (see Experiment 2 - Mean Time Per Page Graphs). These results were attributed to the difficulty the subjects had with the numeric ranges in the unpruned-computer manual.


Figure: Experiment 2 - Mean pages viewed exceeding the minimum

The mean number of pages viewed exceeding the minimum required to complete a problem.
             Pruned First      Unpruned First
Trial     Computer Airplane   Computer Airplane

Pruned
Manuals

  1           0.2     3.0         0.8     1.2
  2           1.6     3.0         0.2     0.0
  3           1.0     0.2         0.2     0.0
  4           1.2     0.4         0.2     0.8
mean          1.0     1.7         0.4     0.5

Unpruned
Manuals

  1           0.4     5.0         0.0     0.0
  2           1.2     3.0         0.8     0.4
  3           0.0     3.6         0.8     2.0
  4           1.2     1.2         0.4     0.4
mean          0.7     3.2         0.5     0.7

mean by       0.9     2.4         0.4     0.6
condition

ANOVA
F(1,16)=4.54, p<.049 for order of presentation

A measure of uncertainty and confusion is the number of pages reread, referred to as the number of pages exceeding the minimum. As mentioned before, the minimum number of pages was 11 in the pruned condition and 16 in the unpruned condition. The number of pages exceeding the minimum was calculated for each subject by trial,17 and a 2 x 2 x 2 x 4 analysis of variance on all factors was performed. An effect was found for order of presentation (F(1,16)=4.54, p<.05) (see Experiment 2 - Mean pages viewed exceeding the minimum). Those subjects who had a pruned manual first had more difficulty with the unpruned manual. In particular, those people who had the unpruned-computer manual second performed worse than those subjects who had the unpruned-computer manual first. No main effects were found for technique or trials.

The questionnaire completed by all of the subjects indicated a preference for the pruned manuals. All but three of the subjects (85%) preferred the pruned manual. One of the subjects did not indicate a preference. The other two subjects who preferred the unpruned manual described it as more challenging than the unpruned manual. It gave them greater satisfaction since it required them to think about the menu choices. In the unpruned manuals, repetitive but simple question answering was found to be tedious, although the unpruned manuals were rated significantly easier to use.


Figure: Experiment 2 - Questionnaire results

Mean response across all subjects by question, on a scale of 1-11.
                           Technique
Question          Pruned  Unpruned  DF  Error    F      P
Ease of learning   9.95     8.10     1    16   42.12  .001
  (Hard - Easy)

Ease of use        9.85     7.60     1    16   34.32  .001
  (Hard - Easy)

Got lost          10.55     8.80     1    16   21.49  .001
  (Frequently -
  Infrequently)

Find right info.  10.05     8.00     1    16   28.74  .001
  (Hard - Easy)

Understand info.  10.20     8.20     1    16   33.68  .001
  (Hard - Easy)

A 2 x 2 x 2 x 7 analysis of variance using technique, text, order and questions as factors was performed on the remaining questions comparing the two manuals. A significant main effect for technique was found in five of the questions (see Experiment 2 - Questionnaire results). Subjects believed the pruned manuals were easier to learn, use, find information on the screen, and understand information on the screen. They also believed that they got lost less frequently in the pruned manual than in the unpruned manual. No significant results were found for the level of detail or organization of information questions.

After performing a 2 x 2 analysis of variance using technique and order as factors, no significant results were found for the number of errors in setting the switches.

Discussion


Figure: Experiment 2 - Percentage difference between techniques

Average time per problem by trial and technique, and the percentage of time to complete the pruned problems as compared to the unpruned problems (pruned/unpruned * 100).
           Technique
Trial  Pruned  Unpruned     %

  1     162.2   301.4     53.8
  2     126.0   254.0     49.6
  3      84.6   194.7     43.5
  4      94.5   203.7     46.4
mean    116.8   238.4     49.0


As expected, the subjects performed better with the pruned manuals than they did with the unpruned manuals. An interesting result is that with the pruned manuals they completed the problems twice as fast, both overall and by trial (see Experiment 2 - Percentage difference between techniques). On average, the time spent viewing each page was also reduced by 30.2%, reflecting the reduction in time to make the correct menu selection.

The interaction of technique with order in the total time to solve the problems indicated the difficulty the subjects had with determining numeric ranges to answer the questions. Another indication of the same problem is that the subjects needed to view more pages exceeding the minimum needed to solve a problem than for any of the other manuals. In the computer manuals, the numeric ranges required the subject to use reasoning to make the menu selection, whereas the airplane manual only required simple pattern matching. During the experiment two subjects, one in the unpruned and one in the pruned conditions, asked about the numeric ranges because they were confused.

Studies have shown that people have difficulty in translating natural logic into formal logic (Braine, 1978; Thomas, 1976). To make menu selections, the subjects had to determine the question being asked, and then translate it into formal logic. For the pruned manuals, the questions and their logic representations were simple. But the menus in the unpruned manuals required the subjects to translate complex English sentences into logic statements, and then make a correct menu selection. The difficulty of translating the questions into logic was mentioned by one of the subjects after the experiment. Determining the question and its translation into logic took longer in the unpruned conditions.

Complex questions can result in errors for several reasons. The natural logic to formal logic translation is difficult, and the the reader may misinterpret the intended meaning of the question. The other problem with complex questions is that the writer may incorrectly formulate the question. Care was taken when designing the questions for the experiment, in an effort to reduce the number of errors committed due to misinterpretation. Also, the subjects used the same manual for four trials and were told where in the manual they made mistakes after each trial. The design of the manual and the feedback on the errors committed contributed to the lack of a significant difference between techniques for the number of switch setting errors made.

The subjects did prefer the pruned manuals over the unpruned manuals. A common theme in the written responses was that the unpruned manuals were "cluttered," or had too much information on a page. They also indicated that the pruned manuals had too little information on a page. Some people complained that there were too many pages in the pruned manual for the actual work performed (setting the switches).

An interesting questionnaire result was that the subjects felt that they got lost more frequently in the unpruned manuals. Their impressions were supported by both the time to solve the problems, particularly in the unpruned-computer manual, and the number of pages viewed that exceeded the minimum. The pruned technique was useful in minimizing their perception of getting lost and in improving their overall performance.

One common response about the experiment was that the subjects would have liked immediate feedback as to whether or not they had set the switches correctly. One person indicated that a picture of the correct switch settings would have helped.

Conclusions

Summary

The experiments provide insight into the ways people search for information in documents, both on paper and online. The first experiment did confirm that people reading from paper do perform tasks faster than when working from computer screens. People learned more of the structure and organization of the manuals than had been anticipated. Their knowledge of the structure helped them to rapidly access the relevant information to solve the problems.

The tree structure used online may have been too restrictive a structure for information searching. Novel data structuring, such as graphs containing lattices, is a promising method of organizing the information. The lattice approach allows a linear traversal of the database for structured tasks. The access to the correct information is done though simple question answering, as was done in the second experiment. The dramatic performance improvement by taking advantage of the pruning technique demonstrates the potential of this approach.

The use of the pruning technique is useful for simplifying the user interface. As in the unpruned condition, complex interactions between different input parameters may make menu selection decisions unnecessarily difficult. The complex decisions become an obstacle in the user's path. The simpler interface of the pruned technique is also perceived to be easier to use, learn, and understand than if the technique is not used.

An interesting result of the first experiment was that people rated the computer manuals as better than the paper manuals even though the content was identical. The preference may be due to people having different expectations of material printed on paper versus material presented on a computer screen. Computers may be perceived as difficult to use and understand. If the material presented is of acceptable quality, thus surprising the reader, then it will be rated higher. The interaction between structure and mode of presentation and the preference for computer manuals over the paper manuals may suggest that people have a different understanding of the organization and structure of the manuals. They may also have different preferences for access techniques depending upon the mode of presentation. Techniques that are satisfactory when using computers may not be as acceptable when working with printed material. Further research is needed to understand why these differences were found between the presentations on paper and computer.

The types of aids needed when searching through databases requires further research. The results of the computer-tree condition indicated that alternative traversal strategies are probably required to allow people to directly access parts of the manual of interest. A table of contents or index is a likely candidate. An ability to return directly to an arbitrary section previously visited may save time when the traversal path is long. Orientation aids are needed to prevent people from getting lost in the manual. A simple approach would be to list all of the sections in the current path. The effectiveness of each of these aids must be assessed through experimentation.

Future possibilities

The use of a graph structure with lattices allows large databases to be created in an incremental fashion. While the databases created for the present experiments were for maintenance manuals, the same techniques can be applied to online help and documentation for application programs and systems. Using a technique similar to ZOG, that acts as a front-end to application software, information from the application program can be used to provide the current state of the system. The relevant information is then retrieved from a documentation database. The pruning techniques can be used to retrieve and display only the information pertinent to the user's current state. The documentation is customized as it is retrieved and displayed to the user, thus tailored to the user's specific needs. The documentation guides the user through the application in a stepwise manner. This style of documentation and tutorial writing is known as the Minimalist Philosophy (Carroll, 1984).

The database could easily be designed to provide multiple levels of documentation. The information for first-time users and experts is available in a single database. An input to the pruning process could indicate the level of expertise of the user for the specific application. Should the user need more detailed information on a topic, the details could be obtained by changing the degree of pruning of text, or the user can make a menu selection to search another section of the database. Also, with the use of heuristics, the level of expertise could be automatically modified as the user becomes more familiar with the system. With the use of the pruning techniques, the Minimalist Philosophy can be realized.


Appendix A: Questionnaire for experiment 1



Appendix B: Questionnaire for experiment 2

Subject _____        Order ______



Compare the two methods which you have just used. Circle the appropriate choice.

Which method would you prefer if you had to use one of the two methods frequently?

(1) The first method. (2) The second method.

Rate the following aspects of the methods on the scales provided. Circle a number from 1 to 11. Carefully read each item and pay attention to the labels on the rating scales.

Manual Content

                                             Very Hard               Very Easy

Ease of learning:             first method   1  2  3  4  5  6  7  8  9  10 11

                             second method   1  2  3  4  5  6  7  8  9  10 11


Ease of use:                  first method   1  2  3  4  5  6  7  8  9  10 11

                             second method   1  2  3  4  5  6  7  8  9  10 11


Was it difficult to find the cursor (the green bar) on the screen?  no  yes

Would you have liked to have another key which would allow the cursor to
move in the opposite direction?  yes no

Were the keys located in a place easy to reach on the keyboard?  no  yes

If not, where would you rather have the keys located?

Are you right handed or left handed?  left right




Orientation

                                             Frequently          Infrequently

Got lost:                     first method   1  2  3  4  5  6  7  8  9  10 11

                             second method   1  2  3  4  5  6  7  8  9  10 11


Would it have helped when you got lost if a list of the titles of the sections you had already been shown were on the screen? yes no

What kind of aids would you have liked so you would know where you were in the manual?

Would a table of contents or an index have helped? no yes

Screen

                                             Very Hard               Very Easy

Find the right information on the screen:

                               first method  1  2  3  4  5  6  7  8  9  10 11

                              second method  1  2  3  4  5  6  7  8  9  10 11


Understand the information on the screen:

                               first method  1  2  3  4  5  6  7  8  9  10 11

                              second method  1  2  3  4  5  6  7  8  9  10 11


Detail

                                             Too Little              Too Much

Level of detail:               first method  1  2  3  4  5  6  7  8  9  10 11

                              second method  1  2  3  4  5  6  7  8  9  10 11


Organization

                                             Very Organized     Very Confusing

Organization of information:   first method  1  2  3  4  5  6  7  8  9  10 11

                              second method  1  2  3  4  5  6  7  8  9  10 11

General Questions

Did you get lost when using the first method? If so, why?

Did you get lost when using the second method? If so, why?

Comments about the first system:

Comments about the second system:

Comments about the experiment:

How would you improve the format of the information in the manual in the first method? In the second method?

How would you improve the methods of selecting information (e.g., use of keys) in the first method? In the second method?


Appendix C: Experiment 1 Paper-Tree manual

Appendix D: Experiment 2 manuals (samples)

Pruned-computer manual

(omitted)

Pruned-airplane manual

(omitted)

Unpruned-computer manual

(omitted)

Unpruned-airplane manual

(omitted)

Footnotes

1 A second version of TIES uses a touchscreen to allow pointing with a finger instead of left and right cursor motions (a keyboard is not needed).

2 A more thorough analysis and discussion of online help systems can be found in Borenstein (1985) and Houghton (1984).

3 Manual styles are described in Chapter 6.

4 Rutherford (1965, p.4) formally defined a lattice as "a partially ordered set such that any two elements of it possess both a least upper bound and a greatest lower bound."

5 Otherwise the structure is a tree.

6 Trees do not lose information since paths do not merge -- trees only become bushier.

7 The database of text and information will be referred to as a "database," and the database of decision information will be referred to as a "decision database."

8 The rules are associated with the database, and define the search space. Rules are described further in Chapter 5.

9 All of the previous sections in the current search path are held in a stack.

10 A detailed description of the features used is in chapter 8 (The user interface).

11 A detailed description can be found in chapter 8 (The user interface).

12 Green phosphor on a gray-black background.

13 In two instances in the paper conditions, the times for a trial were lost because of administrative errors. The data for the two lost trials were replaced with the mean time of the remaining nine subjects on these two trials.

14 Using reciprocals is a standard transformation used to get rid of interactions between means and standard deviations for response time data.

15 Green phosphor on a gray-black background.

16 Average time per page was calculated as (total-time-per-trial)/(pages-viewed-per-trial).

17 No trials contained fewer than the minimum number of pages in either condition.


References

Blair, David C. and Maron, M.E. "An Evaluation of Retrieval Effectiveness for a Full-Text Document-Retrieval System." Communications of the ACM, March 1985, 289-299.

Borenstein, Nathaniel S. "The Design and Evaluation of On-line Help Systems." Ph.D. Dissertation, Carnegie-Mellon University, January 1985.

Braine, M.D.S. "On the relation between the natural logic of reasoning and standard logic." Psychological Review, (1978) 85, 1-12.

Carroll, John M. "Minimalist Training." Datamation, November 1, 1984, 125-136.

Tops-20 User's Guide, Seventh edition. Digital Equipment Corporation, Marlboro, Ma., 1980.

VAX/VMS Command Language User's Guide. Digital Equipment Corporation, 1977.

Ewing, John, Simin Mehrabanzad, Scott Sheck, Dan Ostroff and Ben Shneiderman. "An Experimental Comparison of a Mouse and Arrow Keys for an Interactive Encyclopedia." Department of Computer Science, University of Maryland, College Park. Technical Report CS-TR-1475. February 1985.

Ferrarini, Elizabeth M. "Move Over Electronic Mail - Here Comes Viewdata." Interface Age, December 1980, pp. 77-82. As reprinted in Tutorial: End User Facilities in the 1980's. James A. Larson (ed.), IEEE Computer Society Press.

Furnas, G.W., T.K. Landauer, L.M. Gomez, and S.T. Dumais. "Statistical Semantics: Analysis of the Potential Performance of Key-Word Information Systems." The Bell System Technical Journal, Vol. 62, No. 6, July-August 1983.

Gould, John D., and Nancy Grischkowsky. "Doing the Same Work with Hard Copy and with Cathode-Ray Tube (CRT) Computer Terminals." Human Factors, 1984, 26(3), 323-337.

Hansen, Wilfred J., Richard Doring, and Lawrence R. Whitlock. "Why an examination was slower on-line than on paper." Int. J. Man-Machine Studies (1978) 10, 507-519.

Houghton, Raymond C. Jr. "Online Help Systems: A Conspectus." Communications of the ACM, February 1984, 126-133.

Hsu, Andrew, and David Powell. "An Experimental Evaluation of Two Menu Designs for Information Retrieval." Student project, CMSC 434. University of Maryland, College Park. December 1984.

Document Composition Facility, Generalized Markup Language: Concepts and Design Guide. IBM Publication No. SH20-9188 (April 1980).

IBM Virtual Machine/System Product: CMS Primer. IBM, 1981.

Kolers, P.A., R.L. Duchnicky, & D.C. Ferguson. "Eye movement measurement of readability of CRT displays." Human Factors, 23 (1981), 517-527.

Koved, Larry. "Implicit vs. Explicit Menus." Working paper, December 1984. IBM T.J. Watson Research Center, Yorktown Heights, N.Y.

Koved, Larry, and Ben Shneiderman "Embedded Menus: Selecting items within context." In preparation. IBM, Yorktown Heights, N.Y., and University of Maryland, College Park.

Landauer, T.K., S.T. Dumais, L.M. Gomez, and G.W. Furnas. "Human Factors in Data Access." The Bell System Technical Journal, Vol. 61, No. 9, November 1982.

Mills, Carol Bergfeld, and Linda J. Weldon. "Reading from Computer Screens." Center for Automation Research, Human-Computer Interaction