Constructive hypertext tools have evolved into two broad families that accommodate two distinct roles: argumentation and information farming. Argumentation tools such as SEPIA (Thüing, Haake et al. 1991) and gIBIS (Conklin and Begeman 1988) help writers fit information they understand into coherent, convincing, and valid argumentative structures. Information farming tools such as VIKI (Marshall, Shipman et al. 1994), MacWeb(Nanard and Nanard 1991), and Dolphin(Haake, Neuwirth et al. 1994) help writers assimilate information they do not (yet) completely understand into a growing and changing framework. They also provide opportunities for discovering unexpected structure and for expressing volatile relationships.
Spatial hypertext tools are convenient for information farming because space provides a natural expression for tentative and imprecise relationships. Where formal systems provide strict (and therefore powerful) roles—roles such as is-a, part-of, is-cited-by (Triff and Weiser 1986)(Lenat, Ramanathan et al. 1990)—the continuity and fluidity of spatial relationships facilitate ad hoc, exploratory, or speculative organizations—piles, clusters, and neighborhoods (Marshall and Shipman 1997). The natural transition from information farming to argumentation takes place in the mind of the user: once the data and their relations are understood, the focus naturally shifts from exploration to explication and presentation.
Hypertext tools also need to move information—more or less gracefully—between information farming and argumentation. Information farming tools need to accept data from argumentative and presentational systems, such as journals, books, reports, and web pages. Conversely, information farming tools may need to send information to presentation tools, and so may need to express ill-defined or volatile relationships in more formal terms.
Web Squirrel
The Eastgate Systems Web Squirrel™, an information farming application, provides a spatial hypertext environment for keeping track of Web pages and other internet resources. An individual’s research interests and reading patterns are informal and volatile, yet many people need to keep track of several hundred or thousand such references. We need tools to track and organize resources as they are found, even though such discoveries cannot easily be assigned to predefined roles or categories.
Web Squirrel represents each resource as a concrete, manipulable object. By grouping related objects in space, users remind themselves of relationships both within the data they represent and in the user’s work style. Because moving icons into and among clusters is easy and natural, it requires little commitment and poses slight risk. Moving two items a little closer together, or pulling one exceptional item to the edge of a cluster, is easily reversible, whereas assigning an item to a category (filing it in a folder, assigning it to an aggregate, tagging it with a keyword) represents a more emphatic and hazardous assertion.
Web Squirrel neighborhoods (Figure 1) also provide a natural framework for manipulation and housekeeping. Labels (or pictures), for example, can explain the purpose of a neighborhood to collaborators and remind the user of the pertinent organizing principle. Labels also provide a “handle” allowing selection of a group of related items. This makes it easy to move neighborhood, to clear space for new items, or to clarify the information space by adjusting the spatial relationships of neighborhoods to each other.
Automating Structure Discovery
The immediate goal of information farming is to let structure emerge gradually as data accumulate. However, a system may need to discover and formalise structure automatically when moving data between tools—especially when moving from an information farming tool to an argumentation tool.
Suppose a user selects many Web Squirrel items, and drops them into an HTML editor. One reasonable interpretation of this gesture is that the user wants to create a hotlist in which each Web Squirrel item becomes an HTML link. But in what order should these items appear? How should the spatial layout of the items inform their order in the hotlist? While some representational impedance is inevitable, it is obviously desirable to preserve as much information as possible.
In the Storyspace hypertext environment (Bernstein, Bolter et al. 1991)(Joyce 1991), the position of each item in space implicitly defines its position in a sequence that starts at upper left-hand corner and proceeds across and down. Since Storyspace (following Guide (Brown 1989) and KMS(Akscyn, McCracken et al. 1987)) hypertexts have a hierarchical backbone, this implicit ordering assigns a useful meaning to spatial position while encouraging writers to use both space and hierarchy. When other writers adopt other principles to organise an information space—arranging items to reveal link structure, for example, or to keep track of their progress in finishing a project—the sequence implied by the spatial ordering may surprise or confuse. Even experienced users may be unsure what sequence Figure 2 implies. Moreover, although spatial manipulation is continuous, its mapping onto implied sequence is not: a small change in the position of any of the three items in Figure 2 may alter the sequence unexpectedly. The StorySpace mapping of Space to sequence thus appears ill-suited to Web Sqirrel.
The virtual prominence of labelled neighborhoods in Web Squirrel suggests that neighborhoods ought to structure the exported sequence. Many clustering algorithms might usefully apply here; in principle, almost any plausible algorithm should perform acceptably, since the most difficult choices—items placed ambiguously between neighborhoods, items not placed near any neighborhood, neighborhoods that overlap—are those in which the user herself has introduced ambiguity.
In practice, however, we discovered that immediate visual feedback was essential to comfortable use of neighborhoods as sequencers. Placing an item at the periphery of a neighborhood, or between two neighborhoods is a commonplace action to represent a basic semantic relationship. Yet this placement can lead Web Squirrel’s underlying mechanism to amalgamate two clusters into one, with dramatic and unintended results. By displaying its interpretation of neighborhood boundaries, Web Squirrel always can evident the structure it will use for export.
The need for immediate visual feedback, however, implies that the clustering algorithm must be efficient since neighborhood boundaries need to be recomputed whenever an item is added, moved, or deleted. The following algorithm has proven to be fast and to behave intuitively:
1) Define the Vicinity of an item to be the item’s bounding rectangle, magnified by a fixed proportion (in Web Squirrel, this scaling factor is 1.5)
2) For each neighborhood label M, set the neighborhood boundary to the label’s vicinity.
3) For each item whose vicinity intersects a neighborhood boundary, and for each neighborhood label with a font size smaller than the font size of M
3a) add that item or neighborhood to the neighborhood M
3b) set the neighborhood bounds to the union of the former boundary and the vicinity of the item just added
4) Repeat until all items have been assigned to a neighborhood, or until the vicinity of all unassigned items lie outside of any neighborhood.
Note that, while this algorithm could take 0(n2) time to cluster n items, it runs in 0(n) time if it happens to examine items in a fortuitous sequence, such that each item can be assigned to a neighborhood the first time it is examined. Whenever items are moved, deleted, or added, we can use the old clustering information to chosse the sequence in which items are examined. On modest personal computers, the clustering for a few hundred items can be recomputed while the computer plays a “click” or “snap” for audible enactment of the repositioning or deletion.
This algorithm is straightforward and stable; the assignment of items to clusters does not depend on the sequence in which the algorithm examines them. It also provides a simple way to express hierarchy: a large neighborhood can contain a smaller neighborhood while the small neighborhood retaing its separate identity. The rectangular boundaries facilitate rapid intersection testing. Irregular neighborhood boundaries, on the other hand, might prove more intuitive (albeit at greater computational cost) and would avoid the tendency of rectangular neighborhoods to overlap and swallow adjacent neighborhoods. While immediate visual feedback makes it easy to correct for accidental overlap, more complex neighborhood boundaries would prove more flexible and would reflect more accurately the spirit of underlying spatial relationships.
Nighborhoods within Web Squirrel are simple, ad hoc, information farming structures that emerge naturally whenever the user labels a cluster of related items. When users copy neighborhoods or collections of neighborhoods into argumentative tools, the derived structures provide an easy means to sketch a prellminary structure adapted to the needs of the formal tool without imposing a the burden of formalism on the information farmer.
Acknowledgements: I am grateful to Eric Cohen for many suggestions and corrections, both in the design described here and in the composition of this note. WeB Squirrel™ is a trademark of Eastgate Systems, Inc.
Figure 1. A section of a typical Web Squirrel information farm. Neighborhood boundaries appear as fuzzy rectangles; notice how small neighborhoods are clustered within larger neighborhoods.
Figure 2. The sequence of Storyspace writing spaces is determined by their position, from the upper-left to the lower-right-hand corner. Moving the rightmost space slightly higher in the map will make it the first space of the sequence.
References
Akscyn, R., D. McCracken, et a1. (1987). KMS: A Distributed Hypermedia Systems for Managing Knowledge In Organizations. Hypertext 87, Chapel Hill, NC, ACM.
Bernstein, M., J. D. Bolter, et al. (1991). Architectures for Volatile Hypertext. Hypertext '91, San Antonio, ACM,
Brown, P. J. (1989). “Do we need maps to navigate round hypertext documents?” Electronic Publishing—Organization, Dissemination and Design 2(2): 91–100.
Conklin, J. and M. L. Begeman (1988). “gIBIS: A Hypertext Tool for Exploratory Policy Discussion.”ACM Transactions on Office Information Systems 6(4): 303–331.
Haake, J. M., C. M. Neuwirth, et al
. (1994). Coexistence and Transformation o0f Informal and Forma1 Structures: Requirements for More Flexible Hypermedia Systems. European Conference on Hypermedia Technology 1994, Edinburgh, Scotland,
Joyce, M. (1991). Storyspace as a hypertext system for writers and readers of varying ability. Hypertext’91. San Antonio, 381- 387.
Lenat, D. 8., V. 6. Ramanathan, et al. (1990). “CYC: Towards programs with common sense.” Communications of the ACM 33(8): 30–49.
Marshall, C. and Shipman F. M. (1997). Spatial Hypertext and the Practice of Information Triage. Proc. of Hypertext’97. Southampton, UK, 124–133.
Marshall, C. C., F. M. Shipman, et al. (1994). VIKI: Spatial Hypertext Supporting Emergent Structure. ECHT’94. Edinburgh, 13–23.
Nanard, J. and M. Nanard (1991). Using Structured Types to Incorporate Knowledge in Hypertext. Hypertext’91. San Antonio, 329–343.
Thüring, M., J. M. Haake, et al. (1991). What’s Eliza doing in the Chinese Room? Incoherent hyperdocuments—and how to avoid them. Hypertext’91. San Antonio, 161–177.
Trigg, R. H. and M. Weiser (1986). “TEXTNET: A Network-based Approach to Text Handling.” ACM Transactions on Office Information Systems 4(1): 1-23.