Eckhart Köppen (Eckhart.Koeppen@uni-essen.de), Gustaf
Information Systems and Software Techniques
University of Essen
University Road 9, 45141 Essen
This paper presents an architectural implementation for web-based, active documents. Although several approaches for distributed, active documents exist already, we decided to establish a new model which provides more flexibility and interoperability without giving up formality. The model is based mainly on the Extensible Markup Language (
XML) and makes use of the Document Object Model, Cascading Style Sheets and the Intrinsic Event Model, which are all open standards defined by the W3 Consortium.
Note: This version of the paper is the orignal submission to the Call for Papers for the 7th World Wide Web Conference in Brisbane, Australia. The final and published version can be found at
http://nestroy.wi-inf.uni-essen.de/Forschung/Publikationen/WWW7/. It has been shortened and some minor corrections have been made.
The rapid success of the World Wide Web has led to a new class
of applications which are constructed using
[Raggett (1997)] for the user interface and
scripts for the application's logic. They have a more or less
strong resemblance to mainframe programs: the user enters data
into a form which is transferred to the web-server, evaluated
and the results are passed back to the user agent. As a
result, the computational load and the application logic are
located entirely on the server side.
To gain more flexibility and to remove any distinctions
between server and client, the approach which we will present
in this paper incorporates the application code into the
HTML document, thus turning the formerly passive document
into an application itself. We will refer to these enabled
HTML documents as active, hyperlinked documents
AHDs). With this document-centered architecture, a
different application model can be implemented. Typical
applications range from small programs like a bookmark page
which controls its logic and appearance itself and go up to
workflow management systems which contain mostly independent
documents with different states and possible operations on
them. More generally speaking, the possible uses of
ranges from controlling the contents and layout of a single
document to support of coordination and collaboration
techniques. The goal of our implementation of the model is to
provide a means to develop and evaluate different applications
The general functionality which is needed by this
document-centered architecture covers several aspects: an
interface to the user agent, flexible structuring and semantic
annotation of the document (state, behavior, presentation),
introspection through an interface to the document itself, a
layout mechanism and a scripting language. Figure
shows the building blocks of an
In an implementation, the abstract building blocks have to be
replaced by concrete specifications and implementations.
Mainly, we will be using two standard proposals which have been
introduced recently by the W3 Consortium: the Extensible
Markup Language (
XML) and the Document Object
[Bray et al. (1997)] and
[Byrne (1997)]) . The
DOM provides a clear,
programming language independent interface to the document
structure. Additionally, it defines interaction with the user
and the user agent through the Intrinsic Event Model
IEM). As a means to annotate a document semantically,
HTML is too limited. Here,
XML will be used as the basis
for the semantic annotations and as a structuring
language. Finally, the document layout is described using
Cascading Style Sheets (
[Lie and Bos (1996)]).
The techniques used in the basic building blocks of the
model are shown in Figure
, note however that
the use of Tcl and OTcl (
[Wetherall and Lindblad (1995)]) as the scripting
language is not a mandatory characteristic of an
AHD. Tcl/OTcl is chosen here because it has a
reasonable balance of simplicity and capability.
The proposed model for
AHDs is embedded in a system
architecture which is a natural extension to the architecture
of current web-based applications. In those systems, web
servers provide access to a repository of either statically
available or dynamically created documents. On the client
side, web browsers are used as user agents to request the
documents from the web servers.
On the client side, the user agent will be extended to
incorporate a runtime environment (
provides an interface from the user agent to the
vice versa. In addition, it implements introspection and
execution mechanisms for the
AHDs. Related to the
the presentation and networking layers of the user agent,
which handle the display and transport of
shows the different components of the user agent.
On the server side, the
RE will be incorporated into the web
server (this step will not be discussed in this paper), so
that certain functions in an
AHD can be executed prior to
the transfer of the
AHD to the user agent. An
also reside in the web server for a longer time, responding to
Although this architecture resembles a typical agent meeting
[White (1996)]), our model goes further
by defining not only state and behavior of an
AHD but aiming
also at the end user through representation mechanisms like
CSS (layout, aural attributes) and incorporation of a web
browser , thus letting the user interact with an
The basic components needed to implement our model of
will now be described.
The Extensible Markup Language (
XML) is very
closely related to
HTML and is a subset of the
Standard Generalized Markup Language (
[Goldfarb (1990)]). According to
[Bray et al. (1997)], some of the main goals for the
XML documents are composed of physical units
(entities) and have a logical structure which is formed by
elements. Elements are declared in a Document Type
DTD) and marked with start- and end-tags in
For our approach, the main advantage of
the ability to declare elements which are needed to form an
active document and to describe state, behavior and
presentation separately. In comparison to
explicitly targeted at the World Wide Web and more
widespread support from content creators and software
developers is expected.
The structure of
XML documents can be accessed
through the Document Object Model (
describes the components of a document in terms of nodes
which are organized in a tree. The more interesting node
types are text, elements and
The interface to the different nodes is described via the
Interface Definition Language (
the Object Management Group
[OMG (1997)]. An interface
definition contains the attributes and possible operations
for the node types. Corresponding definitions can be derived
for various languages, e.g. Java. Among the operations
defined for the node types are operations to create new
elements, manipulate their list of children and modify their
To map events to behavior, we use the Intrinsic Event
IEM) defined in the
DOM. Events are tied to
the element where they occur. If an element does not process
an event, it is propagated to its parent element.
Any activity is triggered by external events. They can be
grouped into user events and events caused by the runtime
environment. User events are pointer events (motion,
clicks), keyboard events (key pressed, key released), form
related event (list selections, text changes) and focus
changes. Events triggered by the
RE are document loading
and document unloading.
XML defines only the structure of a document, a
separate layout definition is provided through style
sheets. The W3 Consortium proposes Cascading Style
CSS) to be used together with
Style sheets imply a new set of attributes which are associated with an element. These attributes control the visible aspects of the elements, e.g. margins, borders, text styles. In addition, a style sheet can also describe aural properties for elements to enhance accessibility of a document with text-to-speech programs.
To implement the infrastructure for
AHDs, basic definitions
are needed to describe the structure and semantics of an
AHD. In addition, a mapping of the event model to the
document behavior has to be defined.
The structure of an
AHD is defined in a
DTD. To allow
RE to process an
AHD, two approaches are
possible. On the one hand, the
DTD itself could declare a
set of elements which are recognized by the user agent and
contain any code and data required for the
AHD. This would
require a very detailed description since any specific state
and behavior elements would have to be made proper
elements. In our model, we chose a more flexible
approach. In an
AHD every element can contain code and data. For
this purpose, the
DTD declares only two specific elements,
<VAR>. They will be explained in detail in
the next section. Instances of these elements are always
associated with their parent element, i.e. their scope is
the parent element. This association is established by the
RE, which manages the access to code and data for each
element through access and modification functions.
The following sample illustrates the mechanism. The first
part of the example declares a
DTD with the elements
<!ELEMENT ahd o o ANY> <!ELEMENT func - - CDATA> <!ATTLIST func name CDATA #REQUIRED type CDATA #IMPLIED> <!ELEMENT var - - CDATA> <!ATTLIST var name CDATA #REQUIRED>
In the second part, two additional elements (
<buyer>) are declared. An instance of the
<order> element is
created. It contains the function
instance of the element
<buyer> with two functions
duplicate) and two variables
<!DOCTYPE ahd SYSTEM [ <!ELEMENT order - - (buyer | FUNC | VAR)*> <!ATTLIST order id CDATA #IMPLIED> ]> <!ELEMENT buyer - - (#PCDATA | FUNC | VAR)*> <!ATTLIST buyer id CDATA #IMPLIED> ]> <order id="o80"> <func name="print_header"> ... </func> <buyer id="p80"> <func name="print"> ... </func> <func name="duplicate"> ... </func> <var name="name">John Doe</var> <var name="email">firstname.lastname@example.org</var> </buyer> </order>
RE associates both the functions and the variables
<buyer> element. As a consequence, the variables
and the functions can be accessed only through the
element. In this example, the
<buyer> element itself can be
addressed using its
id attribute. The addressing
scheme for elements is defined in more detail in the
[Bray et al. (1997)].
The local scoping of functions and variables requires a
method to facilitate access to functions and variables. We
will use parent delegation to look up function and
variable elements. In order to access an element through its
id, the parent chain of the
current element will be searched. Function and variable
elements are associated with an element if they are a direct
child of this element. In the example, the variable
name can be modified from within the function
shows the lookup of the function
print_header which is called in the function
<buyer>. Here, the lookup of
the function is continued in the parent of
<order>. This element contains the wanted
script as a direct child.
The two elements which are used to incorporate state and
behavior into an
<VAR>. They contain
either function code or variable values and are a logical
part of their parent element. It is important that any other
element which will contain these elements is defined
appropriately by including the
<VAR> elements in
the content declaration. Both elements have the
display property set to
none so that they will not
be layouted and displayed.
The declaration of the
<FUNC> element is as follows:
<!ELEMENT FUNC - - CDATA> <!ATTLIST FUNC name CDATA #REQUIRED type CDATA #IMPLIED>
name attribute contains the name of the
function under which it can be called, and the
type attribute describes the
[Borenstein and Freed (1993)] of the
function (which is basically the programming language
used). The contents of the
<FUNC> element (i.e. the text
between the start- and end-tag) is the code of the
function. The name of a function has to be locally unique,
which means that the parent element does not have any
other function element with the same name as an immediate
child. If another function with the same name exists, only
the first function is considered.
The following code shows the declaration of the
<!ELEMENT VAR - - CDATA> <!ATTLIST VAR name CDATA #REQUIRED>
The element has only one attribute,
<FUNC> tag, this contains a locally unique name for
the variable. The contents of the
<VAR> element is the
value of the variable.
The events defined in the event model are mapped implicitly
to element functions. Any function which name equals an
event name will be called when that event occurs. According
IEM an event handling function can also be defined
in an attribute which has the name of the handled event. The
next examples (which are all equivalent) show the usage of
<label1> <func name="onclick"> echo Mouse clicked! </func> ... </label1> <label2 onclick="echo Mouse clicked!"> ... </label2> <label3 onclick="mouse_handler"> <func name="mouse_handler"> echo Mouse clicked! </func> ... </label3>
Each event handler is passed additional information about the event, e.g. the pressed key or the pointer location.
An element consists of five components as shown in Figure 5 .
The access to the attributes and the child elements from
RE and and
AHD is provided through the
functions defined in the
DOM. These functions can also be
used to access the functions, variables and the textual
contents of an element, but to ensure the parent
delegation mechanism for function and variable access,
convenience functions are implemented as well. These are:
setVar: sets a variable value (the textual
contents of a
getVar: gets a variable value (the textual
contents of a
setContents: sets the textual contents of
an element. If the element contains other child elements,
they will be deleted (to achieve more control over the the
contents of an element, using the functions of the
getContents: gets the textual contents of
an element. Note that any child elements in the content are
ignored (e.g. the contents of the
<p> tag in
tag</p>" is "
To preserve the state of an
AHD and to transfer it over an
[Berners-Lee et al. (1996)] connection,
another function is needed:
toText: returns a textual representation of
AHD. The resulting text is an
XML document with
additional style information. It reflects any modifications
DOM functions or the functions exported by
Since the execution of an element function can not be
DOM functions, the
RE exports another
callFunc: calls an element function
The implementation of the model is mainly tied to the
implementation of the user agent. We developed the extensible
web browser Cineast
[Koeppen et al. (1997)] to
develop and evaluate novel approaches like
AHDs. The main
part of the Cineast is written in OTcl, this is
also why we chose Tcl/OTcl as the scripting language
AHDs. The Cineast is currently running under
Unix, but its main parts can be ported to other
operating systems and the model for the
AHDs is platform
The basis of the Cineast is the prototyping environment
[Neumann and Nusser (1993)]. It combines
Tcl as a scripting language with different widget sets such
as the Athena widget set or the Motif widget set. In addition,
other libraries are linked into Wafe, among these are
[Hudson and Young (1997)],
[Howes and Smith (1997)] and OTcl. For
the implementation of the Cineast, a special purpose
widget called Kino
[Koeppen (1996)] is
integrated into Wafe to handle the parsing, layout and
XML source text. It is implemented in
to achieve high performance. The roles of Wafe and the
Kino widget in the different layers of the user agent are
discussed below. Figure
shows an overview.
Most of the functionality required in the
RE is provided
by the Kino widget. The main task of the Kino widget in
RE is to parse any source text and to maintain an
internal representation of the element tree. It also
implements all of the
DOM functions and the additional
functions needed to access the functions and variables of
The Kino widget is made up of three components: the
Parser, the Layouter and the
Painter. In the
RE, only the Parser is of
interest. It produces a tree of parsed data
PData) elements. The Kino widget uses four different
PData elements: a generic element, a
box element which can contain children, a text
element and an inset element which can be used
to insert other widgets (text entry fields, push buttons,
list boxes etc.) and images into the element tree.
The most interesting element is the box element. Besides its
role as a structuring element to contain other elements, it
CSS attributes. The
PData box elements
correspond directly to the
XML elements in the document,
XML element results in a
shows a part of a
Navigation in the
PData tree is possible through the
on the one hand, on the other hand, the components of the
PData elements can be used directly through their
The Kino widget is extensible in two ways: the application
can register a tag callback, which is called
whenever a tag is encountered during the parsing process. In
this callback, the application can for example modify the
PData tree. The second callback which is used handles the
execution of scripts. The Kino widget calls this
script callback whenever a script has to be
executed, e.g. when an event occurs or a script is called
callFunc. This makes the Kino widget
independent from the chosen scripting language because the
execution of a script is handled by the application, in this
case the Cineast. Figure
shows the possible
interactions with the Kino widget.
The presentation layer for
XML documents is implemented in
the Layouter and Painter components of the
Kino widget. They are not needed for the basic
functionality of an
AHD and can be omitted if the
to be incorporated into a web server.
The Layouter works together with a
CSS database. The
database is built during the parsing process and contains
all style information. Currently, the
CSS database is
implemented completely in OTcl. The Layouter
positions any element so that no calculations are needed for
display by the Painter.
Our implementation handles most of
3.2, including important features like tables, images
and forms. The internal layout model is box-oriented, so
PData box results in either a block-level or inline
element according to the
CSS specification. In contrast to
a simple flow-oriented model, boxes can be nested. Figure
shows the most important arrangements.
The box model allows the realization of more complex layouts such as tables. The current implementation does not support incremental layout as well as absolute positioning of boxes within a text flow.
The Kino widget is tightly cooperating with and embedded within the infrastructure provided by Wafe and the Cineast browser.
The infrastructure requirements for the implementation of a browser are partly generic and partly Web specific. The generic requirements are usable in a wide range of applications. These generic components are provided either by Wafe or by libraries accessible through Wafe. For example the color and image management (for the image formats xbm, xpm, jpg, png and gif) or event dispatching, resource management, module management are directly provided by Wafe. Generic components provided libraries accessible through Wafe are for example the widget classes from OSF/Motif, basic networking facilities from Tcl, security algorithms and protocols from SSLeay, compression from zlib, etc.
The browser specific code is exclusively implemented in OTcl and is seen by Wafe as the Cineast source code, which is loaded by Wafe at startup time of the browser. The Cineast source code defines the typical browser semantics ranging from
CSSrealization which can be separated from the Kino widget class (e.g.
FORMtags, implementation of the
HTTPS, access to non-networked documents such as files), access control (basic authentication), to
URL-completion, request life-cycle management (starting, redirection, killing), monitoring, to
MIME-type specific presenters, and
One of the most demanding tasks for implementing a browser
is to ensure that the browser does not block event
handling. Since most of the components of the browser are
not thread safe the multitasking is implemented through a
select() based event loop. All interaction and I/O
has to be performed asynchronously. If an I/O operation
would block, for example the browser windows could not be
refreshed, or only a single transfer would be possible at a
time. Furthermore it would not be possible to provide an
incremental loading and display which are highly useful over
rather slow connections.
The Cineast source code logic is greatly determined by
this asynchronous event handling. As a consequence of the
asynchronicity various handler and callbacks have to be
registered to process for example incoming data from sockets
or to process user events in various windows and to
associate the events with the according tasks. Every handler
must have enough context information to continue a task in
the correct environment. The Cineast browser uses widget
IDs for the context selection for GUI purposes, in the other
cases OTcl objects. For example for every request that is
started a new OTcl object is created which registers on
the input side I/O callbacks to process the input data
incrementally and reports to the presentation objects
(typically image or
XML text) important state
changes in its life cycle (when the object is created, a
MIME type determined, some new data is available, end of
data is reached, the request was killed).
As stated earlier the RE incorporated in the browser provides the basic platform for implement the state and behavior components of AHDs. We will show now how a document centered application based on AHDs differs form purely server or client centered approaches. We will use the selection and purchase of goods over the internet as an example.
As the central document, the catalog with the order form can
be identified and implemented with an
AHD. The life cycle of
AHD begins with the transfer of the catalog into
RE of the user's web browser. The first event the
AHD will receive is the
defined in the
IEM. Upon reception of this event, the
AHD can update its contents, e.g. recalculate prices
of special offers etc. We assume that the user now disconnects
from the internet and saves the catalog
AHD locally as a
file which is achieved through the
function (introspection is necessary for this task). The
disconnect is not strictly necessary but emphasizes the
strengths of document centered applications.
AHD can later be loaded from the local file and
will reside in the web browser's
RE where it can react to
user input like entering amounts or selections of goods. When
the user is finished making his selection, he can save the new
state of the catalog
AHD and the order form again to a
file. If he decides to transmit his order to the online store,
AHD generates an order
AHD. The order
transferred to the online using a
where it will be saved in a file on the online store's web
server, ready for further processing.
Ideas for applications which are based on web techniques such
CGI were presented early in the history of the
[Houh et al. (1994)], the authors already
point out the need for more control on both the client and
server side. Related research can also be found in
[Kaashoek et al. (1994)], where the Mosaic web browser
is extended to be able to execute Tcl scripts. However, a
more formal model is not presented.
Our proposed model for Active Hyperlinked Documents and the implementation of the overall system differ in two key points from existing approaches:
AHDis independent of commercial influence and easy to use and validate for others,
More important however are the next steps that have to be made
to prove the potential of
AHDs. First, a formal model for
the use of
AHDs in distributed applications has to be
developed. Starting points can be frameworks like
[Muehlbacher and Neumann (1996)] where te potential of
active documents as tools for collaboration are discussed as
alternative to legacy systems. A prerequirement for
collaboration are coordination issues
[Tolksdorf et al. (1996)] between
AHDs which are not
From the architectural point of view the the distinctions
between a user agents and web servers is not neccessary. We
will work towards a clearly defined
RE that can be
incorporating into classical web server or other applications
Lasty, security issues are not addressed in this paper. We believe that AHDs have a huge potential in electronic commerce (esp. when they are combined with e.g. technologies like croptolopes [IBM (1996)]) or in intranet application in combination with role based access control [Neumann and Nusser (1997)].
[Berners-Lee et al. (1996)] T. Berners-Lee, R. Fielding, H. Frystyck: Hypertext Transfer Protocol - HTTP/1.0 Informational RFC, RFC 1945, http://www.w3.org/Protocols/rfc1945/rfc1945, May 1996.
[Borenstein and Freed (1993)] N. Borenstein and N. Freed: Multipurpose Internet Mail Extensions, Standards Track Protocol, RFC 1521, September 1993.
[Bray et al. (1997)] Tim Bray, Jean Paoli, C.M. Sperberg-Queen: Extensible Markup Language (XML), W3C Working Draft, http://www.w3.org/TR/WD-xml, November 1997.
[Byrne (1997)] Steve Byrne: Document Object Model (Core) Level 1, W3C Working Draft, http://www.w3.org/TR/WD-DOM/level-one-core-971009, October 1997.
[Gosling and McGilton (1996)] James Gosling, Henry McGilton: The Java Language Environment, http://java.sun.com/docs/white/langenv/, May 1996.
[Goldfarb (1990)] Charles F. Goldfarb. The SGML Handbook, Oxford University Press, Oxford 1990.
[Houh et al. (1994)] Henry Houh, Cris Lindblad and David Wetherall: Active Pages: Intelligent Nodes on the World Wide Web, Proceedings of the First World Wide Web Conference, Geneve 1994.
[Howes and Smith (1997)] T. Howes and M. Smith: LDAP: Programming Directory-Enabled Applications with Lightweight Directory Access Protocol, Macmillan Technical Publishing, 1997.
[IBM (1996)] IBM Corp.: IBM Cryptolope Home, http://www.cryptolope.ibm.com/, 1996.
[Hudson and Young (1997)] T. J. Hudson and E. A. Young: SSLeay and SSLapps FAQ (Draft), http://www.psy.uq.edu.au/~ftp/Crypto/, September 1997.
[Kaashoek et al. (1994)] M. Frans Kaashoek, Tom Pinckney and Joshua A. Tauber: Dynamic Documents: Extensibility and Adaptability in the WWW, Proceedings of the Second International World Wide Web Conference, Chicago, 1994.
[Koeppen (1996)] Eckhart Koeppen: Entwicklung eines erweiterbaren Widgets zur Anzeige von HTML-Texten, Master's Thesis, University of Essen, Germany, 1996.
[Koeppen et al. (1997)] Eckhart Koeppen, Gustaf Neumann, Stefan Nusser: Cineast - An extensible Web Browser, Proceedings of WebNet 97, Toronto 1997.
[Levy (1996)] Jacob Levy: A Tk Netscape Plugin, Proceedings of the Fourth Annual USENIX Tcl/Tk Workshop, Monterey 1996.
[Lie and Bos (1996)] Håkon Wium Lie and Bert Bos: Cascading Style Sheets, level 1, W3C Recommendation, http://www.w3.org/TR/REC-CSS1, December 1996.
[Muehlbacher and Neumann (1996)] Robert Muehlbacher and Gustaf Neumann: Towards a Framework for Collaborative Software Development of Business Application System, Proceedings of the Fifth Workshops of WET ICE 96, Stanford, 1996.
[Netscape (1997a)] Netscape Communications Corp.: Plug-In Guide, http://developer.netscape.com/library/documentation/communicator/plugin/index.htm, May 1997.
[Neumann and Nusser (1993)] Gustaf Neumann, Stefan Nusser: Wafe - An X Toolkit Based Frontend for Application Programs in Various Programming Languages, USENIX Winter 1993 Technical Conference, San Diego, January 1993.
[Neumann and Nusser (1997)] Gustaf Neumann, Stefan Nusser: A Framework and Prototyping Environment for a W3 Security Architecture, Proceedings of Communications and Multimedia Security, Joint Working Conference IFIP TC-6 and TC-11, Athens, September , 1997.
[OMG (1997)] Object Management Group: The Common Object Request Broker: Architecture and Specification, ftp://ftp.omg.org/pub/docs/formal/97-10-01.pdf, August 1997.
[Ousterhout (1990)] John K. Ousterhout: Tcl: An embeddable Command Language, Proceeding USENIX Winter Conference, January 1990.
[Raggett (1997)] Dave Raggett: HTML 3.2 Reference Specification, W3C Recommendation, http://www.w3.org/TR/REC-html32.html, January 1997.
[Tolksdorf et al. (1996)] : R. Tolksdorf, G. Neumann, W. Conen, P. Bertok, M. Fuchs: Working Group Report on Web Infrastructures for Collaborative Applications, Proceedings of the Fifth Workshops of WET ICE 96, Stanford, 1996.
[Wetherall and Lindblad (1995)] David Wetherall and Christopher J. Lindblad: Extending Tcl for Dynamic Object-Oriented Programming, Proceedings of the Tcl/Tk Workshop '95, Toronto, July 1995.
[White (1996)] Jim White: Mobile Agents, White Paper, http://www.genmagic.com/agents/Whitepaper/whitepaper.html, 1996.