Eckhart Köppen (Eckhart.Koeppen@uni-essen.de), Gustaf
Neumann (Gustaf.Neumann@uni-essen.de)
Information Systems and Software Techniques
University of Essen
University Road 9, 45141 Essen
This paper presents an architectural implementation for web-based, active documents. Although several approaches for distributed, active documents exist already, we decided to establish a new model which provides more flexibility and interoperability without giving up formality. The model is based mainly on the Extensible Markup Language (XML
) and makes use of the Document Object Model, Cascading Style Sheets and the Intrinsic Event Model, which are all open standards defined by the W3 Consortium.
Note: This version of the paper is the orignal submission to the Call for Papers for the 7th World Wide Web Conference in Brisbane, Australia. The final and published version can be found athttp://www7.conf.au/
or athttp://nestroy.wi-inf.uni-essen.de/Forschung/Publikationen/WWW7/
. It has been shortened and some minor corrections have been made.
The rapid success of the World Wide Web has led to a new class
of applications which are constructed using HTML
[Raggett (1997)] for the user interface and CGI
scripts for the application's logic. They have a more or less
strong resemblance to mainframe programs: the user enters data
into a form which is transferred to the web-server, evaluated
and the results are passed back to the user agent. As a
result, the computational load and the application logic are
located entirely on the server side.
In contrast to server-centered web applications, a client-centered application model has emerged through the use of scripting languages such as JavaScript [Netscape (1997b)]. Interfaces to the user agent and the current document exist in the form of plug-ins [Netscape (1997a)], Java applets [Gosling and McGilton (1996)] and embedded scripts [Levy (1996)]. However, with most of these solutions a number of problems exist: plug-ins are strongly tied to the chosen user agent and the client platform, the interface to the document is in all cases either non-existent or allows only the changing of text and there are still distinctions between client- and server-side application logic, making the design of applications which can operate on- and off-line difficult.
To gain more flexibility and to remove any distinctions
between server and client, the approach which we will present
in this paper incorporates the application code into the
HTML
document, thus turning the formerly passive document
into an application itself. We will refer to these enabled
HTML
documents as active, hyperlinked documents
(AHD
s). With this document-centered architecture, a
different application model can be implemented. Typical
applications range from small programs like a bookmark page
which controls its logic and appearance itself and go up to
workflow management systems which contain mostly independent
documents with different states and possible operations on
them. More generally speaking, the possible uses of AHD
s
ranges from controlling the contents and layout of a single
document to support of coordination and collaboration
techniques. The goal of our implementation of the model is to
provide a means to develop and evaluate different applications
of AHD
s.
The general functionality which is needed by this
document-centered architecture covers several aspects: an
interface to the user agent, flexible structuring and semantic
annotation of the document (state, behavior, presentation),
introspection through an interface to the document itself, a
layout mechanism and a scripting language. Figure
1
shows the building blocks of an AHD
model.
AHD
model
In an implementation, the abstract building blocks have to be
replaced by concrete specifications and implementations.
Mainly, we will be using two standard proposals which have been
introduced recently by the W3 Consortium: the Extensible
Markup Language (XML
) and the Document Object
Model (DOM
) (see
[Bray et al. (1997)] and
[Byrne (1997)]) . The DOM
provides a clear,
programming language independent interface to the document
structure. Additionally, it defines interaction with the user
and the user agent through the Intrinsic Event Model
(IEM
). As a means to annotate a document semantically,
HTML
is too limited. Here, XML
will be used as the basis
for the semantic annotations and as a structuring
language. Finally, the document layout is described using
Cascading Style Sheets (CSS
, see
[Lie and Bos (1996)]).
The techniques used in the basic building blocks of the AHD
model are shown in Figure
2
, note however that
the use of Tcl and OTcl (
[Ousterhout (1990)],
[Wetherall and Lindblad (1995)]) as the scripting
language is not a mandatory characteristic of an
AHD
. Tcl/OTcl is chosen here because it has a
reasonable balance of simplicity and capability.
AHD
The proposed model for AHD
s is embedded in a system
architecture which is a natural extension to the architecture
of current web-based applications. In those systems, web
servers provide access to a repository of either statically
available or dynamically created documents. On the client
side, web browsers are used as user agents to request the
documents from the web servers.
On the client side, the user agent will be extended to
incorporate a runtime environment (RE
). The RE
provides an interface from the user agent to the AHD
and
vice versa. In addition, it implements introspection and
execution mechanisms for the AHD
s. Related to the RE
are
the presentation and networking layers of the user agent,
which handle the display and transport of AHD
s. Figure
3
shows the different components of the user agent.
On the server side, the RE
will be incorporated into the web
server (this step will not be discussed in this paper), so
that certain functions in an AHD
can be executed prior to
the transfer of the AHD
to the user agent. An AHD
could
also reside in the web server for a longer time, responding to
requests itself.
Although this architecture resembles a typical agent meeting
place (see
[White (1996)]), our model goes further
by defining not only state and behavior of an AHD
but aiming
also at the end user through representation mechanisms like
CSS
(layout, aural attributes) and incorporation of a web
browser , thus letting the user interact with an AHD
.
The basic components needed to implement our model of AHD
s
will now be described.
The Extensible Markup Language (XML
) is very
closely related to HTML
and is a subset of the
Standard Generalized Markup Language (SGML
, see
[Goldfarb (1990)]). According to
[Bray et al. (1997)], some of the main goals for the
design of XML
were:
SGML
XML
processorsLike SGML
, XML
documents are composed of physical units
(entities) and have a logical structure which is formed by
elements. Elements are declared in a Document Type
Description (DTD
) and marked with start- and end-tags in
the document.
For our approach, the main advantage of XML
over HTML
is
the ability to declare elements which are needed to form an
active document and to describe state, behavior and
presentation separately. In comparison to SGML
, XML
is
explicitly targeted at the World Wide Web and more
widespread support from content creators and software
developers is expected.
The structure of HTML
and XML
documents can be accessed
through the Document Object Model (DOM
). The DOM
describes the components of a document in terms of nodes
which are organized in a tree. The more interesting node
types are text, elements and
attributes.
The interface to the different nodes is described via the
Interface Definition Language (IDL
) from
the Object Management Group
[OMG (1997)]. An interface
definition contains the attributes and possible operations
for the node types. Corresponding definitions can be derived
for various languages, e.g. Java. Among the operations
defined for the node types are operations to create new
elements, manipulate their list of children and modify their
attributes.
To map events to behavior, we use the Intrinsic Event
Model (IEM
) defined in the DOM
. Events are tied to
the element where they occur. If an element does not process
an event, it is propagated to its parent element.
Any activity is triggered by external events. They can be
grouped into user events and events caused by the runtime
environment. User events are pointer events (motion,
clicks), keyboard events (key pressed, key released), form
related event (list selections, text changes) and focus
changes. Events triggered by the RE
are document loading
and document unloading.
Since XML
defines only the structure of a document, a
separate layout definition is provided through style
sheets. The W3 Consortium proposes Cascading Style
Sheets (CSS
) to be used together with HTML
and
XML
.
Style sheets imply a new set of attributes which are associated with an element. These attributes control the visible aspects of the elements, e.g. margins, borders, text styles. In addition, a style sheet can also describe aural properties for elements to enhance accessibility of a document with text-to-speech programs.
To implement the infrastructure for AHD
s, basic definitions
are needed to describe the structure and semantics of an
AHD
. In addition, a mapping of the event model to the
document behavior has to be defined.
The structure of an AHD
is defined in a DTD
. To allow
the RE
to process an AHD
, two approaches are
possible. On the one hand, the DTD
itself could declare a
set of elements which are recognized by the user agent and
contain any code and data required for the AHD
. This would
require a very detailed description since any specific state
and behavior elements would have to be made proper
elements. In our model, we chose a more flexible
approach. In an AHD
every element can contain code and data. For
this purpose, the DTD
declares only two specific elements,
namely <FUNC>
and <VAR>
. They will be explained in detail in
the next section. Instances of these elements are always
associated with their parent element, i.e. their scope is
the parent element. This association is established by the
RE
, which manages the access to code and data for each
element through access and modification functions.
The following sample illustrates the mechanism. The first
part of the example declares a DTD
with the elements
<FUNC>
and <VAR>
:
<!ELEMENT ahd o o ANY> <!ELEMENT func - - CDATA> <!ATTLIST func name CDATA #REQUIRED type CDATA #IMPLIED> <!ELEMENT var - - CDATA> <!ATTLIST var name CDATA #REQUIRED>
In the second part, two additional elements (<order>
and
<buyer>
) are declared. An instance of the <order>
element is
created. It contains the function print_header
and
instance of the element <buyer>
with two functions
(print
and duplicate
) and two variables
(name
and email
):
<!DOCTYPE ahd SYSTEM [ <!ELEMENT order - - (buyer | FUNC | VAR)*> <!ATTLIST order id CDATA #IMPLIED> ]> <!ELEMENT buyer - - (#PCDATA | FUNC | VAR)*> <!ATTLIST buyer id CDATA #IMPLIED> ]> <order id="o80"> <func name="print_header"> ... </func> <buyer id="p80"> <func name="print"> ... </func> <func name="duplicate"> ... </func> <var name="name">John Doe</var> <var name="email">doe@uni-essen.de</var> </buyer> </order>
The RE
associates both the functions and the variables
with the <buyer>
element. As a consequence, the variables
and the functions can be accessed only through the <buyer>
element. In this example, the <buyer>
element itself can be
addressed using its id
attribute. The addressing
scheme for elements is defined in more detail in the XML
specification
[Bray et al. (1997)].
The local scoping of functions and variables requires a
method to facilitate access to functions and variables. We
will use parent delegation to look up function and
variable elements. In order to access an element through its
name
or id
, the parent chain of the
current element will be searched. Function and variable
elements are associated with an element if they are a direct
child of this element. In the example, the variable
name
can be modified from within the function
print
since it is a direct child of the <buyer>
element.
Figure
4
shows the lookup of the function
print_header
which is called in the function
print
of the element <buyer>
. Here, the lookup of
the function is continued in the parent of <buyer>
,
<order>
. This element contains the wanted
script as a direct child.
The two elements which are used to incorporate state and
behavior into an AHD
are <FUNC>
and <VAR>
. They contain
either function code or variable values and are a logical
part of their parent element. It is important that any other
element which will contain these elements is defined
appropriately by including the <FUNC>
and <VAR>
elements in
the content declaration. Both elements have the CSS
display property set to none
so that they will not
be layouted and displayed.
The declaration of the <FUNC>
element is as follows:
<!ELEMENT FUNC - - CDATA> <!ATTLIST FUNC name CDATA #REQUIRED type CDATA #IMPLIED>
The name
attribute contains the name of the
function under which it can be called, and the
type
attribute describes the MIME
-type
[Borenstein and Freed (1993)] of the
function (which is basically the programming language
used). The contents of the <FUNC>
element (i.e. the text
between the start- and end-tag) is the code of the
function. The name of a function has to be locally unique,
which means that the parent element does not have any
other function element with the same name as an immediate
child. If another function with the same name exists, only
the first function is considered.
The following code shows the declaration of the <VAR>
element:
<!ELEMENT VAR - - CDATA> <!ATTLIST VAR name CDATA #REQUIRED>
The element has only one attribute, name
. Like
the <FUNC>
tag, this contains a locally unique name for
the variable. The contents of the <VAR>
element is the
value of the variable.
The events defined in the event model are mapped implicitly
to element functions. Any function which name equals an
event name will be called when that event occurs. According
to the IEM
an event handling function can also be defined
in an attribute which has the name of the handled event. The
next examples (which are all equivalent) show the usage of
event handlers.
<label1> <func name="onclick"> echo Mouse clicked! </func> ... </label1> <label2 onclick="echo Mouse clicked!"> ... </label2> <label3 onclick="mouse_handler"> <func name="mouse_handler"> echo Mouse clicked! </func> ... </label3>
Each event handler is passed additional information about the event, e.g. the pressed key or the pointer location.
An element consists of five components as shown in Figure 5 .
AHD
Element components
The access to the attributes and the child elements from
within the RE
and and AHD
is provided through the
functions defined in the DOM
. These functions can also be
used to access the functions, variables and the textual
contents of an element, but to ensure the parent
delegation mechanism for function and variable access,
convenience functions are implemented as well. These are:
setVar: sets a variable value (the textual
contents of a <VAR>
tag)
getVar: gets a variable value (the textual
contents of a <VAR>
tag)
setContents: sets the textual contents of
an element. If the element contains other child elements,
they will be deleted (to achieve more control over the the
contents of an element, using the functions of the DOM
is
recommended).
getContents: gets the textual contents of
an element. Note that any child elements in the content are
ignored (e.g. the contents of the <p>
tag in
"<p>A <em>nested</em>
tag</p>
" is "A tag
").
To preserve the state of an AHD
and to transfer it over an
HTTP
[Berners-Lee et al. (1996)] connection,
another function is needed:
toText: returns a textual representation of
the AHD
. The resulting text is an XML
document with
additional style information. It reflects any modifications
made through DOM
functions or the functions exported by
the RE
.
Since the execution of an element function can not be
triggered by DOM
functions, the RE
exports another
utility function:
callFunc: calls an element function
The implementation of the model is mainly tied to the
implementation of the user agent. We developed the extensible
web browser Cineast
[Koeppen et al. (1997)] to
develop and evaluate novel approaches like AHD
s. The main
part of the Cineast is written in OTcl, this is
also why we chose Tcl/OTcl as the scripting language
for AHD
s. The Cineast is currently running under
Unix
, but its main parts can be ported to other
operating systems and the model for the AHD
s is platform
independent.
The basis of the Cineast is the prototyping environment
Wafe
[Neumann and Nusser (1993)]. It combines
Tcl as a scripting language with different widget sets such
as the Athena widget set or the Motif widget set. In addition,
other libraries are linked into Wafe, among these are
SSLeay
[Hudson and Young (1997)], LDAP
[Howes and Smith (1997)] and OTcl. For
the implementation of the Cineast, a special purpose
widget called Kino
[Koeppen (1996)] is
integrated into Wafe to handle the parsing, layout and
display of XML
source text. It is implemented in C
to achieve high performance. The roles of Wafe and the
Kino widget in the different layers of the user agent are
discussed below. Figure
6
shows an overview.
Most of the functionality required in the RE
is provided
by the Kino widget. The main task of the Kino widget in
the RE
is to parse any source text and to maintain an
internal representation of the element tree. It also
implements all of the DOM
functions and the additional
functions needed to access the functions and variables of
the elements.
The Kino widget is made up of three components: the
Parser, the Layouter and the
Painter. In the RE
, only the Parser is of
interest. It produces a tree of parsed data
(PData
) elements. The Kino widget uses four different
types of PData
elements: a generic element, a
box element which can contain children, a text
element and an inset element which can be used
to insert other widgets (text entry fields, push buttons,
list boxes etc.) and images into the element tree.
The most interesting element is the box element. Besides its
role as a structuring element to contain other elements, it
holds the CSS
attributes. The PData
box elements
correspond directly to the XML
elements in the document,
i.e. any XML
element results in a PData
box
element. Figure
7
shows a part of a PData
tree.
PData
tree
Navigation in the PData
tree is possible through the DOM
on the one hand, on the other hand, the components of the
PData
elements can be used directly through their
C
pointers.
The Kino widget is extensible in two ways: the application
can register a tag callback, which is called
whenever a tag is encountered during the parsing process. In
this callback, the application can for example modify the
PData
tree. The second callback which is used handles the
execution of scripts. The Kino widget calls this
script callback whenever a script has to be
executed, e.g. when an event occurs or a script is called
via callFunc
. This makes the Kino widget
independent from the chosen scripting language because the
execution of a script is handled by the application, in this
case the Cineast. Figure
8
shows the possible
interactions with the Kino widget.
The presentation layer for XML
documents is implemented in
the Layouter and Painter components of the
Kino widget. They are not needed for the basic
functionality of an AHD
and can be omitted if the RE
is
to be incorporated into a web server.
The Layouter works together with a CSS
database. The CSS
database is built during the parsing process and contains
all style information. Currently, the CSS
database is
implemented completely in OTcl. The Layouter
positions any element so that no calculations are needed for
display by the Painter.
Our implementation handles most of HTML
3.2
, including important features like tables, images
and forms. The internal layout model is box-oriented, so
that a PData
box results in either a block-level or inline
element according to the CSS
specification. In contrast to
a simple flow-oriented model, boxes can be nested. Figure
9
shows the most important arrangements.
The box model allows the realization of more complex layouts such as tables. The current implementation does not support incremental layout as well as absolute positioning of boxes within a text flow.
The Kino widget is tightly cooperating with and embedded within the infrastructure provided by Wafe and the Cineast browser.
The infrastructure requirements for the implementation of a browser are partly generic and partly Web specific. The generic requirements are usable in a wide range of applications. These generic components are provided either by Wafe or by libraries accessible through Wafe. For example the color and image management (for the image formats xbm, xpm, jpg, png and gif) or event dispatching, resource management, module management are directly provided by Wafe. Generic components provided libraries accessible through Wafe are for example the widget classes from OSF/Motif, basic networking facilities from Tcl, security algorithms and protocols from SSLeay, compression from zlib, etc.
The browser specific code is exclusively implemented in OTcl and is seen by Wafe as the Cineast source code, which is loaded by Wafe at startup time of the browser. The Cineast source code defines the typical browser semantics ranging from
HTML
, XML
and CSS
realization which can be separated from the Kino widget
class (e.g. FORM
tags, implementation of the
CSS
database), toHTTP
, IMAP
, HTTPS
, access to
non-networked documents such as files), access control
(basic authentication), toURL
-completion,
request life-cycle management (starting, redirection,
killing), monitoring, toIMAP
),
MIME
-type specific presenters, and
One of the most demanding tasks for implementing a browser
is to ensure that the browser does not block event
handling. Since most of the components of the browser are
not thread safe the multitasking is implemented through a
select()
based event loop. All interaction and I/O
has to be performed asynchronously. If an I/O operation
would block, for example the browser windows could not be
refreshed, or only a single transfer would be possible at a
time. Furthermore it would not be possible to provide an
incremental loading and display which are highly useful over
rather slow connections.
The Cineast source code logic is greatly determined by
this asynchronous event handling. As a consequence of the
asynchronicity various handler and callbacks have to be
registered to process for example incoming data from sockets
or to process user events in various windows and to
associate the events with the according tasks. Every handler
must have enough context information to continue a task in
the correct environment. The Cineast browser uses widget
IDs for the context selection for GUI purposes, in the other
cases OTcl objects. For example for every request that is
started a new OTcl object is created which registers on
the input side I/O callbacks to process the input data
incrementally and reports to the presentation objects
(typically image or HTML
/XML
text) important state
changes in its life cycle (when the object is created, a
MIME
type determined, some new data is available, end of
data is reached, the request was killed).
As stated earlier the RE incorporated in the browser provides the basic platform for implement the state and behavior components of AHDs. We will show now how a document centered application based on AHDs differs form purely server or client centered approaches. We will use the selection and purchase of goods over the internet as an example.
As the central document, the catalog with the order form can
be identified and implemented with an AHD
. The life cycle of
the catalog AHD
begins with the transfer of the catalog into
the RE
of the user's web browser. The first event the
catalog AHD
will receive is the onload
event
defined in the IEM
. Upon reception of this event, the
catalog AHD
can update its contents, e.g. recalculate prices
of special offers etc. We assume that the user now disconnects
from the internet and saves the catalog AHD
locally as a
file which is achieved through the RE
's toText
function (introspection is necessary for this task). The
disconnect is not strictly necessary but emphasizes the
strengths of document centered applications.
The catalog AHD
can later be loaded from the local file and
will reside in the web browser's RE
where it can react to
user input like entering amounts or selections of goods. When
the user is finished making his selection, he can save the new
state of the catalog AHD
and the order form again to a
file. If he decides to transmit his order to the online store,
the catalog AHD
generates an order AHD
. The order AHD
is
transferred to the online using a HTTP
PUT
request
where it will be saved in a file on the online store's web
server, ready for further processing.
Ideas for applications which are based on web techniques such
as HTML
and CGI
were presented early in the history of the
web. In
[Houh et al. (1994)], the authors already
point out the need for more control on both the client and
server side. Related research can also be found in
[Kaashoek et al. (1994)], where the Mosaic web browser
is extended to be able to execute Tcl scripts. However, a
more formal model is not presented.
Our proposed model for Active Hyperlinked Documents and the implementation of the overall system differ in two key points from existing approaches:
AHD
is independent of commercial influence
and easy to use and validate for others,AHD
-based
application.
More important however are the next steps that have to be made
to prove the potential of AHD
s. First, a formal model for
the use of AHD
s in distributed applications has to be
developed. Starting points can be frameworks like
[Muehlbacher and Neumann (1996)] where te potential of
active documents as tools for collaboration are discussed as
alternative to legacy systems. A prerequirement for
collaboration are coordination issues
[Tolksdorf et al. (1996)] between AHD
s which are not
discussed here.
From the architectural point of view the the distinctions
between a user agents and web servers is not neccessary. We
will work towards a clearly defined RE
that can be
incorporating into classical web server or other applications
as well.
Lasty, security issues are not addressed in this paper. We believe that AHDs have a huge potential in electronic commerce (esp. when they are combined with e.g. technologies like croptolopes [IBM (1996)]) or in intranet application in combination with role based access control [Neumann and Nusser (1997)].
[Berners-Lee et al. (1996)] T. Berners-Lee, R. Fielding, H. Frystyck: Hypertext Transfer Protocol - HTTP/1.0 Informational RFC, RFC 1945, http://www.w3.org/Protocols/rfc1945/rfc1945, May 1996.
[Bray et al. (1997)] Tim Bray, Jean Paoli, C.M. Sperberg-Queen: Extensible Markup Language (XML), W3C Working Draft, http://www.w3.org/TR/WD-xml, November 1997.
[Byrne (1997)] Steve Byrne: Document Object Model (Core) Level 1, W3C Working Draft, http://www.w3.org/TR/WD-DOM/level-one-core-971009, October 1997.
[Gosling and McGilton (1996)] James Gosling, Henry McGilton: The Java Language Environment, http://java.sun.com/docs/white/langenv/, May 1996.
[Goldfarb (1990)] Charles F. Goldfarb. The SGML Handbook, Oxford University Press, Oxford 1990.
[IBM (1996)] IBM Corp.: IBM Cryptolope Home, http://www.cryptolope.ibm.com/, 1996.
[Hudson and Young (1997)] T. J. Hudson and E. A. Young: SSLeay and SSLapps FAQ (Draft), http://www.psy.uq.edu.au/~ftp/Crypto/, September 1997.
[Lie and Bos (1996)] Håkon Wium Lie and Bert Bos: Cascading Style Sheets, level 1, W3C Recommendation, http://www.w3.org/TR/REC-CSS1, December 1996.
[Netscape (1997a)] Netscape Communications Corp.: Plug-In Guide, http://developer.netscape.com/library/documentation/communicator/plugin/index.htm, May 1997.
[Netscape (1997b)] Netscape Communications Corp.: JavaScript Reference, http://developer.netscape.com/library/documentation/communicator/jsref/index.htm, October 1997.
[OMG (1997)] Object Management Group: The Common Object Request Broker: Architecture and Specification, ftp://ftp.omg.org/pub/docs/formal/97-10-01.pdf, August 1997.
[Raggett (1997)] Dave Raggett: HTML 3.2 Reference Specification, W3C Recommendation, http://www.w3.org/TR/REC-html32.html, January 1997.
[White (1996)] Jim White: Mobile Agents, White Paper, http://www.genmagic.com/agents/Whitepaper/whitepaper.html, 1996.