Web content management, co-operative authoring and collaboration over the World Wide Web using an engine developed and demonstrated here, demonstrated the co-operative authoring of individual documents, the management of their content and collaboration in general. The version that appeared here was the latest version that had been tested successfully on the testbed servers. The testbed servers were used to test the code before it was placed on the public server, they were also used to run tests that were better run on a standalone network, because they might have interfered with other computers if run over the Internet. Tests like load and stress testing are done in this manner.
The engine demonstrated the ability for people to edit documents over the Web, from any location, at any time. It stored documents in the local file system of the computer that was hosting the engine, and served pages over the Web, with the ability for anyone to fully edit them while keeping full version history. The engine also allowed for documents to be stored in the web server's document tree. This meant that the web server could serve the documents as if they were local static documents. This allowed the reading of the documents without regard to the status of the engine, this gives better reliability than serving the pages dynamically every time.
The engine allowed the free mixing of ideas by allowing anyone to edit any page. People who would like to add their ideas only needed to browse the web site, and start editing (whether creating new pages or editing current pages), to add their ideas, or to enhance ideas that were already present. Of course should a Webmaster want to restrict a particular area to read only, or restrict access to only a select few, then the normal access provisions of the web server can be used. This means that YEdit can be used for public editable content, public read-only content, and private content, with the use of the normal access controls that the web server provides.
All of this builds up to a vision of the future for the Web that allows the freedom for groups to work together to not only provide information over the Web, but to provide a means of creating new information at the same time. That is, to allow the flow of information not only from the web server to the web surfer, but between groups of web surfers and web servers. The ability for the information to flow in all directions was enabled by the engine because it allowed people to co-operatively author, and collaborate on, documents that they wished to share the creation of. They could receive comments and additions, with the comments and additions made as part of the actual document, rather than making comments and additions using some other communication channel, like comment forms or email.
One of the early possibilities that was looked into for this thesis was to use web themes on the web server to allow a split between content and context information. This split allows for there to be multiple ways of viewing information, all with one content base, that is, only one file to update that contains the content for the page. The context information (images, buttons, other graphics, etc) could be linked to, and the appropriate version appears when someone views the web page. For example, if the user is browsing in text only mode, all the graphics will be replaced with their text equivalent, or if it is near a particular time of year, the default images and graphics could be changed to match that time of year. If the user prefers a particular theme or colour scheme, providing the web site supports that theme, the user can view the site using it. Because this is all done on the server side, there are no requirements placed on the user, therefore anyone can use it. Prototypes were created, first in 'C', and then in Java to test the idea and to see if it would work. These themes allow the user to choose the method of display that is the most appropriate and convenient for them, with only one web page for the Webmaster to keep up to date. Thus users can view the page as they wish, and the Webmaster only has to keep one set of information up to date. These prototypes showed that it could work with fuller integration into the web server (rather than calling via SSI's (Server Side Includes; these let a user embed a number of special 'commands' into HTML. When the server reads a SSI document, it looks for these commands and performs the necessary action. For example, there is a SSI command that inserts the document's last modification time. When the server reads a file with this command in, it replaces the command with the appropriate time.) or Servlets as was done in the prototypes). This idea was considered for use in this thesis, but at the time (circa 1998) others were floating ideas of using templates and similar ideas for a similar job. Since then other technologies, such as XSL/XML, JSP/XML, or JSP/HTML, for separating content from context, have maturated. Unfortunately it is impossible to tell if these technologies are widely used (or used at all), because the entire interaction happens on the web server, before the user sees the page.
Along with web themes, Multi-tier and multi-homed web sites were looked at early on. The thinking behind multi-tiered and/or multi-homed web sites is that in case one part of the site fails or is lost in some manner, another part can take over at least some of the responsibilities of the failed part. That is to display and/or store information so that it is both stored safely and accessible. This had potential to be useful by itself, or in combination with other ideas. Using multi-tiered/multi-homed web sites in combination with other ideas such as the free flowing web of information allows for a more robust system. This is because it tolerates any problems with the dynamic web site (that may be encountered in any newly designed system that are results of unexpected situations), as most people will be viewing the web pages from the static read-only web site, while editing is done on the dynamic site. This allows both the read and edit parts of the dynamic web site to crash, while only inconveniencing a small proportion of users. The level of service and information that is replicated over the different locations determines the level of redundancy offered by a multi-tier/multi-homed web site. The ability to access the latest version of a web page is the most important one to have available, as most people are probably surfing the Web, not expecting to be able to edit it. This is backed up by the access patterns that were observed when it was tested here. In the 2.5 months the test lasted, people viewed 2729 web pages on the web site, while only making 25 page edits during that time, which is approximately 1% of page views were to edit a web page. This page view and edit information shows that if there are any problems, the way to minimise the impact is to make sure that the majority of people can continue to do what they were doing, and that is to surf the Web (in this case, view the read-only pages). This means that the most used part of the engine, and therefore the part that is most important to replicate and keep redundantly available is the ability to view the latest version of the web pages. This split between reading the latest version of a web page, and browsing all the versions and editing them, is used in the YEdit engine to allow a multi-tier, or multi-homed system to exist, and allows for the redundant storage and viewing of the latest versions. Another reason for the split and redundant availability of the web pages is so that any problems in the more complicated subsystem detract as little as possible from the overall system. There are other multi-tier/multi-homed systems that can be used (and could be used in conjunction with this engine), such as load balancing between machines, mirror web sites, and lately Akamai Technologies (who have web servers located globally, and replicate web sites to the servers that are closest to users).
The engine was created in Java, using Java Servlets. At the time, Java was a very new language, but it had good support and several advantages over other languages such as Perl or C/C++ for server side applications. There are some very important considerations when writing applications that are to run on a web server. Some of these are robustness, abilities of the language, portably, support, and general susceptibility to errors, especially buffer overruns, as they are one of the most common exploits. One of the things that was important for YEdit was the dynamic object nature of Java. C++ has similar abilities, but lacks the easy portability, and can have problems with security (especially buffer overruns). Perl seemed to lack some of the abilities (such as reducing heavy load on the web server, unless the server supports persistence for Perl, like "mod_perl" in Apache, and less opportunity for dynamic object oriented code loaded on the fly), while Java Servlets stood out with a very good match for the needed abilities, with the advantage of a similar syntax to C/C++, which eased the initial learning of Java. As Java was such a young language at the time, it was important that the reasons for choosing it were because it did what was needed, as has been the case. Today it is a much more accepted language, and over the past years, the support for Java Servlets has increased in the commercial web server market, although it is still very low compared with other server side technology. There are now a reasonable number of web hosts to choose from, whereas when this was started, there were hardly even a handful to choose from, and if you wanted Java Servlet support, it was almost a case of setting it up yourself, which actually is not all that hard.
The engine was flexible enough to be used for a wide range of different document types, from web pages, to word processor documents, to any kind of document that one wishes to edit co-operatively. It also had the capability to store the documents in any kind of repository that an interface could be created for. This potentially allowed it to access and store information in any current document storage system, and to serve the stored documents from this system to any document editing facilities that were available. This allowed any group or business to have the full advantages of content management, while keeping prior investments in legacy technology intact. It enhances their ability to advance the technology they use to the latest that is available, without many of the normal side effects of changing the software that people are used to using. The side effects are mitigated because most users can continue to use the software that they are familiar with, while allowing the software that is working in the background to be advanced and upgraded. Because the user interface and server interfaces of the engine were independent, there was no reason that a particular storage system should be used for any particular software that a user is using, this allows for the storage system to be completely upgraded with little impact on the users. This de-coupling of the storage system from the user-editing program also has the side effect of allowing an user interface for the engine to be created that can automatically convert documents to a standard format for storage. This can have the benefit of allowing user-editing programs that store files in different formats to be used, accessing the documents transparently to the user. The document may have been edited in a completely different editing program, as the engine could provide the document in a particular programs native format, without the need for the user to intervene.
The engine was flexible enough to support future expansion and development in methods for storing and serving documents. It was designed to work over the Web to begin with, to test and evaluate possibilities for use and to allow people to become familiar with this type of system. The Web was also expected to be the most used interface, therefore it was the first one created. Included in the future developments for storing documents is using databases to store the documents, and the possibility of storing documents into source code revision storage facilities, such as CVS for advanced version control. Included in future developments for serving documents are a simplified web edit ability (for those that would rather have a simpler way to edit documents), serving and editing documents using WebDAV (an open distributed authoring and versioning protocol), WAP (browsing simplified web pages on mobile devices), and directly into your favourite document editing program, using their native formats.
The major goal of this research was to see if it was possible, and if possible to create, an system that would handle co-operative authoring and content management over the Web. As a secondary goal the system should be flexible enough to potentially handle any kind of document, any kind of storage system, and any kind of user application to access it. Successful implementation of the engine has demonstrated the viability of those ideas. The secondary goal was also achieved, allowing the flexibility of any kind of document to be stored in any kind of storage system that has an interface created for it, and accessed by any kind of user program that an interface is created for. Now that the engine has been created, and had interfaces for editing the Web, it would have been appropriate to create interfaces to other applications. This is so that the engine could be used by a variety of different applications, and interfaces to a variety of different storage systems so that when people first hear about this, they know that it can interface with their current set up seamlessly. This would have allowed them to use the software that they want to, rather than being locked into any legacy system.
Further research could be possible into how a system such as YEdit will affect the use of authoring software in groups, especially from the point of view of businesses, as the engine has implications on the use of both legacy and current software. Other research areas could include how people use the interfaces (where they are visible to the end user), and whether the interfaces should be completely transparent, or should give some sign to the user that they are being used. Also there could be research into the refinement of the Web interface, and web page editing interfaces in general for editing documents live on the Web. This would have implications not only for YEdit, but also for web editing in general, and may have a flow on effect to other editing interfaces.
There are other things that have been created in the process of conducting this research that are not directly related to its aim, such as the design and construction of the web pages to support the engine. These web pages inform people about the uses and abilities of the engine and these web pages can be enhanced in the future. Along with enhancing the web pages, further promotion of http://www.YEdit.com/ can proceed especially with regard to users who may wish to use the abilities of the engine to enhance their co-operative authoring, content management and collaboration.
The YEdit engine had succeeded in demonstrating that co-operative authoring and content management of individual documents over the Web and collaboration in general is possible, and that it is possible to create an engine that is general enough that it can support a wide variety of document types. The front end could interface with many different applications, allowing people to work in the manner that best suits them, and the storage interface could store documents in any locations that were appropriate for the documents concerned. The main user interface was through the Web. This means that it could be widely used and tested without the need to install specific software on users' computers, allowing a broad range of people to try it out and decide if it will be useful for them. The engine also allowed a wide range of document storage systems to be used. The combined abilities to handle any type of document interface with any kind of document storage system, and interface with the users most used applications would have allowed the engine to enhance even legacy systems with the maximum of ease and usefulness.