Asp_Spacer
Home About us Services Contact us Careers
Logo  
Asp_Spacer
ColdFusion_Spacer
 
Anticipating tomorrows
technologies
Outsource_ImageASP_Spacer
 
ASP_Spacer
General Articles
Web Technology
Lotus Domino
Security
 
Automatic web usability evaluation: what needs to be done?
Outsource_Spacer
By Giorgio Brajnik
Outsource_Spacer
Abstract
Outsource_Spacer
Website redesign and maintenance are likely to absorb more and more resources as web technologies and uses keep evolving at the current pace. Usability evaluation methods need to be run after each change in order to ensure a decent quality level. The means to control the complexity and cost of website maintenance lies in tools performing automatic usability evaluations. I present a survey of tools that analyze websites, illustrating what kind of automatic tests they perform and which usability factors the tests are more closely related to. The survey then leads to an analysis of the still remaining gaps and of research openings.
Outsource_Spacer
1. Introduction
Outsource_Spacer
It is well known that the average quality of websites is poor, “lack of navigability” being the #1 cause of user dissatisfaction [Fleming, 1998; Nielsen, 1999].
Outsource_Spacer
On the one hand web technologies evolve extremely fast, enabling sophisticated tools to be deployed and complex interactions to take place. Secondly, the life cycle of a website is also extremely fast: maintenance of a website is performed at a rate that is higher than that of other software products because of market pressure and lack of distribution barriers. In addition, often the scope of maintenance becomes so wide that a complete redesign takes place.
Outsource_Spacer
On the other hand, the quality of a website is rooted on its usability, which usually results from the adoption of user-centered development and evaluation approaches [Newman and Lamming, 1994; Fleming, 1998; Rosenfeld and Morville, 1998; Nielsen, 1999]. Usability testing is thus a necessary and repeated step during the life-cycle of a website.
Outsource_Spacer
To test usability of a website a developer can adopt two kinds of methods: usability inspection methods (e.g. heuristic evaluation [Nielsen and Mack, 1994]) or user testing [Nielsen, 2000]. Heuristic evaluation is based on a pool of experts that inspect and use a (part of a) website and identify usability problems that they assume will affect end users. With user testing, a sample of the user population of the website is selected and is asked to use (part of the) website and report things that they think did not work or are not appropriate.
Outsource_Spacer
Even though the cost (in terms of time and effort) of both methods is not particularly high, and their application improves the website quality and reduce the overall development cost, they are not systematically performed at detailed levels on every different part of a website after each maintenance or development step.
Outsource_Spacer
It is clear that as change actions on a website increase rapidly in number and variety, more and more resources need to be deployed to ensure that website quality does not decrease (but hopefully increases). It is also clear that any tool that can, at least in part, automate the usability evaluation and maintenance processes will help to fill this ever widening gap.
Outsource_Spacer
The goal of this paper is to present a brief survey of what these tools do and how they contribute to the usability evaluation problem. From the analysis it appears that gaps exist between what these tools achieve and what is required to ensure usability. While some of these gaps are inherently unsolvable, other ones can probably be filled in, given that additional research is carried out to identify effective techniques.
Outsource_Spacer
2. A software engineering view of a website
Outsource_Spacer
A website is an interactive software system. It interacts with at least two different kinds of users: end users trying to achieve some goal and developers/maintainers striving to keep the system working and improving it.
Outsource_Spacer
End users can be characterized in terms of:
Outsource_Spacer
 
Outsource_Bullet goals and tasks: e.g. information seeking, choosing where to buy some specific product, buying it, writing a book review, etc.
Outsource_Spacer
Outsource_Bullet context: user behavior during information seeking processes is strongly affected by users’ culture, language, previous knowledge in the field, experience in using the web.
Outsource_Spacer
Outsource_Bullet technology: end users interact with the website through a layer of technology that is not under control by the web designer: browsers, protocols, plug-ins, operating system platforms, interaction devices (screens, speaking devices, pens, reduced telephone keyboards, etc.), network connections.
Outsource_Spacer
Information seeking through browsing is a process that almost all websites must support. Unfortunately, it is also a difficult task to model and support because it encompasses complex cognitive, social and cultural processes [Allen, 1996] spanning through interpretation of textual, visual, audio messages, selection of relevant information and learning.
Outsource_Spacer
On the other hand we have developers and maintainers. Amongst their activities, a prominent role is played by actions that include: corrective maintenance (i.e. fixing problems with the website behavior or inserting missing contents), adaptive maintenance (i.e. upgrading the site with respect to new technologies, like new browsers’ capabilities), effective maintenance (ie. improving the site behavior or content), and preventive maintenance (i.e. fixing problems in behavior or content before they affect users). A large fraction of these activities is aimed at detecting system failures (that is departures from its required behavior), analyzing them and identifying faults (that is representations, within the system, of human errors that occurred during development – bugs).
Outsource_Spacer
Maintenance is meant to improve the quality of the website. ISO9126 defines quality as “the totality of features and characteristics of a software product that bear on its ability to satisfy stated or implied needs” and it includes properties like maintainability, robustness, reliability and usability that are particularly important for websites.
Outsource_Spacer
Usability can be defined (ISO9241) as “the effectiveness, efficiency and satisfaction with which specified users achieve specified goals in particular environments”, where:
Outsource_Spacer
 
Outsource_Bullet effectiveness means “the accuracy and completeness with which specified users can achieve specified goals in particular environments”,
Outsource_Spacer
Outsource_Bullet efficiency means “the resources expended in relation to the accuracy and completeness of goals achieved”, and
Outsource_Spacer
Outsource_Bullet satisfaction means “the comfort and acceptability of the work system to its users and other people affected by its use”.
Outsource_Spacer
General properties like these are not independent: for example, a robustness failure of a website (e.g. some browser incompatibility) will result also in a usability failure (e.g. user inability to complete a task and dissatisfaction).
Outsource_Spacer
In order to be operationalized these properties need to be decomposed into more detailed ones that can be assessed in a simpler and perhaps more standard way. For example, maintainability can be decomposed into complexity of the DHTML code, its size, the number of absolute URLs, etc.
Outsource_Spacer
The same applies to usability. It can be described in terms of usability factors (like speed of use, error rate, ease of error recovery, etc) which in turn can be reduced to other lower-level properties. The most important properties for website usability include those related with “navigability” (most of them taken from [Fleming, 1998]):
Outsource_Spacer
 
Outsource_Bullet consistency of presentation and controls
Outsource_Spacer
Outsource_Bullet adequate feedback
Outsource_Spacer
Outsource_Bullet natural organization of the information (systematic labels, clear hierarchical structure)
Outsource_Spacer
Outsource_Bullet contextual navigation (in each state all and only the possible navigation options are available)
Outsource_Spacer
Outsource_Bullet efficient navigation (in terms of time and effort needed to complete a task)
Outsource_Spacer
Outsource_Bullet clear and meaningful labels.
Outsource_Spacer
Other properties relevant to usability of a website are:
Outsource_Spacer
 
Outsource_Bullet robustness (i.e. how well the website handles technology used by users that has not been foreseen by developers)
Outsource_Spacer
Outsource_Bullet flexibility (for example: availability of graphic and textual versions, redundant indexes and site maps, duplicated image map links)
Outsource_Spacer
Outsource_Bullet functionality (i.e. support of users’ goals)
Outsource_Spacer
The latter can be further decomposed if we narrow users' goals. For e-commerce sites, for example, other relevant attributes can be:
Outsource_Spacer
 
Outsource_Bullet how security is handled and how easy it is to get information about it
Outsource_Spacer
Outsource_Bullet similarly for privacy
Outsource_Spacer
Outsource_Bullet how easy and effective it is to find the desired item
Outsource_Spacer
Outsource_Bullet how easy and effective it is to search the catalog for an item not known a priori
Outsource_Spacer
Outsource_Bullet how easy and effective it is to preview an item
Outsource_Spacer
Outsource_Bullet what are the return policies and how they are communicated
Outsource_Spacer
The Web Accessibility Initiative [W3C, 2000] is an effort by the W3C organization to improve website accessibility. They publish a set of guidelines [WAI, 1999] where accessibility is defined as the website ability to be used by someone with disabilities. An accessible website:
Outsource_Spacer
ensures graceful transformation: it should remain accessible despite physical, sensory and cognitive disabilities, work constraints and technological barriers;
Outsource_Spacer
makes content understandable and navigable: it should present its content in a clear and simple language, and should provide understandable mechanisms to navigate within and between pages.
Outsource_Spacer
While usability implies accessibility (at least when an unconstrained user population is considered), the contrary is not necessarily true. For example, a missing link to the home page may be a fault affecting usability, while it does not affect accessibility.
Outsource_Spacer
All these properties (either those related with usability or those related with accessibility) may be further decomposed into more detailed ones that refer to specific attributes of the website implementation. Actually, such a decomposition has to be done in order to support usability inspection methods and to identify and fix faults. For example, to determine how flexible a website is, we need to inspect implementation (or perhaps design specifications) to determine if there is a textual version of the page, if there are textual links that duplicate those embedded in images, etc.
Outsource_Spacer
Some of these lower-level properties refer to attributes that depend only on how the website has been designed/developed (e.g. textual duplicates of links embedded in images) – they are internal attributes, while others depend on the website and its usage (e.g. how meaningful a label is) – external attributes. This is always the case for properties referring to the content, which require some sort of interpretation that assigns meaning to symbols in order to be assessed.
Outsource_Spacer
While for evaluating usability of a website both internal and external attributes are needed, only the former ones are amenable for automatic tests. External attributes can be evaluated only via semi-automatic means that entail a human evaluation step. However, tools can provide useful assistance by filtering and ranking content that is potentially relevant (for example, by adopting statistical techniques developed in Information Retrieval [Belkin and Croft, 1987]).
Outsource_Spacer
3. Automatic tools for usability evaluation
Outsource_Spacer
Tools that support the developer/maintainer in finding usability faults and fixing them can be classified according to:
Outsource_Spacer
location: web-based vs off-line
Outsource_Spacer
 
Outsource_Bullet type of service: failure identifiers (they discover potential failures via simulation of user actions, like filling a form; sometimes they rank them according to severity); fault analyzers (they find failures and highlight their causes, i.e. faults; usually they systematically analyze the source code of the website; sometimes ranking the list of faults according to their severity); analysis and repair tools (they assist the developer also in fixing the faults)
Outsource_Spacer
Outsource_Bullet information source: automatic usability analysis can be performed on the basis of the actual implementation of a website (sources), or on webserver logs, or data acquired during user testing (user testing data); this paper deals only with tools analyzing website sources
Outsource_Spacer
Outsource_Bullet scope, i.e. the set of attributes that are considered during the automatic analysis. A classification based on scope is:
Outsource_Spacer
Outsource_Bullet HTML validators and cleaners (they assist in removing non standard usage of the language)
Outsource_Spacer
Outsource_Bullet HTML/graphic optimizers (they improve downloading and rendering performance by recoding certain parts of HTML or graphic documents)
Outsource_Spacer
Outsource_Bullet link checkers (they probe all the links leaving a page to determine if their targets exist)
Outsource_Spacer
Outsource_Bullet usability tools (they detect and sometimes help to fix usability faults).
Outsource_Spacer
In the following of the paper I will discuss only tools having d) for scope, being the most general one. At the moment the following tools have been developed and are available (or will soon be available) from the web1:
Outsource_Spacer
A-Prompt: developed by the University of Toronto [ATRC, 1999]; off-line, with ranking; does fault analysis and repair
Outsource_Spacer
Bobby: available from CAST [CAST, 1999]; web-based and off-line, with ranking; fault analyzer
Outsource_Spacer
Doctor HTML: available from Imagiware [Imagiware, 1997]; web-based and off-line; fault analyzer
Outsource_Spacer
LIFT: available from UsableNet.com [Usablenet, 2000]; web-based and off-line, with ranking; fault analyzer and repair tool
Outsource_Spacer
4. The test effectiveness problem
Outsource_Spacer
While these tools offer a test suite that is reasonably wide and open, at the moment there is no standard way to assess usability of the tools themselves. This is particularly true for their effectiveness, that is how accurate are the tests that they run. Determining the means to measure and evaluate test effectiveness is an important requirement, both from research and pragmatic viewpoints. In fact, a standard tool evaluation methodology:
Outsource_Spacer
 
Outsource_Bullet could be used to assess validity of each test and consequently each tool;
Outsource_Spacer
Outsource_Bullet could be used to compare effectiveness of different tools;
Outsource_Spacer
Outsource_Bullet could be used to define standard levels of effectiveness, that might then automatically reflect on standard usability levels of websites that have been passed through certified tests;
Outsource_Spacer
Outsource_Bullet could provide insights for a proper interpretation of the results produced by tests (what can be the consequences of the problems identified and fixed by tools).
Outsource_Spacer
The research on web usability and accessibility guidelines [WAI, 1999; Scapin et al., 2000] is a first step towards such a methodology. But more is needed to define a proper methodology.
Outsource_Spacer
An evaluation methodology, given the fast evolution pace of web technologies and uses, can probably be only based on experiments comparing test results with results obtained through other usability evaluation methods, namely usability inspection methods and user testing.
Outsource_Spacer
It should specify a set of tests (by identifying possible usability failures and related faults), how test effectiveness is to be measured and how the experiment should be performed (what kind of user testing, what kind of questionnaires or data acquisition methods should be adopted, etc.) in order to be valid. The Goal-Question-Metrics approach [Fenton and Lawrence Pfleeger, 1997] could be followed as a framework to define such a methodology.
Outsource_Spacer
Notice that even though many tests are likely to yield false positives, the major consequence of this is a reduced productivity of the maintainer (that has to cope with incorrect information). In my view, it is more important to define effectiveness in terms of the number of false negatives, that is cases where the automatic tool was not able to identify a fault that was instead uncovered by other means.
Outsource_Spacer
Test sites could be set up where specific faults are injected with the purpose of exercising certain tests. Tools then could be evaluated on the basis of the number of faults that they uncover.
Outsource_Spacer
5. Conclusions
Outsource_Spacer
In this paper a brief survey of automatic usability evaluation tools for websites has been presented. These tools consider a large set of properties depending on attributes of websites only (and not on the context in which websites are used, thus not considering its contents). Expecially those supporting repair actions (in addition to identification of usability faults) have the potential to dramatically reduce the time and effort needed to perform maintenance activities.
Outsource_Spacer
Several tests are still uncovered even though it seems that they are viable with currently available technology. In other cases, in order to be able to advance the state of the art in automatic usability evaluation, the test effectiveness problem needs to be formulated and solved. This is the problem of defining a standard methodology for evaluating the effectiveness of these tools. This in turn requires that appropriate models for usability are defined
 
   
 
Outsource_Image
 
 
Why Lotus Domino ?
More
 
 
The question of whether a computer can think is no more interesting than the question of whether a submarine can swim.
E. W. Dijkstra
   
  Contactus
   
   
ASP_Image ASP_Image
ASP_Spacer© 1999- Digital Mesh Softech India (P) Limited, Kochi. Home | Client Login | Sitemap | Our Land | Privacy | Terms of useColdFusion_Spacer