There are a number of problems with the current state of the art in the evaluation of user interfaces. Firstly, there have been proposed a wide range of different methods for performing the evaluation, each with its own body of advocates, but with no common agreement as to how the different methods compare with one another. Secondly, the methods which exist at the moment either provide insufficient information or they are so time-consuming as to not be practical in real cases. Finally, many methods are unreliable, with individual evaluators having an undue influence on results, and there is in general no way to translate results into solid recommendations for improving the interface under test. This paper surveys a representative subset of evaluation methods, and suggests ways in which their differences may be resolved.
Download compressed postscript file