Usability Evaluation is a Heterogeneous Process

Many have guessed it, some have ignored it.

But, my recent results are definite: Usability defects differ in how easy they are detected and in most cases evaluators differ in their skills to detect them. This is also the case for participants in usability testing studies.

If you read the old papers of Virzi and Landauer & Nielsen on predicting the usability evaluation process you find them

  • argue that there are differences between defects and/or evaluators
  • but then forget that and choose a the cumulative geometric function as a prediction model, which only has one single parameter p

Now, I have treated five data sets from the literature with some quite advanced statistical techniques. The results are clear:

  • usability defects differ in their “detectability” regardless of the method or any other context factors
  • in three of five studies the detection skills of evaluators differed
  • only in two data sets from highly systematic experiments the evaluator skills appeared homogeneous.

Sounds like academic subtleties? Read the latest papers of Lewis. He still has problems to find a good estimator for p. A further step in my analysis revealed: Heterogeneity of defects is accountable for harmful overestimation in the cumulative geometric process model, which is used since the papers of Virzi.

Lewis, J. R. Evaluation of Procedures for Adjusting Problem-Discovery Rates Estimated From Small Samples International Journal of Human-Computer Interaction, 2001, 13, 445-479

Nielsen, J. & Landauer, T. K. A mathematical model of the finding of usability problems CHI ‘93: Proceedings of the SIGCHI conference on Human factors in computing systems, ACM Press, 1993, 206-213

Virzi, R. A. Refining the Test Phase of Usability Evaluation: How many Subjects is enough? Human Factors, 1992, 34, 457-468

Blogged with Flock

Tags:

No comments yet

Leave a reply