Daniel M. Berry
Cheriton School of Computer Science
University of Waterloo
Waterloo, ON, Canada
A hairy requirements or software engineering task involving natural language (NL) documents is one that is not inherently difficult for NL understanding humans on a small scale but becomes unmanageable in the large scale. A hairy task demands tool assistance. Because humans need far more help in carrying out a hairy task completely than they do in making the local yes-or-no decisions, a tool for a hairy task should have as close to 100% recall as possible, even at the expense of high imprecision. A tool that falls short of 100% recall may even be useless, because to find the missing information, a human has to do the entire task manually anyway. Any such tool based on NL processing techniques inherently fails to achieve 100% recall, because even the best parsers are no more than 91% correct. Therefore, to achieve 100% recall in a tool for a hairy task, it needs to be based on something other than traditional NLP.
The reality is that a tool's achieving exactly 100% recall, which may be impossible anyway, may not be necessary. It suffices for a human working with the tool on a task to achieve better recall than a human working on the task entirely manually.
This talk describes research whose goal is to discover and test a variety of non-traditional approaches to building tools for hairy tasks to see which, if any, allows a human working with with the tool to achieve better recall than a human working entirely manually. Among the early results is some advice about the correct F-measure to use to evaluate tools for hairy tasks.