R. Chillarege, I. S. Bhandari, J. K. Chaar, M. J. Halliday, D. S. Moebus, B. K. Ray, and M. Y. Wong, "Orthogonal defect classification-a concept for in-process measurements," Software Engineering, IEEE Transactions on, vol. 18, no. 11, pp. 943-956, 1992.Chillarege et al. describe an technique for identifying problem spots in the software development process whilst a project is underway through classifying software defects that come up. They suggest a paradigm and describe a pilot study to validate it, but overall I wasn't convinced. There's a lot in this paper that smacks of advertising over content -- maybe the guts of results are found in all the subsequent papers they link to?
Anyhow, my interest in this paper is for the authors' concept of software quality, their use of defect classification, and their thinking on the link between the two. Plus, it seems to be widely referenced.
To begin, the authors point out a gap in the qualitative-quantitative spectrum of measurement methods we have for software quality. They want a measurement scheme that is lightweight, sensitive to the development process (in that it can help locate process problems), and also consistent across phases of a project and between projects (so those who use it can learn from theirs and others experiences). Statistical defect models (quantitative) and root cause analysis (qualitative) are both done retrospectively, are time consuming, and in the case of the statistical methods, often intentionally ignore the details of the software development process used, so they can't provide detailed process feedback.
Enter
ODC (Orthogonal Defect Classification). The main idea: come up with various classifications of defects and map those classes onto the software development process so that every defect points to a process problem. (The word "orthogonal" is used both to mean "mutually independent" and because the authors run the metaphor of software-development-process-as-a-mathematical-vector-space, and defect classes "span" this space).
There are two tricks to doing this. The first trick is to use a layer of indirection in the mapping of defects to parts of process. Defects are first mapped to defect types, and then defect types are mapped to parts of the software development process. Why? Because mapping directly isn't something practitioners can do in the moment (it's error prone, and the attempt to do so is nothing more than a "good opinion survey ... not ... a measurement") and because the indirection allows us to compare results across projects and phases.
My view on this: I'm not sure assigning defect types or mapping a defect types onto a parts of the process is any less error prone or requires any less opinion, it just seems to divide up the opinion-making into smaller chunks.
Anyhow, the second trick is about making sure your defect classes actually span the process space. The authors point out that a sufficient classification scheme is a work in progress that ultimately needs to be empirically validated. A good chunk of the paper is devoted to describing a pilot study of
ODC to validate it, or referencing future work.
Looking at defect types in more depth then. The first important point:
defect types are chosen by the semantics of the fix, rather than only by qualities of the defects themselves. They are assigned by engineer making the fix. Here are the 8 types of defects:
- Function -- errors that effect capability and require a formal design change.
- Assignment -- logical errors in small bits of code.
- Interface -- errors in interacting with other components.
- Checking -- errors in data validation.
- Timing/serialisation -- errors in the use of shared/real-time resources.
- Build/package/merge -- errors in libraries or change management systems.
- Documentation -- errors in documentation and publications.
- Algorithm -- efficiency or correctness problems that require fixes through reimplementation (not requiring a formal design change).
The authors investigate how the quantities of defects of each type vary over different phases of a project. They note, for instance, that
function bugs appear with greater number in the design phase, and
timing bugs appear more in the system test phase. The authors then take this "trend analysis a stage deeper" and provide a correlation table that maps principle defect types to stages in the software development process.
This section is maddeningly vague on details -- it's not clear where the process stages have come from or how the correlations were done specifically. This is a shame because this mapping is
crucial to the underlying argument for the usefulness of
ODC. Any deviation from the "expected" principle variation trends of defect quantities is considered by
ODC to point to process problems, but what exactly the "expected" variations ought to be isn't well described, nor is what exactly constitutes a variation other than to say the judgement is "determined with experience".
Overall I'm suspicious of
ODC as described in this paper. Partly because the paper lacks detail, but also because I wonder if the classification scheme is objective enough to work as the authors claim -- especially across projects.
(El Emam, K.; Wieczorek, I. 1998) show some evidence that defect classification is repeatable within members of a development team, but even then theirs is a highly qualified experiment.
For my interests, this is the first paper I looked at that hints that it might be possible to compare the quality of two projects by looking at the types of defects that turn up. But it only hints. I'll be posting about other work that adds a lot more murk to these waters.