1999 International Conference on Software Quality Cambridge, MA
1
Toward Quantitative Process Management With
Exploratory Data Analysis
Mark C. Paulk
Software Engineering Institute
Carnegie Mellon University
Pittsburgh, PA 15213
Abstract
The Capability Maturity ModelÒ for Software is a model for building organizational
capability that has been widely adopted in the software community and beyond. The
Software CMMÒ is a five-level model that prescribes process improvement priorities for
software organizations. Level 4 in the CMM focuses on using quantitative techniques,
particularly statistical techniques, for controlling the software process. In statistical
process control terms, this means eliminating assignable (or special) causes of variation.
Organizations beginning to use quantitative management typically begin by "informally
stabilizing" their process. This paper describes typical questions and issues associated
with the exploratory data analysis involved in initiating quantitative process management.
Introduction
The Capability Maturity Model (CMM) for Software [Paulk95], developed by the
Software Engineering Institute (SEI) at Carnegie Mellon University, is a model for
building organizational capability that has been widely adopted in the software community
and beyond. The Software CMM is a five-level model that describes good engineering
and management practices and prescribes improvement priorities for software
organizations. The five maturity levels are summarized in Figure 1.
The higher maturity levels in the CMM are based on applying quantitative techniques,
particularly statistical techniques [Florac99], to controlling and improving the software
process. In statistical process control (SPC) terms, level 4 focuses on removing assignable
causes of variation, and level 5 focuses on systematically addressing common causes of
variation. This gives the organization the ability to understand the past, control the
present, and predict the future – quantitatively. Regardless of the specific tools used (and
control charts are implied by SPC), the foundation of levels 4 and 5 is statistical thinking
[Hare95], which is based on three fundamental axioms:
· all work is a series of interconnected processes
· all processes are variable
Ò Capability Maturity Model and CMM are registered with the . Patent and Trademark Office.
SM Personal Software Process and PSP are service marks of Carnegie Mellon University.
The Software Engineering Institute is a federally funded research and development center sponsored by
the . Department of Defense.
智慧接触,网聚精彩-智网
我爱e书网提供大量管理营销资源免费下载
1999 International Conference on Software Quality Cambridge, MA
2
· understanding variation is the basis for management by fact and systematic
improvement
The statistical thinking characteristic of a high maturity organization depends on two
fundamental principles. First, process data is collected at the “process step” level for real-
time process control. This is perhaps the most important single attribute of a level 4
organization – that engineers are using data to drive technical decision making in real-
time, thereby maximizing efficiency. Second, and a direct consequence of statistical
thinking, is that decision making at the process level incorporates an understanding of
variation. A wide range of analytic techniques can be used for systematically
understanding variation, ranging from simple graphs, such as histograms and bar charts,
and statistical formulas, such as standard deviation, to statistical process control tools,
such as XmR charts, u-charts and beyond. The simplicity of a histogram does not lessen
its power – a simple picture that imparts insight is more powerful than a sophisticated
formula whose implications are not understood.
Level Focus Key Process Areas
5
Optimizing
Continual process
improvement
Defect Prevention
Technology Change Management
Process Change Management
4
Managed
Product and process quality Quantitative Process Management
Software Quality Management
3
Defined
Engineering processes and
organizational support
Organization Process Focus
Organization Process Definition
Training Program
Integrated Software Management
Software Product Engineering
Intergroup Coordination
Peer Reviews
2
Repeatable
Project management
processes
Requirements Management
Software Project Planning
Software Project Tracking & Oversight
Software Subcontract Management
Software Quality Assurance
Software Configuration Management
1
Initial
Competent people and heroics
Figure 1. An overview of the Software CMM.
Although the Software CMM has been extensively used to guide software process
improvement, the majority of software organizations are at the lower maturity levels; as of
March 1999, of the 807 organizations active in the SEI's assessment database, only 35
were at levels 4 and 5. While the number of high maturity organizations is growing
rapidly, it takes time to institutionalize a measurement program and the quantitative
management practices that take good advantage of its capabilities. The typical software
organization takes over two years to move from level 1 to level 2 and from level 2 to level
智慧接触,网聚精彩-智网
我爱e书网提供大量管理营销资源免费下载
1999 International Conference on Software Quality Cambridge, MA
3
3 [Herbsleb97]. One to two years is a reasonable expectation for building, deploying, and
refining quantitatively managed processes.
One of the challenges in moving to level 4 is the discovery organizations typically make
when looking at their process data: the defined processes used by the projects are not as
consistently implemented or measured as believed. When a process is being placed under
statistical process control in a rigorous sense, it is "stabilized" by removing assignable
causes of variation. "Informal stabilization" occurs simply by examining the data
(graphically) before even placing it on a control chart, as patterns in the data suggestive of
mixing and stratification are seen.
If there is a great deal of variability in the data, a common complaint when arguing that
SPC cannot be applied to the software process [Ould96], the control limits on a control
chart will be wide. High variability has consequences: if the limits are wide, predictability
is poor, and highly variable performance is to be expected for future performance. If
highly variable performance is unacceptable, then the process will have to be changed.
Ignoring reality will not change it. Since some studies suggest a 20:1 difference in the
performance of programmers, variability is a fact of life in a design-intensive, human-
centric process. The impact of a disciplined process can be significant in minimizing
variation while improving both quality and productivity, as demonstrated by the Personal
Software ProcessSM [Humphrey95, Hayes97]. Some software organizations are using
control charts appropriately and to provide business value [Paulk99a, Paulk99b], thus
there are a few examples of SPC for software providing business value.
Informally stabilizing the process can be characterized as an exercise in exploratory data
analysis, which is a precursor to the true quantitative management of level 4. The
processes that are first stabilized tend to be design, code, and test, since there is usually an
adequate amount of inspection and test data to apply statistical techniques in a fairly
straightforward manner. A fairly typical subset of code inspection data in Table 1
illustrates what an organization might start with. The organization that provided this data
was piloting the use of control charts on a maintenance project.
Table 1. Representative Code Inspection Data From an Organization Beginning
Quantitative Management.
Number of
Inspectors
Inspection
Preparation
Time
Code Inspection Time
(number of inspectors X
inspection hours)
Number of
Defects
Lines of
Code
7 2
6 2
5 0
5 1
6 2
6 2
6 0
5 3
智慧接触,网聚精彩-智网
我爱e书网提供大量管理营销资源免费下载
1999 International Conference on Software Quality Cambridge, MA
4
If you were asked to analyze the data in Table 1, what questions might you ask? They will
probably fall in four broad categories: operational definitions, process consistency,
aggregation, and organizational implications.
Operational Definitions
Good operational definitions must satisfy two important criteria [Florac99]:
· communication. If someone uses the definition as a basis for measuring or describing
a measurement result, will others know precisely what has been measured, how it was
measured, and what has been included and excluded?
· repeatability. Could others, armed with the definition, repeat the measurements and
get essentially the same results?
In looking at the data in Table 1, the first question is likely to be, "How is a line of code
defined?" The fact that the LOC values are not integers rings a bell, yet, when first
hearing that this is a maintenance project, the question should have arisen, "How do they
deal with modified, deleted, and unchanged lines?" In this case, a formula was used to
create an aggregate size measure that is a weighted function of new, modified, deleted,
and unchanged lines. It is more important to know that the formula exists and is being
consistently used than to know what the specific formula is.
The second question might be, "What are the time units?" One panelist at the 1999
European SEPG Conference reported a case where the unit was not clearly
communicated, and they discovered that their time data included both hours and minutes
(assuming every value of 5 or less was hours was their pragmatic solution).
The third question is obviously, "How is a defect defined?" While the first two metrics
can be collected in a fairly objective fashion, getting a good operational definition of
"defect" can be challenging. Are multiple severity levels included, from life-critical to
major to trivial? Are trivial defects even recorded? How does the inspection team
determine what category a defect belongs in? Again, the crucial question is whether the
data can be collected consistently and repeatably.
A fourth question should be, "Is the data collected at the same point in the process each
time?" For example, are code inspections performed before or after a clean compile is
obtained? If this is not specified, there may be a mix of compiled/not compiled inspection
data, which will increase variability significantly.
Process Consistency
Even if the data is collected consistently and repeatably, the process itself maybe vary from
one execution to the next. For example, some teams may engage in "pre-reviews" before
inspections (to ensure that the inspected code is of acceptable quality – not an
智慧接触,网聚精彩-智网
我爱e书网提供大量管理营销资源免费下载
1999 International Conference on Software Quality Cambridge, MA
5
unreasonable practice if the number of defects reported in the inspections has ever been
used in a performance appraisal). This, too, can lead to a mix of pre-reviewed/not pre-
reviewed data that will increase variability.
Another panelist at the 1999 European SEPG Conference identified a case where
examination of data revealed two operationally distinct inspection processes. The
distinguishing attribute of the inspection was the size of the work product. If the code
module being inspected was larger than about 50 lines of code, the inspection rates were
significantly different – even though the same inspection process was supposedly being
performed. The important insight is not whether the existence of two operationally
different inspections is appropriate, but that the decision be a conscious one.
In the case of the data in Table 1, two questions immediately arise: "Is it a good idea to
have inspections covering this wide a range of code sizes?" and the corollary question,
"Are the inspection rates for these different sizes of module reasonable?" Different
organizations may establish somewhat different guidelines, but150 LOC per hour is a
reasonable target [Fagan86]. A casual examination of the data provided in Table 1
suggests that some inspection rates are running at greater than 2,000 LOC per hour, which
suggests a significant process consistency issue.
Aggregation
When analyzing process data, there are many potential sources of variation in the process.
It is easy to overlook sources of variation when data are aggregated. Common causes of
overly aggregated data include [Florac99]:
· poor operational definitions
· inadequate contextual information
· lack of traceability from data back to its original context
· working with data whose elements are combinations (mixtures) of values from
different sources
The predominant source of aggregated data is simply that different work products are
produced by different members of the project team. Collecting data on an individual basis
would address this, but could have severe consequences in terms of motivational use of
the data, ., during performance appraisals, which can lead to dysfunctional behavior
[Austin96], and in terms of the amount of the data available for statistical analyses. There
are no easy answers to this question.
It is, however, possible on occasion to disaggregate data. For example, defect data could
be separated into different categories, and control charts on each category may provide
significantly better insight into separate common cause systems [Florac99].
Organizational Implications
In the particular example we have gone through above, the data was used within a single
智慧接触,网聚精彩-智网
我爱e书网提供大量管理营销资源免费下载
1999 International Conference on Software Quality Cambridge, MA
6
project. When dealing with organizational data, these problems are exacerbated. In
moving between projects, application domains, and customers, operational definitions may
be "adjusted" to suit the unique needs of the new environment, thus it is crucial to
understand the context of the data when doing cross-project comparisons. It can be
particularly challenging when government regulations or customers demand that data be
reported in different ways than the organization would normally collect it.
Conclusion
This paper provides a simple road map through some of the issues that an analyst must
deal with in implementing quantitative process management. As we frequently say about
the CMM, this is not rocket science, but it is easy to miss an important point, and it can be
quite frustrating at times to work through these issues. These are, however, typical
problems that most organizations have work through on the journey of continual process
improvement; "informal stabilization" seems to be a necessary precursor to the useful
application of rigorous SPC techniques.
References
Austin96 Robert D. Austin, Measuring and Managing Performance in Organizations,
Dorset House Publishing, ISBN: 0-932633-36-6, New York, NY, 1996.
Fagan86 . Fagan, "Advances in Software Inspections," IEEE Transactions on
Software Engineering, Vol. 12, No. 7, July 1986, pp. 744-751, reprinted in
Software Engineering Project Management, . Thayer (ed), IEEE
Computer Society Press, IEEE Catalog No. EH0263-4, 1988, pp. 416-
423.
Florac99 William A. Florac and Anita D. Carleton, Measuring the Software Process:
Statistical Process Control for Software Process Improvement, ISBN 0-201-
60444-2, Addison-Wesley, Reading, MA, 1999.
Hare95 Lynne B. Hare, Roger W. Hoerl, John D. Hromi, and Ronald D. Snee,
"The Role of Statistical Thinking in Management," ASQC Quality
Progress, Vol. 28, No. 2, February 1995, pp. 53-60.
Hayes97 Will Hayes and James W. Over, "The Personal Software Process (PSP):
An Empirical Study of the Impact of PSP on Individual Engineers,"
Software Engineering Institute, Carnegie Mellon University, CMU/SEI-97-
TR-001, December 1997.
Herbsleb97 James Herbsleb, David Zubrow, Dennis Goldenson, Will Hayes, and Mark Paulk,
"Software Quality and the Capability Maturity Model,” Communications of the
ACM, Vol. 40, No. 6, June 1997, pp. 30-40.
智慧接触,网聚精彩-智网
我爱e书网提供大量管理营销资源免费下载
1999 International Conference on Software Quality Cambridge, MA
7
Humphrey95 Watts S. Humphrey, A Discipline for Software Engineering, ISBN 0-201-
54610-8, Addison-Wesley Publishing Company, Reading, MA, 1995.
Ould96 Martyn A. Ould, "CMM and ISO 9001," Software Process: Improvement and
Practice, Vol. 2, Issue 4, December 1996, -289.
Paulk95 Carnegie Mellon University, Software Engineering Institute (Principal
Contributors and Editors: Mark C. Paulk, Charles V. Weber, Bill Curtis,
and Mary Beth Chrissis), The Capability Maturity Model: Guidelines
for Improving the Software Process, ISBN 0-201-54664-7, Addison-
Wesley Publishing Company, Reading, MA, 1995.
Paulk99a Mark C. Paulk, “Practices of High Maturity Organizations,” The 11th
Software Engineering Process Group (SEPG) Conference, Atlanta,
Georgia, 8-11 March 1999.
Paulk99b Mark C. Paulk, "Using the Software CMM With Good Judgment,” ASQ
Software Quality Professional, Vol. 1, No. 3, June 1999, pp. 19-29.
智慧接触,网聚精彩-智网
我爱e书网提供大量管理营销资源免费下载