PHOTOOG Photography writings by Olivier Giroux

26Jul/070

Summary: A Model of Large Program Development (part 1 of 3)

See part 2 here, and part 3 here.

A paper by Belady & Lehman, published in IBM Systems Journal, 1976#3.

Summarized for my management.

The further along you are in the development of a large software project, the bigger and harder to understand the system is, and the more bugs are created per unit of work done. At some point fixing bugs takes as much (or more) of your time as adding new features. Eventually an arbitrary pain threshold is reached and management decides to build a brand new system instead of continuing with the current one.

As far as the modern state of the art (in 2007) is concerned, all software systems exhibit these trends. No one can claim to do better. Yet the claim is routinely made (either by the unlearned, or the optimistic) and it is quite common to plan unrealistically.

It is worth noting that OS/360, the project from which this data was taken, was widely regarded as a failure. The system almost aborted at birth, and struggled with its mass for most of the decade it was in use. This may be most valuable as a case study in failure.

From their experiences with OS/360 and other IBM systems, Belady and Lehman postulate 3 laws of software development:

  1. Continuing change, a system changes constantly until it is discarded.
  2. Increasing entropy, each change makes the system more complex than before (on average).
  3. Smooth growth, each change makes the system bigger than it was before.

In order to model the development of OS/360, they select a few independent variables for which data was available and attempt to correlate them to the cost and defect rates of the project over time.

They propose 3 models to convey their observations: a Project Model, a Cost Model and a Fault Model.

Project Model

From the measurement of 4 parameters:

  1. M(R), the total number of modules at revision R.
  2. M(D), the total number of modules on day D.
  3. MH(R), the number of modules modified between revision R-1 and R.
  4. MH(D), the number of module modifications between day 0 and day D.

They note the following observations:

  1. M(R) is linear with R.
  2. M(D) is logarithmic with D.
  3. MH(R) is exponential with R.
  4. MH(D) is linear with D.

Note on the graph on the right that the relationship between M(R) and M(D) implies a relationship between R and D as shown (observed). See the very first graph above for the exponential shape of MH(R).

Complexity Relation

They additionally define complexity C(R) as the fraction of modules modified between revision R-1 and R. i.e.: C(R) = MH(R)/M(R). In the span of data they had available could be approximated with a quadratic polynomial.


Filed under: Uncategorized Leave a comment
Comments (0) Trackbacks (0)

No comments yet.


Leave a comment

No trackbacks yet.