Can Systems Engineering help in Zero-Defect-Delivery (ZDD)? Is it really possible, in particular while delivering updates to decades old legacy mission critical systems? Or is it Rocket Science, a Beautiful Myth, or Snake Oil?
Presentation Menu
Weapon systems critical to national security are increasingly becoming more complex and expensive. The cost overruns alone are estimated to be about half-a-trillion dollars, making it the second-largest defense budget in the world. In the USA, several weapon system programs breached Nunn-McCurdy thresholds for cost and schedule overruns.
“Inadequate Systems Engineering” is almost always the critical factor in these troubled programs. Systems Engineering, since the days of Bell Labs where the word was first coined, has morphed several times and lost much of its original purpose of guiding the design and development of systems (in particular, large and complex systems like 5ESS and Aegis BMD; both are in the 100-million-lines-of-code class).
As John Vu stated in his keynote address at Lucent’s 1997 Software Symposium: “No system failed due to bad coding. All failures could be traced to bad design” (note: most weapon systems are software-intensive). Smith and Williams, in Performance Solutions: A Practical Guide to Creating Responsive, Scalable Software, analyzed root causes of performance problems and concluded that “the problem is often due to a fundamental misunderstanding of how to achieve performance objectives” during the analysis of requirements and design (Systems Engineering).
It has become an accepted practice that delivering complex weapon systems has become a “work until it works” paradigm, with its associated cost and schedule overruns and poor quality. For example, immediately after joining the GPS-OCX program as the Deputy Chief Engineer, I asked a hypothetical question: “If someone from GPS-OCX needs to stand in front of the customer and say, ‘Requirement-xxx is delivered. If anyone can find a problem/bug in that, I will gladly pay $5 for each bug,’ who would be that person? Do you think it can be done?” Almost all the answers I received were: “It is impossible. It cannot be done.”
However, it can be done, and it has been done. Prof. Donald Knuth of Stanford University did it. I have done it routinely at Bell Labs on a project that was much more challenging than GPS-OCX. I routinely offered $5 for each bug found in the software developed by the team (and never had to pay out in eight years). In other words, we are not attempting the impossible.
This paper, after providing a brief overview of the problems and the state of the art, presents the process for Zero- Defect Delivery (ZDD) and shows how Systems Engineering plays a crucial role. Large systems have four levels of systems engineering (as Bell Labs calls them Tier-1 to Tier-4).
It also addresses common concerns such as: (1) Does it apply only to legacy systems or also to the development of new systems? (2) Does it work only for software-intensive systems or for others as well? (Yes, it works for the development and sustainment of software-intensive systems, hardware systems, and even—apologies to F. P. Brooks—for “grave digging.”) (3) Does it work only for closely collaborating teams? (No, it works for cooperating, indifferent, and even hostile teams.) (4) Does it help in maintaining a stable workforce critical for weapon systems development? (5) Is it Rocket Science? (No: “Even though I can do rocket science.”)
It provides examples, lessons learned, and even lessons we are going to learn (a.k.a. pre-mortem)from the systems I have worked on and left my fingerprints.