The Weekly Reflektion 14/2025
Thank you to everyone that attended Reflekt’s Breakfast Seminar, The black box, can it be trusted? One of the points that we emphasised in the seminar was the way simple errors can compromise the introduction of new technology. That is, it’s not necessarily the ‘new’ part that fails, it is the ‘old’ part that is integrated into the ‘new’.
Do you need a Major Accident to discover a simple error?

On June 4, 1996, the maiden flight of the European Space Agency’s Ariane 5 rocket ended in failure just 37 seconds after lift-off. The rocket veered off its flight path and subsequently self-destructed, resulting in the loss of the vehicle and its payload of four Cluster satellites intended for researching Earth’s magnetosphere. The cost of the failure was USD 370 million in 1996 dollars.
The failure was traced to a software error in the rocket’s Inertial Reference System (IRS). Specifically, a 64-bit floating-point number representing the horizontal velocity was improperly converted to a 16-bit number. When the velocity exceeded the storage capacity of the 16-bit integer during the rocket’s acceleration the error caused both the primary and backup IRS systems to cease operation.
The software containing the faulty conversion was reused from the Ariane 4 rocket. The Ariane 4 trajectory did not lead to a situation where the horizontal velocity exceeded the storage capacity for the 16-bit integer, hence the system functioned satisfactorily. The potential for overflow of storage capacity for certain measurement parameters was known, however the programmers had not protected all the critical measurements including the horizontal velocity. During the testing of the IRS, simulated data was used for the horizontal velocity. This meant the measurement system was not tested fully at the likely velocity that would have been attained by the Ariane 5 rocket. Simple software testing on the measurement system would have uncovered the limitation on the storage capacityfor the velocity value.
The Ariane 5 failure underscored the critical importance of rigorous software testing and validation, especially when reusing code in new contexts. It highlighted the need for thorough analysis of software behaviour under all possible operating conditions and the implementation of robust error-handling mechanisms to prevent similar failures in future missions.
John A. McDermid wrote an interesting paper in 2001 addressing this issue. ‘Software Safety: Where’s the evidence?’ He highlightedthe challenges in fully testing software and all the potential failure scenarios that could arise. He warned of the acceptance of ‘conventional wisdom’ as a basis for the testing. Conventional wisdom relies on widely accepted beliefs and ideas that are commonly regarded as true and that are not subject to rigorous challenge. He advocates an ‘evidence based’ approach where documentation generated in the testing process serves as a proof that the tests achieved the correct outcomes. He also recognises that this approach will increase the cost of introduction of new software. Often there is a balance to be reached, however this balance cannot be reduced to a ‘trial and error’ approach where taking the system into operation becomes the main testing forum. We have all experienced ‘bugs’ in our PC’s and smart phones and accept that from time to time there is a need for a software update. Hopefully the consequence of any bug is not catastrophic.