By Dominic Gates / The Seattle Times
Seven weeks after the crash of a Boeing 737 Max operated by Lion Air killed 189 people in Indonesia, the jetmaker made a detailed presentation to the Federal Aviation Administration (FAA) justifying its design of the flight control system that had repeatedly pushed the jet’s nose down.
It concluded, in an exculpatory phrase repeated on multiple slides, that there was “No process violation or non-compliance” in how the jet was certified by the regulators.
But in hindsight, details in the December 2018 slide presentation reveal serious holes in the original evaluation of the Maneuvering Characteristics Augmentation System (MCAS) flight-control software.
Equally troubling, despite clear indications from the previous month’s Lion Air tragedy that the pilots had not responded as Boeing’s safety analyses assumed, the presentation reiterated the same assumptions and never approached the question of whether the Max should still be flying.
Flaws in the original safety analysis of MCAS are apparent now after a second crash involving an Ethiopian Airlines Max in March, and a great deal of reporting on what went wrong on both flights. That December presentation reveals Boeing’s thinking soon after the first crash and indicates both a substantial effort to deflect blame and a missed opportunity to reevaluate before the second crash happened.
The presentation shows that Boeing in its original certification of the Max:
• Presented MCAS to the FAA as not being a “new and novel” technology — and thus not requiring deeper scrutiny. The justification given was a doubtful comparison with the 767 tanker.
• Did not consider in its safety assessment the effect of multiple system failures and how this would affect the reactions of the pilots.
• Used questionable math to downgrade the system’s risk classification below a level that would have required more redundancy with at least two sensors to activate it.
• Made a key safety assessment prior to a major change in the design of MCAS, and did not reevaluate the system again before certification.
• Dismissed one scenario in which an MCAS failure was assessed as “catastrophic,” sticking — despite the Lion Air experience — to its prior assumption that “appropriate flight crew action” would save the aircraft.
Boeing’s message to the FAA that December — which formed the basis of multiple public statements by CEO Dennis Muilenburg since — was that MCAS had been certified using the company’s standard processes and was compliant with all FAA regulations.
In a statement Friday, Boeing reiterated: “The FAA considered the final configuration and operating parameters of MCAS during Max certification, and concluded that it met all certification and regulatory requirements.”
Peter Lemme, a former Boeing flight-controls engineer and avionics expert, describes this as the company’s “stay-the-course, admit-no-fault mentality.”
“Boeing failed to properly reassess the situation, doubling down on their assumptions instead of immediately disabling MCAS to remove any chance of further disaster,” Lemme wrote on his blog devoted to analysis of the Max crashes.
As a result, Boeing and the FAA maintained their position that the Max was safe until forced to ground the jet 12 weeks later after another 157 people died in a similar crash in Ethiopia.
A flawed process
The U.S. House Transportation and Infrastructure Committee, which displayed one slide from Boeing’s presentation during an appearance by CEO Muilenburg at a hearing last week, provided all 43 slides in the document at the request of the Seattle Times. The presentation is titled “MCAS Development and Certification Overview.”
It notes that MCAS was not evaluated as an individual system that was “new/novel on the Max.” The significance of this term is that the FAA is required to be closely involved in the testing and certification of any new and novel features on an aircraft.
Though MCAS was new on the Max version of the 737, Boeing argued that it wasn’t new and novel because a similar system “had been previously implemented on the 767” tanker for the Air Force.
Yet MCAS on the Max was triggered by just one of the jet’s two angle-of-attack sensors, whereas MCAS on the 767 tanker compared signals from both sensors on the plane. When asked after the second crash to explain why the airliner version lacked this same redundancy, Boeing’s response was that the architecture, implementation, and pilot interface of the KC-46 tanker MCAS were so different that the two systems shared little but the acronym.
Laying out how Boeing originally assessed MCAS internally, the December 2018 presentation tells how first a standard preliminary risk assessment was done — it’s called a Functional Hazard Assessment (FHA) — by pilots in flight simulators.
They did not simulate the real-world scenario that occurred in the crash flights when a single sensor failed and prompted the cascade of warnings in the cockpit. Instead, the pilots simply induced the horizontal tail, also known as the stabilizer, to swivel as MCAS would have moved it to pitch the nose down in a single activation.
The pilots successfully demonstrated that they could then recover the aircraft. They did so simply by pulling back on the control column. They didn’t even have to use what Boeing later described as the final step to stopping MCAS: hitting the cut-off switches that would have killed electrical power to the stabilizer.
“Accumulation or combination of failures leading to unintended MCAS activation were not simulated nor their combined flight deck effects,” Boeing said in the presentation.
Those pilots also did not simulate the crash flight scenario of MCAS misfiring multiple times — in the case of Lion Air, 27 times before the plane nose-dived into the sea.
Boeing notes in the presentation that much later, in June 2016 during flight tests of the Max, its engineers did discuss this scenario of “repeated unintended MCAS activation” with its test pilots. They concluded that this would be “no worse than single unintended activation.”
As proof that discussion occurred, Boeing’s presentation mentions an internal email summary. Yet Boeing concedes that the discussion and its conclusion apparently never made it to the ears of the FAA. Boeing said it was “not documented in formal certification” papers.
The initial FHA classified an erroneous activation of MCAS during the normal phases of flight as a “major” risk.
This is a significant yet relatively low-level risk category, signifying an event that could cause some upset inside the aircraft but would not typically lead to serious injuries or damage. A manufacturer must do detailed calculations to prove that the chance of such a failure happening is less than one in 100,000.
This classification of MCAS proved fateful. It meant that Boeing did not go on to conduct two more detailed analyses of MCAS — a Fault Tree Analysis and a Failure Modes and Effects Analysis — for the system safety assessment it sent to the FAA.
It also meant that MCAS could be designed with just a single sensor.
This is despite the fact that the same FHA established that a similar MCAS malfunction during an extreme, high-speed, banked turn would be a “hazardous” risk. This is a much more serious risk category where some serious injuries and fatalities could be expected. It’s one level short of “catastrophic,” in which the plane is lost with multiple fatalities. The probability of a “hazardous” failure has to be demonstrated as less than 1 in 10 million.
Lemme notes that a “hazardous” classification typically requires that redundancy be designed into the system, with a comparison of at least two sensors being used to activate it.
However, Boeing avoided this for MCAS.
It argued that since the probability of a Max airliner getting into such an extreme, high-speed, banked turn was just one in 1,000 and that the chance of an MCAS “major” malfunction was less than 1 in 100,000, the combination meant the chance of both together happening was less than 1 in 100 million — which “meets the Hazardous integrity requirements.”
A report on the certification of the Max released last month by an international panel of air-safety regulators, the Joint Authorities Technical Review (JATR), states that this mathematical discounting of the risk “is not a standard industry approach.”
An FAA safety engineer, who asked for anonymity because he spoke without agency approval, explained why that’s questionable math. He offered the example of how aviation engineers work out the probability of an engine failure complicated by an added factor of ice forming around the engine.
They don’t consider that an aircraft will encounter icing in, say, one of 500 flights and then combine that probability with whatever system failure is in question to produce a lower probability. Instead, they just assume that icing will happen, because sometime it definitely will.
On Friday, Boeing said that it calculated the probability according to an accepted method, adding that “recently there has been discussion of revising this practice but no new standards have been set.”
The presentation also describes how a separate analysis was done of multiple system failures on the 737 Max, which would have included MCAS.
However, this was “completed prior to the design change to MCAS,” when Boeing decided in March 2016 to extend the system’s operation to low-speed normal flight. The presentation states, “reevaluation of design change not required,” per Boeing’s process.
As a result, Boeing conceded that the version of MCAS included in the system safety assessment sent to the FAA “was not updated to reflect certified design.”
However, it assured the FAA that it had done a new post-Lion Air assessment of the redesigned MCAS, which concluded that a revised analysis “would have included the same crew action that is already considered” and so wouldn’t have changed the outcome.
On Friday, Boeing in a statement said that despite this admitted glitch in the documentation, “Boeing informed the FAA about the expansion of MCAS to low speeds, including by briefing the FAA and international regulators on multiple occasions about MCAS’s final configuration.”
One scenario in the multiple system failure analysis is on the slide the House committee displayed during last week’s hearing.
It shows that when engineers analyzed the case of one angle of attack sensor not working and the second giving an erroneous signal, the combined effect on all systems, not just MCAS, was “deemed potentially catastrophic.”
However, again Boeing concluded this was “acceptable” because of the expectation of “appropriate crew action” to counter the emergency, plus the calculation that such a dual angle-of-attack failure was “extremely remote,” specifically that it would occur in less than one in a billion flights.
However, for MCAS to go haywire required only one angle-of-attack failure, a much higher probability.
Boeing knew that such an event had happened twice on successive Lion Air flights just seven weeks earlier, both on the crashed flight and on the prior flight.
And it knew that the crew action it had expected hadn’t occurred, even on the prior flight when the pilots managed to recover.
Nevertheless, Boeing’s presentation both justified its original analysis and reiterated its position: if MCAS failed, the crew would save the plane and all on board.
The presentation also solves one small mystery. Boeing notes how the FAA agreed with it to remove all mention of MCAS from the pilot manuals and pilot training.
So why was the acronym MCAS listed in the glossary at the back of the pilot manual, though nowhere else?
That was a mistake, Boeing said, “left behind from earlier drafts” before mention of MCAS was excised.