Comments by "ke6gwf - Ben Blackburn" (@ke6gwf) on "" video.

@hazardous458 it's a failure because the control software has catastrophic errors in the core design philosophy to even allow a single error to do this, in addition to a total failure of QC and testing in not verifying that the software had the correct address programmed in. And if any other maneuvers had been attempted, or the crew had attempted to use manual control, what other serious errors would have been discovered?
14
@_Andrew2002 there are a lot of differences between the MAX and Starliner, but both MCAS and the Starliner control system are software systems given great power, but based on bad assumptions and only working with one data source. And both did not receive proper testing and quality control or verification. On one, a bad vane made it overpower the pilots into the ground. In the other, a bad clock was blindly trusted despite all the other data disagreeing with it. And both could have been easily spotted and corrected if they had been verified and tested.
6
@jersey282 the Crew Dragon and Starliner capsules are both intended to autonomously dock with the ISS. No one wants manual docking, it's too unreliable.
4
What many are missing is that the "minor software issue" is actually a serious ground up design problem that needs to be completely redesigned with different assumptions and criteria. They designed the system in a way where a single point of failure would cause the capsule to blindly do things regardless of any other inputs. It was simply responding to a fixed sequence based on the clock, and that means that we won't find out what else has zero fault tolerance or error checking, until it's actually put to use. I do industrial automation, so I am familiar with making large machines perform actions automatically, and you always want to verify your inputs and have some form of error checking or redundancy. For instance, the easiest one. If the clock says 11, before you do the step for 11, you verify that the previous steps have been completed. Or, before changing the orientation of the craft, you check the star trackers and GPS and verify where you are supposed to be with where you actually are. You can also compare SECO/deployment with the download clock time and verify that they match the schedule. But, just like with MCAS, it is a system with lots of power and no way to verify or error check what it's doing, so one input is all it needs to jump off the bridge blindly. And you can say that it could be overridden by the crew and everything would be fine, but since they obviously failed in basic design and ground testing, designing a system that could fail so confidently, and not testing it adequately to discover this problem, but that is not a valid assumption, because the rest of the system is designed and tested by the same team, and so it would be surprising if this blind and Fail Dangerous system didn't have other similar flaws and bad assumptions and improperly tested aspects, that could have terrible consequences without a review as detailed as the MCAS is receiving.
3
Yup, sounds like they had no error checking or fault tolerance on here, just a list of actions blindly excuted based on a single time code taken from another system, which then didn't get thoroughly tested and verified. Sounds like MCAS!
3
@simongeard4824 I agree completely that redundant computers wouldn't solve this, and yes, it's an error checking issue, back to only looking at one data point (the MET clock), and ignoring all other data such as spatial position, UTC clock, or even internal data such as whether previous steps on the script had been run. And yes, the more data you look at, the more likely it is that you will get a disagree error, but that's where redundancy comes in, but also, a disagree error means that SOMETHING is wrong, and you need to identify and correct it!
2
@steveelectronics7819 LOL! I was trying to think of something like that earlier, I shall now steal it from you. Ty!
1
@DC2022 what you and Scott are missing is that the "minor software issue" is actually a serious ground up design problem that needs to be completely redesigned with different assumptions and criteria. They designed the system in a way where a single point of failure would cause the capsule to blindly do things regardless of any other inputs. It was simply responding to a fixed sequence based on the clock, and that means that we won't find out what else has zero fault tolerance or error checking, until it's actually put to use. I do industrial automation, so I am familiar with making large machines perform actions automatically, and you always want to verify your inputs and have some form of error checking or redundancy. For instance, the easiest one. If the clock says 11, before you do the step for 11, you verify that the previous steps have been completed. Or, before changing the orientation of the craft, you check the star trackers and GPS and verify where you are supposed to be with where you actually are. You can also compare SECO/deployment with the clock time and verify that they match the schedule. But, just like with MCAS, it is a system with lots of power, and no way to verify or error check what it's doing, so one input is all it needs to jump off the bridge blindly. And you can say that it could be overridden by the crew and everything would be fine, but since they obviously failed in basic design and ground testing, designing a system that could fail so confidently, and not testing it adequately to discover this problem, but that is not a valid assumption, because the rest of the system is designed and tested by the same team, and so it would be surprising if this blind and Fail Dangerous system didn't have other similar flaws and bad assumptions and improperly tested aspects, that could have terrible consequences without a review as detailed as the MCAS is receiving.
1
@DC2022 oh, and the Crew Dragon was designed fine for normal use, it was only after multiple test cycles and pressure variations that the leak occurred. It probably would never have occurred in flight conditions, because they weren't constantly cycling and refilling the tanks. So it was a failure that would only occur under those specific circumstances, and not in other types of testing or simulation. The Starliner issue on the other hand could be found by a competent programmer checking the code, or through ground testing. And again, it's a issue of no error checking in the design, rather than just a glitch.
1
@DC2022 yes, SpaceX discovered an unexpected design flaw in a check valve, allowing a small back flow leak. So they replaced the check valve with a burst disc, and were ready to fly again. Simple fix, minor change. The only downside is that it requires replacement of the disk after every test, instead of just refilling with the check valve. In future iterations, it suspect they will put in a different sort of check valve, or an additional electric positive shutoff valve to allow ground testing, but the burst disc is all that's needed for abort situations. The software issue on Starliner on the other hand needs a full review and testing regime, and then a rewrite, before it's safe.
1
@G5rry yes, I seem to remember that SpaceX uses either 2 or 3 redundant computers constantly cross checking with each other.
1