Comments by "ke6gwf - Ben Blackburn" (@ke6gwf) on "" video.
-
14
-
6
-
4
-
What many are missing is that the "minor software issue" is actually a serious ground up design problem that needs to be completely redesigned with different assumptions and criteria.
They designed the system in a way where a single point of failure would cause the capsule to blindly do things regardless of any other inputs.
It was simply responding to a fixed sequence based on the clock, and that means that we won't find out what else has zero fault tolerance or error checking, until it's actually put to use.
I do industrial automation, so I am familiar with making large machines perform actions automatically, and you always want to verify your inputs and have some form of error checking or redundancy.
For instance, the easiest one. If the clock says 11, before you do the step for 11, you verify that the previous steps have been completed.
Or, before changing the orientation of the craft, you check the star trackers and GPS and verify where you are supposed to be with where you actually are.
You can also compare SECO/deployment with the download clock time and verify that they match the schedule.
But, just like with MCAS, it is a system with lots of power and no way to verify or error check what it's doing, so one input is all it needs to jump off the bridge blindly.
And you can say that it could be overridden by the crew and everything would be fine, but since they obviously failed in basic design and ground testing, designing a system that could fail so confidently, and not testing it adequately to discover this problem, but that is not a valid assumption, because the rest of the system is designed and tested by the same team, and so it would be surprising if this blind and Fail Dangerous system didn't have other similar flaws and bad assumptions and improperly tested aspects, that could have terrible consequences without a review as detailed as the MCAS is receiving.
3
-
3
-
@simongeard4824 I agree completely that redundant computers wouldn't solve this, and yes, it's an error checking issue, back to only looking at one data point (the MET clock), and ignoring all other data such as spatial position, UTC clock, or even internal data such as whether previous steps on the script had been run.
And yes, the more data you look at, the more likely it is that you will get a disagree error, but that's where redundancy comes in, but also, a disagree error means that SOMETHING is wrong, and you need to identify and correct it!
2
-
1
-
@DC2022 what you and Scott are missing is that the "minor software issue" is actually a serious ground up design problem that needs to be completely redesigned with different assumptions and criteria.
They designed the system in a way where a single point of failure would cause the capsule to blindly do things regardless of any other inputs.
It was simply responding to a fixed sequence based on the clock, and that means that we won't find out what else has zero fault tolerance or error checking, until it's actually put to use.
I do industrial automation, so I am familiar with making large machines perform actions automatically, and you always want to verify your inputs and have some form of error checking or redundancy.
For instance, the easiest one. If the clock says 11, before you do the step for 11, you verify that the previous steps have been completed.
Or, before changing the orientation of the craft, you check the star trackers and GPS and verify where you are supposed to be with where you actually are.
You can also compare SECO/deployment with the clock time and verify that they match the schedule.
But, just like with MCAS, it is a system with lots of power, and no way to verify or error check what it's doing, so one input is all it needs to jump off the bridge blindly.
And you can say that it could be overridden by the crew and everything would be fine, but since they obviously failed in basic design and ground testing, designing a system that could fail so confidently, and not testing it adequately to discover this problem, but that is not a valid assumption, because the rest of the system is designed and tested by the same team, and so it would be surprising if this blind and Fail Dangerous system didn't have other similar flaws and bad assumptions and improperly tested aspects, that could have terrible consequences without a review as detailed as the MCAS is receiving.
1
-
1
-
@DC2022 yes, SpaceX discovered an unexpected design flaw in a check valve, allowing a small back flow leak.
So they replaced the check valve with a burst disc, and were ready to fly again.
Simple fix, minor change.
The only downside is that it requires replacement of the disk after every test, instead of just refilling with the check valve.
In future iterations, it suspect they will put in a different sort of check valve, or an additional electric positive shutoff valve to allow ground testing, but the burst disc is all that's needed for abort situations.
The software issue on Starliner on the other hand needs a full review and testing regime, and then a rewrite, before it's safe.
1
-
1