General statistics
List of Youtube channels
Youtube commenter search
Distinguished comments
About

Continuous Delivery
comments

Comments by "" (@grokitall) on "Continuous Delivery" channel.

Your comment demonstrates some of the reasons people don't get tdd. First, you are equating the module in your code as a unit, and then equating the module test suite as the unit test, and then positing that you have to write the entire test suite before you write the code. This just is not how modern testing defines a unit test. An example of a modern unit test would be a simple test that when given the number to enter into the cell perform a check to see if the number is between 1 and the product of the grid sizes and returns a true or false value. For example your common sudoku uses a 3 x 3 grid, requiring that the number be less than or equal to 9, so it would take the grid parameters, cache the product, check the value was between 1 and 9, and return true or false based on the result. This would all be hidden behind an API, and you would test that given a valid number it would return true. You would then run the test, and prove that it fails. A large number of tests written after the fact can pass not only when you run the test, but also then you either invert the condition, or comment out the code which supplies the result. You would then write the first simple code that provided the correct result, run the test, see it pass, and then you have validated your regression test in both the passing and failing mode, giving you an executable specification of the code covered by that test. You would also have a piece of code which implements that specification, and also a documented example of how to call that module and what it's parameters are for use when writing the documentation. Assuming that it was not your first line of code you would then look to see if the code could be generalized, and if it could you would then refactor the code, which is now easier to do because it already has the regression tests for the implemented code. You would then add another unit test, which might check that the number you want to add isn't already used in a different position, and go through the same routine again, and then another bit of test and another bit of code, all the while growing your test suite until you have covered the whole module. This is where test first wins, by rapidly producing the test suite, and the code it tests, and making sure that the next change doesn't break something you have already written. This does require you to write the tests first, which some people regard as slowing you down, but if you want to know that your code works before you give it to someone else, you either have to take the risk that it is full of bugs, or you have to write the tests anyway for continuous integration, so doing it first does not actually cost you anything. It does however gain you a lot. First, you know your tests will fail. Second you know that when the code is right they will pass. third, you can use your tests as examples when you write your documentation. fourth, you know that the code you wrote is testable, as you already tested it. fifth, you can now easily refactor, as the code you wrote is covered by tests. sixth, it discourages the use of various anti patterns which produce hard to test code. there are other positives, like making debugging fairly easy, but you get my point. as your codebase gets bigger and more complex, or your problem domain gets less well understood initially, the advantages rapidly expand, while the disadvantages largely evaporate. the test suite is needed for ci and refactoring, and the refactoring step is needed to handle technical debt.
9
@julianbrown1331 if as you accept, it is a nightmare to maintain by hand, and it is not suitable for version control, it is almost certainly bad code. most reasons for code to be hard to,maintain are due to breaking good practice, and the code which was generated based on largely untested training data is almost certain to be a testability nightmare are well. not to mention the security issues, copyright problems, etc.
3
the fact is that mandatory automatic updates is in breach of any sensible security, stability or resilience policy. they should be able to say there is an update available, and you should be able to say that it does not get installed until it has been triggered by you.
3
third party mandatory automated updates that you cannot turn off have no place in any security policy for corporate use. these types of updates are basically saying that your system is so unimportant that you can let someone you don't know decide that the system can be shutdown now, and it does not matter if it needs a complete reinstall to fix it. that is why every half way competent engineer here is saying wtf. it has been known bad practice for systems which matter for decades.
3
@julianbrown1331 partly it is down to the training data, but the nature of how they work does not filter them by quality either before, during, or after training, so lots of these systems are producing code which is as bad as that produced in the average training data, most of which is produced by newbies learning either the languages or the tools. also, you misrepresent how copyright law works in practice. when someone claims you are using their code, they only have to show that it is a close match. to avoid summary judgement against you, you have to show that it is a convergent solution from the constraints of the problem space, and that there was no opportunity to copy the code. given that there have been studies showing that for edge cases with very few examples they have produced identical code snippets right down to the comments in the code, good luck proving no chance to copy the code. just saying i got it from microsoft copilot does not relieve you of the responsibility to audit the origins of the code. even worse, microsoft cannot prove it was not copied either, as the nature of statistical ai obfuscates how got from the source data to the code they gave you. even worse, the training data does not even flag up which license the original code was under, so you could find yourself with gpl code with matching comments leaving you with your only choice being to release your proprietary code under the gpl to avoid triple damages and comply with the license. on top of that, the original code is usually not written to be security or testability aware, so it has security holes, is hard to test, and you can't fix it.
2
My biggest problem with object oriented programming is that for most of it's history people using different languages could not agree which things were fundamental, and which things were just implementation specific. Lots of early oop systems were just the same as before with an oop sticker slapped on it.
2
And yet every technique has this same issue. You cannot explain the new technique without working on small, manageable demonstrations. If you try anything else, you quickly get bogged down in the problem details, and obfuscate how the method is actually supposed to work, You can only move on to your example when you have a clear understanding of the method, to try and get better understanding of how and why it works, but first you have to understand how to do it.
2
it did not become a brick. both microsoft and crowdfire released kernel updates. one of them broke everything. microsoft had no means to automatically block them on next reboot. therefore microsoft is not fit for locked down corporate use. unfortunately nobody else is either.
2
That is basically what the delta airlines suit is.
2
Everything is taught using easy examples. The reason is simple. If you try and teach using real world large problem examples then it becomes less about what you are trying to teach, and more about the detailed understanding of the large problem you are using as your example. So what a good teacher will do is look for the simplest example possible which still allows you to demonstrate as many details as possible of the thing you are trying to teach. If it turns out a specific aspect doesn't work with that example, look for another one that covers as many of the remaining details as possible. It is for just this reason that every new way of doing AI in games starts by examining how it works with tic tac toe and moves on to harder examples as needed. Because you already know the problem space, you can spend all of your time looking at the potential solution and trying to understand the techniques involved.
2
The people making those arguments are just proving that they don't understand testing, and can't figure out how to do continuous integration. Tdd only pushes towards lots of abstraction when trying to get legacy code under control. The rest of the time it drives towards writing testable modular code with io pushed to the edges of the system, producing self testing code with a thin and thus easily replaceable user interface. While this does move the regression tests to the front of the process, it makes the debugging phase nearly non-existent and what little is left becomes fairly trivial, as does expanding the codebase. The data is already in about which approach between no test, test after and test first produces better results in the form of the dora reports, and it says do more testing earlier and at the right level and get more, better and easier to expand code faster in the long term. It also points to the long term not being that long.
2
The context switching is actually one of the points of tdd. You are constantly validating that your regression test fail, then pass with passing code, then using the refactoring step which is now easy as you have the tests to handle the technical debt. All the while you are discouraging habits which produce hard to test, coupled and in other ways legacy code. As a side effect you end up with a validated test suite for continuous integration, code which you know passes, new code going into version control which you know works and so do the tests that go in with it, and debugging becomes easier due to the smaller changes. You also end up with high code coverage, testable code, a starting set of examples for writing your documentation, and lots of other benefits.
2
you mean an argument for continuous delivery. ci and cd are designed to block deployments of broken systems.
2
sorry, but with broken kernel drivers for any operating system for locked down corporate use you either need automated roll back after a failed reboot, or you need automated bad driver detection and isolation. nobody implements the latter, and windows safe mode does not qualify as the former.
2
@AK-vx4dy infrastructure is any person, device or service without which your business stops trading and starts losing money. resilience planning requires you to ask what parts of your business does this cover, and what plans do you need to put in place to cope with it. chaos engineering then goes one step further, and deliberately blocks that infrastructure to see if your plans are good enough.
2
the value of maintainence is not in speed of change, but in the fact that when done well, it produces ever improving code which is easier to change. this requires minor updates which make specific changes to make particular types of improvements, which requires understanding why the code is less than optimal, and which change is the better one to make. this is fundamentally at odds with how statistical ai in general works, and when you regenerate sections of code in big blocks, you have no reason to believe that what it guessed this time is any better than what it guessed last time, or that it is not throwing away better code to replace it with something worse. it also fundamentally screws up the whole idea of version control, as it is impossible to create good commit messages, and you are repeatedly just bulk replacing large chunks of code rather than evolving it.
2
That one is actually quite easy to answer if you look at how GUI code has been historically developed. First, you write too much of the UI which does nothing. Second, you write some code which does something but you embed it in the UI code. Third, you don't do any testing. Fourth, due to the lack of testing, you don't do any refactoring. Fifth, you eventually throw the whole mess over the wall to the q & a department, who moan that it is an untestable piece of garbage. Sixth, you don't require the original author to fix up this mess before allowing it to be used. When you eventually decide that you need to start doing continuous integration, they then have no experience of how to write good code how to test it, or why it matters. So they fight back against it. Unfortunately for them, professional programmers working for big companies need continuous integration, so they then need to learn how to do unit testing to develope regression tests, or they will risk being unproductive and risk being fired.
1
I think there is also an issue with how tdd is presented, partly preaching to the choir. A lot of those opposing tdd do not have the same definitions as the ci and tdd community. They oppose unit testing because to them a unit is a complete module, and the unit test is every test in the suite to test the module. Similarly they write the entire module, and only write regression tests when they have to, and adding tests or security or portability after the fact is always a nightmare. Because they don't write tests first, their code coverage is minimal, often consisting of UI tests and end to end tests which are fragile, and invert the testing pyramid. A lot of them come from the windows ecosystem or from the object orientation community, where the definitions don't match.
1
was it ci which was the problem, or the need to start writing tests for hard to test legacy code and bad managers making things worse?
1
@julianbrown1331 yes, you can treat it as a black box, only doing version control on the tests, but as soon as you do you are relying on the ai to get it right 100% of the time, which even the best symbolic ai systems cannot do. also, the further your requirements get away from the ones defining the training data, the worse the results get. also the copyright issues are non trivial. when your black box creates infringing code, then by design you are not aware of it, and have no defence against it. even worse, if someone infringes your code, you again do not know by design, cannot prove it, as you are not saving every version, and if you shout about how you work, there is nothing stopping a bad actor copying the code, saving the current version, letting you generate something else, then suing you for infringement, which you cannot defend against because you are not saving your history. it is the wrong answer to the wrong problem, with the potential legal liabilities being huge.
1
Just to call you out, but you got something wrong. The EU problem was that Microsoft wanted to implement something in a way that only their client software could access, which was monopoly abuse to try and extend their monopoly to other parts of the software. It was pointed out to them that this was illegal in the EU, and they needed to go away and find a legal way to do what they wanted. Microsoft then decided that they would not implement the idea at all rather than letting their competitors use the same api's, but that was totally their call. The EU case was never about enforcing quality, but about not illegally extending the monopoly.
1
@ulrichborchers5632 dealing with your points in no particular order. 1, the EU did not favour competition over quaIty. Having previously found Microsoft guilty of trying to extend their os monopoly to give them control of the browser market, they flagged up to Microsoft that their proposed change was equally guilty, and thus illegal with regards to the security market. All they said to Microsoft was go have another think and look for a legal way to do what you want to do. Making the api accessible to all security firms equally would have solved that problem, removing the attempt to expand their monopoly, but instead of that, Microsoft chose to abandon the proposed api instead. This was Microsofts choice, not the eu's. When apple had the same choice later, they provided a kernel api to all user land programs which could thus move the user code out of the kernel, which is usually a good idea. The fact that the change was initially proposed and that apple later did basically the same thing shows that the idea had merit. The fact that Microsoft chose to roll back the proposed change rather than share it with everyone was a commercial decision at Microsoft, not a technical one, as can be seen from the fact that a lot later they basically had to copy apple. The EU literally had nothing to say about the technical merits or the options Microsoft had, they just said you can't do that, it's illegal, have another go and get back to us. There were multiple options for Microsoft, who chose for commercial reasons to just roll back the change, but it was their choice. 2, third party code in the kernel. There are various good reasons to have third party code in the kernel, and multiple ways you can handle it. One reason is speed, which is the case for low level drivers like graphics. Another is the level of access needed to do the job, which is the case for security software. Linux handles it by requiring this code to be added to the kernel under the same gpl2 license as the rest of the code, or if you want to keep it private, you get a lower level of access to some of the api's. Closed source systems like Microsoft have a number of options. They tried just letting any old garbage in, which was one of the reasons the win95 to windows me consumer kernel was such a flakes and crash prone pice of garbage. They tried forcing the graphics drivers into user space, which is why windows 3.11 did not really crash that much, but this made the thing slow. They tried going through the driver signing route, but that is underfunded and thus both slow and expensive, but more importantly, cloudstrike showed that by allowing the code to be a binary blob, it was easy to subvert. The backdoor as as you called it was basically a repl inside the driver, which partly makes sense for this use case, but the way it was implemented was as another binary blob which their own code did not get to see. The only way to handle this safely is to create a cryptographic hash at creation time, which would be easy to check at download time to spot corruption. They did not do this. This only tells you it did not get corrupted in flight, so you also need testing. While they claimed to be doing this, they did not do it. Next, as it is kernel code, you provide a crash detection method which rolls back to an earlier version if it can't boot. Again they claimed to do this but didn't. Lastly you allow critical systems to run the previous version. Again they claimed to do this, but the repl binaries were explicitly excluded from this, so when the primaries crashed due to the bad update, they tried switching to systems which were supposed to run the previous version, and found that it only applied to the core code, and not the binary repl blobs, so as soon as these were updated with those blobs they crashed as well. I find their argument that they did not have time for testing and canary releasing as specious as the security through obscurity argument. If you don't have time for tests, you really don't have time to manually reboot all the machines you crashed. The testing time to determine that your new update crashed the system is minimal. Just install the update to some local machines, reboot, and have the machine ping another machine to say it booted up OK, and you are done. They could have done this easily, but we know from the way that they crashed the Linux kernel with a previous bug (which happened to be in the kernel) that they did not do this. We have known how not to do this for at least 3 decades for user space code, and for kernel space code you should be even more careful, but they just could not be bothered to do any of the things which would have turned this into a non event, including doing what they told their customers they were doing. This is why the lawsuits are coming, and why at least some of them have a good chance of winning against them.
1
This is mostly not a microsoft issue. It is a kernel space code issue which applies to all operating systems. Any line of kernel code can have a bug, and the wrong bug in the kernel can take down the whole machine, irrelevant of which operating system you are using. Microsoft were guilty of failing to fix a previously known bug where a kernel mode driver can request to always run, and when it crashes due to a bug it forces a permanent reboot cycle, which has hit them multiple times before.
1
@quantumangel I'm not defending windows or Microsoft, but apart from the boot loop problem in this incident it is all on crowdstrike. However the failure mode of buggy third party kernel code can crash all operating systems.
1
You should write the tests that matter for your use case. If the software needs certain functions, add tests for those in your unit and integration tests which are run when you do continuous integration. If you find out later that you have various minimal requirements for non functional attributes of the software, add them to your continuous delivery pipeline to make sure it never drops below those limits. Just remember that retrofitting capabilities to code to pass tests for newly discovered requirements can be much more expensive than writing the tests first and then making the code pass, but in the end when to write the tests, and howmexpensive it is to write them is your choice.
1
@BigWhoopZH no, e-gate happened because it was delivering software updates over the same capped internet connection used to do live testing, but it should not have been connecting to head office for every check anyway, it should have been using a local mirror of the data for resilience.
1
@Artoooooor the only problem with that test is the wait instruction. What happens if you are running on the latest, blindingly fast, machine, and your user is running on an older, slower machine. In that case you need to either make the wait longer, and if so by how much, or you have to make the prior function synchronous so that it doesn't return until the write finishes. Personally I would favour the second approach unless you are specifically trying to emulate the timing as well, in which case I would move the wait inside the function call and bookend the function with a timing function to force the wait to be at least however long it needed to be.
1
legacy code almost by definition is code where testing is at best an afterthought, so retrofitting it to be testable is a pain. luckily, you don't have too. not all code is created equal, so you write new code using tdd, and as part of the refactoring step, you move duplicate code out of the legacy section, modifying and writing just enough tests to make sure it continues to work. this results in a code base where the amount of untestable code keeps reducing, while the code under test keeps increasing. more importantly, you only modify the legacy code when it needs changing. the rest stays the same. working any other way is basically chasing code coverage for a system not designed to be tested, which is why dave says trying to force it under tdd style tests is a bad idea. over time, more and more code needs modifying, and thus comes under test.
1
You call them requirements, which generally implies big upfront design, but if you call them specifications it makes things clearer. Tdd has three phases. In the first phase, you write a simple and fast test to document the specifications of the next bit of code you are going to write. Because you know what that is you should understand the specification well enough to write a test that is going to fail, and then it fails. This gives you an executable specification of that piece of code. If it doesn't fail you fix the test. Then you write just enough code to meet the specification, and it passes, proving the test good because it works as expected and the code good because it meets the specification. If it still fails you fix the code. Finally you refactor the code, reducing technical debt, and proving that the test you wrote is testing the API, not an implementation detail. If the valid refactoring breaks the test you fix the test, and keep fixing it until you get it right. At any point you can spot another test, make a note of it, and carry on, and when you have completed the cycle you can pick another test from your notes, or write a different one. In this way you grow your specification with your code, and is it incrementally to feed back into the higher level design of your code. Nothing stops you from using A.I. tools to produce higher level documentation from your code to give hints at the direction your design is going in. This is the value of test first, and even more so of tdd. It encourages the creation of an executable specification of the entirety of your covered codebase, which you can then throw out and reimplement if you wish. Because test after, or worse, does not produce this implementation independent executable specification it is inherently weaker. The biggest win from tdd is that people doing classical tdd well do not generally write any new legacy code, which is not something you can generally say about those who don't practice it. If you are generally doing any form of incremental development, you should have a good idea as to the specifications of the next bits of code you want to add. If you don't you have much bigger problems than testing. This is different from knowing all of the requirements for the entire system upfront, you just need to know enough to do the next bit. As to the issue of multi threading and micro services, don't do it until you have to and then do just enough. Anything else multiplies the problems massively before you need to.
1
Despite doubling the number of programmers every five years, the need for them is going up even faster. This means that the demand exceeds supply and therefore if the development environment is not good and relies on getting crappy code out of the door fast, you should start looking for a better job as soon as you can. Just make sure in your interview that you ask the right questions to ensure that you are moving to somewhere better before taking the job. A good interviewer will let you ask questions to get a good fit between your needs and theirs, as they won't retain staff if they don't.
1
@davidvernon3119 the success rate isn't great, but the point was mainly that there are more job vacancies than programmers, so you don't need to keep working for a bad employer. obviously being invited to join another company is better, but you don't need to wait for the invite.
1
@CallousCoder this is not a os fanboy issue. every operating system could easily start by loading just enough modules to be able to read and write a file on pemenent storage. it could the load a signature list of modules loaded last time. when the module asks to be loaded, it can check if it is changed, and if so add it to a block next time list which it writes to storage immediately. if was already there, don't load it. when the kernel finishes booting, it can clear the block list, and save it to storage. at that point, the kernel can recover on next reboot simply by disabling all the updated drivers on the blocked list, and this incident never happens.
1
@CallousCoder sorry, responded to the wrong post
1
@Aleks-fp1kq because they see it as paying 30% extra. The same management will probably also really dislike refactoring, and will probably like the feature treadmill, while not getting that ignoring all the things mentioned slows down the ability to implement features. In all of these cases you are just moving stuff you will have to do anyway in any codebase above a certain size to the front of the process where it will make the biggest difference, which is why you do it first.
1
@gppsoftware the linux kernel gets a new patch every 30 seconds on average, so there are cases where multiple updates per day make sense. netflix do this quite well. good ci and cd act to stop you releasing if there is a problem, so a snafu this bad would not make it out the door. even if it got out, canary releasing to your own computers would quickly stop the roll out of the patch.
1
@vladimirpopov2373 that sounds like a similar buggy driver issue, but that is just normal windows problems, as the driver just broke, but did not crash the kernel.
1
@zed5129 someone made comments about how multiple releases per day do not make any sense under any circumstances. i was using the linux kernel development process as one example where that is obviously not the case. virus recognition signature files at the heart of the problem here is another. in response to this need we have over many years developed methods to avoid exactly the issue that happened here and mitigate the extra risks involved, but the evidence clearly indicates that despite these methods being common knowledge, either they do not use them, or in this instance they chose to subvert them.
1
any os kernel module can bsod a system. there are only two ways to stop it. 1 require every module to be submitted to os vendor for intensive testing prior to release. microsoft tried it and nobody wanted to pay them them thousands to sign every minor patch. 2, have the os catch bad drivers, and automatically block them next reboot. nobody does this. a poor approximation is windows safe mode, which does not work in a corporate environment. so every kernel driver update can hose your system with a need for a full reinstall.
1
@Storytelless first, you separate out the code that does the search, as it is not part of the ui, and write tests for it. Then you have some ideas as to what parameters you want to pass to the search, which are also not part of the ui, and add tests for those. Finally, you know what those parameters are going to be, which hints at the ui, but you have the ui build the search query, and test that that query is valid. Only after all of this is it necessary to finalise the shape of the ui, which is a lot easier to test due to all of the other stuff you have already removed. The ui should be a thin shim at the edge of the code which just interacts with the user to get the parameters. This was it is easier to replace it with a different one if your ui design changes, because you have already removed all of the other stuff which is ui dependent. You can then test the ui using one of the guides test frameworks, just looking to see if the automated step you have recorded actually selects what you expected it to and returns the correct value.
1
but the state of measurement in software stinks. with the exception of the dora metrics, they are all equally bad, leading to a situation similar to the drunk looking for his keys undrr the street light, even though that is not where they dropped them. while i don't dispute the value of good measurement, we have not done enough work on large version control repositories to develop good measurements.
1
This problem happens regularly, but the consequences are usually less severe. In this case it was an off by one error which could easily have been caught if they had actually been doing what they told their customers they were doing, but as it applied to kernel code running critical infrastructure, lives were lost. In another case, it was a radiotherapy machine which defaulted to maximum dose and then reduced it under user selection, not informing the user of the actual setting for the dosage, which killed people. There are also cases in architecture where someone changed the plans without doing any testing, and walkways collapsed killing people. The same thing happened with mechanical linkages in the tail flight controls for some aircraft, resulting in crashes and loss of life. Most of these types of failure are more akin to having your office suite crash and cost you a couple of hours of work.
1
using c as an example, anything in the header file is part of the public api, which is the part you test. anything in the c source code file is implementation specific. you should be able to rename this c file and replace it with a completely new one, and all the tests should still pass once you have a complete implementation. to the extent it does not, you were testing internal details of the previous implementation. that is what is meant by do not test the implementation, that you should be able to do this, and to some extent get full code coverage just using the public api. any code not covered is dead code and can be deleted if your test suite is complete.
1
yes it is. have a modular kernel, flag up which modules have changed on shutdown, when rebooted clear the list, and if the reboot did not complete, block those modules on next reboot.
1
it is almost a certainty. this hit hospital emergency wards and operating theaters. if nobody died due to lack of necessary data it will be a miracle.
1
Not only is there the issue of not testing the private api, but there is also the issue that people from the oop community regard a unit as equivalent to the module, and thus the unit test to them is the entire test suite for that module, which is not what a unit test actually is.
1
This is because it will not work unless it is as intrusive and mandatory as the ntsb, with appropriate legal liabilities for management, and good luck getting that past any legislature funded by lobbyists. In the end it will be like the smoking ban in the UK, in that they will face a choice between coming to a legal framework that works, or delaying and having a much worse one implemented by creeping binding case law. Most people are probably criticising the practicality of getting something good passed, not the idea of having it, but I am sure that there will be plenty of short sighted idiots who cannot see which way the wind is blowing.
1
@tediustimmy that is the entire point of testing. We automate because when you do it manually it is not repeatable. We test to prove that the code is not fit for release, and block it if any of the tests fail, until you figure out why it failed and fix the underlying problem. The history of testing tells us that tests fundamentally fall into two different categories. Type 1 tests are deterministic, repeatable, and provide a gold standard test which if it fails should block the release. Type 2 tests are non deterministic, flaky, and a pain to find the source of the problem. These will usually pass when you rerun the test a couple of times. Due to the different failure modes, you can easily partition your tests into these two types, blocking on Type 1, and not blocking on type 2. You should do everything you can to migrate code coverage from Type 2 tests to type 1, including refactoring the code so that less of it falls under the Type 2 scope. The reason for this is simple. Type 2 tests spot heisenbugs, which will eventually find a way to come back and bite you. As such, they provide a valuable source of data as to the risks in your code, but should only be allowed to exist while you search for the source of the bug. As such, long standing Type 2 tests are a really big red flag, but every piece of non trivial code has some.
1
@llothar68 every operating system I have seen has the equivalent to a command to reboot or shutdown now or after a short time. If you run it in a virtual machine and have a heartbeat process, when the external system loses the heartbeat it can kill the virtual machine and report it. Both of these steps can be automated.
1
@mrpocock the clients were under the impression that they were doing in house testing, but cloudstrike subverted expectations. By splitting the code into well tested and signed core code, and untested binary driver patches, they changed the rules. They had an option to use the n-1 version of the code, but their salesmen and documentation failed to have any knowledge that this only applied to them core code. The secondary code was rolled out to all versions of the core code, and this is where the bug lived. This outage was 100% avoidable, they just could not be bothered to do what was needed to avoid it.
1
@wwkw4992 actually, it is not a good idea to create the tests for your own code under two conditions. The first is when the company demands that all testing is done after the code is thrown over the wall to the testing silo.This not only produces worse tests, but makes those tests flakey and fragile. The other time is for writing acceptance tests, when the developer is rubbish at talking with the customer, where you need someone with a clue to extract that information for the developer. In all other cases test first works better.
1
Unfortunately that is not a realistic option. What you are basically talking about there is having a micro kernel design and forcing everything else into user space. The problem with that is that they have been looking ever since it was first designed for ways to mitigate the high costs of working this way, and literally every suggestion which improved micro kernel performance could also be done with a monolithic kernel, so it does not provide any additional benefit. You will always have code which needs to run in kernel mode. This can be due to speed constraints for interrupt handlers and drivers, because it needs higher level security access like with malware detection and mitigation, or for multiple other reasons. The key is to split the code so that the bit which absolutely must be in the kernel is minimal, and everything else is in user space, where it can crash to its hearts content and only kill itself. Then you just have to pay the price for jumping between the two. It is cheaper than not paying it and increasing the kernel footprint too much.
1
@eltreum1 no, if it ever gets to court we will definitely find out. Their primary defense seems to be to try and claim contractual limitations to liability, but published legal analysis has pointed out that this only holds if they were not negligent. Their secondary defense seems to be to try and throw their programmers under the bus, but management already had multiple experiences of this type of issue, so that fails the knew or should have known willful blindness test. Their programmers will get called to testify, and cross examination will be brutal. Sales will also be called, as there was a disconnect between what they were promising customers, and what was actually going on.
1
No, they see unit testing as cumbersome because they don't write the tests first, then write legacy code which is not designed to be tested, and then write implementation dependent tests which are fragile against the newly written legacy code. All testing of legacy code is cumbersome, a total pain, and an exercise in frustration. This is why you write the test first, so you do not write the legacy code in the first place. However this is a skill, and as Dave pointed out, when you first try it you try and write the test for the imaginary legacy code, and find you are not getting the advantages.
1
@manuelgurrola basically your functional tests check that your code does what you expect it to do, and belong in the ci system. thing like memory footprint, speed of response, etc are acceptance tests, and go in the cd system.
1
@lapis.lazuli. from what little info has leaked out, a number of things can be said about what went wrong. first, the file seems to have had a big block replaced with zeros. if it was in the driver, it would be found with testing on the first machine you tried it on, as lots of tests would just fail, which should block the deployment. if it was a config file, or a signature file, lots of very good programmers will write a checker so that a broken file will not even be allowed to be checked in to version control. in either case, basic good practice testing should have caught it, and then stopped it before it even went out of the door. as that did not happen, we can tell that their testing regime was not good. then they were not running this thing in house. if it was, the release would have been blocked almost immediately. then they did not do canary releasing, and specifically the software did not include smoke tests to ensure it even got to the point of allowing the system to boot. if it had, the system would have disabled itself when the machine rebooted the second time and had not set a simple flag to say yes it worked. it could then have also phoned home, flagging up the problem and blocking the deployment. according to some reports, they also made this particular update ignore customer upgrade policies. if so, they deserve everything thrown at them. some reports even go as far as to say that some manager specifically said to ship without bothering to do any tests. in either case, a mandatory automatic update policy for anything, let alone some kernel module is really stupid.
1
The problem with your three ring circus is that the middle ring does not gain you anything, which is why no operating system that I know of implements it. The windows kernel had a flag which enabled crowdstrike to require their code to always run in order for windows to boot. This is possibly a good thing for security software and critical drivers, but the problem is that on reboot after the crash, nothing done by windows kept any record of which mandatory drives actually completed booting. If it had, it could have flagged the crowdstrike broken driver as buggy, and booted without it. Crowdstrike also claimed to have stuff in their code which identified buggy updates, but either it did not exist, or was never tested, but if it worked as they told their customers it did, it also could have flagged the 291 update as broken, and rolled back to the previous version until a new update was released. Because it did not work, this never happened.
1
If you are talking about delta, they were doing exactly that based on the promises given to them by cloudstrike. If you are talking about cloudstrike, they actively subverted the promises they provided to both Microsoft and their customers, which is why they are being sued a lot.
1
not really, we all know how hard it is to fix bad attitudes in bosses. in the end it comes down to the question of which sort of bad boss you have. if it is someone who makes bad choices because they don't know any better, train them by ramming home the points they are missing at every oportinity until they start to get it. for example if they want to get a feature out the door quick, point out that by not letting you test, future changes will be slower. if they still don't let you test, point out that now it is done, we need to spend the time to test, and to get it right, or the next change will be slower. if they still did not let you test, when the next change comes along, point out how it will now take longer to do, as you still ha e to do all the work you were not allowed to do before, to get it into a shape where it is easy to add the new stuff. if after doing that for a while, there is still no willingness to let you test, then you have a black boss. with a black boss, their only interest is climbing the company ladder, and they will do anything to make themselves look good in the short term to get the promotion. the way to deal with this is simply to get a paper trail of every time you advise him of why something is a bad idea, and him forcing you to do it anyway. encourage your colleagues to do the same. eventually one of the inevitable failures will get looked into, and his constantly ignoring advice and trying to shift blame to other will come to light. in the worst case, you won't be able to put up with his crap any more, and will look for another job. when you do, make sure that you put all his behaviour in the resignation letter, and make sure copies go directly to hr and the ceo, who then will wonder what is going on and in a good company will look to find out.
1
@jimhumelsine9187 because testing is not really taught, and people don't design with testing in mind, having to test the legacy code can be a pain, so some developers don't like it, but it still comes back to management. either the company cares about tests, or it does not, and if you do care, then everyone needs to write and not break tests. testing is like version control, it only works if everyone does it, and you cannot do continuous integration without it.
1
there are plenty of cases of management by metrics where it only leads to 2 outcomes. 1, managers valuing fast workers who produce buggy features that nobody likes over careful workers who produce quality features a bit slower. 2, clever programmers gaming the system to subvert the metrics. both of these just annoy the good developers, which is why it has never worked in software and other creative industries.
1
at the very least, there should have been some sort of smoke test to see if it even got as far as completing booting, and disabled itself if that failed.
1
microsoft allowing a bad driver update to stop you from doing a full reboot, no, nothing to do with microsoft.
1
@Storytelless so you write a test tool to run all of your cross platform stuff against which reports when some piece of code will cause a problem in a particular supported browser. This linter will initially just parse the code, but as you find nasty browser specific corner cases, you flag them up, and run this linter over your code before checking in the change. I've done this with html since windows 95, reporting browser specific problems, so the code just does not get checked in in the first place.
1
I think that what we need here is a test like jez humble's continuous integration test, but for tdd. Most of the criticisms I see can be sumerised as "I write legacy code and don't get the testing pyramid, unit testing or regression tests, so why would test first regression testing make sense". As they don't get regression testing, they also fundamentally don't get refactoring, continuous integration or technical debt. Time for them to learn some of the basics of the industry wouldn't you say?
1
@SteveBurnap have fun with flags. set a flag when you download the update. if the flag is not update or booted, leave the driver and have it disable itself if it is updated change the flag to say it started if it has started do the minimum work to show it boots, then change the flag to say it booted at this point, either it worked, or it got out of the way next reboot. suddenly the problem did not occur in production, and everything else is solved by testing and canary releasing.
1
While it might make sense to test the internals of functions in systems programming languages, it still makes more sense to write functional tests of the function API until you absolutely need to test the internals, and then only test those bits you absolutely must test.
1
but as the systems could be rebooted, all you need is for the os vender to block drivers which were updated prior to the reboot, and the os boots and can start recovery procedures.
1
no, if you are corporate scale, you have on call tech support whenever your business is open.
1
Yes doing tdd slows down the initial delivery speed, but unless you absolutely need first move advantage to capitalize on the network effect, not doing it usually comes with a much crappier result, and becomes increasingly slower until you move to a test first design. So in that specific use case, you can get some advantage from being first to market, but you are usually better served even in this case doing test first with a minimal viable product and incrementally adding more features.
1
@evgenyamorozov no, it won't compile, but you are designing the public api, so it does not need to. Once you have an api call which is good enough, you have your failing test, which fails by not compiling, so you create a function definition which meets that api, and a minimal implementation which returns whatever value your test requires, thereby compiling and passing the test. Then you come up with another example which returns a different result, which then fails, and make just enough changes to the implementation to pass both tests. Then keep doing the same thing.
1
If you need two tests, write two separate tests, but remember that their purpose is different. Unit tests and integration tests are part of the continuous integration stage, and check that the code does what the programmer understood was needed. Acceptance tests are part of the continuous delivery pipeline, and check that the code does what the user actually intended. They tend to be written at different times and sometimes by different people for different purposes.
1
we have known how to do man rated systems since the moon landings, and this is not how. the last time the nhs in the uk was brought down, it was embedded medical devices using windows xp years after end of life mixing with current machines.
1
@mallninja9805 that quote describes the development philosophy, not the release philosophy. How it works in practice is you do iterative development to develope new capabilities at speed, then you use ci/cd testing to make sure you do not release garbage.
1
@ChristianWagner888 i was not aware that the efforts in mac os and linux were further progressed than capabilities and potentialities. thanks for the information. you cannot stop bugs in kernel level code from taking down the kernel. the issue here is how the os responds to it after a reboot. boot looping until someone comes along and manually fixes it shows that in this case the answer is not well. having as little code in kernel space as possible is obviously the best answer, which is why the idea of microkernels was so popular, but in practice there are some things you need to do which must be done in kernel space. the key there is to do as little as possible, and expect things to break, planing to mitigate the side effects.
1
One of the points of test first is that it gradually migrates the code you write into a more testable format, which naturally produces a more functional design, especially when you can make them pure functions with no side effects. Dave made the point in other videos that testable code and legacy code look very different. This is because the legacy code is not designed to be testable, and thus is a total pain to test. It is like trying to retrofit security into an insecure application, you can do it, but it is not pleasant.
1
@vanivari359 what you described is no more tdd than long lived feature branches which are seldom integrated are Ci. What you are describing is a lot of mandatory testing of the internal api's of internal functions, whereas tdd works in an implementation independent method against the public api's of public functions. You then get the code coverage that your system needs not by testing the internal functions, but by using them from the public api, and having them return results that make sense. The bits which do not get used are then either potential dead code, or part of a more general implementation of a library that you only need some of the functions provided. In either case, as you are working at the public api, you are completely free to totally change how the stuff behind the curtain works. If it is a problem, either you are testing implementation, or you are not coding to pass the tests, in which case you are not doing tdd.
1
@tongobong1 I would say you can have multiple assets if it is defining a complex precondition, but that this is the only time, and should be discouraged, as this is usually the code smell that you have not done enough upfront thinking about the testing. It is usually better to move all of those assets into separate tests with clear error messages. This is one of the benefits of tdd, in that it naturally moves the logic of the code under test to a more modular form so it ends up easier to test. You only have to test the logic of the module rather than having to muck about testing all of the surrounding glue logic describing where and how to get your test data. It also makes your code more functional, and makes more of it into pure functions which are easy to test.
1
no, this is what happens when you push a kernel module to everyone at once without any testing. code review or push to master was not relevant to the failure mode. a simple test deployment and canary releasing would have caught this even if you were doing waterfall development.
1
@ansismaleckis1296 the problem with branching is that when you take more than a day between merges, it becomes very hard to keep the build green and pushes you towards merge hell. the problem with code review and pull requests is that when you issue the pull request and then have to wait for code review before the merge, it slows everything down. this in turn makes it more likely that the patches will get bigger, which take longer to review, making the process slower and harder, and thus more likely to miss your 1 day merge window. the whole problem comes from the question of what is the purpose of version control, and it is to do a continuous backup against every change. however this soon turned out to be of less use than expected because most backups ended up in a broken state, sometimes going months between releasable builds. this made most of the backups to be of very little value. the solution to this turned out to be smaller patches merged more often, but the pre merge manual review was found not to scale well, so a different solution was needed, which turned out to be automated regression tests against the public api, which guard against the next change breaking existing code. this is what continuous integration is, running all those tests to make sure nothing broke. the best time to write the tests was before you wrote the code, as then you have tested that the test can fail and pass. this tells us that the code does what the developer intended it to do. tdd adds refactoring into the cycle, which further checks the test to make sure it does not depend on the implementation. the problem with not merging often enough is that it breaks refactoring. either you cannot do it, or the merge for the huge patch needs to manually apply the refactoring to the unmerged code. continuous delivery takes the output from continuous integration, which is all the deployable items, and runs every other sort of test against it trying to prove it unfit for release.if it fails to find any issues, then it can then be deployed. the deployment can then be done using canary releasing, with chaos engineering being used to test the resilience of the system, performing a roll back if needed. it looks too good to be true, but is what is actually done by most of the top companies in the dora state of devops report.
1
alpha beta and releasable date back to the old days of pressed physical media, and their meanings have changed somewhat in the modern world of online updates. originally, alpha software was not feature complete, and was also buggy as hell, and thus was only usable for testing what parts worked, and which parts didn't. beta software occurred when your alpha software became feature complete, and the emphasis moved from adding features to bug fixing and optimisation, but it was usable for non business critical purposes. when beta software was optimised enough, with few enough bugs, it was then deemed releasable, and sent out for pressing in the expensive factory. later, as more bugs were found by users and more optimisations were done you might get service packs. this is how windows 95 was developed, and it shipped with 4 known bugs, which hit bill gates at the product announcement demo to the press, after the release had already been printed. after customers got their hands on it the number of known bugs in the original release eventually went up to 15,000. now that online updates are a thing, especially when you do continuous delivery, the meanings are completely different. alpha software on its initial release is the same as it ever was, but now the code is updated using semantic versioning. after the initial release, both the separate features and the project as a whole have the software levels mentioned above. on the second release, the completed features of version 1 have already moved into a beta software mode, with ongoing bug fixes and optimisations. the software as a whole remains in alpha state, until it is feature complete, and the previous recommendations still apply, with one exception. if you write code yourself that runs on top of it, you can make sure you don't use any alpha level features. if someone else is writing the code, there is no guarantee that the next update to their code will not depend on a feature that is not yet mature, or even implemented if the code is a compatability library being reimplemented. as you continue to update the software, you get more features, and your minor version number goes up. bug fixes don't increase the minor number, only the patch number. in general, the project is moving closer to being feature complete, but in the meantime, the underlying code moves from alpha to beta, to maintainance mode, where it only needs bug fixes as new bugs are found. thus you can end up with things like reactos, where it takes the stable wine code, removes 5 core libraries which are os specific, and which it implements itself, and produces something which can run a lot of older windows programs at least as well as current wine, and current windows. however it is still alpha software because it does not fully implement the total current windows api. wine on the other hand is regarded as stable, as can be seen from the fact that its proton variant used by steam can run thousands of games, including some that are quite new. this is because those 5 core os specific libraries do not need to implement those features, only translate them from the windows call to the underlying os calls. the software is released as soon as that feature is complete, so releasable now does not mean ready for for an expensive release process, but instead means that it does not have any major regressions as found by your ci and cd processes. the software as a whole can remain alpha until feature complete, which can take a while, or if you are writing something new, it can move to beta as soon as you decide that enough of it is good enough, and when those features enter maintainance mode, it can be given a major version increment. this is how most projects now reach their next major version, unless they are a compatability library. so now code is split into 2 branches, stable and experimental, which then has code moved to stable when ci is run, but it is not turned on until it is good enough, so you are releasing the code at every release, but not enabling every feature. so now the project is alpha (which can suddenly crash or lose data), beta (which should not crash but might be slow and buggy) or stable (where it should not be slow, should not crash, and should have as few bugs as possible). with the new way of working, alpha software often is stable as long as you don't do something unusual, in which case it might lose data or crash. beta software now does not usually crash, but can still be buggy, and the stable parts are ok for non business critical use, and stable software should not crash, lose data or otherwise misbehave, and should have as few known bugs as possible, thus making it usable for business critical use. a different way of working with very different results.
1
one additional point i failed to mention. when someone looked at 5 releases of red hat linux and the thousands of packages on it, they found that 70% of the code had not changed over the multiple years covered by the study. that code was so stable that it had not even needed a bug fix. most features in modern projects follow a similar dynamic, where most of the churn happens in a small amount of the code, with an increasingly stable long tail on the graph. this graph looks exactly the same whatever language the project is written in.
1
Actually they are mostly complaining about having to start writing regression tests and having to write testable code when they could previously write untestable garbage and force q & a to have to deal with it. Regression testing shines a light on how bad your code is, and they don't like it as they like to think that due to spending some time programming they must be good at it. Test first they hits you in the face with how bad you are at testing, which is not taught, and tdd's refactoring step shows just how much copy past coding is in your code base. Understandably a lot of people who never had to do testing don't like it and push back, mostly with comments which can be paraphrased as "I only know how to write legacy garbage, so it doesn't work for me", not realizing that this is what they are saying.
1
I do know that when they write the code for the space shuttle they used the fact that they had multiple machines that had to agree to put the contract out with multiple companies. When they got it back they analyzed it, and found that a lot of the code had the same assumptions and blindspots in them, which just goes to show that it is much harder to get completely redundant implementations of the same specifications than you think. I'm sorry, but I don't remember the names of the reports and papers which covered it in detail.
1
the problem with the idea of using statistical ai for refactoring is that the entire method is about producing plausible hallucinations that conform to very superficial correlations. to automate refactoring, you need to understand why the current code is wrong in this context. this is fundamentally outside the scope of how these systems are designed to work, and no minor tweaking can remove the lack of understanding from the underlying technology. the only way around this is to use symbolic ai, like expert systems or the cyc project, but that is not where the current money is going. given the current known problems with llm generated code, lots of projects are banning it completely. these issues include: exact copies of the training data right down to the comments, leaving you open to copyright infringement. producing code with massive security bugs due to the training data not being written to be security aware. producing hard to test code, due to the training data not being written with testing in mind. the code being suggested being identical to code under a different license, leaving you open to infringement claims. when the code is identified as generated, it is not copyrightable, but if you don't flag it up it moves the liability for infringement to the programmer. the only way to fix generating bad code is to completely retrain from scratch, which does not guarantee fixing the problem and risks introducing more errors. these are just some of the issues of statistical methods, there are many more.
1
@ThomasTomiczek i do not underestimate the potential of ai to help with all sorts of programing tasks, only the implausability of getting there using the currently popular statistical ai systems, the whole point of which is that you throw a lot of data at them and they do not create a model of the system, just a bunch of plausible corellations each of which could be noise. you could have a swarm of symbolic ai's each looking at the code base and figuring things out which other parts could then make use of, which in turn could help take legacy code from an untestable big ball of mud to a much better design, but to do that, it needs to have some clue as to what the existing code is doing, why, and how to move it in the direction of something better.
1
@ThomasTomiczek the whole approach is based on not modeling the problem and looking for ever larger numbers of low level correlations, which be purely coincidental and hoping it can come up with something which is good enough. when you point out this problem, the suggested solution is just to throw more of the same at the issues, which will somehow magically solve problems caused by how the tech works. turtles all the way down does not work in philosophy, and it does not work in software.
1
Lots of people bash Microsoft over this, and they are not completely blameless, but this is a kernel mode issue. Anything running in kernel mode can crash the system, no matter who writes the code. The question is what happens on reboot. Ebpf is still code, and if it is run in kernel mode it can still crash the machine. The same thing could also happen with your proprietary video drivers under Linux.
1
@ContinuousDelivery this is exactly the correct analogy to use. In science what you are doing is crowd sourcing the tests based upon existing theories and data, and using the results to create new tests, data and theories. Peer review is then equivalent of running the same test suite on different machines with different operating systems and library versions to see what breaks due to unspecified assumptions and sensitivity to initial conditions. This then demonstrates that the testing is robust, and any new data can be fed back into improving the theory. And like with science, the goal is falsifiability of the initial assumptions. Of course the other problem is that there is a big difference between writing code and explaining it, and people are crap at explaining things they are perfectly good at doing. Testing is just explaining it with tests, and the worst code to learn the skill on is legacy code with no tests. So people come along and try to fit tests to legacy code only to find that the tests can only be implemented as flaky and fragile tests due to the code under test not being designed for testability, which just convinces them that testing is not worth it. What they actually need is to take some tdd project which evolved as bugs we're found, delete the tests, and compare how many and what types of bugs they find as they step through the commit history. If someone was being really nasty they could delete the code, and reimplement it with a bug for every test until they got code with zero passes, and then see what percentage of bugs they found when they implemented their own test suite.
1
Tdd comes with a number of costs and benefits, and so does not doing tdd or continuous integration. The cost of doing tdd is that you move your regression tests to the front of the process, and refactor as you go and it can cost up to 35 percent extra in time to market.. What you get back is an executable specification anyone can run to reimplement the code in the form of tests, a set of code designed to be testable with very few bugs, and the combination is optimized for doing continuous integration. You also spend very little time on bug hunting. it also helps with areas that are heavily regulated as you can demonstrate on an ongoing basis that it meets the regulations. All of this helps with getting customers to come back for support, and for repeat business. Not doing tdd also comes with benefits and costs. The benefit Is mainly that your initial code dump comes fast, giving a fast time to market. The costs are significant. As you are not doing incremental testing, the code tends to be hard to test and modify. It also tends to be riddled with bugs which take a long time to find and fix. Due to the problem of being hard to modify, it is also hard to extend, and if they have to get someone else to fix it it can sometimes be quicker to just reimplement the whole thing from scratch. This tends to work against getting support work and repeat business. As for the snowflake code no one will touch, it will eventually break, at which point you end up having to do the same work anyway, but on an emergency basis with all the costs that implies. Testing is like planting a tree, the best time to do it is a number of years ago, the second best time is now. The evidence for incremental development with testing is in, in the dora reports. Not testing is a disaster. Test after gives some advantages initially, while costing more, but rapidly plataus. Test first cost a very little more than comprehensive test after, but as more code I covered you get an ever accelerating speed of improvements and ease of implementation of those improvements, and it is very easy for others to come along and maintain and expand the code, assuming they don't ask you to do the maintenance and extensions.
1
I doubt it, but you do not need them. If you look at history you can see multitudes of examples of new tech disrupting industries, and project that onto what effect real ai will have. Specialisation lead us away from being surfs, automation removed horses as primary power sources, and changed us from working near 18 hour days seven days per week towards the current 40 hour 5 day standard. Mechanisation also stopped us using 98 percent of the population for agriculture, moving most of them to easier, lower hour, better paying work. This lead to more office work, where wordprocessors and then computers killed both the typing pool and the secretarial pool, as bosses became empowered to do work that used to have to be devolved to secretaries. As computers have become more capable they have spawned multiple new industries with higher creative input, and that trend will continue, with both ai and,additive manufacturing only speeding up the process. The tricky bit is not having the industrial and work background change, but having the social, legal and ethical background move fast enough to keep up. When my grandfather was born, the majority of people still worked on the land with horses, we did not have powered flight, and the control systems for complex mechanical systems were cam shafts and simple feedback systems. When I was born, we had just stepped on the moon, computers had less power than a modern scientific calculator app on your smartphone, and everyone was trained at school on the assumption of a job for life. By the time I left school, it became obvious that the job for life assumption was on it's way out from the early seventies, and we needed to train people in school for lifelong learning instead, which a lot of countries still do not do. By the year 2000, it became clear that low wage low skilled work was not something to map your career around, and that you needed to continually work to upgrade your skills so that when you had to change career after less than 20 years, you had options for other, higher skilled and thus higher paid employment. Current ai is hamstrung by the fact that companies developing it are so pleased by the quantity of available data to train them with that they ignore all other considerations, and so the output is absolutely dreadful. If you take the gramarly app or plug in, it can be very good at spotting when you have typed in something which is garbage, but it can be hilariously bad at suggesting valid alternatives which don't mangle the meaning. It also is rubbish at the task given to schoolchildren to determine things like if you should use which or witch, or their, there or the're. Copilot makes even worse mistakes, as you use it wanting quality code, but the codebases it was trained upon have programmers with less than 5 years experience, due to the exponential growth of programming giving a doubling of the number of programmers every 5 years. It also does nothing to determine the license the code was released under, thereby encouraging piracy and similar legal problems, and even if you could get away with claiming that it was generated by copilot and approved by you, it is not usually committed to version control that way, leaving you without an audit trail to defend yourself. To the extent you do commit it that way, it is not copyrightable in the us, so your companies lawyers should be screaming at you not to use it for legal reasons. Because no attempt was made as a first step to create an ai to quantify how bad the code was, the output is typically at the level of the average inexperienced programmer, so again, it should not be accepted uncritically, as you would not do so from a new hire, so why let the ai contribute equally bad code? The potential of ai is enormous, but the current commercial methodology would get your project laughed out of any genuinely peer reviewed journal as anything but a proof of concept, and until they start using better methods with their ai projects there are a lot of good reasons to not let them near anything you care about in anything but a trivial manner. Also as long as a significant percentage of lawmakers are as incompetent as you typical magazine republican representative we have no chance of producing a legal framework which has any relationship to the needs of the industry, pushing development to less regulated and less desirable locations, just like is currently done with alternative nuclear power innovations.
1
@gruntaxeman3740 because you cannot run the flight arrival status system an a desktop, it has to be where you can see it, which makes it hard to reach. same with kiosk type systems, and any type of system where you do not want the operator to have full access to the machine. anything working like an autopilot, where it matters that it does not fail while in use, and many other use cases.
1
@AndrewSmithDev what you are doing there is exploratory testing to understand the external library. A well written library will separate the core functionality from the user interface, so you can write a test providing your input with your guess as to the output, and correct the test if the result does not match. If the same input gives different outputs to the same input, use a different library because the non-deterministic nature of the library code responses will cause nothing but problems. How will you know if your code using the returned value is correct if you don't know what the returned value will be? The only way around this is to log the input to the library and every respone, and then create lots of tests covering every response to make sure it does not return something daft, and then expect previously unreported values to be returned in production. For most tasks that is a level of risk that is just not worth the costs involved for the value provided.
1
 @astronemir yes, having a dead American comedian working with them will show their quality levels.😂 I think you mean Dave Farley.
1
@nschoem that perception definitely exists, and is based upon intuitive feelings that writing tests with your code takes longer, which is true but not really relevant. What happens with feature addicted managers is that they start off saying get the feature to work and we can write the tests later. Then they prioritize the next feature over testing, resulting in no tests, and what few tests do get written are fragile because the only way to test most code that was not designed with tests in mind tend to rely on implementation details to work at all. This results in code with increasing levels of technical debt which gets harder and harder to debug and extend, making everything slower. The only way to fix this is by refactoring your way out of the problem, which needs tests, and test after tests are harder to write and fragile, so you end up writing tdd style tests for the refactored code so you can just delete those original tests as they cease being helpful. You still have to write the tests in either case if you have a long lived or large code base, but tdd style tests first tests tend to be API tests which don't depend on internal implementation details, and thus don't break much.
1
no this is a not doing ci or cd failure.
1
@lucashowell7653 the tests in tdd are unit tests and integration tests that assert that the code does what it did the last time the test was run. These are called regression tests, but unless they have high coverage and are run automatically with every commit you have large areas of code where you don't know when something broke. If the code was written before the tests, especially if the author isn't good at testing, it is hard to retrofit regression tests, and to the extent you succeed they tend to be flakier and more fragile. This is why it is better to write them first. Assuming that the code was written by someone who understands how to write testable code, you could use A.I. to create tests automatically, but then you probably would not have tests where you could understand easily what the test failing meant due to poor naming. When you get as far as doing continuous integration the problem is even worse, as the point of the tests is to prove that the code still does what the programmer understood was needed and document this, but software cannot understand this yet. If you go on to continuous delivery, you have additional acceptance tests whose purpose is prove that the programmer has the same understanding of what is needed as the customer, which requires an even higher level of understanding of the problem space, and software just does not understand either the customer or the programmer that well either now or in the near future. This means that to do the job well, the tests need to be written by humans to be easily understood, and the time which makes this easiest is to write one test, followed by the code to pass the test. For acceptance tests the easiest time is as soon as the code is ready for the customer to test, adding tests where the current version does not match customer needs. Remember customers don't even know what they need over 60% of the time.
1
@temper8281 this is the argument I've heard from everyone who can't write good tests, as well as from those who don't understand why testing matters. If you are just doing a little hobby project which nobody else is ever going to use, and is never going to be extended, then you can mostly get away with not having tests. As the project gets bigger, or more popular, or has more developers you need the tests to stop regressions, communicate between developers, spot coupling, and numerous other things, most notably to deal with technical debt. The more any of those factors rise, the higher the cost of shipping broken code, and thus the more the tests matter. By the time you need continuous integration you cannot do without the tests, but the harder it is to retrofit them to your existing legacy code base, so it is better to learn how to do testing earlier and add it to the project sooner.
1
I find that to be an easy question to answer. Write the test at the next level up only when you have tried to write it at the lower level and found it too hard. More importantly, if you later find that you can test the code using new lower level tests, remove those higher level tests that are redundant and therefore provide no added value. As you will perform new work using tdd,, the refactored new code will be more testable, moving the tests down the levels of the pyramid, until you end up with code using the higher lever tests which is little more than a translation layer to the new functions which exists only to support older code not moved to the newer, more testable API. At this point you can add development comments reporting that the API is deprecated and use whatever the new API is instead, further pushing refactoring of older code.
1
@mallninja9805 numerous books and articles have pointed out that testing is vital. Testing the happy path is a necessary first step to get the code running with safe configuration options and inputs. Testing everything else is needed to get it so that the stuff is of release quality. Lots of analysis of good codebases has found that the happy path tests can be as high as 1/5 of the tests for simple code, and a much smaller amount for complicated code. The same reports show that the test code can be as large as the system under test. Unfortunately, test code can have bugs as well,
1
Testing works best for code which has no side effects, and ui work is basically founded on top of side effects, so you have to already have some skill at testing to even consider writing tests for ui stuff and expecting them to work as well as they do for pure functions. Learn tdd against some non ui coding project, and then apply what you learned to the ui work. This will encourage you to separate the true ui stuff from everything else, and make both easier to test. This is not to say don't test ui stuff, only that you need to get above a certain level of skill in testing to do it well.
1
Too big a subject for a comment. It is covered extremely well in Michael feathers book "working effectively with legacy code". However in general, you write just enough of a test wrapper around the code you need to update, use that wrapper to refactor the code to make it easier to update while writing lower level tests for the changes, then do tdd for the new stuff. This takes you from just having a test that asks does it compile and run, to having a set of high level tests of the old stuff with a small but growing tdd base where you can gradually move the tests down the testing pyramid. Remember the code took a long time to get into it's current mess, and it will take a long time to refactor your way out of it.
1
@algernon69 but the idea of being at the edge still applies. Generally you have the stuff you write, the glue logic to ask what they write for the data, and the stuff at the edge. With the stuff you write, write the code in a tdd way so that your stuff is testable. This naturally moves the logic into separate functions which you can easily figure out the test data for, and thus init test. The rest is just how you ask their code for the data you need, which you then pass to your nice testable function, and thus you have moved their stuff to the edge, and in the process produced a much cleaner design.
1
@algernon69 mocks are not easy and are usually a sign that the code was not designed with testing in mind, but splitting your code up this way means that you end up with a lot less code having to be tested inside the mock, which makes the mock easier.
1
If you expect your first draft to be crap, still do tdd to evolve a good API, and use your tests to provide an executable specification of the API. At that point you can either throw away the code, in which case the.number of passing and failing tests gives you a measure of how far you are from fully reimplementing the API, or if your editor or ide can highlight dead code you just put your new code at the start of the function and delete the dead code as it gets made redundant. Because you are testing to the API, there is no reason to implement it using the same set of private functions you used in your first draft, and can refactor the new functions to be better as you generalise your new code.
1
@georganatoly6646 this is where the ci and cd distinction comes in useful. using c for illustrative purposes, you decide to write mylibrary. this gives you mylibrary.h which contains your public api, and mylibrary.c which contains your code which provides an implementation of that public api. to the extent your tests break this separation, they become fragile and implementation dependant. this is usually very bad. by implementing your unit and integration tests against the public api in mylibrary.h, you gain a number of benefits, including: 1, you can throw away mylibrary.c and replace it with a total rewrite, and the tests still work. to the extent it does not, you have either broken that separation, or you have not written the code to pass the test that failed. 2, you provide an executable specification of what you think the code should be doing. if the test then breaks, your change to mylibrary.c changed the behaviour of the code, breaking the specification. this lets you be the first one to find out if you do something wrong. 3, your suite of tests gives lots of useful examples of how to use the public api. this makes it easier for your users to figure out how to use the code, and provides you with detailed examples for when you write documentation. finally, you use the code in myprogram.c, and you have only added the functions you need to the library (until someone else starts using it in theirprogram.c, where the two programs might each have extra functions the other does not need, which should be pushed down into the library when it becomes obvious that the code should be there instead of in the program). you then use ci to compile and test the program, at which point you know that the code behaves as you understand it should. this is then passed to cd, where further acceptance tests are run, which determine if what you understood the behaviour matches what your customer understood the behaviour to be. if there is a mismatch found, you add more acceptance tests until it is well enough documented, and go back and fix the code until it passes the acceptance tests as well. at this point not only do you know that the code does what you expect it to do, but that this matches with what the customer expected it to do, in a way that immediately complains if you get a regression which causes any of the tests to fail. in your example, you failed because you did not have a program being implemented to use the code, so it was only at the acceptance test point that it was determined that there were undocumented requirements.
1
@deanschulze3129 there are reasons behind the answers to some of your questions, and I will try and address them here. First, the reason tdd followers take automated regression testing seriously is that a lot of the early advocates came from experience with large teams writing complex software which needed long development times. in that context, regression tests are not optional, as lots of people are making lots of changes to different parts of the code that they don't know very well. This led to the development of continuous integration, where code coverage for regression testing was essential. Tdd later came along after the development of continuous integration, with the added awareness of technical debt to add refactoring to the continuous integration cycle. You don't seem to understand just how recent the understanding of how to do regression testing is. Even the idea of what a unit test is was not present in the 2012 version of the book "the art of software testing", but it forms the base of the testing pyramid at the heart of regression testing. Also, automated regression testing cannot work unless you get management buy in to the idea that code needs tests, and broken tests are the most important code to fix, which is even harder to get quickly, but all of the tech giants do exactly that. You cannot do continuous integration without it. Even worse, you cannot learn good test practices trying to fit tests to code written without being tested in mind. The resulting tests tend to have to depend on implementation details and are often flakey and fragile, further pushing against the adoption of regression testing. As to statistics, the dora metrics produced from the annual state of Dev ops report clearly indicated that no testing produces the worst results, test after initially provides better results than no testing, but only up to a certain point due to the previously mentioned problems with retrofitting regression tests to code not designed for it, and test first produces ever faster production of code of higher quality than either of the other two. The methodology surrounding the report is given in detail in the accelerate book, by the authors of the state of Dev ops report as they got fed up of having to explain in detail to every new reader they encountered. Bear in mind, the number of programmers doubles every five years, so by definition most programmers have less than five years experience in any software development methodology, let alone advanced techniques. Those techniques are often not covered in training courses for new programmers, and sometimes are not even well covered in all degree level courses.
1
@trignals not really. the history of programming has been to migrate away from hard to understand, untestable, clever code which almost nobody can understand, towards code which better models the problem space and the design goals needed to get something good to do the job, which is easier to maintain due to the costs moving away from the hardware, then the initial construction, till most of the cost is now in the multi year maintainence mode. there are lots of people in lots of threads on lots of videos about the subject who seem to buy the hype that you can just throw statistical ai at legacy code, it will suddenly create massive amounts of easy to understand tests, which you can then throw at another ai which can just trivially create wonderful code which will replace that big ball of mud with optimum code behind optimum tests, where the whole system is basically ai generated tests and code, but built by systems which fundamentally can never reach the point of knowing the problem space and the design options as they fundamentally do not work that way. as some of those problems are analogous to the halting problem, i am fundamentally sceptical of the hype which goes on to suggest that if there is not enough data to create enough superficial correlations, then we can just go ahead and use ai to fake up some more data to improve the training of the other ai systems. as you can guess, a lot of these assumptions just do not make sense. a system which cannot model the software cannot then use the model it does not have to make high level design choices to make the code testable, it cannot then go on to use the analysis of the code it does not do to figure out the difference between bad code and good code, or to figure out how to partition the problem. finally, it cannot use that understanding it does not have to decide how big a chunk to regenerate, and if the new code is better than the old code. for green field projects, it is barely plausible that you might be able to figure out the right tests to give it to get it to generate something which does not totally stink, but i have my doubts. for legacy code, everything depends on understanding what is already there, and figuring out how to make it better, which is something these systems basically are not designed to be able to do.
1
They should not be, as even the windows eula says it is not suitable for safety critical usages. But there is a long history in embedded devices of just dumping a copy of Windows on stuff so you can get cheaper developers for the software, ignoring the fact that the licenses make the devices more expensive. There are good reasons why your smart TV, Internet hub, and not devices are not running windows.
1
Not only that, but when you break the API you feel the pain before your users even find out about it, which just goes to show that a lot of framework and core library developers are not doing regression testing and continuous integration.😂
1
@Jadestonk sure they do, but eventually someone else has to come along and fix up their legacy code mess to keep it working. This is one reason not allowing check in of code without tests generally improves the code base, but of course the same people will also produce rubbish tests because they never leared how to test.
1
the best answer to your question is to read the book "working effectively with legacy code" by michael feathers, which covers from starting with a big ball of mud, through to getting it under tests.
1
the problem with component tests is that they are often written as a highly coupled suite of tests which are implementation dependent, instead of using unit tests, and only integration testing the stuff not already covered by unit tests. this approach comes from the oop community, where their idea of a unit is the entire module, and their idea of unit testing is the entire suite of tests needed to test the module, which is not what a unit test actually is, therefore i find the unit & integration test split to be a much better and 100% replacement for the idea of component tests.
1