Comments by "" (@grokitall) on "Continuous Delivery" channel.

  1. Your comment demonstrates some of the reasons people don't get tdd. First, you are equating the module in your code as a unit, and then equating the module test suite as the unit test, and then positing that you have to write the entire test suite before you write the code. This just is not how modern testing defines a unit test. An example of a modern unit test would be a simple test that when given the number to enter into the cell perform a check to see if the number is between 1 and the product of the grid sizes and returns a true or false value. For example your common sudoku uses a 3 x 3 grid, requiring that the number be less than or equal to 9, so it would take the grid parameters, cache the product, check the value was between 1 and 9, and return true or false based on the result. This would all be hidden behind an API, and you would test that given a valid number it would return true. You would then run the test, and prove that it fails. A large number of tests written after the fact can pass not only when you run the test, but also then you either invert the condition, or comment out the code which supplies the result. You would then write the first simple code that provided the correct result, run the test, see it pass, and then you have validated your regression test in both the passing and failing mode, giving you an executable specification of the code covered by that test. You would also have a piece of code which implements that specification, and also a documented example of how to call that module and what it's parameters are for use when writing the documentation. Assuming that it was not your first line of code you would then look to see if the code could be generalized, and if it could you would then refactor the code, which is now easier to do because it already has the regression tests for the implemented code. You would then add another unit test, which might check that the number you want to add isn't already used in a different position, and go through the same routine again, and then another bit of test and another bit of code, all the while growing your test suite until you have covered the whole module. This is where test first wins, by rapidly producing the test suite, and the code it tests, and making sure that the next change doesn't break something you have already written. This does require you to write the tests first, which some people regard as slowing you down, but if you want to know that your code works before you give it to someone else, you either have to take the risk that it is full of bugs, or you have to write the tests anyway for continuous integration, so doing it first does not actually cost you anything. It does however gain you a lot. First, you know your tests will fail. Second you know that when the code is right they will pass. third, you can use your tests as examples when you write your documentation. fourth, you know that the code you wrote is testable, as you already tested it. fifth, you can now easily refactor, as the code you wrote is covered by tests. sixth, it discourages the use of various anti patterns which produce hard to test code. there are other positives, like making debugging fairly easy, but you get my point. as your codebase gets bigger and more complex, or your problem domain gets less well understood initially, the advantages rapidly expand, while the disadvantages largely evaporate. the test suite is needed for ci and refactoring, and the refactoring step is needed to handle technical debt.
    9
  2. 3
  3. 3
  4. 3
  5.  @julianbrown1331  partly it is down to the training data, but the nature of how they work does not filter them by quality either before, during, or after training, so lots of these systems are producing code which is as bad as that produced in the average training data, most of which is produced by newbies learning either the languages or the tools. also, you misrepresent how copyright law works in practice. when someone claims you are using their code, they only have to show that it is a close match. to avoid summary judgement against you, you have to show that it is a convergent solution from the constraints of the problem space, and that there was no opportunity to copy the code. given that there have been studies showing that for edge cases with very few examples they have produced identical code snippets right down to the comments in the code, good luck proving no chance to copy the code. just saying i got it from microsoft copilot does not relieve you of the responsibility to audit the origins of the code. even worse, microsoft cannot prove it was not copied either, as the nature of statistical ai obfuscates how got from the source data to the code they gave you. even worse, the training data does not even flag up which license the original code was under, so you could find yourself with gpl code with matching comments leaving you with your only choice being to release your proprietary code under the gpl to avoid triple damages and comply with the license. on top of that, the original code is usually not written to be security or testability aware, so it has security holes, is hard to test, and you can't fix it.
    2
  6. 2
  7. 2
  8. 2
  9. 2
  10. 2
  11. 2
  12. 2
  13. 2
  14. 2
  15. 2
  16. 2
  17. 1
  18. 1
  19. 1
  20. 1
  21. 1
  22.  @ulrichborchers5632  dealing with your points in no particular order. 1, the EU did not favour competition over quaIty. Having previously found Microsoft guilty of trying to extend their os monopoly to give them control of the browser market, they flagged up to Microsoft that their proposed change was equally guilty, and thus illegal with regards to the security market. All they said to Microsoft was go have another think and look for a legal way to do what you want to do. Making the api accessible to all security firms equally would have solved that problem, removing the attempt to expand their monopoly, but instead of that, Microsoft chose to abandon the proposed api instead. This was Microsofts choice, not the eu's. When apple had the same choice later, they provided a kernel api to all user land programs which could thus move the user code out of the kernel, which is usually a good idea. The fact that the change was initially proposed and that apple later did basically the same thing shows that the idea had merit. The fact that Microsoft chose to roll back the proposed change rather than share it with everyone was a commercial decision at Microsoft, not a technical one, as can be seen from the fact that a lot later they basically had to copy apple. The EU literally had nothing to say about the technical merits or the options Microsoft had, they just said you can't do that, it's illegal, have another go and get back to us. There were multiple options for Microsoft, who chose for commercial reasons to just roll back the change, but it was their choice. 2, third party code in the kernel. There are various good reasons to have third party code in the kernel, and multiple ways you can handle it. One reason is speed, which is the case for low level drivers like graphics. Another is the level of access needed to do the job, which is the case for security software. Linux handles it by requiring this code to be added to the kernel under the same gpl2 license as the rest of the code, or if you want to keep it private, you get a lower level of access to some of the api's. Closed source systems like Microsoft have a number of options. They tried just letting any old garbage in, which was one of the reasons the win95 to windows me consumer kernel was such a flakes and crash prone pice of garbage. They tried forcing the graphics drivers into user space, which is why windows 3.11 did not really crash that much, but this made the thing slow. They tried going through the driver signing route, but that is underfunded and thus both slow and expensive, but more importantly, cloudstrike showed that by allowing the code to be a binary blob, it was easy to subvert. The backdoor as as you called it was basically a repl inside the driver, which partly makes sense for this use case, but the way it was implemented was as another binary blob which their own code did not get to see. The only way to handle this safely is to create a cryptographic hash at creation time, which would be easy to check at download time to spot corruption. They did not do this. This only tells you it did not get corrupted in flight, so you also need testing. While they claimed to be doing this, they did not do it. Next, as it is kernel code, you provide a crash detection method which rolls back to an earlier version if it can't boot. Again they claimed to do this but didn't. Lastly you allow critical systems to run the previous version. Again they claimed to do this, but the repl binaries were explicitly excluded from this, so when the primaries crashed due to the bad update, they tried switching to systems which were supposed to run the previous version, and found that it only applied to the core code, and not the binary repl blobs, so as soon as these were updated with those blobs they crashed as well. I find their argument that they did not have time for testing and canary releasing as specious as the security through obscurity argument. If you don't have time for tests, you really don't have time to manually reboot all the machines you crashed. The testing time to determine that your new update crashed the system is minimal. Just install the update to some local machines, reboot, and have the machine ping another machine to say it booted up OK, and you are done. They could have done this easily, but we know from the way that they crashed the Linux kernel with a previous bug (which happened to be in the kernel) that they did not do this. We have known how not to do this for at least 3 decades for user space code, and for kernel space code you should be even more careful, but they just could not be bothered to do any of the things which would have turned this into a non event, including doing what they told their customers they were doing. This is why the lawsuits are coming, and why at least some of them have a good chance of winning against them.
    1
  23. 1
  24. 1
  25. 1
  26. 1
  27. 1
  28. 1
  29. You call them requirements, which generally implies big upfront design, but if you call them specifications it makes things clearer. Tdd has three phases. In the first phase, you write a simple and fast test to document the specifications of the next bit of code you are going to write. Because you know what that is you should understand the specification well enough to write a test that is going to fail, and then it fails. This gives you an executable specification of that piece of code. If it doesn't fail you fix the test. Then you write just enough code to meet the specification, and it passes, proving the test good because it works as expected and the code good because it meets the specification. If it still fails you fix the code. Finally you refactor the code, reducing technical debt, and proving that the test you wrote is testing the API, not an implementation detail. If the valid refactoring breaks the test you fix the test, and keep fixing it until you get it right. At any point you can spot another test, make a note of it, and carry on, and when you have completed the cycle you can pick another test from your notes, or write a different one. In this way you grow your specification with your code, and is it incrementally to feed back into the higher level design of your code. Nothing stops you from using A.I. tools to produce higher level documentation from your code to give hints at the direction your design is going in. This is the value of test first, and even more so of tdd. It encourages the creation of an executable specification of the entirety of your covered codebase, which you can then throw out and reimplement if you wish. Because test after, or worse, does not produce this implementation independent executable specification it is inherently weaker. The biggest win from tdd is that people doing classical tdd well do not generally write any new legacy code, which is not something you can generally say about those who don't practice it. If you are generally doing any form of incremental development, you should have a good idea as to the specifications of the next bits of code you want to add. If you don't you have much bigger problems than testing. This is different from knowing all of the requirements for the entire system upfront, you just need to know enough to do the next bit. As to the issue of multi threading and micro services, don't do it until you have to and then do just enough. Anything else multiplies the problems massively before you need to.
    1
  30. 1
  31. 1
  32. 1
  33. 1
  34. 1
  35. 1
  36. 1
  37. 1
  38. 1
  39. 1
  40. 1
  41. 1
  42. 1
  43. 1
  44. 1
  45. 1
  46. 1
  47. 1
  48. 1
  49. 1
  50. 1
  51. 1
  52. 1
  53. 1
  54. 1
  55. ​ @lapis.lazuli. from what little info has leaked out, a number of things can be said about what went wrong. first, the file seems to have had a big block replaced with zeros. if it was in the driver, it would be found with testing on the first machine you tried it on, as lots of tests would just fail, which should block the deployment. if it was a config file, or a signature file, lots of very good programmers will write a checker so that a broken file will not even be allowed to be checked in to version control. in either case, basic good practice testing should have caught it, and then stopped it before it even went out of the door. as that did not happen, we can tell that their testing regime was not good. then they were not running this thing in house. if it was, the release would have been blocked almost immediately. then they did not do canary releasing, and specifically the software did not include smoke tests to ensure it even got to the point of allowing the system to boot. if it had, the system would have disabled itself when the machine rebooted the second time and had not set a simple flag to say yes it worked. it could then have also phoned home, flagging up the problem and blocking the deployment. according to some reports, they also made this particular update ignore customer upgrade policies. if so, they deserve everything thrown at them. some reports even go as far as to say that some manager specifically said to ship without bothering to do any tests. in either case, a mandatory automatic update policy for anything, let alone some kernel module is really stupid.
    1
  56. 1
  57. 1
  58. not really, we all know how hard it is to fix bad attitudes in bosses. in the end it comes down to the question of which sort of bad boss you have. if it is someone who makes bad choices because they don't know any better, train them by ramming home the points they are missing at every oportinity until they start to get it. for example if they want to get a feature out the door quick, point out that by not letting you test, future changes will be slower. if they still don't let you test, point out that now it is done, we need to spend the time to test, and to get it right, or the next change will be slower. if they still did not let you test, when the next change comes along, point out how it will now take longer to do, as you still ha e to do all the work you were not allowed to do before, to get it into a shape where it is easy to add the new stuff. if after doing that for a while, there is still no willingness to let you test, then you have a black boss. with a black boss, their only interest is climbing the company ladder, and they will do anything to make themselves look good in the short term to get the promotion. the way to deal with this is simply to get a paper trail of every time you advise him of why something is a bad idea, and him forcing you to do it anyway. encourage your colleagues to do the same. eventually one of the inevitable failures will get looked into, and his constantly ignoring advice and trying to shift blame to other will come to light. in the worst case, you won't be able to put up with his crap any more, and will look for another job. when you do, make sure that you put all his behaviour in the resignation letter, and make sure copies go directly to hr and the ceo, who then will wonder what is going on and in a good company will look to find out.
    1
  59. 1
  60. 1
  61. 1
  62. 1
  63. 1
  64. 1
  65. 1
  66. 1
  67. 1
  68. 1
  69. 1
  70. 1
  71. 1
  72. 1
  73. 1
  74. 1
  75. 1
  76. 1
  77. 1
  78. 1
  79.  @ansismaleckis1296  the problem with branching is that when you take more than a day between merges, it becomes very hard to keep the build green and pushes you towards merge hell. the problem with code review and pull requests is that when you issue the pull request and then have to wait for code review before the merge, it slows everything down. this in turn makes it more likely that the patches will get bigger, which take longer to review, making the process slower and harder, and thus more likely to miss your 1 day merge window. the whole problem comes from the question of what is the purpose of version control, and it is to do a continuous backup against every change. however this soon turned out to be of less use than expected because most backups ended up in a broken state, sometimes going months between releasable builds. this made most of the backups to be of very little value. the solution to this turned out to be smaller patches merged more often, but the pre merge manual review was found not to scale well, so a different solution was needed, which turned out to be automated regression tests against the public api, which guard against the next change breaking existing code. this is what continuous integration is, running all those tests to make sure nothing broke. the best time to write the tests was before you wrote the code, as then you have tested that the test can fail and pass. this tells us that the code does what the developer intended it to do. tdd adds refactoring into the cycle, which further checks the test to make sure it does not depend on the implementation. the problem with not merging often enough is that it breaks refactoring. either you cannot do it, or the merge for the huge patch needs to manually apply the refactoring to the unmerged code. continuous delivery takes the output from continuous integration, which is all the deployable items, and runs every other sort of test against it trying to prove it unfit for release.if it fails to find any issues, then it can then be deployed. the deployment can then be done using canary releasing, with chaos engineering being used to test the resilience of the system, performing a roll back if needed. it looks too good to be true, but is what is actually done by most of the top companies in the dora state of devops report.
    1
  80. alpha beta and releasable date back to the old days of pressed physical media, and their meanings have changed somewhat in the modern world of online updates. originally, alpha software was not feature complete, and was also buggy as hell, and thus was only usable for testing what parts worked, and which parts didn't. beta software occurred when your alpha software became feature complete, and the emphasis moved from adding features to bug fixing and optimisation, but it was usable for non business critical purposes. when beta software was optimised enough, with few enough bugs, it was then deemed releasable, and sent out for pressing in the expensive factory. later, as more bugs were found by users and more optimisations were done you might get service packs. this is how windows 95 was developed, and it shipped with 4 known bugs, which hit bill gates at the product announcement demo to the press, after the release had already been printed. after customers got their hands on it the number of known bugs in the original release eventually went up to 15,000. now that online updates are a thing, especially when you do continuous delivery, the meanings are completely different. alpha software on its initial release is the same as it ever was, but now the code is updated using semantic versioning. after the initial release, both the separate features and the project as a whole have the software levels mentioned above. on the second release, the completed features of version 1 have already moved into a beta software mode, with ongoing bug fixes and optimisations. the software as a whole remains in alpha state, until it is feature complete, and the previous recommendations still apply, with one exception. if you write code yourself that runs on top of it, you can make sure you don't use any alpha level features. if someone else is writing the code, there is no guarantee that the next update to their code will not depend on a feature that is not yet mature, or even implemented if the code is a compatability library being reimplemented. as you continue to update the software, you get more features, and your minor version number goes up. bug fixes don't increase the minor number, only the patch number. in general, the project is moving closer to being feature complete, but in the meantime, the underlying code moves from alpha to beta, to maintainance mode, where it only needs bug fixes as new bugs are found. thus you can end up with things like reactos, where it takes the stable wine code, removes 5 core libraries which are os specific, and which it implements itself, and produces something which can run a lot of older windows programs at least as well as current wine, and current windows. however it is still alpha software because it does not fully implement the total current windows api. wine on the other hand is regarded as stable, as can be seen from the fact that its proton variant used by steam can run thousands of games, including some that are quite new. this is because those 5 core os specific libraries do not need to implement those features, only translate them from the windows call to the underlying os calls. the software is released as soon as that feature is complete, so releasable now does not mean ready for for an expensive release process, but instead means that it does not have any major regressions as found by your ci and cd processes. the software as a whole can remain alpha until feature complete, which can take a while, or if you are writing something new, it can move to beta as soon as you decide that enough of it is good enough, and when those features enter maintainance mode, it can be given a major version increment. this is how most projects now reach their next major version, unless they are a compatability library. so now code is split into 2 branches, stable and experimental, which then has code moved to stable when ci is run, but it is not turned on until it is good enough, so you are releasing the code at every release, but not enabling every feature. so now the project is alpha (which can suddenly crash or lose data), beta (which should not crash but might be slow and buggy) or stable (where it should not be slow, should not crash, and should have as few bugs as possible). with the new way of working, alpha software often is stable as long as you don't do something unusual, in which case it might lose data or crash. beta software now does not usually crash, but can still be buggy, and the stable parts are ok for non business critical use, and stable software should not crash, lose data or otherwise misbehave, and should have as few known bugs as possible, thus making it usable for business critical use. a different way of working with very different results.
    1
  81. 1
  82. 1
  83. 1
  84. the problem with the idea of using statistical ai for refactoring is that the entire method is about producing plausible hallucinations that conform to very superficial correlations. to automate refactoring, you need to understand why the current code is wrong in this context. this is fundamentally outside the scope of how these systems are designed to work, and no minor tweaking can remove the lack of understanding from the underlying technology. the only way around this is to use symbolic ai, like expert systems or the cyc project, but that is not where the current money is going. given the current known problems with llm generated code, lots of projects are banning it completely. these issues include: exact copies of the training data right down to the comments, leaving you open to copyright infringement. producing code with massive security bugs due to the training data not being written to be security aware. producing hard to test code, due to the training data not being written with testing in mind. the code being suggested being identical to code under a different license, leaving you open to infringement claims. when the code is identified as generated, it is not copyrightable, but if you don't flag it up it moves the liability for infringement to the programmer. the only way to fix generating bad code is to completely retrain from scratch, which does not guarantee fixing the problem and risks introducing more errors. these are just some of the issues of statistical methods, there are many more.
    1
  85. 1
  86. 1
  87. 1
  88. ​ @ContinuousDelivery this is exactly the correct analogy to use. In science what you are doing is crowd sourcing the tests based upon existing theories and data, and using the results to create new tests, data and theories. Peer review is then equivalent of running the same test suite on different machines with different operating systems and library versions to see what breaks due to unspecified assumptions and sensitivity to initial conditions. This then demonstrates that the testing is robust, and any new data can be fed back into improving the theory. And like with science, the goal is falsifiability of the initial assumptions. Of course the other problem is that there is a big difference between writing code and explaining it, and people are crap at explaining things they are perfectly good at doing. Testing is just explaining it with tests, and the worst code to learn the skill on is legacy code with no tests. So people come along and try to fit tests to legacy code only to find that the tests can only be implemented as flaky and fragile tests due to the code under test not being designed for testability, which just convinces them that testing is not worth it. What they actually need is to take some tdd project which evolved as bugs we're found, delete the tests, and compare how many and what types of bugs they find as they step through the commit history. If someone was being really nasty they could delete the code, and reimplement it with a bug for every test until they got code with zero passes, and then see what percentage of bugs they found when they implemented their own test suite.
    1
  89. Tdd comes with a number of costs and benefits, and so does not doing tdd or continuous integration. The cost of doing tdd is that you move your regression tests to the front of the process, and refactor as you go and it can cost up to 35 percent extra in time to market.. What you get back is an executable specification anyone can run to reimplement the code in the form of tests, a set of code designed to be testable with very few bugs, and the combination is optimized for doing continuous integration. You also spend very little time on bug hunting. it also helps with areas that are heavily regulated as you can demonstrate on an ongoing basis that it meets the regulations. All of this helps with getting customers to come back for support, and for repeat business. Not doing tdd also comes with benefits and costs. The benefit Is mainly that your initial code dump comes fast, giving a fast time to market. The costs are significant. As you are not doing incremental testing, the code tends to be hard to test and modify. It also tends to be riddled with bugs which take a long time to find and fix. Due to the problem of being hard to modify, it is also hard to extend, and if they have to get someone else to fix it it can sometimes be quicker to just reimplement the whole thing from scratch. This tends to work against getting support work and repeat business. As for the snowflake code no one will touch, it will eventually break, at which point you end up having to do the same work anyway, but on an emergency basis with all the costs that implies. Testing is like planting a tree, the best time to do it is a number of years ago, the second best time is now. The evidence for incremental development with testing is in, in the dora reports. Not testing is a disaster. Test after gives some advantages initially, while costing more, but rapidly plataus. Test first cost a very little more than comprehensive test after, but as more code I covered you get an ever accelerating speed of improvements and ease of implementation of those improvements, and it is very easy for others to come along and maintain and expand the code, assuming they don't ask you to do the maintenance and extensions.
    1
  90. I doubt it, but you do not need them. If you look at history you can see multitudes of examples of new tech disrupting industries, and project that onto what effect real ai will have. Specialisation lead us away from being surfs, automation removed horses as primary power sources, and changed us from working near 18 hour days seven days per week towards the current 40 hour 5 day standard. Mechanisation also stopped us using 98 percent of the population for agriculture, moving most of them to easier, lower hour, better paying work. This lead to more office work, where wordprocessors and then computers killed both the typing pool and the secretarial pool, as bosses became empowered to do work that used to have to be devolved to secretaries. As computers have become more capable they have spawned multiple new industries with higher creative input, and that trend will continue, with both ai and,additive manufacturing only speeding up the process. The tricky bit is not having the industrial and work background change, but having the social, legal and ethical background move fast enough to keep up. When my grandfather was born, the majority of people still worked on the land with horses, we did not have powered flight, and the control systems for complex mechanical systems were cam shafts and simple feedback systems. When I was born, we had just stepped on the moon, computers had less power than a modern scientific calculator app on your smartphone, and everyone was trained at school on the assumption of a job for life. By the time I left school, it became obvious that the job for life assumption was on it's way out from the early seventies, and we needed to train people in school for lifelong learning instead, which a lot of countries still do not do. By the year 2000, it became clear that low wage low skilled work was not something to map your career around, and that you needed to continually work to upgrade your skills so that when you had to change career after less than 20 years, you had options for other, higher skilled and thus higher paid employment. Current ai is hamstrung by the fact that companies developing it are so pleased by the quantity of available data to train them with that they ignore all other considerations, and so the output is absolutely dreadful. If you take the gramarly app or plug in, it can be very good at spotting when you have typed in something which is garbage, but it can be hilariously bad at suggesting valid alternatives which don't mangle the meaning. It also is rubbish at the task given to schoolchildren to determine things like if you should use which or witch, or their, there or the're. Copilot makes even worse mistakes, as you use it wanting quality code, but the codebases it was trained upon have programmers with less than 5 years experience, due to the exponential growth of programming giving a doubling of the number of programmers every 5 years. It also does nothing to determine the license the code was released under, thereby encouraging piracy and similar legal problems, and even if you could get away with claiming that it was generated by copilot and approved by you, it is not usually committed to version control that way, leaving you without an audit trail to defend yourself. To the extent you do commit it that way, it is not copyrightable in the us, so your companies lawyers should be screaming at you not to use it for legal reasons. Because no attempt was made as a first step to create an ai to quantify how bad the code was, the output is typically at the level of the average inexperienced programmer, so again, it should not be accepted uncritically, as you would not do so from a new hire, so why let the ai contribute equally bad code? The potential of ai is enormous, but the current commercial methodology would get your project laughed out of any genuinely peer reviewed journal as anything but a proof of concept, and until they start using better methods with their ai projects there are a lot of good reasons to not let them near anything you care about in anything but a trivial manner. Also as long as a significant percentage of lawmakers are as incompetent as you typical magazine republican representative we have no chance of producing a legal framework which has any relationship to the needs of the industry, pushing development to less regulated and less desirable locations, just like is currently done with alternative nuclear power innovations.
    1
  91. 1
  92. 1
  93. 1
  94. 1
  95. 1
  96. ​ @lucashowell7653 the tests in tdd are unit tests and integration tests that assert that the code does what it did the last time the test was run. These are called regression tests, but unless they have high coverage and are run automatically with every commit you have large areas of code where you don't know when something broke. If the code was written before the tests, especially if the author isn't good at testing, it is hard to retrofit regression tests, and to the extent you succeed they tend to be flakier and more fragile. This is why it is better to write them first. Assuming that the code was written by someone who understands how to write testable code, you could use A.I. to create tests automatically, but then you probably would not have tests where you could understand easily what the test failing meant due to poor naming. When you get as far as doing continuous integration the problem is even worse, as the point of the tests is to prove that the code still does what the programmer understood was needed and document this, but software cannot understand this yet. If you go on to continuous delivery, you have additional acceptance tests whose purpose is prove that the programmer has the same understanding of what is needed as the customer, which requires an even higher level of understanding of the problem space, and software just does not understand either the customer or the programmer that well either now or in the near future. This means that to do the job well, the tests need to be written by humans to be easily understood, and the time which makes this easiest is to write one test, followed by the code to pass the test. For acceptance tests the easiest time is as soon as the code is ready for the customer to test, adding tests where the current version does not match customer needs. Remember customers don't even know what they need over 60% of the time.
    1
  97. 1
  98. 1
  99. 1
  100. 1
  101. 1
  102. 1
  103. 1
  104. 1
  105. ​ @georganatoly6646 this is where the ci and cd distinction comes in useful. using c for illustrative purposes, you decide to write mylibrary. this gives you mylibrary.h which contains your public api, and mylibrary.c which contains your code which provides an implementation of that public api. to the extent your tests break this separation, they become fragile and implementation dependant. this is usually very bad. by implementing your unit and integration tests against the public api in mylibrary.h, you gain a number of benefits, including: 1, you can throw away mylibrary.c and replace it with a total rewrite, and the tests still work. to the extent it does not, you have either broken that separation, or you have not written the code to pass the test that failed. 2, you provide an executable specification of what you think the code should be doing. if the test then breaks, your change to mylibrary.c changed the behaviour of the code, breaking the specification. this lets you be the first one to find out if you do something wrong. 3, your suite of tests gives lots of useful examples of how to use the public api. this makes it easier for your users to figure out how to use the code, and provides you with detailed examples for when you write documentation. finally, you use the code in myprogram.c, and you have only added the functions you need to the library (until someone else starts using it in theirprogram.c, where the two programs might each have extra functions the other does not need, which should be pushed down into the library when it becomes obvious that the code should be there instead of in the program). you then use ci to compile and test the program, at which point you know that the code behaves as you understand it should. this is then passed to cd, where further acceptance tests are run, which determine if what you understood the behaviour matches what your customer understood the behaviour to be. if there is a mismatch found, you add more acceptance tests until it is well enough documented, and go back and fix the code until it passes the acceptance tests as well. at this point not only do you know that the code does what you expect it to do, but that this matches with what the customer expected it to do, in a way that immediately complains if you get a regression which causes any of the tests to fail. in your example, you failed because you did not have a program being implemented to use the code, so it was only at the acceptance test point that it was determined that there were undocumented requirements.
    1
  106. ​ @deanschulze3129 there are reasons behind the answers to some of your questions, and I will try and address them here. First, the reason tdd followers take automated regression testing seriously is that a lot of the early advocates came from experience with large teams writing complex software which needed long development times. in that context, regression tests are not optional, as lots of people are making lots of changes to different parts of the code that they don't know very well. This led to the development of continuous integration, where code coverage for regression testing was essential. Tdd later came along after the development of continuous integration, with the added awareness of technical debt to add refactoring to the continuous integration cycle. You don't seem to understand just how recent the understanding of how to do regression testing is. Even the idea of what a unit test is was not present in the 2012 version of the book "the art of software testing", but it forms the base of the testing pyramid at the heart of regression testing. Also, automated regression testing cannot work unless you get management buy in to the idea that code needs tests, and broken tests are the most important code to fix, which is even harder to get quickly, but all of the tech giants do exactly that. You cannot do continuous integration without it. Even worse, you cannot learn good test practices trying to fit tests to code written without being tested in mind. The resulting tests tend to have to depend on implementation details and are often flakey and fragile, further pushing against the adoption of regression testing. As to statistics, the dora metrics produced from the annual state of Dev ops report clearly indicated that no testing produces the worst results, test after initially provides better results than no testing, but only up to a certain point due to the previously mentioned problems with retrofitting regression tests to code not designed for it, and test first produces ever faster production of code of higher quality than either of the other two. The methodology surrounding the report is given in detail in the accelerate book, by the authors of the state of Dev ops report as they got fed up of having to explain in detail to every new reader they encountered. Bear in mind, the number of programmers doubles every five years, so by definition most programmers have less than five years experience in any software development methodology, let alone advanced techniques. Those techniques are often not covered in training courses for new programmers, and sometimes are not even well covered in all degree level courses.
    1
  107.  @trignals  not really. the history of programming has been to migrate away from hard to understand, untestable, clever code which almost nobody can understand, towards code which better models the problem space and the design goals needed to get something good to do the job, which is easier to maintain due to the costs moving away from the hardware, then the initial construction, till most of the cost is now in the multi year maintainence mode. there are lots of people in lots of threads on lots of videos about the subject who seem to buy the hype that you can just throw statistical ai at legacy code, it will suddenly create massive amounts of easy to understand tests, which you can then throw at another ai which can just trivially create wonderful code which will replace that big ball of mud with optimum code behind optimum tests, where the whole system is basically ai generated tests and code, but built by systems which fundamentally can never reach the point of knowing the problem space and the design options as they fundamentally do not work that way. as some of those problems are analogous to the halting problem, i am fundamentally sceptical of the hype which goes on to suggest that if there is not enough data to create enough superficial correlations, then we can just go ahead and use ai to fake up some more data to improve the training of the other ai systems. as you can guess, a lot of these assumptions just do not make sense. a system which cannot model the software cannot then use the model it does not have to make high level design choices to make the code testable, it cannot then go on to use the analysis of the code it does not do to figure out the difference between bad code and good code, or to figure out how to partition the problem. finally, it cannot use that understanding it does not have to decide how big a chunk to regenerate, and if the new code is better than the old code. for green field projects, it is barely plausible that you might be able to figure out the right tests to give it to get it to generate something which does not totally stink, but i have my doubts. for legacy code, everything depends on understanding what is already there, and figuring out how to make it better, which is something these systems basically are not designed to be able to do.
    1
  108. 1
  109. 1
  110. 1
  111. 1
  112. 1