Comments by "" (@grokitall) on "Continuous Delivery" channel.

Too big a subject for a comment. It is covered extremely well in Michael feathers book "working effectively with legacy code". However in general, you write just enough of a test wrapper around the code you need to update, use that wrapper to refactor the code to make it easier to update while writing lower level tests for the changes, then do tdd for the new stuff. This takes you from just having a test that asks does it compile and run, to having a set of high level tests of the old stuff with a small but growing tdd base where you can gradually move the tests down the testing pyramid. Remember the code took a long time to get into it's current mess, and it will take a long time to refactor your way out of it.
1
@algernon69 but the idea of being at the edge still applies. Generally you have the stuff you write, the glue logic to ask what they write for the data, and the stuff at the edge. With the stuff you write, write the code in a tdd way so that your stuff is testable. This naturally moves the logic into separate functions which you can easily figure out the test data for, and thus init test. The rest is just how you ask their code for the data you need, which you then pass to your nice testable function, and thus you have moved their stuff to the edge, and in the process produced a much cleaner design.
1
@algernon69 mocks are not easy and are usually a sign that the code was not designed with testing in mind, but splitting your code up this way means that you end up with a lot less code having to be tested inside the mock, which makes the mock easier.
1
If you expect your first draft to be crap, still do tdd to evolve a good API, and use your tests to provide an executable specification of the API. At that point you can either throw away the code, in which case the.number of passing and failing tests gives you a measure of how far you are from fully reimplementing the API, or if your editor or ide can highlight dead code you just put your new code at the start of the function and delete the dead code as it gets made redundant. Because you are testing to the API, there is no reason to implement it using the same set of private functions you used in your first draft, and can refactor the new functions to be better as you generalise your new code.
1
@georganatoly6646 this is where the ci and cd distinction comes in useful. using c for illustrative purposes, you decide to write mylibrary. this gives you mylibrary.h which contains your public api, and mylibrary.c which contains your code which provides an implementation of that public api. to the extent your tests break this separation, they become fragile and implementation dependant. this is usually very bad. by implementing your unit and integration tests against the public api in mylibrary.h, you gain a number of benefits, including: 1, you can throw away mylibrary.c and replace it with a total rewrite, and the tests still work. to the extent it does not, you have either broken that separation, or you have not written the code to pass the test that failed. 2, you provide an executable specification of what you think the code should be doing. if the test then breaks, your change to mylibrary.c changed the behaviour of the code, breaking the specification. this lets you be the first one to find out if you do something wrong. 3, your suite of tests gives lots of useful examples of how to use the public api. this makes it easier for your users to figure out how to use the code, and provides you with detailed examples for when you write documentation. finally, you use the code in myprogram.c, and you have only added the functions you need to the library (until someone else starts using it in theirprogram.c, where the two programs might each have extra functions the other does not need, which should be pushed down into the library when it becomes obvious that the code should be there instead of in the program). you then use ci to compile and test the program, at which point you know that the code behaves as you understand it should. this is then passed to cd, where further acceptance tests are run, which determine if what you understood the behaviour matches what your customer understood the behaviour to be. if there is a mismatch found, you add more acceptance tests until it is well enough documented, and go back and fix the code until it passes the acceptance tests as well. at this point not only do you know that the code does what you expect it to do, but that this matches with what the customer expected it to do, in a way that immediately complains if you get a regression which causes any of the tests to fail. in your example, you failed because you did not have a program being implemented to use the code, so it was only at the acceptance test point that it was determined that there were undocumented requirements.
1
@deanschulze3129 there are reasons behind the answers to some of your questions, and I will try and address them here. First, the reason tdd followers take automated regression testing seriously is that a lot of the early advocates came from experience with large teams writing complex software which needed long development times. in that context, regression tests are not optional, as lots of people are making lots of changes to different parts of the code that they don't know very well. This led to the development of continuous integration, where code coverage for regression testing was essential. Tdd later came along after the development of continuous integration, with the added awareness of technical debt to add refactoring to the continuous integration cycle. You don't seem to understand just how recent the understanding of how to do regression testing is. Even the idea of what a unit test is was not present in the 2012 version of the book "the art of software testing", but it forms the base of the testing pyramid at the heart of regression testing. Also, automated regression testing cannot work unless you get management buy in to the idea that code needs tests, and broken tests are the most important code to fix, which is even harder to get quickly, but all of the tech giants do exactly that. You cannot do continuous integration without it. Even worse, you cannot learn good test practices trying to fit tests to code written without being tested in mind. The resulting tests tend to have to depend on implementation details and are often flakey and fragile, further pushing against the adoption of regression testing. As to statistics, the dora metrics produced from the annual state of Dev ops report clearly indicated that no testing produces the worst results, test after initially provides better results than no testing, but only up to a certain point due to the previously mentioned problems with retrofitting regression tests to code not designed for it, and test first produces ever faster production of code of higher quality than either of the other two. The methodology surrounding the report is given in detail in the accelerate book, by the authors of the state of Dev ops report as they got fed up of having to explain in detail to every new reader they encountered. Bear in mind, the number of programmers doubles every five years, so by definition most programmers have less than five years experience in any software development methodology, let alone advanced techniques. Those techniques are often not covered in training courses for new programmers, and sometimes are not even well covered in all degree level courses.
1
@trignals not really. the history of programming has been to migrate away from hard to understand, untestable, clever code which almost nobody can understand, towards code which better models the problem space and the design goals needed to get something good to do the job, which is easier to maintain due to the costs moving away from the hardware, then the initial construction, till most of the cost is now in the multi year maintainence mode. there are lots of people in lots of threads on lots of videos about the subject who seem to buy the hype that you can just throw statistical ai at legacy code, it will suddenly create massive amounts of easy to understand tests, which you can then throw at another ai which can just trivially create wonderful code which will replace that big ball of mud with optimum code behind optimum tests, where the whole system is basically ai generated tests and code, but built by systems which fundamentally can never reach the point of knowing the problem space and the design options as they fundamentally do not work that way. as some of those problems are analogous to the halting problem, i am fundamentally sceptical of the hype which goes on to suggest that if there is not enough data to create enough superficial correlations, then we can just go ahead and use ai to fake up some more data to improve the training of the other ai systems. as you can guess, a lot of these assumptions just do not make sense. a system which cannot model the software cannot then use the model it does not have to make high level design choices to make the code testable, it cannot then go on to use the analysis of the code it does not do to figure out the difference between bad code and good code, or to figure out how to partition the problem. finally, it cannot use that understanding it does not have to decide how big a chunk to regenerate, and if the new code is better than the old code. for green field projects, it is barely plausible that you might be able to figure out the right tests to give it to get it to generate something which does not totally stink, but i have my doubts. for legacy code, everything depends on understanding what is already there, and figuring out how to make it better, which is something these systems basically are not designed to be able to do.
1
They should not be, as even the windows eula says it is not suitable for safety critical usages. But there is a long history in embedded devices of just dumping a copy of Windows on stuff so you can get cheaper developers for the software, ignoring the fact that the licenses make the devices more expensive. There are good reasons why your smart TV, Internet hub, and not devices are not running windows.
1
Not only that, but when you break the API you feel the pain before your users even find out about it, which just goes to show that a lot of framework and core library developers are not doing regression testing and continuous integration.😂
1
@Jadestonk sure they do, but eventually someone else has to come along and fix up their legacy code mess to keep it working. This is one reason not allowing check in of code without tests generally improves the code base, but of course the same people will also produce rubbish tests because they never leared how to test.
1
the best answer to your question is to read the book "working effectively with legacy code" by michael feathers, which covers from starting with a big ball of mud, through to getting it under tests.
1
the problem with component tests is that they are often written as a highly coupled suite of tests which are implementation dependent, instead of using unit tests, and only integration testing the stuff not already covered by unit tests. this approach comes from the oop community, where their idea of a unit is the entire module, and their idea of unit testing is the entire suite of tests needed to test the module, which is not what a unit test actually is, therefore i find the unit & integration test split to be a much better and 100% replacement for the idea of component tests.
1