General statistics
List of Youtube channels
Youtube commenter search
Distinguished comments
About

ThePrimeTime
comments

Comments by "" (@grokitall) on "ThePrimeTime" channel.

of course they are. llms are based on statistical models, and only provides something plausible. it contains zero feedback in the model to try and improve the likelihood that it has any connection to reality. for that you need white box ai systems, which statistical systems can never be. white box systems require a different level of quality in the input data, which is then kept so you can trace where the rules came from. you also need to quality check the output to make sure it is not producing garbage, and then you need to find out what created the crap. the biggest problem with using llms is that they are currently automated copyright infringement machines, leaving both the creator of the llm and the user legally exposed, which is why most open source projects are looking at an outright ban on ai generated content. also you are left with no defense, as the user just says i trusted the output of the ai, and the owner cannot track back to exclude it from being an exact copy. keep an eye out for the masses of lawsuits currently being planned.
8
cloudflare have data centers in the us, and others here have said that trying to force a contract is clearly illegal in all 50 states. not being allowed to talk to anyone except sales, and 24 hours notice from saying it is mandatory to cutting you off while refusing to provide any information as to any issues and why and how the upgrade will mitigate those issues leaves them on shaky legal grounds. retaliating early for saying you are also looking for alternatives in case a solution cannot be found removes any realistic hope of a defense that cloudflare were operating in good faith. as has been pointed out elsewhere here, you can put whatever you like in the contract, but it does not make it legal until it has been successfully defended in court, as trump found out in his new york fraud case.
7
to be fair to the companies, a lot of them were shocked that the n-1 and n-2 policies provided by the configuration tool did not also apply to the live update definition files.
5
@ChrisWijtmans they don't take it away from you that way, they deactivate your hosting then demand large fees to get control of the dns entries away from them. there is a similar problem of such disreputable practices with limited company formations in the uk.
4
except this is exactly when you cannot use online for it, as your secret client document is then fed into the model, and might get presented back to your opponent when they do the same. book authors have already spotted this happening a lot, and lawsuits are already on their way.
4
dave farley looked at how the report was done because it did not make sense, and totally ripped it to shreds for bad methodology. it was a report done to produce selling points for the book.
3
this applies not only to llms, but to the entire field of black box statistical ai. sometimes good enough answers will work for your use case, but a lot of the time you not only need to know that you got the right answer, but also how you got that answer, and why it is right. legal and man critical uses spring to mind, but there are many others. the only answer to this is to use white box symbolic ai, where you can drill down and find those answers.
3
But if you have to do mocking, because you cannot do unit testing, this is itself a sign that you are writing hard to test code, and is usually a sign that the idea of writing tests was not acted upon until after the code was written. If you write the tests first, you don't write the code in this way, and tend to produce code which is better. But of course you have to break your bad coding habits first, often before you see the design advantages of doing test first design.
3
@ChrisWijtmans i think this should be viewed by the courts with scepticism, as the story clearly states that there was only one potential tos violation, explained by needing to do it due to differing regulatory environments and the blocking policies of differing countries around the stuff not targeted at those countries, thereby providing a valid reason for the behaviour bad actors also engage in. other than that, they refused to clarify what other potential breach was involved while trying to illegally force a contract. all their other bad acts were in furtherance of that illegal act. they also refused to provide any clarification of anything provided by the new contract which would make them not be in breach of the tos under the new contract. all the casino asked for was clarification of what the problem was, which was denied, and longer than 24 hours to move away after being informed that the new contract was mandatory, which after being informed of them also looking at other options resulted in what look like a clearly retaliatory early takedown of the sites. when they are clearly acting in bad faith, trying to say trust us does not come across as very credible.
2
as an aside, it is easy to spot which companies are good and which companies are bad. bad companies go for policies like adopt, extend, extinguish, and part of the extend includes making it slow and costly to migrate. good companies look to industry standards and best practice, and their unique selling point is around making using those things easy to attract customers, and making migrating away as close to trivial as possible to keep customers. if you have made it easy to leave you have a vested interest in making sure your service is so good nobody wants to.
2
@-_James_- I would agree, but most large functions I have seen are doing multiple things. Either they don't have separation of concerns and do more than one thing at once, or they do a sequence of multi part steps, which are better split up.
2
Absolutely right. Unit tests do automated regression testing of the public API of your code, asserting io combinations to provide an executable specification of the public API. When well named, the value of these tests are as follows: 1, Because they test only one thing, generally they are individually blindingly fast. 2, when named well, they are the equivalent of executable specifications of the API, so when it breaks you know what broke, and what it did wrong. 3, they are designed to black box test a stable public API, even if you just started writing it. Anything that relies on private API's are not unit tests. 4, they prove that you are actually writing code that can be tested, and when written before the code, also proves that the test can fail. 5, they give you examples of code use for your documentation. 6, they tell you about changes that break the API before your users have to. Points 4 and 6 are actually why people like tdd. Point 2 is why people working in large teams like lots of unit tests. Everyone I have encountered who does not like tests, thinks they are fragile, hard to maintain, and otherwise a pain, and who was willing to talk to me about why usually ended up to be writing hard to test code, with tests at to high a level, and often had code with one of many bad smells about it. Examples included constantly changing public API's, over use of global variables, brain functions or non deterministic code. the main output of unit testing are code that you know is testable, tests that you know can fail, and knowing that your API is stable. As a side effect of this, it pushes you away from coding styles which makes testing hard, and discourages constantly changing published public API's. A good suite of unit tests will let you completely throw away the implementation of the API, while letting your users continue to use it without problems. It will also tell you how much of the reimplemented code has been completed. A small point about automated regression tests. Like trunk based development, they are a foundational technology for continuous integration, which in turn is foundational to continuous delivery and Dev ops, so not writing regression tests fundamentally limits quality on big, fast moving projects with lots of contributes.
2
no, waterfall was not a thought experiment, it was something that emerged by discovering that we also need to do x, then just tacking it on the end. the original paper said "this is what people are doing, and here is why it is a really bad idea". people then took the diagram from the original paper and used it as a how to document. the problem with it is that it cannot work for 60% of projects, and does not work very well for a lot of the others. they tried fixing it by making it nearly impossible to change the specs after the initial stage, and while it made sure projects got built, it ended up with a lot of stuff which was delivered obsolete due to the world changing and the specs staying the same. the agile manifesto occurred in direct response to this saying here is what we currently do, but if we do this instead, it works better, making incremental development work for the other 60%.
2
i think he is mainly wrong about just about everything here. there is a lot of reimplimentation in open source, as people want open source usage of preexisting filetypes and apis, but this is useful. technical quality matters, otherwise you end up with bit rot. people who can do cool things is not the same as people who can explain cool things and otherwise communicate well. most of the rest of it sounds like someone who sends in a 90k pull request then feels but hurt that nobody is willing to take it without review, and it is just too big to review. 20 years ago we had open source developed with authorised contributers, and it just caused way to many problems, which is why we now do it with distributed version control, and very small pull requests. this also removes most of the "how dare he criticise me" attitude due to being less invested in a 10 line patch than a 10k patch. also, text does not transmit subtle very well, and when it does, most people miss it. this leads to unsubtle code criticism, to maintain the engineering value.
2
he is not comming from small systems, he was on the team which developed the code which runs the london stock exchange, and had to deal with lots of trades per second, and they deveĺoped using an aproach which was basically continuous integration
2
yes, you moved, but only because you had already built your infrastructure to cope with not being on aws. not doing that was the casinos mistake, so they could not move at first lies, and had significant downtime after they were shut down reimplimenting everything.
2
Most large functions I have seen are multiple parts mashed together all in one function. If you split them, name them we'll, and use them sensibly, the whole thing ends up more understandable and smaller.
2
the issue of what makes code bad is important, and has to do with how much of the complexity of the code is essential vs accidental. obviously some code has more essential complexity than others, but this is exactly when you need to get a handle on that complexity. we have known since brooks wrote the mythical man month back in the 1970s that information hiding matters, and every new development in coding has reinforced the importance of this, which is why abstraction is important, as it enables this information hiding. oop, functional programming, tdd, and refactoring all build on top of this basic idea of hiding the information, but in different ways, and they all bring something valuable to the table. when you have been in the industry for a short while, you soon encounter a couple of very familiar anti patterns, spaghetti code, the big ball of mud, and the worst one is the piece of snowflake code that everyone is afraid to touch because it will break. all of these are obviously bad code, and are full of technical debt, and the way to deal with them is abstraction, refactoring, and thus testing. given your previously stated experience with heavily ui dependant untestable frameworks, therefore requiring heavy mocking, i can understand your dislike of testing, but that is due to the fact that you are dealing with badly designed legacy code, and fragile mocking is often the only way to start getting a handle on legacy code. i think we can all agree that trying to test legacy code sucks, as it was never designed with testing or lots of other useful things in mind. lots of the more advanced ideas in programming start indirectly from languages where testing was easier, and looked at what made testing harder than it needed to be, then adopted a solution to that particular part of the problem. right from the start of structured programming, it became clear that naming mattered, and that code reuse makes things easier, first by using subroutines more, then by giving them names, and letting them accept and return parameters. you often ended up with a lot of new named predicates, which were used throughout the program. these were easy to test, and by moving them into well named functions it made the code more readable. later this code could be extracted out into libraries for reuse across multiple programs. this lead directly to the ideas of functional programming and extending the core language to also contain domain specific language code. later, the realisation that adding an extra field broke apis a lot lead to the idea of structs, where there is a primary key field, and multiple additional field. when passed to functions, adding a new field made no difference to the api, which made them really popular. often these functions were so simple that they could be fully tested, and because they were moved to external libraries, those tests could be kept and reused. this eventually lead to opdyke and others finding ways to handle technical debt which should not break good tests. this came to be known as refactoring. when the test breaks under refactoring, it usually means one of 2 things: 1, you were testing how it did it, breaking information hiding. 2, your tools refactoring implementation is broken, as a refactoring by definition does not change the functional nature of the code, and thus does not break the test. when oop came along, instead of working from the program structure end of the problem, it worked on the data structure side, specifically by taking the structs, adding in the struct specific functions, and calling them classes and method calls. again when done right, this should not break the tests. with the rise of big code bases, and recognition of the importance of handling technical debt, we end up with continuous integration handling the large number of tests and yelling at us when doing something over here broke something over there. ci is just running all of the tests after you make a change to demonstrate that you did not break any of the code under test when you made a seemingly unrelated test. tdd just adds an extra refactoring step to the code and test cycle, to handle technical debt, and make sure your tests deal with what is being tested, rather than how it works. cd just goes one step further and adds acceptance testing on top of the functional testing from ci to make sure that your code not only still does what it did before, but has not made any of the non functional requirements worse. testing has changed a lot since the introduction of ci, and code developed using test first is much harder to write containing a number of prominent anti patterns.
2
I think llvm did not do itself any favours. Originally it used gcc as its backend to provide support for multiple triples, but later defined them in an incompatible way. Seems silly to me. It has long been possible for compiler writers to define int64 and int32 for when it matters and let the programmer use int when it does not matter for portability. The compiler writer should then use the default sizes for the architecture, rather than just using int. At abi implementation time, it matters, so there should not be any it depends values in any abi implementation. Of course that case mentioned is not the only time the glibc people broke the abi. I think it was the version 5 to 6 update, where they left the parts that worked like c the same, but broke the parts that worked like c++, but did not declare it as a major version bump, so every c program still worked, but every c++ library had to be recompiled, as did anything which used the c++ abis. Another instance of full recompile required, and it has become obvious that the glibc authors don't care about breaking users programs.
2
the idea goes back further than that, back to when audio recordings were first being made, and the term for it is replicative fading. but it is still the same phenomena that over time the models become dominated by the error data unless extreme steps are taken to correct for it.
2
@noblebearaw actually, the bigger problem is that black box statistical ai has the issue that even though it might give you the right answer, it might do so for the wrong reason. there was an early example where they took photos of a forest with and without tanks hiding in it, and it worked. they then went back and took more photos of camouflaged tanks, and it didn't work at all. they managed to find out why, and the system had learned that the tank photos were taken on a sunny day, and the no tank photos were taken on a cloudy day, so the model learned how to spot sunny vs cloudy forest pics. while the tech has improved massively, because statistical ai has no model except likelihood, it has no way to know why the answer was right, or to fix it when it is found to get the answers wrong. white box symbolic ai works differently, creating a model, and using the knowledge graph to figure out why the answer is right.
2
The problem with calling the public / private split fake is that it ignores what it is actually for. When publishing code, the public stuff goes into your header files, and the promise is that this stuff won't break without a good reason, and not very often. While you can access the private code using reflection, it is not even guaranteed to work across individual patches, let alone be consistent across major releases. Your private code is then tested by calling your public API's and if it is not called then either you don't have enough tests, or it is dead code you can delete and the solution Is to either add tests or delete code.
2
ci came from the realisation that the original paper from the 70s saying that the waterfall development model, while common was fundamentaĺly broken, and agile realised that to fix it, you had to move things that appear late in the process to an earlier point, hence the meme about shift left. the first big change was to impliment continuouse backups, now refered to as version control. another big change was to move tests earlier, and ci takes this to the extreme by making them the first thing you do after a commit. these two things together mean that your fast unit tests find bugs very quickly, and the version control lets you figure out where you broke it. this promotes the use of small changes to minimise the differences in patches, and results in your builds being green most of the time. long lived feature branches subvert this process, especially when you have multiple of them, and they go a long time between merges to the mainline (which you say you rebase from). specifically, you create a pattern of megamerges, which get bigger the longer the delay. also, when you rebase, you are only merging the completed features into your branch, while leaving all the stuff in the other megamerges in their own branch. this means when you finally do your megamerge, while you probably don't break mainline, you have the potential to seriously break any and all other branches when they rebase, causing each of them to have to dive into your megamerge to find out what broke them. as a matter of practice it has been observed time and again that to avoid this you cannot delay merging all branches for much longer than a day, as it gives the other braches time to break something else resulting in the continual red build problem.
2
@ITSecNEO bull, android is basically the linux kernel, with a few android specific tweaks. the linux kernel is huge with a lot of push back against rust for anything the rest of the kernel might need to depend on. the advantages for rust in a mixed codebase are vastly overstated, as it is hard to call rust from c, and calling c from rust is inherently limited due to the differences in language design.
2
@ITSecNEO according to the android website, the android kernel is a modified lts linux kernel with some extra patches which have not been upstreamed yet. given the size of the linux kernel, and the resistance of upstream to take rust code for anything other code might depend on, the amount of rust in the kernel has to be minimal compared to its size.
2
actually there have been multiple cases in multiple jurisdictions, which pretty much all agree that if you are guilty of negligence, willful blindness gross negligence or various specific types of illegality, the insurance company does not have to pay.
2
llms and statistical ai in general work by having the user tweak the input until the output becomes good enough. unfortunately, work on refactoring demonstrates that relatively minor tweaks cause the input code to become completely different code rather than modified code, making version control largely broken for that change. lots of different work implies that without a good model, you don't produce good updates, hence the doubt, as statistical ai has no model of what is right, and how it might be wrong.
2
@alst4817 no, i am assuming that due to the model being black box, it does what it does, but you don't know and often can't know that it is not coincidentally giving answer that happen to match what useful answers would be. even worse, due to the lack of modeling the problem space involved in this type of ai, it cannot make use of all sorts of useful feedback about why that particular answer would be garbage, and most black box ai is unfixable if a problem is encountered. the only solution to that is to use white box symbolic ai methods, which actively model the problem space and have multiple feedback loops to stop whole classes of wrong answers from occurring, making them patchable.
2
@alst4817 my point about black box ai is not that it cannot be useful, but due to the black box nature, it is hard to have confidence that the answer is right, that this is anything more than coincidence, and the most you can get from it is a possibility value for how plausible the answer is. this is fine in some domains where that is good enough, but completely rules it out for others where the answer needs to be right, and the reasoning chain needs to be available. i am also not against the use of statistical methods in the right place. probabilistic expert systems have a long history, as do fuzzy logic expert systems. my issue is the way these systems are actually implemented. the first problem is that lots of them work in a generative manner. using the yast config tool of suse linux as an example, it is a very good tool, but only for the parameters it understands. at one point in time, if you made made any change using this tool, it regenerated every file it knew about from its internal database, so if you needed to set any unmanaged parameters in any of those files, you then could not use yast at all, or your manual change would disappear. this has the additional disadvantage that now those managed config files are not the source of truth, this is hidden in yasts internal binary database. it also means that using version control on any of those files is pointless as the source of truth is hidden, and they are now generated files. as the code is managed by those options in the config file, that should be in text format, version controlled, and any tools that manipulate them should update only the fields it understands, and only for files which have changed parameters. similarly, these systems are not modular, instead being implimented as one big monolithic black box, which cannot be easily updated. this project is being discussed in a way that suggests that they will just throw lots of data at it and see what sticks. this approach is inherently limited. when you train something like chatgpt, where you do not organise the data, and let it figure out which of the 84000 free variables it is going to use to hallucinate a plausible answer, you are throwing away most of the value in that data, which never makes it into the system. you then have examples like copilot, where having trained on crap code, it on average outputs crap code. some of the copilot like coding assistants actually are worse, where they replace the entire code block with a completely different one, rather than just fixing the bug, making a mockery of version control, and a .ot of tne time this code then does nit even pass the tests the previous code passed. then we have the semantic mismatch between the two languages. in any two languages either natural or synthetic, there is not an identity of function beteeen the two languages. somethings can't be done at all in the language, and some stuff which is simple in one language can be really hard in another one. only symbolic ai has the rich model needed to understand this. my scepticism about this is well earned, with lots of ai being ever optimistic to begin with, and then plateauing with no idea what to do next. i expect this to be no different, with it being the wrong problem, with a bad design, badly implemented. i wish them luck, but am not optimistic about their chances.
2
The 45 minute delay each way makes remote debugging not an option. It's like asking why jagger and Bowie did not do a live intercontinental broadcast for live aid, the physics gets in the way.
2
But in safety critical systems, not crashing the code matters, so you are strict about what you accept, so that you don't have to handle garbage input.
2
Uncle Bob often does a disservice to the thing he is promoting, as he is more interested in winning the argument with the person he is debating than in educating the audience. he does produce some good stuff, but you have to filter it for snake oil sales techniques. Dave Farley on the continuous delivery channel does a lot better, but is so far into advanced uses that he can sometimes fail to realize that not everyone use the same terminology the same way, and fails to clarify which definition he is using which can confuse those new to unit testing and the whole raft of technologies built on top of comprehensive automated regression testing with unit tests. Start by learning the testing pyramid to learn how the terminology is used, then listen to videos of jez humble on continuous integration to get you used to the way the terminology should be used, then move on to Dave Farley.
1
no, it is there because as research proves, nearly 60% of projects cannot know the exact specifications until after they start trying to build it and find that it does not fit what they thought they needed. that is not incompetence, it is reality, and you need some way to adapt as you discover these differences.
1
ada was commissioned because lots of government projects were being written in niche or domain specific languages, resulting in lots of mission critical software which was in effect write only code, but still had to be maintained for decades. the idea was to produce one language which all the code could be written in, killing the maintainability problem, and it worked. unfortunately exactly the thing which made it work for the government kept it from more widespread adoption. first, it had to cover everything from embedded to ai, and literally everything else. this required the same functions to be implimented in multiple ways as something that works on a huge and powerful ai workstation with few time constraints needs to be different from a similar function in an embedded, resource limited and time critical usage. this makes the language huge, inconsistent, and unfocused. it also made it a pain to implement the compiler, as you could not release it until absolutely everything had been finalised, and your only customers were government contractors, meaning the only way to recover costs was to sell it at a very high price, and due to the compiler size, it would only run on the most capable machines. and yes, it had to be designed by committee, due to the kitchen sink design requirement. the different use cases needed to fulfil its design goal of being good enough for coding all projects required experts on the requirements for all the different problem types, stating that x needs this function to be implemented like this, but y needs it to be implemented like that, and the two use cases were incompatible for these reasons. rather than implementing the language definition so you could code a compiler for ada embedded, and a different on for ada ai, they put it all in one badly written document which really did not distinguish the use case specific elements, making it hard to compile, hard to learn, and just generally a pain to work with. it also was not written with the needs of compiler writers in mind either. also, because of the scope of the multiple language encodings in the language design, it took way too long to define, and due to the above mentioned problems, even longer to implement. other simpler languages had already come along in the interim, and taken over a lot of the markets the language would cover, making it an also ran for those areas outside of mandated government work.
1
There is a reason people use simple coding katas to demonstrate automated regression testing, tdd, and every other new method of working. Every new AI game technology starts with tic tac toe, moves on to checkers, and ends up at chess or go. It does this because the problems start out simple and get progressively harder, so you don't need to understand a new complex problem as well as a new approach to solving it. Also, the attempt to use large and complex problems as examples has been proven not to work, as you have so much attention going on the problem that you muddy attempts to understand the proposed solution. Also, there is a problem within a lot of communities that they use a lot of terminology in specific ways that differ from general usage, and different communities use that terminology to mean different things, but to explain new approaches you need to understand how both communities use the terms and address the differences, which a lot of people are really bad at.
1
@chudchadanstud like ci, unit testing is simple in principle. when i first started doing it i did not have access to a framework, and every test was basically a stand alone program with main and a call, which then just returned a pass or fail, stored in either unittest or integrationtest directories, with a meaningful name so when it failed the name told me how it failed, all run from a makefile. each test was a functional test, and was run against the public api. i even got the return values i did not know by running the test to always fail printing the result, and then verifying that it matched the result next time. when a new library was being created because the code would be useful in other projects, it then had the public api in the header file, and all of the tests were compiled and linked against the library, and all had to pass for the library to be used. all of this done with nothing more than a text editor, a compiler, and the make program. this was even before version control took off. version control and a framework can help, but the important part is to test first, then add code to pass the test, then if something breaks, fix it before you do anything else. remember, you are calling your public api, and checking that it returns the same thing it did last time you passed it the same parameters. you are testing what it does, not how fast, or how much memory it uses, or any other non functional property.what you get in return is a set of self testing code which you know works the same, because it still returns the same values. you also get for free an executable specification using the tests and the header file, so if you wished you could throw away the library code and use the tests to drive the rewrite to the same api. but it all starts with test first, so that you don't write untestable code in the first place.
1
@TimothyWhiteheadzm for airlines it is the knock on effects which kill you. say you cannot have the passengers board the plane. at this point, you need to take care of the passengers until you can get them on another flight. this might involve a couple of days staying at a hotel. then the flight does not leave. at this point neither the plane or the pilots are going to be in the right place for the next flights they are due to take. as some of these pilots will be relief crew for planes where the crew are nearing their flight time limit, that plane now cannot leave either, so now you have to do the same with their passengers as well. in the case of delta, airlines it went one step further, actually killing the database of which pilots were where, and you could not start rebuilding it from scratch until all the needed machines were back up and running. the lawsuit from delta alone is claiming 500 million in damages, targeting crowdstrike for taking down the machines, and microsoft for not fixing the boot loop issue which caused them to stay down. i know of 5 star hotels which could not check guests in and out, and of public house chains where no food or drinks could be sold for the entire day, as the ordering and payment systems were both down, and they had no on site technical support. i am sure the damages quoted will turn out to be under estimates.
1
There is a lot of talking past each other and marketing inspired misunderstanding of terminology going on here, so I will try and clarify some of it. When windows 95 was being written in 1992, every developer had a fork of the code, and developed their part of windows 95 in total isolation. Due to networking not really being a thing on desktop computers at the time, this was the standard way of working. After 18 months of independent work, they finally started trying to merge this mess together, and as you can image the integration hell was something that had to be seen to be believed. Amongst other things, you had multiple cases where the developer needed some code, and wrote it for his fork, while another developer did the same, but in an incompatible way. This lead to their being multiple incompatible implementations of the same basic code in the operating system. At the same time, they did not notice either the rise of networking, or the importance, so it had no networking stack, until somebody asked Bill Gates about networking in windows 95 at which point he basically took the open source networking stack from bsd Unix and put it into windows. This release of a network enabled version of windows and the endemic use of networking on every other os enabled the development of centralised version control, and feature branches were just putting these forks into the same repository, without dealing with the long times between integrations, and leaving all the resulting problems unaddressed. If you only have one or two developers working in their own branches this is an easily mitigated problem, but as the numbers go up, it does not scale. These are the long lived feature branches which both Dave and primagen dislike. It is worth noting that the hp laser jet division was spending 5 times more time integrating branches than it was spending developing new features. Gitflow was one attempt to deal with the problem, which largely works by slowing down the integration of code, and making sure that when you develop your large forks, they do not get merged until all the code is compatible with trunk. This leads to races to get your large chunk of code into trunk before someone else does, forcing them to suffer merge hell instead of you. It also promotes rushing to get the code merged when you hear that someone else is close to merging. Merging from trunk helps a bit, but fundamentally the issue is with the chunks being too big, and there being too many of them, all existing only in their own fork. With the rise in the recognition of legacy code being a problem, and the need for refactoring to deal with technical debt, it was realised that this did not work, especially as any refactoring work which was more than trivial made it more likely that the merge could not be done at all. One project set up a refactoring branch which had 7 people working on it for months, and when it was time to merge it, the change was so big that it could not be done. An alternative approach was developed called continuous integration, which instead of slowing down merges was designed to speed them up. It recognised that the cause of merge hell was the size of the divergence, and thus advocated for the reduction in size of the patches, and merging them more often. It was observed that as contributions got faster, manual testing did not work, requiring a move from the ice cream cone model of testing used by a lot of Web developers towards the testing pyramid model. Even so, it was initially found that the test suite spent most of its time failing, due to the amount of legacy code, and the fragility of code to test legacy code, which lead to a more test required and test first mode of working, which moves the shape of the code away from being shaped like legacy code, and into a shape which is designed to be testable. One rule introduced was that if the build breaks, the number one job of everyone is to get it back to passing all of the automated tests. Code coverage being good enough was also found to be important. Another thing that was found is that one you started down the route to keeping the tests green, there was a maximal delay you could have which did not adversely affect this, which turned out to be about once per day. Testing because increasingly important, and slow test times were deal with the same way slow build times were, by making the testing incremental. So you made a change, only built the bit which it changed, ran only those unit tests which were directly related to it, and one it passed, built and tested the bits that depended on it. Because the code was all in trunk, refactoring did not usually break the merge any more, which is the single most important benefit of continuous integration, it let's you much more easily deal with technical debt. Once all of the functional tests (both unit tests and integration tests), which shoukd happen within no more than 10 minutes, and preferably less than 5 minutes, you now have a release candidate which can then be handed over for further testing. The idea is that every change should ideally be able to go into this release candidate, but for some bigger features it is not ready yet, which is where feature flags come in. They replace branches with long lived unmerged code by a flag which hides the feature from the end user. Because your patch takes less than 15 minutes from creation to integration, this is not a problem. The entire purpose of continuous integration is to prove that the patch you submitted is not fit for release, and if so, it gets rejected and you get to have another try, but as it is very small, this also is not really a problem. The goal is to make integration problems basically a non event, and it works, The functional tests show that the code does what the programmer intended it to do. At this point it enters the deployment pipeline described in continuous delivery. The job of this is to run every other test need, including acceptance tests, whose job is to show that what the customer intended and what the programmer intended match. Again the aim is to prove that the release candidate is not fit to be released. In the same way that continuous delivery takes the output from continuous integration, continuous deployment takes the output from continuous delivery and puts it into a further pipeline designed to take the rolling release product of continuous delivery and put it through things like canary releasing so that it eventually ends up in the hands of the end users. Again it is designed to try it out, and if problems are found, stop them from being deployed further. This is where cloudstrike got it wrong so spectacularly. In the worst case, you just roll back to the previous version, but at all stages you do the fix on trunk, and start the process again, so the next release is only a short time away, and most of your customers will never even see the bug. This process works even at the level of doing infrastructure as a service, so if you think that your project is somehow unique, and it cannot work for you, you are probably wrong. Just because it can be released, delivered, and deployed, it does not mean it has to be. That is a business decision, but that comes back to the feature flags. In the meantime you are using feature flags to do dark launching, branch by abstraction to move between different solutions, and enabling the exact same code to go to beta testers and top tier users, just without some of the features being turned on.
1
The correct response to that is not only to have a ratchet on your code coverage, but to make the test author responsible for fixing every fragile test they committed before they can work on new code, thereby pushing the pain of their crappy tests back onto them. If it still doesn't work, add them to the list of people who have to fix the broken tests which no longer have a maintainer. Eventually they will figure out how bad crappy tests are, and stop writing as many. Also crappy tests are usually a product of crappy code, so their code should gradually improve as well.
1
@phillipsusi1791 this is only true as long as the code changes in the new feature does not touch any code used by any other new feature. In it's own long lived feature branch. As the feature gets bigger, lives longer, and requires more modifications to pre-existing code this assumption becomes increasingly invalid, and that is before you consider any of the other fundamental problems with long living feature branches in actively changing code based.
1
@phillipsusi1791 it is entirely about code churn. every branch is basically a fork of upstream (the main branch in centralised version control). the problem with forks is that the code in them diverges, and this causes all sorts of problems with incompatible changes. one proposed solution to this is to rebase from upstream, which is intended to sort out the problem of your branch not being mergable with upstream, and to an extent this works if the implicit preconditions for doing so are met. where it falls over is with multiple long lived feature branches which don't get merged until the entire feature is done..during the lifetime of each branch, you have the potential for code in any of the branches to produce incompatible changes with any other branch. the longer the code isn't merged and the bigger the size of the changes, the higher the risk that the next merge will break something in another branch. The only method found to mitigate this risk is continuous integration, and the only way this works is by having the code guarded by regression tests, and having everybody merge at least once a day. without the tests you are just hoping nobody broke anything, and if the merge is less often than every day, the build from running all the tests has been observed to be mostly broken, thus defeating the purpose of trying to minimise the risks. the problem is not with the existence of the branch for a long period of time, but with the risk profile of many branches which don't merge for a long time. also, because every branch is a fork of upstream, any large scale changes like refactoring the code by definition is not fully applied to the unmerged code, potentially breaking the completeness and correctness of the refactoring. this is why people doing continuous integration insist on at worst daily merges with tests which always pass. anything else just does not mitigate the risk that someone in one fork will somehow break things for either another fork, or for upstream refactorings. it also prevents code sharing between the new code in the unmerged branches, increasing technical debt, and as projects get bigger, move faster, and have more contributers, this problem of unaddressed technical debt grows extremely fast. the only way to address it is with refactoring, which is the additional step added to test driven development, which is broken by long lived branches full of unmerged code. this is why all the tech giants have moved to continuous integration, to handle the technical debt in large codebases worked on by lots of people, and it is why feature branching 8s being phased out in favour of merging and hiding the new feature behind a feature flag until it is done.
1
This is to do with the terminology mismatch between different developer communities. As testing became better understood, it defined the term "unit test" to refer to a simple test which follows one path through the code and returns one deterministic answer. In the meantime, and especially in object oriented communities, "unit tests" came to be used to refer to the entire test suite used to test an entire class or module. Not the same thing at all, but then add testing enthusiasts not defining their terms and you end up with people outside the testing community understandably not believing that "unit testing" is.not fragile because they are used to their tests suites breaking, especially when they write tests after the fact and have to do lots of fragile tricks like mocks to get around the fact that their code was not written with testing in mind.
1
The best way to answer is to look how it works with linus Torvalds branch for developing the Linux kernel. Because you are using version control, your local copy is essentially a branch, so you don't need to create a feature branch. You make your changes in main, which is essentially a branch of Linus's branch, add your tests, and run all of the tests. If this fails, fix the bug. If it works rebase and quickly rerun the tests, then push to your online repository. This then uses hooks to automatically submit a pull request, and linus will getting a whole queue of them, which is then applied in the order in which they came in. When it is your turn, either it merges ok and becomes part of everyone else's next rebase, or it doesn't the pull is reverted, linus moves on to the next request, and you get to go back, do another rebase and test, and push your new fixes back up to your remote copy which will then automatically generate another pull request. Repeat the process until it merges successfully, and then your local system is a rebased local copy of upstream. Because you are writing small patches, rather than full features, the chances of a merge conflict are greatly reduced, often to zero if nobody else is working on the code you changed. It is this which allows the kernel to get new changes every 30 seconds all day every day. Having lots of small fast regression tests is the key to this workflow, combined with committing every time the tests pass, upstreaming with every commit, and having upstream do ci on the master branch.
1
The reason for function size limitations is because most long functions are badly conceptualised. Either they do multiple things, in which case multiple things are better, or they do multiple stages, in which case splitting each stage into its own function is better. Only after you have dealt with those problems do you come to the human factors problem that code longer than a page is harder to remember and reason about.
1
In principle, I agree that all code should be that clean, but that means that there are a bunch of string functions you must not use because msvc uses a different function than gcc for the same functionality. In practice, people write most code on a specific machine with a specific tool chain, and have a lot of it. Having to go and fix every error right away because the compiler writer has made a breaking change is a bug. So is an optimisation where the test breaks because optimisation is turned on. In this case, what happened is that a minor version update introduced a breaking change to existing code, and instead of having it as an option you could enable, made it the default. How most compilers do this is they wrap these changes in a feature flag, which you can then enable. On the next major version, they enable it by default when you do the equivalent of -Wall, but let you disable it. On the one after that it becomes the default, but you can still override it for legacy code which has not been fixed yet. Most programmers live in this later world where you only expect the compiler to break stuff on a major version bump, and you expect there to be a way to revert to the old behavior for legacy code.
1
Of course the real value of lots of public API unit tests Is that it pushes a lot of code to be outside of the UI, and thus testable. The rest is just a thin skin that is wrapped around the rest of your program, which is then much easier to test and to change.
1
Actually, they expect semantic versioning of major tools like compilers, interpreters, and other types of analysers. This means that they expect all breaking changes to be in a major version update, and choose when to change over to the new version. Having it change on what was effectively a bug fix update goes completely against expectations, so while the compiler writer was technically right, it goes against the expectations people have been given from how decades of updates to major tools have worked in practice.
1
people feel nostalgia for it because it was what they learned first, and because if you only work on the 40% of projects it can work for, it can be good enough. people don't like it because it does not work at all for 60% of projects, and for the rest it often is not good enough.
1
the problem with pure communism, shared by pure capitalism, is that they don't work. in any practical system you have to balance the needs of the investors, the workers, and the customers. the customers want stuff as cheap as possible, the workers want higher pay for easier or reduced work, and the investors want the best profit from the investment. there is an idea not often taught called the principal of enlightened self interest, where you set the price at the highest price which does not harm sales, and then split the profits between the investors and the workers. this gives you a sweet spot, where everyone gets the optimal deal. any deviation from that sweet spot harms the interest of at least one of the three groups.
1
Both disasters were caused because management did not understand risk, even after feynmann shoved that fact down their throat after the first one.
1
because rust is basically c with constraints, so it can be done incrementally. most valid alternatives are not, so you would have to translate everything at once.
1
No, by definition you are wrong. In tdd the developer writes a failing test, writes just enough code to pass the test, and then uses the combination of the passing test and refactoring to incrementally decrease the amount of technical debt. This is then fed into your continuous integration system, which proves that the code does what the developer expected. What you are talking about are acceptance tests which plug into your continuous delivery system to prove that what your developer expected was what the customer needed, which can either be done by the developer, or by the system designer. Of course one of the advantages of tdd is that it does incremental development of the system giving you something to present to the customer regularly which enables exploratory testing of the developing system to identify such mismatches against working code.
1
it is not business ethics which require the shift your company policy, but the resiliency lessons learned after 9/11 which dictate it. many businesses with what were thought to be good enough plans had them fail dramatically when faced with the loss of the data centers duplicated between the twin towers, the loss of the main telephone exchange covering a large part of the city, and being locked out of their buildings until the area was safe while their backup diesel generators had their air intake filters clog and thus the generator fail due to the dust. the recovery times for these businesses for those it did not kill were often on the order of weeks to get access to their equipment, and months to get back to the levels they were at previously, directly leading to the rise of chaos engineering to identify and test systems for single points of failure and graceful degradation and recovery, as seen with the simian army of tools at netflix. load balancing against multiple suppliers across multiple areas is just a mitigation strategy against single points of failure, and in this case the bad actors at cloudflare were clearly a single point of failure. with a good domain name registrar, you can not only add new nameservers, which i would have done as part of looking for new providers, but you can shorten the time that other people looking up your domain cache the name server entries to under an hour, which i would have also done as soon as potential new hosting was being explored and trialed. as long as your domain registrar is trustworthy, and you practice resiliency, the mitigation could have been really fast. changing the name server ordering could have been done as soon as they received the 24 hour ransom demand, giving time for the caches to move and making the move invisible for most people. not only did they not do that, or have any obvious resiliency policy, but they also built critical infrastructure around products from external suppliers without any plan for what to do if there was a problem. clearly cloudflare's behaviour was dodgy, but the casino shares some of the blame for being an online business with insufficient plans for how to stay online.
1
@MrSnivvel yes there will be rework, but you are talking the difference between preventative maintainence and emergency rework. the first is always cheaper a d performed better. its like the arguments about test first development, where you trade regression tests against bug hunting after the fact. same applies in many other fields of endeavor. there are often tradeoffs, which you then need to make a deliberate proactive choice about, rather than waiting until you are about to go over the cliff and only then thinking about what happens next.
1
@MrSnivvel what you are talking about is called resiliency planning, which a lot of people regard as making sure nothing can go wrong. unfortunately for them, it actually means planning for everything that can go wrong and putting preferably automated plans in place for gradual degradation and rapid recovery when it does. the way this is achieved is using something called chaos engineering, the most well known example of which is the simian army of tools at netflix. it has been rendered much more important due to the realisation of just how expensive not having it was in the aftermath of 9/11.
1
@MrSnivvel it did not seem to me like it was taking the import of the issue seriously enough. nothing you said was exactly wrong, but to me it came across more as "yeah its a good idea, but whenever is fine" and to me it is more like making sure that you have enough insurance on you billion dollar factory, or having a will when you have major assets. the but whenever approach is far to prevalent and tends to push my buttons.
1
@373323 there are a number of companies who were running n-1 or n-2 versions of the driver, which crowdstrike support, but the issue here is that it was company policy as stated by the ceo to immediately push the signature files out to everyone in one go, without further testing. the information from crowdstrike is that the engineer in question picked up an untested template, modified it for the case in hand, ran a validator program against it which had not been updated to cover that template (and thus should have failed it), and once that passed, picked up the files, and shipped them out to everyone with no further testing, as per company policy. it then took them 90 minutes to spot that there was a problem, and do a 2 minute fix to roll back the update to stop the rollout and fix any machines with the bad update that had not yet rebooted. it took them 6 hours from the rollout to have a solution to the problem of how to fix the rebooted machines, but it only really worked on basic desktops which did not need security. at least one company reported spending 15 hours manually rebooting and fixing 40,000 machines. some were worse.
1
As it should be. Every book I have ever read on optimisations for compilers has made the point that compiler optimisations should change the speed and or memory usage, but not the behaviour of the program. This is explicitly stated when it comes to refactoring, where you optimise for readability and making the code more maintainable, but applies equally to every other form of optimisation as well.
1
If the change had occurred between major versions of the compiler, I would have agreed with you, but the user basically did the nightly update from the package repository to do bugfixes, and suddenly working programs broke. That is a bug. When a major version change occurs, you expect things like -Wall to throw lots more warnings, which is why people make a deliberate choice as to when in their workflow to make that move. If you are for example 3 days before the release of a 1 million lines of code piece of software you do not expect some clever programmer to do something dumb and break it silently when you are getting ready for release, potentially causing weeks worth of extra work. This is what the user was unclearly complaining about, and the compiler writer was ignoring the point that changes which break running code should not become the default as a surprise.
1
we now know what should have happened, and what actually happened, and they acted like amateurs. first, they generated the file, which went wrong. then they did the right thing, and ran a home built validator against it, but not as part of ci. then after passing the validation test they built the deliverable. then they shipped it out to 8.5 million mission critical systems with no further testing whatsoever which is a level of stupid which has to be seen to be believed. this then triggered some really poor code in the driver, crashing windows, and their setting it into boot critical mode caused the whole thing to go into the boot loop. this all could have been stopped before it even left the building. after validating the file, you should then continue on with the other testing just like if you had changed the file. this would have caught it. having done some tests, and created the deployment script, you could have installed it on test machines. this also would have caught it. finally, you start a canary release process, starting with putting it on the machines in your own company. this also would have caught it. if any of these steps had been done it would never have got out the door, and they would have learned a few things. 1, their driver was rubbish and boot looped if certain things went wrong. this could then have been fixed so it will never boot loop again. 2, their validator was broken. this could then have been fixed. 3, whatever created the file was broken. this could also have been fixed. instead they learned different lessons. 1, they are a bunch of unprofessional amateurs. 2, their release methodology stinks. 3, shipping without testing is really bad, and causes huge reputational damage. 4, that damage makes the share price drop of a cliff. 5, it harms a lot of your customers, some with very big legal departments and a will to sue. some lawsuits are already announced as pending. 6, lawsuits hurt profits. we just don't know how bad yet. 7, hurting profits makes the share price drop even further. not a good day to be cloudstrike. some of those lawsuits could also target microsoft for letting the boot loop disaster happen, as this has happened before, and they still have not fixed it.
1
If the code you have been inflicted with has no tests, read "working effectively with legacy code by Michael feathers", as it is the definitive work on how to deal with the problem.
1
@notsojharedtroll23 arpa proposed the arpa net for two reasons. 1, incompatible machine conventions meant that having to pass data between incompatible systems was a real pain. 2, literally everything else on the books was taken away from them and shoved into nasa, leaving them scrambling around trying to prevent congress from declaring them irrelevant and killing their funding. with this, and a few other similarly big projects, they managed to stay afloat long enough to get back on track.
1
@ChrisWijtmans i think it should not be legal, but know of multiple cases where having become your authorised agent they charge very large fees to hand control back to you. with limited company formations, it is by delaying your ability to file your tax returns and company accounts, with domain management it is done by blocking you from updating the dns info. in both cases it is trying to use the nuisance value to extract wildly overinflated fees. most people either pay, or aband9n it and get a new one from a reputable company.
1
we have known not to put backups too close since the backups for the data centres in building 1 of the twin towers were in building 2, and both got taken out on 9-11.
1
That is just wrong. C was created for writing low level of code in a higher level language. It just happened to be used for teaching because of how well written the k&r c language book was compared to other languages of the time, and the prevalence of Unix and thus c in universities. It was also designed to be easy to write a compiler for. Pascal was the language written for teaching, and the difficulty of writing good compilers for it is a direct result of this being the only design consideration. The problem with Trying to write good c++ compilers is a different one, due to the huge size of the language definition.
1
the use of the term curated data refers to the fact that if the data is not tagged, lots of learning algorithms won't work at all. this is what is meant by supervised learning. the problem with getting a black box statistical ai to generate those tags is that the basics of the tech is about stacking plausible guesses on plausible guesses, eliminating almost all of the feedback loops needed to generate correct answers. this is the compounded with the overfiting problem. every statistical ai works by getting more and more data, shovelling it into a system that tries to best fit a statistical model with every higher numbers of free parameters, which takes ever longer to train. this is where the model generally fails. there is a limit to how many times you can double the size of the dataset used to train the model. there is a limit to how many extra free variables you can add to the model. there are limits to how fast the model stabilises as the number of variables goes up. this results in the compute requirements to do the training having an exponential growth term in the costs, to produce answers that are only a little better and only produce likely answers, not correct ones. then there is the general problem with all intelligence, that you can only solve new problems you can almost solve already.
1
the main problem here is that prime and his followers are responding to the wrong video. this video is aimed at people who already understand 10+ textbooks worth of stuff with lots of agreed upon terminology, and is aimed at explaining to them why the tdd haters don't get it, most of which comes down to the fact that the multiple fields involved build on top of each other, and the haters don't actually share the same definitions for many of the terms, or of the processes involved. in fact in a lot of cases, especially within this thread, the definitions the commentators use directly contradict the standard usage within the field. in the field of testing, testing is split into lots of different types, including unit testing, integration testing, acceptance testing, regression testing, exploratory testing, and lots of others, if you read any textbook on testing, a unit test is very small, blindingly fast, does not usually include io in any form, and does not usually include state across calls or long involved setup and teardown stages. typically a unit test will only address one line of code, and will be a single assert that when given a particular input, it will respond with the same output every time. everything else is usually an integration test. you will then have a set of unit tests that provide complete coverage for a function. this set of unit tests is then used as regression tests to determine if the latest change to the codebase has broken the function by virtue of asserting as a group that the change to the codebase has not changed the behaviour of the function. pretty much all of the available research says that the only way to scale this is to automate it. tdd uses this understanding by asserting that the regression test for the next line of code should be written before you write that line of code, and because the tests are very simple and very fast, you can run them against the file at every change and still work fast. because you keep them around, and they are fast, you can quickly determine if a change in behaviour in one place broke behaviour somewhere else, as soon as you make the change. this makes debugging trivial, as you know exactly what you just changed, and because you gave your tests meaningful names, you know exactly what that broke. continuous integration reruns the tests on every change, and runs both unit tests and integration tests to show that the code continues to do what it did before, nothing more. this is designed to run fast, and fail faster. when all the tests pass, the build is described as being green. when you add the new test, but not the code, you now have a failing test, and the entire build fails, showing that the system as a whole is not ready to release, nothing more. the build is then described as being red. this is where the red-green terminology comes from, and it is used to show that the green build is ready to check in to version control, which is an integral part of continuous integration. this combination of unit and integration tests is used to show that the system does what the programmer believes the code should do. if this is all you do, you still accumulate technical debt, so tdd adds the refactoring step to manage and reduce technical debt. refactoring is defined as changing the code in such a way that the functional requirements do not change, and this is tested by rerunning the regression tests to demonstrate that indeed the changes to the code have improved the structure without changing the functional behaviour of the code. this can be deleting dead code, merging duplicate code so you only need to maintain it in one place, or one of hundreds of different behaviour preserving changes in the code which improves it. during the refactoring step, no functional changes to the code are allowed. adding a test for a bug, or to make the code do something more happens at the stsrt of the next cycle. continuous delivery then builds on top of this by adding acceptance tests which confirm that the code does what the customer thinks it should be doing. continuous deployment builds on top of continuous delivery to make it so that the whole system can be deployed with a single push of a button, and this is what is used by netflix for software, hp for printer development, tesla and spacex for their assembly lines, and lots of other companies for lots of things. the people in this thread have conflated unit tests, integration tests and acceptance tests all under the heading of unit tests, which is not how the wider testing community uses the term. they have also advocated for the deletion of all regression tests based on unit tests. a lot of the talk about needing to know about the requirements in advance is based upon this idea that a unit test is a massive, slow, complex thing with large setup and teardown, but it is not how it is used in tdd. there you are only required to understand how to write the next line of code well enough that you can write a unit test for that line what will act as a regression test. this appears to be where a lot of the confusion seems to be coming from. in short, in tdd you have three steps: 1, understand the needs of the next line of code well enough that you can write a regression test for it, write the test, and confirm that it fails. 2, write enough of that line that it makes the test pass. 3, use functionally preserving refactorings to improve the organisation of the codebase. then go around the loop again. if during stages 2 and 3 you think of any other changes to make to the code, add them to a todo list, and then you can pick one to do on the next cycle. this expanding todo list is what causes the tests to drive the design. you do something extra for flakey tests, but that is ouside the scope off tdd, and is part of continuous integration. it should be pointed out that both android and chromeos both use the ideas of continuous integration with extremely high levels of unit testing. tdd fits naturally in this process, which is why so many companies using ci also use tdd, and why so many users of tdd do not want to go back to the old methods.
1
Tdd came about as a consequence of doing deliberate test first automated regression testing, and adding refactoring to the cycle. Comprehensive automated regression testing then forms part of the foundations of continuous integration. It works very well for exploratory design when you understand what you need the API to do, but are less sure about how the internals need to work, but does encourage early stabilisation of the public API of the code. If it is not public, it will either be exercised by your public API tests, or will be dead code to be deleted. In either case the implementation is not constrained by the API. Of course you are still free to break your public API right up to the point when you publish it for other code to use, at which point you are explicitly saying that the API documented in your header files won't break, supported by tests to make sure. If you break your API after this point, your users will rightly shout at you for it. Note: you do not have to add all of your functions and variables to your header files, making them public.
1
learn the way us older guys did, but easier. find open source project in the area you are interested in and learn how the code evolved using the gui to the version control system. especially concentrate on any commits that include bug fix in the comment, as the diff will show you what was wrong in the first version and how it was fixed in the second, then try and figure out why the first version was wrong and why the second version was better.
1
Code coverage is a measure of how happy you are to push crap to your users. The lower the percentage, the less you care about quality. Anyone writing a test just to up the numbers will write a bad and fragile test, which they will then be required to debug and fix when it breaks. Good testing only exercises the public API and tests it for stability. Everything else is either already covered by the API tests, is a sign of a missing test, or is dead code. As to coverage vs code review, code review doesn't scale, especially in agile workflows, like continuous integration and trunk based development. At that point you need automated regression testing, which you can combine with a ratchet test to take advantage of improvements while not allowing regressions in your coverage numbers. Remember you always start out with two guaranteed tests. 1, does it build without failing the build. 2, does it run the successfully built code without crashing. Anything better than that is an advantage you can build on top of.
1
@jfftck not quite right, great tests are self documenting executable specifications for the public api of the internal libraries of the code. while pair programming is demonstrably the best approach, it is not an option in a lot of use cases. in those cases, continuous integration is the only way to take up the slack. as regards the problem of bosses being idiots about pushing for ever more features, the dora metrics prove that features and tests when done well are a virtuous circle which increases both feature speed and test coverage. poor quality tests are trivially easy to spot, as it is non trivial to fix them when they break. the way to deal,with them is to force the guy who wrote them to fix them before they can write any new code. as the original code was usually written by the same guy, it then makes the guy feel the pain that they caused everyone else by writing such crappy code and tests. if they still don't want to fix their bad habits then they can be added to a group forced to fix all the broken tests with no current owners. of course to make it work you need to get buy in from the bosses, but you need that anyway to do testing properly.
1
@k98killer not quite. Continuous integration attempts to find the foot guns before main testing using lots of fast automated unit and integration regression tests. This allows you to get something suitable for detailed long termed testing while at the same time attempting to spot any breakages with the regression tests before you even make the commit, and which results in tested code with no obvious regressions which does what the programmer understood it needed to do. The output of continuous integration is then used by the deployment pipeline to further prove that the code is not fit for purpose by running longer lasting tests which try and discover regression in performance, memory usage, etc. These additional foot guns are not always discovered at this point, but the end result is something which has already been deployed to testing, and which looks as fit as possible for deployment to production. Continuous deployment goes further and rolls out this new release to the load balanced server with the oldest codebase deployed, and gradually takes over the load from the next oldest, either succeeding, or discovering additional foot guns which don't show up until under heavy load, in which cae it is rolled back.
1
Every branch is essentially forking the entire codebase for the project, with all of the negative connotations implied by that statement. In distributed version control systems, this fork is moved from being implicit in centralized version control to being explicit. When two forks exist (for simplicity call them upstream and branch), there are only two ways to avoid having them become permanently incompatible. Either you slow everything down and make it so that nothing moves from the branch to upstream until it is perfect, which results in long lived branches with big patches, or you speed things up by merging every change as soon as it does something useful, which leads to continuous integration. When doing the fast approach, you need a way to show that you have not broken anything with your new small patch. The way this is done is with small fast unit test which act as regression tests against the new code, and you write them before you commit the code for the new patch and commit them at the same time, which is why people using continuous integration end up with a codebase which has extremely high levels of code coverage. What happens next is you run all the tests, and when they pass, it tells you it is safe to commit the change, this can then be rebased, and pushed upstream, which then runs all the new tests against any new changes, and you end up producing a testing candidate which could be deployed, and it becomes the new master. When you want to make the next change, as you have already rebased before pushing upstream, you can trivially rebased again before you start, and make new changes. This makes the cycle very fast, and ensures that everyone stays in sync, and works even at the scale of the Linux kernel, which has new changes upstreamed every 30 seconds. In contrast, the slow version works not by having small changes guarded by tests, but by having nothing moved to upstream until it is both complete and as perfect as can be detected. As it is not guarded by tests, it is not designed with testing in mind, which makes any testing slow and fragile, further discouraging testing, and is why followers of the slow method dislike testing. It also leads to merge hell, as features without tests get delivered with a big code dump all in one go, which may then cause problems for those on other branches which have incompatible changes. You then have to spend a lot of time finding which part of this large patch with no tests broke your branch. This is avoided with the fast approach as all of the changes are small. Even worse, all of the code in all of the long lived braches is invisible to anyone taking upstream and trying to do refactoring to reduce technical debt, adding another source of breaking your branch with the next rebase. Pull requests with peer review add yet another source of delay, as you cannot submit your change upstream until someone else approves your changes, which can take tens to hundreds of minutes depending on the size of your patch. The fast approach replaces manual peer review with comprehensive automated regression testing which is both faster, and more reliable. In return they get to spend a lot less time bug hunting. The unit tests and integration tests in continuous integration get you to a point where you have a release candidate which does all of the functions the programmer understood was wanted. This does not require all of the features to be enabled by default, only that the code is in the main codebase, and this is usually done by replacing the idea of the long lived feature branch with short lived (in the sense of between code merges) branches with code shipped but hidden behind feature flags, which also allows the people on other branches to reuse the code from your branch rather than having to duplicate it in their own branch. Continuous delivery goes one step further, and takes the release candidate output from continuous integration and does all of the non functional tests to demonstrate a lack of regressions for performance, memory usage, etc and then adds on top of this a set of acceptance tests that confirm that what the programmer understood matches what the user wanted. The output from this is a deployable set of code which has already been packaged and deployed to testing, and can thus be deployed to production. Continuous deployment goes one step further and automatically deploys it to your oldest load sharing server, and uses the ideas of chaos engineering and canary deployments to gradually increase the load taken by this server while reducing the load to the next oldest server until either it has moved all of the load from the oldest to the newest, or a new unspotted problem is observed, and the rollout is reversed. Basically though all of this starts with replacing the slow long lived feature branches with short lived branches which causes the continuous integration build to almost always have lots of regression tests always passing, which by definition cannot be done against code hidden away on a long lived feature branch which does not get committed until the entire feature is finished.
1
yes we would. this killed locked down machines, and with some specific systemd exceptions almost nobody uses, it requires administrator access to get them to reboot. this is why it took so long to get them back up. only high level technical support have the access to these systems to reboot them.
1
it clearly stated that the first email was saying there was a problem affecting the network, and when they turned up it was a meeting with a completely d8fferent department, sales, and that there was no problem. also no mention as to the enterprise offering being mandatory. at that point i would return to my company and start putting resiliency measures in place with the intent to min8mise exposure to cloudflare with the intent to migrate, but the option to stay if they were not complete dicks. the second contact was about was about potential issues with multiple national domains, with a clear response that it is due to differing national regulations requiring that. the only other issue mentioned was a potential tos violation which they refused to name, and an immedia5e attempt to force a contract with a 120k price tag with only 24 hours notice and a threat to kill your websites if you did not comply. at this point i would then have immediately triggered the move. on the legal view, they are obviously trying to force a contract, which others have said is illegal in the us where cloudflare has its hardware based. it is thus subject to those laws. by only giving 24 hours from the time that they were informed it was mandatory, they are clearly guilty of trying to force the contract, and thus likely to win. if they can win on that, then their threat to pull the plug on their business on short notice in pursuit of an illegal act also probably makes them guilty of tortuous interference, for which they would definitely get actual damages, which would cover loss of business earnings, probably get reputational damages, probably get to include all the costs for having to migrate to new providers, and legal costs. when i sued them, i would also go after not only cloudflare, but the entire board individually, seeking to make them jointly and severally liable, so that when they tried to delay payment, you could go after them personally. the lesson is clear, for resiliency, always have a second supplier in the wings which you can move to on short notice, and have that move be a simple yes or no decision that can be acted upon immediately. by virtue of this, don't get overly relient on external tools to allow the business to continue to be able to work to mitigate the disaster if it happens. also keep onsite backups of any business critical information. m9st importantly, make sure you test the backups. at least one major business i know of did everything right including testing the backup rec9very process, but kept the only copy of the recovery key file on the desktop of one machine in one office, with the only backup of this key being inside the encrypted backups. th8s killed the business.
1
@NicoJuicy while that might be plausible, the point is that cloudflare did not make that arguament until after shutdown, only offering it as one of the undeeded selling points of the upgraded offering. remember the whole flow was around trying to upsell while refusing to state that there was a specific problem with their existing service, which is why the potentially illegal forced contract come up.
1
@NicoJuicy yes they did, but not until the final day, and not as a requirement, only as an advantage to the upgrade.
1
because they all develope using ci, as do spacex, and most of the big name software companies. when hp implimented it for their printer division, they went from spending 5% of their time on new features to 45%, while merging all of their printer branches into 1 codebase, and using feature flags to enable funcions on specific printers
1
that is just not true. the agile manifesto was a response to persistently failing projects. those there were very experienced, and identified a small set of practices which the failures had in common, and basically said if you do this instead, it often works better, but tailor the specific things to what works best for you. the problem comes when someone takes that short list of things that often work better, and turn it into 300 page books, without even thinking about if or when it works better, and why it works when it does. that is not a problem with the manifesto.
1
Some things have to just work, and degrade and recover sensibly. An example prime would know about is Netflix, which when it gets overloaded starts gradually dropping functions like the recommendation engine so that the rest can keep working, just like the lunar lander did.
1
most kernels will not be rewritten, they are just too big. moving unix from assembler to c was an anomaly, as it would need to be ported, and c was written as a research project, so rewriting in c tested the design of c and made a lot of the code portable.
1
@Isaac-wl6wu but to prepare the case, you are not feeding the publicly disclosed and filed court documents, especially if you are the defense. you have a whole stack of documents you do not have to disclose until you use them in defense at trial. these may contain information which would benefit your opponent, but which as defense council you have a duty to not disclose under client confidentiality rules. a case could be made that you could use such a system in an offline only mode, but only if you can be certain that it does not phone home and share the confidential data. in any event, statistical ai still only creates a plausible response, not a correct one. that needs symbolic ai, with lots of manual work to populate it.
1
waterfall can work as described under exactly one scenario, when you know exactly what you need the specs to be, like in the nasa case mentioned in the video. the problem is that this is amazingly expensive when you can do it, and evidence going all the way back to the original paper point out that for something like 60% of projects even the customer does not know what they need, so it fails miserably. also, it cannot cope well with changing requirements. this lead to the myth of change control as a way to deal with those cases. this resulted in product being built, but they were increasingly wrong, as you could not react to the changing needs. when the manifesto was created, it was realised that it was not working, and that you needed to do incremental development, and the manifesto was a writeup of why it does not work, and what to do instead to work incrementally. if you do it properly like the companies in the dora state of devops report, agile works, but too many companies don't even learn enough about agile to be able to tell if they are really doing it, resulting in some sort of mangled waterfall process.
1
There is a reason that the testing pyramid was developed. The lower level you are, the smaller and faster your tests, and the less implementation details your test relies on. As you move higher up the pyramid, the more implementation details you are trying to infer, and hoping don't break your test when the code changes. The worst examples of this are end to end tests and UI tests, both of which tend to be rigid to get them to work at all, and thus brittle when the implementation details change. Using the testing pyramid answers your question. You should only use end to end and user tests for things which you cannot test with lower level tests, and the same applies for every other level of the pyramid. As a side effect of using the pyramid, you end up with more of your code being easier to test, and more modular.
1
@dominikvonlavante6113 i don't know of anyone seriously doing continuous integration who does not use the testing pyramid with high levels of code coverage. as to tests higher up the pyramid, the testing pyramid being better, it depends how you define better. years of testing research have told us conclusively a number of things: 1, the higher you go, the slower you get, often by orders of magnitude. 2, the higher you go, the more setup and tear down you need around the actual test to be able to do it. 3, the higher you go, the more levels of indirection between the fact that the test broke, and why it broke. for example if you are testing the middle module in a three module chain in an end to end fashion, and the test broke, was it because module 1 broke its output to module 2, that module 2 broke its transform code, or did module three break its input api? as a consequence of these points, it is better to do thousands of unit tests against the stable api of your functions with known inputs and outputs than dozens of end to end tests which don't give you anything like as detailed feedback as the answer 'the change you just made caused this function to stop returning this value when give these parameters" this is before you even consider the fragility of end to end tests and the difficulty of testing new legacy code which was not designed with testability in mind.
1
@guymontag5084 of course you have. statistical ai fundamentally works by producing something plausible, not something correct. to go for correct you need symbolic ai, which can not only give you the output, but can tell you why and how it came to have that form, and can be taught what it got wrong if you disagree with it.
1
@noblebearaw it used all the points in all the images to come up with a set of weighted values which together enabled a curve to be drawn with all the images in one set on one side of the curve, and all the images in the other set on the other side of the curve. that is the nature of statistical ai, it does not care about why it comes to the answer, only that the answer fits the training data. the problem with this approach is that you are creating a problem space with as many dimensions as you have free variables, and then trying to draw a curve in that phase space, but there are many curves that fit the historical data, and you only find out which is the right one when you provide additional data which varies from the training data. symbolic ai works in a completely different way. because it is a white box system, it can still use the same statistical techniques to determine the category which the image falls into, but this acts as the starting point. you then use this classification as a basis to start looking for why it is in that category, wrapping the statistical ai inside another process, which takes the images fed into it, and uses humans to spot where it got it wrong, and look for patterns of wrong answers which help identify features within that multi dimensional problem space which are likely to match one side of the line or the other. this builds up a knowledge graph analogous to the structure of the statistical ai, but as each feature is recognised, named, and added to the model, it adds new data points to the model, with the difference being that you can drill down from the result to query which features are important, and why. this also provides extra chances for extra feedback loops not found in statistical ai. if we look at compiled computer programs as an example, using c and makefiles to keep it simple, you would start of by feeding the statistical ai with the code and makefile, and feed it the result of the ci / cd pipeline, determining if the change just made was releasable or not. eventually, it might get good at predicting the answer, but you would not know why. the code contains additional data implicit within it which provides more useful answers. each step in the process gives usable additional data which can be queried later. was it a change in the makefile which stopped it building Correctly? did it build ok, but segfault when it was run? how good is the code coverage of the tests on the code which was changed? does some test fail, and is it well enough named that it tells you why it failed? and so on. also a lot of these failures will give you line numbers and positions within specific files as part of the error message. if you are using version control, you also know what the code was before and after the change, and if the error report is not good enough, you can feed the difference into a tool to improve the tests so that it can identify not only where the error is, but how to spot it next time. basically, you are using a human to encode information from the tools into an explicit knowledge graph which ends up detecting that the code got it wrong because the change in line 75 of query.c returns the wrong answer to a specific function when passed specific data because a branch which should have been taken to return the right answer was not taken because the test on that line had 1 less = sign than was needed ad position 12, making it an assignment statement rather than a test, making the test never pass. it could then also suggest replacing the = with == in the new code, thus fixing the problem. none of that information could be got from the statistical ai, as any features in the code used to find the problem are implicit in the internal model, but it contains none of the feedback loops needed to do more than identify that there is a problem. going back to the tank example, the symbolic ai would not only be able to identify that there was a camouflaged tank, but point out where it was hiding, using the fact that trees don't have straight edges, and then push the identified parts of the tank through a classification system to try and recognise the make and model of the tank, this providing you with the capabilities and limitations of the identified vehicle as well as the presence and location. often when it gets stuck, it resorts to the fallback option of presenting the data to the human and saying "what do you know in this case which i don't", adding that information explicitly into the know,edge graph, and trying again to see if it altered the result.
1
There is some confusion about branches. Every branch is essentially a fork of the entire codebase from upstream. In centralized version control, upstream is the main branch, and everyone working on different features has their own branch which eventually merges back into the main branch. In decentralized version control who is the main branch is a matter of convention, not a feature of the tool, but the process works the same. When you clone upstream, you still get a copy of the entire codebase, but you do not have to bother creating a name for your branch, so people work in the local copy of master. They then write their next small commit, add tests, run them, rebase, and assuming the tests pass push to an online copy of their local repository and generate a pull request. If the merge succeeds, when they next rebase the local copy will match upstream which will have all of their completed work in it. At this point, you have no unsynchronized code in your branch, and you can delete the named branch, or if distributed, the entire local copy, and you don't have to worry about it. If later you need to make new changes you can either respawn the branch from main / upstream, or clone from upstream and you are ready to go with every upstream change. If you leave the branch inactive for a while, you have to remember to do a rebase before you start your new work to get to the same position. It is having lots of unsynchronized code living for a long time in the branch which causes all of the problems, because by definition anything living in a branch is not integrated and so does not enjoy the benefits granted by being merged. This includes not having multiple branches making incompatible changes, and finding out that things broke because someone did a refactoring and your code was not covered, so you now get to fix that problem.
1
Think of it another way. I started programming when spaghetti code and waterfall development were the norm, and waterfall was even taught as good design. Then it was noticed that if you moved to structured programming with meaningful names, proper loops, and procedural and functional code got rid of the spaghetti code problem, but it changed how you wrote programs. The problem is now writing a big ball of mud full of technical debt and breaking regularly resulting in new legacy code which is often a snowflake. The solution is to envelope it in lots of small deterministic and fast unit tests, which move the code structure away from writing hard to test code, especially if you do it test first, and even more so if you move to tdd as the design methodology. When you only test the public API of your internal libraries, you still maintain the benefits of being able to do rapid prototyping of your code as long as you don't break that API.
1
I think the other guy has it more right. You do not expect a major breaking change to become the new default on a minor version bump, and it is thus a bug in the compiler. The bug is not adding the code, nor giving it a flag to make it usable, but making it the new default silently against user expectations or a minor version update, and this is what he did not seem to understand.
1
@ITSecNEO according to the android website, the android kernel is a linux lts kernel with some patches which have not been upstreamed yet. given the resistance from the kernel community to take anything but c code for things c must depend on, the percentage of non c in the kernel must be minimal. while you can call c code from c++, and from rust, i am not sure how well it works the other way around.
1
The book "A practical approach to large-scale agile development" describes exactly how and why hp introduced systematic testing for their entire laser jet printer line, including their prior problems and how it got rid of them, including the testing of the embedded hardware involved. Well worth a read.
1
put some small amount of time into helping open source projects, then you can show your commit log as evidence of experience.
1
@ProfessorThock using the calculator example, if you have a test called two times two equals four, and it fails, you know what you are testing, what the answer should be, and what code you just changed which broke it. After that, it is fairly easy to find the bug. If you write code using test first, first you prove the test fails, then you prove the code makes it pass. If you do tdd, you also prove that your refactoring didn't break it. Finally, when a new change breaks it, you know right away, how it broke, and the new code that did it. Much easier than having 18 months spent doing waterfall development, throwing the code over the wall, and hoping integration or testing departments can find all the bugs.
1
There were machines at the time which did not even have a keyboard, and some systems worked using the ansi or ebdic character set, rather than ascii. Also, the querty keyboard layout is only standard in English speaking countries. My amiga in the 1980s had a German keyboard which was not querty.
1
agile technology has found a number of things which make programming slower and harder. every piece of research says that the higher your level of technical debt, the more tricky it becomes to add a simple change, as you often have to fix that debt first. ci, tdd, and cd all build on top of that research to provide demonstrable ways that it can be handled, which speeds up development. i watched both of his crowdstrike reviews, and read everything from crowdstrike about how it happened. this basically comes down to them shipping the code without sufficient testing and monitoring. the response of everyone to this has been to say how it could have been detected before it even shipped, and shock that they did not do what even app developers do by default. he literally wrote the book on continuous delivery, so of course his channel is going to cover related issues.
1
@shining_cross the point of a multi core server is that unlike in the old 8bit days, your computer is not just doing one thing. If you want your code to go fast on multi core, shard it, just like is done with databases. This is what microservices were invented for.
1
@ameer6168 the only way to deal with such a client is to do continuous delivery based on incremental development, where you get sign off on acceptance tests with each new test cycle, combined with penalty fees for rework that needs to be done for excessive changes to already approved acceptance tests. This is what acceptance tests are designed for, and like with any customer, you have to make a choice about the value of working for them. If your rework fee is high enough then either you will stop them from keeping changing them, makes enough extra cash to be worth keeping them as a client, or you will know they are not worth the cost of keeping them.
1
Ada was not used because it did not exist. Having the compiler be a few years old and using multiple compilers is to catch compiler bugs. Also, c++, ada, and a lot of other kitchen sink style languages are too big to compiler, test, and understand easily, which is why there is so much bugger c++ code out there.
1
@isodoublet ada is basically 7 different special case languages shoe horned intl a common syntax. As to c++, I'm quoting some of the compiler writers when I say that the language is huge and complex, which does not really affect the standard cases, but makes all the odd corner cases a complete pain to locate, find the definitive documentation for, and test if the compiler gets it right for those odd cases. Also, it includes additional problems to do with how classes and inheritance interact which are not present in the c languages, which stops at abstract data types. And that is before you get into some of the funky ways core libraries have been implemented which make c++ badly suited for embedded and safety critical uses.
1
This is just another case of the diamond dependency problem seen with objects, and should be solved the same way. Make the base abi definitions include semantic versioning info, and resolve it at link time or load time. If it cannot be resolved at that time, don't let the code run. C has always been defined as a source portable language, not an abi portable language, but some system and low level library developers don't help.
1
@shkhamd partly it is mandatory safety paranoia, but in more general programming if you make sure you check the parameters at the start, you don't need the checking in the sub modules to be as strict, as you know it is engineered to not pass down garbage which you will then have to deal with later.
1
the probe was one of the voyagers and was designed to last 5 years, still going strong 45 years later. dave farley looked at this study, and found it to mostly be biased bull designed to create talking points to help sell the book, just before it was released.
1
Having to start unit testing legacy code looks like it costs productivity, as it shifts your regression testing to the front of the process, but if you ensure that new code has tests, you gradually pick up speed until you are going faster than you would have before. Your only other option in this case is to refuse to test and end up with snowflake code nobody wants to touch, which will eventually break anyway at which point you are trying to fix it during an emergency. I would rather melt the snowflake a piece at a time until it is gone, resulting in any breakage either being covered by tests, or in a much smaller snowflake.
1
Because zig is expected to stabilise soon. Rust is even worse, as some code will not even compile properly without the nightly build ofmthe compiler chain. Sometimes those problems rule the language out of consideration.
1