General statistics
List of Youtube channels
Youtube commenter search
Distinguished comments
About

ThePrimeTime
comments

Comments by "" (@grokitall) on "ThePrimeTime" channel.

Previous
1
Next
...
All

of course they are. llms are based on statistical models, and only provides something plausible. it contains zero feedback in the model to try and improve the likelihood that it has any connection to reality. for that you need white box ai systems, which statistical systems can never be. white box systems require a different level of quality in the input data, which is then kept so you can trace where the rules came from. you also need to quality check the output to make sure it is not producing garbage, and then you need to find out what created the crap. the biggest problem with using llms is that they are currently automated copyright infringement machines, leaving both the creator of the llm and the user legally exposed, which is why most open source projects are looking at an outright ban on ai generated content. also you are left with no defense, as the user just says i trusted the output of the ai, and the owner cannot track back to exclude it from being an exact copy. keep an eye out for the masses of lawsuits currently being planned.
8
cloudflare have data centers in the us, and others here have said that trying to force a contract is clearly illegal in all 50 states. not being allowed to talk to anyone except sales, and 24 hours notice from saying it is mandatory to cutting you off while refusing to provide any information as to any issues and why and how the upgrade will mitigate those issues leaves them on shaky legal grounds. retaliating early for saying you are also looking for alternatives in case a solution cannot be found removes any realistic hope of a defense that cloudflare were operating in good faith. as has been pointed out elsewhere here, you can put whatever you like in the contract, but it does not make it legal until it has been successfully defended in court, as trump found out in his new york fraud case.
7
to be fair to the companies, a lot of them were shocked that the n-1 and n-2 policies provided by the configuration tool did not also apply to the live update definition files.
5
@ChrisWijtmans they don't take it away from you that way, they deactivate your hosting then demand large fees to get control of the dns entries away from them. there is a similar problem of such disreputable practices with limited company formations in the uk.
4
except this is exactly when you cannot use online for it, as your secret client document is then fed into the model, and might get presented back to your opponent when they do the same. book authors have already spotted this happening a lot, and lawsuits are already on their way.
4
dave farley looked at how the report was done because it did not make sense, and totally ripped it to shreds for bad methodology. it was a report done to produce selling points for the book.
3
this applies not only to llms, but to the entire field of black box statistical ai. sometimes good enough answers will work for your use case, but a lot of the time you not only need to know that you got the right answer, but also how you got that answer, and why it is right. legal and man critical uses spring to mind, but there are many others. the only answer to this is to use white box symbolic ai, where you can drill down and find those answers.
3
But if you have to do mocking, because you cannot do unit testing, this is itself a sign that you are writing hard to test code, and is usually a sign that the idea of writing tests was not acted upon until after the code was written. If you write the tests first, you don't write the code in this way, and tend to produce code which is better. But of course you have to break your bad coding habits first, often before you see the design advantages of doing test first design.
3
@ChrisWijtmans i think this should be viewed by the courts with scepticism, as the story clearly states that there was only one potential tos violation, explained by needing to do it due to differing regulatory environments and the blocking policies of differing countries around the stuff not targeted at those countries, thereby providing a valid reason for the behaviour bad actors also engage in. other than that, they refused to clarify what other potential breach was involved while trying to illegally force a contract. all their other bad acts were in furtherance of that illegal act. they also refused to provide any clarification of anything provided by the new contract which would make them not be in breach of the tos under the new contract. all the casino asked for was clarification of what the problem was, which was denied, and longer than 24 hours to move away after being informed that the new contract was mandatory, which after being informed of them also looking at other options resulted in what look like a clearly retaliatory early takedown of the sites. when they are clearly acting in bad faith, trying to say trust us does not come across as very credible.
2
as an aside, it is easy to spot which companies are good and which companies are bad. bad companies go for policies like adopt, extend, extinguish, and part of the extend includes making it slow and costly to migrate. good companies look to industry standards and best practice, and their unique selling point is around making using those things easy to attract customers, and making migrating away as close to trivial as possible to keep customers. if you have made it easy to leave you have a vested interest in making sure your service is so good nobody wants to.
2
@-_James_- I would agree, but most large functions I have seen are doing multiple things. Either they don't have separation of concerns and do more than one thing at once, or they do a sequence of multi part steps, which are better split up.
2
Absolutely right. Unit tests do automated regression testing of the public API of your code, asserting io combinations to provide an executable specification of the public API. When well named, the value of these tests are as follows: 1, Because they test only one thing, generally they are individually blindingly fast. 2, when named well, they are the equivalent of executable specifications of the API, so when it breaks you know what broke, and what it did wrong. 3, they are designed to black box test a stable public API, even if you just started writing it. Anything that relies on private API's are not unit tests. 4, they prove that you are actually writing code that can be tested, and when written before the code, also proves that the test can fail. 5, they give you examples of code use for your documentation. 6, they tell you about changes that break the API before your users have to. Points 4 and 6 are actually why people like tdd. Point 2 is why people working in large teams like lots of unit tests. Everyone I have encountered who does not like tests, thinks they are fragile, hard to maintain, and otherwise a pain, and who was willing to talk to me about why usually ended up to be writing hard to test code, with tests at to high a level, and often had code with one of many bad smells about it. Examples included constantly changing public API's, over use of global variables, brain functions or non deterministic code. the main output of unit testing are code that you know is testable, tests that you know can fail, and knowing that your API is stable. As a side effect of this, it pushes you away from coding styles which makes testing hard, and discourages constantly changing published public API's. A good suite of unit tests will let you completely throw away the implementation of the API, while letting your users continue to use it without problems. It will also tell you how much of the reimplemented code has been completed. A small point about automated regression tests. Like trunk based development, they are a foundational technology for continuous integration, which in turn is foundational to continuous delivery and Dev ops, so not writing regression tests fundamentally limits quality on big, fast moving projects with lots of contributes.
2
no, waterfall was not a thought experiment, it was something that emerged by discovering that we also need to do x, then just tacking it on the end. the original paper said "this is what people are doing, and here is why it is a really bad idea". people then took the diagram from the original paper and used it as a how to document. the problem with it is that it cannot work for 60% of projects, and does not work very well for a lot of the others. they tried fixing it by making it nearly impossible to change the specs after the initial stage, and while it made sure projects got built, it ended up with a lot of stuff which was delivered obsolete due to the world changing and the specs staying the same. the agile manifesto occurred in direct response to this saying here is what we currently do, but if we do this instead, it works better, making incremental development work for the other 60%.
2
i think he is mainly wrong about just about everything here. there is a lot of reimplimentation in open source, as people want open source usage of preexisting filetypes and apis, but this is useful. technical quality matters, otherwise you end up with bit rot. people who can do cool things is not the same as people who can explain cool things and otherwise communicate well. most of the rest of it sounds like someone who sends in a 90k pull request then feels but hurt that nobody is willing to take it without review, and it is just too big to review. 20 years ago we had open source developed with authorised contributers, and it just caused way to many problems, which is why we now do it with distributed version control, and very small pull requests. this also removes most of the "how dare he criticise me" attitude due to being less invested in a 10 line patch than a 10k patch. also, text does not transmit subtle very well, and when it does, most people miss it. this leads to unsubtle code criticism, to maintain the engineering value.
2
he is not comming from small systems, he was on the team which developed the code which runs the london stock exchange, and had to deal with lots of trades per second, and they deveĺoped using an aproach which was basically continuous integration
2
yes, you moved, but only because you had already built your infrastructure to cope with not being on aws. not doing that was the casinos mistake, so they could not move at first lies, and had significant downtime after they were shut down reimplimenting everything.
2
Most large functions I have seen are multiple parts mashed together all in one function. If you split them, name them we'll, and use them sensibly, the whole thing ends up more understandable and smaller.
2
the issue of what makes code bad is important, and has to do with how much of the complexity of the code is essential vs accidental. obviously some code has more essential complexity than others, but this is exactly when you need to get a handle on that complexity. we have known since brooks wrote the mythical man month back in the 1970s that information hiding matters, and every new development in coding has reinforced the importance of this, which is why abstraction is important, as it enables this information hiding. oop, functional programming, tdd, and refactoring all build on top of this basic idea of hiding the information, but in different ways, and they all bring something valuable to the table. when you have been in the industry for a short while, you soon encounter a couple of very familiar anti patterns, spaghetti code, the big ball of mud, and the worst one is the piece of snowflake code that everyone is afraid to touch because it will break. all of these are obviously bad code, and are full of technical debt, and the way to deal with them is abstraction, refactoring, and thus testing. given your previously stated experience with heavily ui dependant untestable frameworks, therefore requiring heavy mocking, i can understand your dislike of testing, but that is due to the fact that you are dealing with badly designed legacy code, and fragile mocking is often the only way to start getting a handle on legacy code. i think we can all agree that trying to test legacy code sucks, as it was never designed with testing or lots of other useful things in mind. lots of the more advanced ideas in programming start indirectly from languages where testing was easier, and looked at what made testing harder than it needed to be, then adopted a solution to that particular part of the problem. right from the start of structured programming, it became clear that naming mattered, and that code reuse makes things easier, first by using subroutines more, then by giving them names, and letting them accept and return parameters. you often ended up with a lot of new named predicates, which were used throughout the program. these were easy to test, and by moving them into well named functions it made the code more readable. later this code could be extracted out into libraries for reuse across multiple programs. this lead directly to the ideas of functional programming and extending the core language to also contain domain specific language code. later, the realisation that adding an extra field broke apis a lot lead to the idea of structs, where there is a primary key field, and multiple additional field. when passed to functions, adding a new field made no difference to the api, which made them really popular. often these functions were so simple that they could be fully tested, and because they were moved to external libraries, those tests could be kept and reused. this eventually lead to opdyke and others finding ways to handle technical debt which should not break good tests. this came to be known as refactoring. when the test breaks under refactoring, it usually means one of 2 things: 1, you were testing how it did it, breaking information hiding. 2, your tools refactoring implementation is broken, as a refactoring by definition does not change the functional nature of the code, and thus does not break the test. when oop came along, instead of working from the program structure end of the problem, it worked on the data structure side, specifically by taking the structs, adding in the struct specific functions, and calling them classes and method calls. again when done right, this should not break the tests. with the rise of big code bases, and recognition of the importance of handling technical debt, we end up with continuous integration handling the large number of tests and yelling at us when doing something over here broke something over there. ci is just running all of the tests after you make a change to demonstrate that you did not break any of the code under test when you made a seemingly unrelated test. tdd just adds an extra refactoring step to the code and test cycle, to handle technical debt, and make sure your tests deal with what is being tested, rather than how it works. cd just goes one step further and adds acceptance testing on top of the functional testing from ci to make sure that your code not only still does what it did before, but has not made any of the non functional requirements worse. testing has changed a lot since the introduction of ci, and code developed using test first is much harder to write containing a number of prominent anti patterns.
2
I think llvm did not do itself any favours. Originally it used gcc as its backend to provide support for multiple triples, but later defined them in an incompatible way. Seems silly to me. It has long been possible for compiler writers to define int64 and int32 for when it matters and let the programmer use int when it does not matter for portability. The compiler writer should then use the default sizes for the architecture, rather than just using int. At abi implementation time, it matters, so there should not be any it depends values in any abi implementation. Of course that case mentioned is not the only time the glibc people broke the abi. I think it was the version 5 to 6 update, where they left the parts that worked like c the same, but broke the parts that worked like c++, but did not declare it as a major version bump, so every c program still worked, but every c++ library had to be recompiled, as did anything which used the c++ abis. Another instance of full recompile required, and it has become obvious that the glibc authors don't care about breaking users programs.
2
the idea goes back further than that, back to when audio recordings were first being made, and the term for it is replicative fading. but it is still the same phenomena that over time the models become dominated by the error data unless extreme steps are taken to correct for it.
2
@noblebearaw actually, the bigger problem is that black box statistical ai has the issue that even though it might give you the right answer, it might do so for the wrong reason. there was an early example where they took photos of a forest with and without tanks hiding in it, and it worked. they then went back and took more photos of camouflaged tanks, and it didn't work at all. they managed to find out why, and the system had learned that the tank photos were taken on a sunny day, and the no tank photos were taken on a cloudy day, so the model learned how to spot sunny vs cloudy forest pics. while the tech has improved massively, because statistical ai has no model except likelihood, it has no way to know why the answer was right, or to fix it when it is found to get the answers wrong. white box symbolic ai works differently, creating a model, and using the knowledge graph to figure out why the answer is right.
2
The problem with calling the public / private split fake is that it ignores what it is actually for. When publishing code, the public stuff goes into your header files, and the promise is that this stuff won't break without a good reason, and not very often. While you can access the private code using reflection, it is not even guaranteed to work across individual patches, let alone be consistent across major releases. Your private code is then tested by calling your public API's and if it is not called then either you don't have enough tests, or it is dead code you can delete and the solution Is to either add tests or delete code.
2
ci came from the realisation that the original paper from the 70s saying that the waterfall development model, while common was fundamentaĺly broken, and agile realised that to fix it, you had to move things that appear late in the process to an earlier point, hence the meme about shift left. the first big change was to impliment continuouse backups, now refered to as version control. another big change was to move tests earlier, and ci takes this to the extreme by making them the first thing you do after a commit. these two things together mean that your fast unit tests find bugs very quickly, and the version control lets you figure out where you broke it. this promotes the use of small changes to minimise the differences in patches, and results in your builds being green most of the time. long lived feature branches subvert this process, especially when you have multiple of them, and they go a long time between merges to the mainline (which you say you rebase from). specifically, you create a pattern of megamerges, which get bigger the longer the delay. also, when you rebase, you are only merging the completed features into your branch, while leaving all the stuff in the other megamerges in their own branch. this means when you finally do your megamerge, while you probably don't break mainline, you have the potential to seriously break any and all other branches when they rebase, causing each of them to have to dive into your megamerge to find out what broke them. as a matter of practice it has been observed time and again that to avoid this you cannot delay merging all branches for much longer than a day, as it gives the other braches time to break something else resulting in the continual red build problem.
2
@ITSecNEO bull, android is basically the linux kernel, with a few android specific tweaks. the linux kernel is huge with a lot of push back against rust for anything the rest of the kernel might need to depend on. the advantages for rust in a mixed codebase are vastly overstated, as it is hard to call rust from c, and calling c from rust is inherently limited due to the differences in language design.
2
@ITSecNEO according to the android website, the android kernel is a modified lts linux kernel with some extra patches which have not been upstreamed yet. given the size of the linux kernel, and the resistance of upstream to take rust code for anything other code might depend on, the amount of rust in the kernel has to be minimal compared to its size.
2
actually there have been multiple cases in multiple jurisdictions, which pretty much all agree that if you are guilty of negligence, willful blindness gross negligence or various specific types of illegality, the insurance company does not have to pay.
2
llms and statistical ai in general work by having the user tweak the input until the output becomes good enough. unfortunately, work on refactoring demonstrates that relatively minor tweaks cause the input code to become completely different code rather than modified code, making version control largely broken for that change. lots of different work implies that without a good model, you don't produce good updates, hence the doubt, as statistical ai has no model of what is right, and how it might be wrong.
2
@alst4817 no, i am assuming that due to the model being black box, it does what it does, but you don't know and often can't know that it is not coincidentally giving answer that happen to match what useful answers would be. even worse, due to the lack of modeling the problem space involved in this type of ai, it cannot make use of all sorts of useful feedback about why that particular answer would be garbage, and most black box ai is unfixable if a problem is encountered. the only solution to that is to use white box symbolic ai methods, which actively model the problem space and have multiple feedback loops to stop whole classes of wrong answers from occurring, making them patchable.
2
@alst4817 my point about black box ai is not that it cannot be useful, but due to the black box nature, it is hard to have confidence that the answer is right, that this is anything more than coincidence, and the most you can get from it is a possibility value for how plausible the answer is. this is fine in some domains where that is good enough, but completely rules it out for others where the answer needs to be right, and the reasoning chain needs to be available. i am also not against the use of statistical methods in the right place. probabilistic expert systems have a long history, as do fuzzy logic expert systems. my issue is the way these systems are actually implemented. the first problem is that lots of them work in a generative manner. using the yast config tool of suse linux as an example, it is a very good tool, but only for the parameters it understands. at one point in time, if you made made any change using this tool, it regenerated every file it knew about from its internal database, so if you needed to set any unmanaged parameters in any of those files, you then could not use yast at all, or your manual change would disappear. this has the additional disadvantage that now those managed config files are not the source of truth, this is hidden in yasts internal binary database. it also means that using version control on any of those files is pointless as the source of truth is hidden, and they are now generated files. as the code is managed by those options in the config file, that should be in text format, version controlled, and any tools that manipulate them should update only the fields it understands, and only for files which have changed parameters. similarly, these systems are not modular, instead being implimented as one big monolithic black box, which cannot be easily updated. this project is being discussed in a way that suggests that they will just throw lots of data at it and see what sticks. this approach is inherently limited. when you train something like chatgpt, where you do not organise the data, and let it figure out which of the 84000 free variables it is going to use to hallucinate a plausible answer, you are throwing away most of the value in that data, which never makes it into the system. you then have examples like copilot, where having trained on crap code, it on average outputs crap code. some of the copilot like coding assistants actually are worse, where they replace the entire code block with a completely different one, rather than just fixing the bug, making a mockery of version control, and a .ot of tne time this code then does nit even pass the tests the previous code passed. then we have the semantic mismatch between the two languages. in any two languages either natural or synthetic, there is not an identity of function beteeen the two languages. somethings can't be done at all in the language, and some stuff which is simple in one language can be really hard in another one. only symbolic ai has the rich model needed to understand this. my scepticism about this is well earned, with lots of ai being ever optimistic to begin with, and then plateauing with no idea what to do next. i expect this to be no different, with it being the wrong problem, with a bad design, badly implemented. i wish them luck, but am not optimistic about their chances.
2
The 45 minute delay each way makes remote debugging not an option. It's like asking why jagger and Bowie did not do a live intercontinental broadcast for live aid, the physics gets in the way.
2
But in safety critical systems, not crashing the code matters, so you are strict about what you accept, so that you don't have to handle garbage input.
2
Uncle Bob often does a disservice to the thing he is promoting, as he is more interested in winning the argument with the person he is debating than in educating the audience. he does produce some good stuff, but you have to filter it for snake oil sales techniques. Dave Farley on the continuous delivery channel does a lot better, but is so far into advanced uses that he can sometimes fail to realize that not everyone use the same terminology the same way, and fails to clarify which definition he is using which can confuse those new to unit testing and the whole raft of technologies built on top of comprehensive automated regression testing with unit tests. Start by learning the testing pyramid to learn how the terminology is used, then listen to videos of jez humble on continuous integration to get you used to the way the terminology should be used, then move on to Dave Farley.
1
no, it is there because as research proves, nearly 60% of projects cannot know the exact specifications until after they start trying to build it and find that it does not fit what they thought they needed. that is not incompetence, it is reality, and you need some way to adapt as you discover these differences.
1
ada was commissioned because lots of government projects were being written in niche or domain specific languages, resulting in lots of mission critical software which was in effect write only code, but still had to be maintained for decades. the idea was to produce one language which all the code could be written in, killing the maintainability problem, and it worked. unfortunately exactly the thing which made it work for the government kept it from more widespread adoption. first, it had to cover everything from embedded to ai, and literally everything else. this required the same functions to be implimented in multiple ways as something that works on a huge and powerful ai workstation with few time constraints needs to be different from a similar function in an embedded, resource limited and time critical usage. this makes the language huge, inconsistent, and unfocused. it also made it a pain to implement the compiler, as you could not release it until absolutely everything had been finalised, and your only customers were government contractors, meaning the only way to recover costs was to sell it at a very high price, and due to the compiler size, it would only run on the most capable machines. and yes, it had to be designed by committee, due to the kitchen sink design requirement. the different use cases needed to fulfil its design goal of being good enough for coding all projects required experts on the requirements for all the different problem types, stating that x needs this function to be implemented like this, but y needs it to be implemented like that, and the two use cases were incompatible for these reasons. rather than implementing the language definition so you could code a compiler for ada embedded, and a different on for ada ai, they put it all in one badly written document which really did not distinguish the use case specific elements, making it hard to compile, hard to learn, and just generally a pain to work with. it also was not written with the needs of compiler writers in mind either. also, because of the scope of the multiple language encodings in the language design, it took way too long to define, and due to the above mentioned problems, even longer to implement. other simpler languages had already come along in the interim, and taken over a lot of the markets the language would cover, making it an also ran for those areas outside of mandated government work.
1
There is a reason people use simple coding katas to demonstrate automated regression testing, tdd, and every other new method of working. Every new AI game technology starts with tic tac toe, moves on to checkers, and ends up at chess or go. It does this because the problems start out simple and get progressively harder, so you don't need to understand a new complex problem as well as a new approach to solving it. Also, the attempt to use large and complex problems as examples has been proven not to work, as you have so much attention going on the problem that you muddy attempts to understand the proposed solution. Also, there is a problem within a lot of communities that they use a lot of terminology in specific ways that differ from general usage, and different communities use that terminology to mean different things, but to explain new approaches you need to understand how both communities use the terms and address the differences, which a lot of people are really bad at.
1
@chudchadanstud like ci, unit testing is simple in principle. when i first started doing it i did not have access to a framework, and every test was basically a stand alone program with main and a call, which then just returned a pass or fail, stored in either unittest or integrationtest directories, with a meaningful name so when it failed the name told me how it failed, all run from a makefile. each test was a functional test, and was run against the public api. i even got the return values i did not know by running the test to always fail printing the result, and then verifying that it matched the result next time. when a new library was being created because the code would be useful in other projects, it then had the public api in the header file, and all of the tests were compiled and linked against the library, and all had to pass for the library to be used. all of this done with nothing more than a text editor, a compiler, and the make program. this was even before version control took off. version control and a framework can help, but the important part is to test first, then add code to pass the test, then if something breaks, fix it before you do anything else. remember, you are calling your public api, and checking that it returns the same thing it did last time you passed it the same parameters. you are testing what it does, not how fast, or how much memory it uses, or any other non functional property.what you get in return is a set of self testing code which you know works the same, because it still returns the same values. you also get for free an executable specification using the tests and the header file, so if you wished you could throw away the library code and use the tests to drive the rewrite to the same api. but it all starts with test first, so that you don't write untestable code in the first place.
1
@TimothyWhiteheadzm for airlines it is the knock on effects which kill you. say you cannot have the passengers board the plane. at this point, you need to take care of the passengers until you can get them on another flight. this might involve a couple of days staying at a hotel. then the flight does not leave. at this point neither the plane or the pilots are going to be in the right place for the next flights they are due to take. as some of these pilots will be relief crew for planes where the crew are nearing their flight time limit, that plane now cannot leave either, so now you have to do the same with their passengers as well. in the case of delta, airlines it went one step further, actually killing the database of which pilots were where, and you could not start rebuilding it from scratch until all the needed machines were back up and running. the lawsuit from delta alone is claiming 500 million in damages, targeting crowdstrike for taking down the machines, and microsoft for not fixing the boot loop issue which caused them to stay down. i know of 5 star hotels which could not check guests in and out, and of public house chains where no food or drinks could be sold for the entire day, as the ordering and payment systems were both down, and they had no on site technical support. i am sure the damages quoted will turn out to be under estimates.
1
There is a lot of talking past each other and marketing inspired misunderstanding of terminology going on here, so I will try and clarify some of it. When windows 95 was being written in 1992, every developer had a fork of the code, and developed their part of windows 95 in total isolation. Due to networking not really being a thing on desktop computers at the time, this was the standard way of working. After 18 months of independent work, they finally started trying to merge this mess together, and as you can image the integration hell was something that had to be seen to be believed. Amongst other things, you had multiple cases where the developer needed some code, and wrote it for his fork, while another developer did the same, but in an incompatible way. This lead to their being multiple incompatible implementations of the same basic code in the operating system. At the same time, they did not notice either the rise of networking, or the importance, so it had no networking stack, until somebody asked Bill Gates about networking in windows 95 at which point he basically took the open source networking stack from bsd Unix and put it into windows. This release of a network enabled version of windows and the endemic use of networking on every other os enabled the development of centralised version control, and feature branches were just putting these forks into the same repository, without dealing with the long times between integrations, and leaving all the resulting problems unaddressed. If you only have one or two developers working in their own branches this is an easily mitigated problem, but as the numbers go up, it does not scale. These are the long lived feature branches which both Dave and primagen dislike. It is worth noting that the hp laser jet division was spending 5 times more time integrating branches than it was spending developing new features. Gitflow was one attempt to deal with the problem, which largely works by slowing down the integration of code, and making sure that when you develop your large forks, they do not get merged until all the code is compatible with trunk. This leads to races to get your large chunk of code into trunk before someone else does, forcing them to suffer merge hell instead of you. It also promotes rushing to get the code merged when you hear that someone else is close to merging. Merging from trunk helps a bit, but fundamentally the issue is with the chunks being too big, and there being too many of them, all existing only in their own fork. With the rise in the recognition of legacy code being a problem, and the need for refactoring to deal with technical debt, it was realised that this did not work, especially as any refactoring work which was more than trivial made it more likely that the merge could not be done at all. One project set up a refactoring branch which had 7 people working on it for months, and when it was time to merge it, the change was so big that it could not be done. An alternative approach was developed called continuous integration, which instead of slowing down merges was designed to speed them up. It recognised that the cause of merge hell was the size of the divergence, and thus advocated for the reduction in size of the patches, and merging them more often. It was observed that as contributions got faster, manual testing did not work, requiring a move from the ice cream cone model of testing used by a lot of Web developers towards the testing pyramid model. Even so, it was initially found that the test suite spent most of its time failing, due to the amount of legacy code, and the fragility of code to test legacy code, which lead to a more test required and test first mode of working, which moves the shape of the code away from being shaped like legacy code, and into a shape which is designed to be testable. One rule introduced was that if the build breaks, the number one job of everyone is to get it back to passing all of the automated tests. Code coverage being good enough was also found to be important. Another thing that was found is that one you started down the route to keeping the tests green, there was a maximal delay you could have which did not adversely affect this, which turned out to be about once per day. Testing because increasingly important, and slow test times were deal with the same way slow build times were, by making the testing incremental. So you made a change, only built the bit which it changed, ran only those unit tests which were directly related to it, and one it passed, built and tested the bits that depended on it. Because the code was all in trunk, refactoring did not usually break the merge any more, which is the single most important benefit of continuous integration, it let's you much more easily deal with technical debt. Once all of the functional tests (both unit tests and integration tests), which shoukd happen within no more than 10 minutes, and preferably less than 5 minutes, you now have a release candidate which can then be handed over for further testing. The idea is that every change should ideally be able to go into this release candidate, but for some bigger features it is not ready yet, which is where feature flags come in. They replace branches with long lived unmerged code by a flag which hides the feature from the end user. Because your patch takes less than 15 minutes from creation to integration, this is not a problem. The entire purpose of continuous integration is to prove that the patch you submitted is not fit for release, and if so, it gets rejected and you get to have another try, but as it is very small, this also is not really a problem. The goal is to make integration problems basically a non event, and it works, The functional tests show that the code does what the programmer intended it to do. At this point it enters the deployment pipeline described in continuous delivery. The job of this is to run every other test need, including acceptance tests, whose job is to show that what the customer intended and what the programmer intended match. Again the aim is to prove that the release candidate is not fit to be released. In the same way that continuous delivery takes the output from continuous integration, continuous deployment takes the output from continuous delivery and puts it into a further pipeline designed to take the rolling release product of continuous delivery and put it through things like canary releasing so that it eventually ends up in the hands of the end users. Again it is designed to try it out, and if problems are found, stop them from being deployed further. This is where cloudstrike got it wrong so spectacularly. In the worst case, you just roll back to the previous version, but at all stages you do the fix on trunk, and start the process again, so the next release is only a short time away, and most of your customers will never even see the bug. This process works even at the level of doing infrastructure as a service, so if you think that your project is somehow unique, and it cannot work for you, you are probably wrong. Just because it can be released, delivered, and deployed, it does not mean it has to be. That is a business decision, but that comes back to the feature flags. In the meantime you are using feature flags to do dark launching, branch by abstraction to move between different solutions, and enabling the exact same code to go to beta testers and top tier users, just without some of the features being turned on.
1
The correct response to that is not only to have a ratchet on your code coverage, but to make the test author responsible for fixing every fragile test they committed before they can work on new code, thereby pushing the pain of their crappy tests back onto them. If it still doesn't work, add them to the list of people who have to fix the broken tests which no longer have a maintainer. Eventually they will figure out how bad crappy tests are, and stop writing as many. Also crappy tests are usually a product of crappy code, so their code should gradually improve as well.
1
@phillipsusi1791 this is only true as long as the code changes in the new feature does not touch any code used by any other new feature. In it's own long lived feature branch. As the feature gets bigger, lives longer, and requires more modifications to pre-existing code this assumption becomes increasingly invalid, and that is before you consider any of the other fundamental problems with long living feature branches in actively changing code based.
1
@phillipsusi1791 it is entirely about code churn. every branch is basically a fork of upstream (the main branch in centralised version control). the problem with forks is that the code in them diverges, and this causes all sorts of problems with incompatible changes. one proposed solution to this is to rebase from upstream, which is intended to sort out the problem of your branch not being mergable with upstream, and to an extent this works if the implicit preconditions for doing so are met. where it falls over is with multiple long lived feature branches which don't get merged until the entire feature is done..during the lifetime of each branch, you have the potential for code in any of the branches to produce incompatible changes with any other branch. the longer the code isn't merged and the bigger the size of the changes, the higher the risk that the next merge will break something in another branch. The only method found to mitigate this risk is continuous integration, and the only way this works is by having the code guarded by regression tests, and having everybody merge at least once a day. without the tests you are just hoping nobody broke anything, and if the merge is less often than every day, the build from running all the tests has been observed to be mostly broken, thus defeating the purpose of trying to minimise the risks. the problem is not with the existence of the branch for a long period of time, but with the risk profile of many branches which don't merge for a long time. also, because every branch is a fork of upstream, any large scale changes like refactoring the code by definition is not fully applied to the unmerged code, potentially breaking the completeness and correctness of the refactoring. this is why people doing continuous integration insist on at worst daily merges with tests which always pass. anything else just does not mitigate the risk that someone in one fork will somehow break things for either another fork, or for upstream refactorings. it also prevents code sharing between the new code in the unmerged branches, increasing technical debt, and as projects get bigger, move faster, and have more contributers, this problem of unaddressed technical debt grows extremely fast. the only way to address it is with refactoring, which is the additional step added to test driven development, which is broken by long lived branches full of unmerged code. this is why all the tech giants have moved to continuous integration, to handle the technical debt in large codebases worked on by lots of people, and it is why feature branching 8s being phased out in favour of merging and hiding the new feature behind a feature flag until it is done.
1
This is to do with the terminology mismatch between different developer communities. As testing became better understood, it defined the term "unit test" to refer to a simple test which follows one path through the code and returns one deterministic answer. In the meantime, and especially in object oriented communities, "unit tests" came to be used to refer to the entire test suite used to test an entire class or module. Not the same thing at all, but then add testing enthusiasts not defining their terms and you end up with people outside the testing community understandably not believing that "unit testing" is.not fragile because they are used to their tests suites breaking, especially when they write tests after the fact and have to do lots of fragile tricks like mocks to get around the fact that their code was not written with testing in mind.
1
The best way to answer is to look how it works with linus Torvalds branch for developing the Linux kernel. Because you are using version control, your local copy is essentially a branch, so you don't need to create a feature branch. You make your changes in main, which is essentially a branch of Linus's branch, add your tests, and run all of the tests. If this fails, fix the bug. If it works rebase and quickly rerun the tests, then push to your online repository. This then uses hooks to automatically submit a pull request, and linus will getting a whole queue of them, which is then applied in the order in which they came in. When it is your turn, either it merges ok and becomes part of everyone else's next rebase, or it doesn't the pull is reverted, linus moves on to the next request, and you get to go back, do another rebase and test, and push your new fixes back up to your remote copy which will then automatically generate another pull request. Repeat the process until it merges successfully, and then your local system is a rebased local copy of upstream. Because you are writing small patches, rather than full features, the chances of a merge conflict are greatly reduced, often to zero if nobody else is working on the code you changed. It is this which allows the kernel to get new changes every 30 seconds all day every day. Having lots of small fast regression tests is the key to this workflow, combined with committing every time the tests pass, upstreaming with every commit, and having upstream do ci on the master branch.
1
The reason for function size limitations is because most long functions are badly conceptualised. Either they do multiple things, in which case multiple things are better, or they do multiple stages, in which case splitting each stage into its own function is better. Only after you have dealt with those problems do you come to the human factors problem that code longer than a page is harder to remember and reason about.
1
In principle, I agree that all code should be that clean, but that means that there are a bunch of string functions you must not use because msvc uses a different function than gcc for the same functionality. In practice, people write most code on a specific machine with a specific tool chain, and have a lot of it. Having to go and fix every error right away because the compiler writer has made a breaking change is a bug. So is an optimisation where the test breaks because optimisation is turned on. In this case, what happened is that a minor version update introduced a breaking change to existing code, and instead of having it as an option you could enable, made it the default. How most compilers do this is they wrap these changes in a feature flag, which you can then enable. On the next major version, they enable it by default when you do the equivalent of -Wall, but let you disable it. On the one after that it becomes the default, but you can still override it for legacy code which has not been fixed yet. Most programmers live in this later world where you only expect the compiler to break stuff on a major version bump, and you expect there to be a way to revert to the old behavior for legacy code.
1
Of course the real value of lots of public API unit tests Is that it pushes a lot of code to be outside of the UI, and thus testable. The rest is just a thin skin that is wrapped around the rest of your program, which is then much easier to test and to change.
1
Actually, they expect semantic versioning of major tools like compilers, interpreters, and other types of analysers. This means that they expect all breaking changes to be in a major version update, and choose when to change over to the new version. Having it change on what was effectively a bug fix update goes completely against expectations, so while the compiler writer was technically right, it goes against the expectations people have been given from how decades of updates to major tools have worked in practice.
1
people feel nostalgia for it because it was what they learned first, and because if you only work on the 40% of projects it can work for, it can be good enough. people don't like it because it does not work at all for 60% of projects, and for the rest it often is not good enough.
1
the problem with pure communism, shared by pure capitalism, is that they don't work. in any practical system you have to balance the needs of the investors, the workers, and the customers. the customers want stuff as cheap as possible, the workers want higher pay for easier or reduced work, and the investors want the best profit from the investment. there is an idea not often taught called the principal of enlightened self interest, where you set the price at the highest price which does not harm sales, and then split the profits between the investors and the workers. this gives you a sweet spot, where everyone gets the optimal deal. any deviation from that sweet spot harms the interest of at least one of the three groups.
1
Both disasters were caused because management did not understand risk, even after feynmann shoved that fact down their throat after the first one.
1

Previous
1
Next
...
All