Comments by "" (@grokitall) on "ThePrimeTime"
channel.
-
8
-
7
-
4
-
4
-
3
-
3
-
3
-
2
-
2
-
Absolutely right. Unit tests do automated regression testing of the public API of your code, asserting io combinations to provide an executable specification of the public API. When well named, the value of these tests are as follows:
1, Because they test only one thing, generally they are individually blindingly fast.
2, when named well, they are the equivalent of executable specifications of the API, so when it breaks you know what broke, and what it did wrong.
3, they are designed to black box test a stable public API, even if you just started writing it. Anything that relies on private API's are not unit tests.
4, they prove that you are actually writing code that can be tested, and when written before the code, also proves that the test can fail.
5, they give you examples of code use for your documentation.
6, they tell you about changes that break the API before your users have to.
Points 4 and 6 are actually why people like tdd. Point 2 is why people working in large teams like lots of unit tests.
Everyone I have encountered who does not like tests, thinks they are fragile, hard to maintain, and otherwise a pain, and who was willing to talk to me about why usually ended up to be writing hard to test code, with tests at to high a level, and often had code with one of many bad smells about it. Examples included constantly changing public API's, over use of global variables, brain functions or non deterministic code.
the main output of unit testing are code that you know is testable, tests that you know can fail, and knowing that your API is stable. As a side effect of this, it pushes you away from coding styles which makes testing hard, and discourages constantly changing published public API's. A good suite of unit tests will let you completely throw away the implementation of the API, while letting your users continue to use it without problems. It will also tell you how much of the reimplemented code has been completed.
A small point about automated regression tests. Like trunk based development, they are a foundational technology for continuous integration, which in turn is foundational to continuous delivery and Dev ops, so not writing regression tests fundamentally limits quality on big, fast moving projects with lots of contributes.
2
-
2
-
no, waterfall was not a thought experiment, it was something that emerged by discovering that we also need to do x, then just tacking it on the end.
the original paper said "this is what people are doing, and here is why it is a really bad idea".
people then took the diagram from the original paper and used it as a how to document.
the problem with it is that it cannot work for 60% of projects, and does not work very well for a lot of the others.
they tried fixing it by making it nearly impossible to change the specs after the initial stage, and while it made sure projects got built, it ended up with a lot of stuff which was delivered obsolete due to the world changing and the specs staying the same.
the agile manifesto occurred in direct response to this saying here is what we currently do, but if we do this instead, it works better, making incremental development work for the other 60%.
2
-
i think he is mainly wrong about just about everything here.
there is a lot of reimplimentation in open source, as people want open source usage of preexisting filetypes and apis, but this is useful.
technical quality matters, otherwise you end up with bit rot.
people who can do cool things is not the same as people who can explain cool things and otherwise communicate well.
most of the rest of it sounds like someone who sends in a 90k pull request then feels but hurt that nobody is willing to take it without review, and it is just too big to review.
20 years ago we had open source developed with authorised contributers, and it just caused way to many problems, which is why we now do it with distributed version control, and very small pull requests. this also removes most of the "how dare he criticise me" attitude due to being less invested in a 10 line patch than a 10k patch.
also, text does not transmit subtle very well, and when it does, most people miss it. this leads to unsubtle code criticism, to maintain the engineering value.
2
-
2
-
2
-
2
-
the issue of what makes code bad is important, and has to do with how much of the complexity of the code is essential vs accidental. obviously some code has more essential complexity than others, but this is exactly when you need to get a handle on that complexity.
we have known since brooks wrote the mythical man month back in the 1970s that information hiding matters, and every new development in coding has reinforced the importance of this, which is why abstraction is important, as it enables this information hiding.
oop, functional programming, tdd, and refactoring all build on top of this basic idea of hiding the information, but in different ways, and they all bring something valuable to the table.
when you have been in the industry for a short while, you soon encounter a couple of very familiar anti patterns, spaghetti code, the big ball of mud, and the worst one is the piece of snowflake code that everyone is afraid to touch because it will break.
all of these are obviously bad code, and are full of technical debt, and the way to deal with them is abstraction, refactoring, and thus testing.
given your previously stated experience with heavily ui dependant untestable frameworks, therefore requiring heavy mocking, i can understand your dislike of testing, but that is due to the fact that you are dealing with badly designed legacy code, and fragile mocking is often the only way to start getting a handle on
legacy code.
i think we can all agree that trying to test legacy code sucks, as it was never designed with testing or lots of other useful things in mind.
lots of the more advanced ideas in programming start indirectly from languages where testing was easier, and looked at what made testing harder than it needed to be, then adopted a solution to that particular part of the problem.
right from the start of structured programming, it became clear that naming mattered, and that code reuse makes things easier, first by using subroutines more, then by giving them names, and letting them accept and return parameters.
you often ended up with a lot of new named predicates, which were used throughout the program. these were easy to test, and by moving them into well named functions it made the code more readable. later this code could be extracted out into libraries for reuse across multiple programs.
this lead directly to the ideas of functional programming and extending the core language to also contain domain specific language code.
later, the realisation that adding an extra field broke apis a lot lead to the idea of structs, where there is a primary key field, and multiple additional field. when passed to functions, adding a new field made no difference to the api, which made them really popular.
often these functions were so simple that they could be fully tested, and because they were moved to external libraries, those tests could be kept and reused. this eventually lead to opdyke and others finding ways to handle technical debt which should not break good tests. this came to be known as refactoring.
when the test breaks under refactoring, it usually means one of 2 things:
1, you were testing how it did it, breaking information hiding.
2, your tools refactoring implementation is broken, as a refactoring by definition does not change the functional nature of the code, and thus does not break the test.
when oop came along, instead of working from the program structure end of the problem, it worked on the data structure side, specifically by taking the structs, adding in the struct specific functions, and calling them classes and method calls.
again when done right, this should not break the tests.
with the rise of big code bases, and recognition of the importance of handling technical debt, we end up with continuous integration handling the large number of tests and yelling at us when doing something over here broke something over there.
ci is just running all of the tests after you make a change to demonstrate that you did not break any of the code under test when you made a seemingly unrelated test.
tdd just adds an extra refactoring step to the code and test cycle, to handle technical debt, and make sure your tests deal with what is being tested, rather than how it works.
cd just goes one step further and adds acceptance testing on top of the functional testing from ci to make sure that your code not only still does what it did before, but has not made any of the non functional requirements worse.
testing has changed a lot since the introduction of ci, and code developed using test first is much harder to write containing a number of prominent anti patterns.
2
-
I think llvm did not do itself any favours. Originally it used gcc as its backend to provide support for multiple triples, but later defined them in an incompatible way. Seems silly to me.
It has long been possible for compiler writers to define int64 and int32 for when it matters and let the programmer use int when it does not matter for portability. The compiler writer should then use the default sizes for the architecture, rather than just using int.
At abi implementation time, it matters, so there should not be any it depends values in any abi implementation.
Of course that case mentioned is not the only time the glibc people broke the abi.
I think it was the version 5 to 6 update, where they left the parts that worked like c the same, but broke the parts that worked like c++, but did not declare it as a major version bump, so every c program still worked, but every c++ library had to be recompiled, as did anything which used the c++ abis.
Another instance of full recompile required, and it has become obvious that the glibc authors don't care about breaking users programs.
2
-
2
-
@noblebearaw actually, the bigger problem is that black box statistical ai has the issue that even though it might give you the right answer, it might do so for the wrong reason. there was an early example where they took photos of a forest with and without tanks hiding in it, and it worked. they then went back and took more photos of camouflaged tanks, and it didn't work at all. they managed to find out why, and the system had learned that the tank photos were taken on a sunny day, and the no tank photos were taken on a cloudy day, so the model learned how to spot sunny vs cloudy forest pics.
while the tech has improved massively, because statistical ai has no model except likelihood, it has no way to know why the answer was right, or to fix it when it is found to get the answers wrong. white box symbolic ai works differently, creating a model, and using the knowledge graph to figure out why the answer is right.
2
-
2
-
ci came from the realisation that the original paper from the 70s saying that the waterfall development model, while common was fundamentaĺly broken, and agile realised that to fix it, you had to move things that appear late in the process to an earlier point, hence the meme about shift left.
the first big change was to impliment continuouse backups, now refered to as version control.
another big change was to move tests earlier, and ci takes this to the extreme by making them the first thing you do after a commit.
these two things together mean that your fast unit tests find bugs very quickly, and the version control lets you figure out where you broke it.
this promotes the use of small changes to minimise the differences in patches, and results in your builds being green most of the time.
long lived feature branches subvert this process, especially when you have multiple of them, and they go a long time between merges to the mainline (which you say you rebase from).
specifically, you create a pattern of megamerges, which get bigger the longer the delay. also, when you rebase, you are only merging the completed features into your branch, while leaving all the stuff in the other megamerges in their own branch.
this means when you finally do your megamerge, while you probably don't break mainline, you have the potential to seriously break any and all other branches when they rebase, causing each of them to have to dive into your megamerge to find out what broke them.
as a matter of practice it has been observed time and again that to avoid this you cannot delay merging all branches for much longer than a day, as it gives the other braches time to break something else resulting in the continual red build problem.
2
-
2
-
2
-
2
-
2
-
2
-
@alst4817 my point about black box ai is not that it cannot be useful, but due to the black box nature, it is hard to have confidence that the answer is right, that this is anything more than coincidence, and the most you can get from it is a possibility value for how plausible the answer is.
this is fine in some domains where that is good enough, but completely rules it out for others where the answer needs to be right, and the reasoning chain needs to be available.
i am also not against the use of statistical methods in the right place. probabilistic expert systems have a long history, as do fuzzy logic expert systems.
my issue is the way these systems are actually implemented. the first problem is that lots of them work in a generative manner.
using the yast config tool of suse linux as an example, it is a very good tool, but only for the parameters it understands. at one point in time, if you made made any change using this tool, it regenerated every file it knew about from its internal database, so if you needed to set any unmanaged parameters in any of those files, you then could not use yast at all, or your manual change would disappear.
this has the additional disadvantage that now those managed config files are not the source of truth, this is hidden in yasts internal binary database.
it also means that using version control on any of those files is pointless as the source of truth is hidden, and they are now generated files.
as the code is managed by those options in the config file, that should be in text format, version controlled, and any tools that manipulate them should update only the fields it understands, and only for files which have changed parameters.
similarly, these systems are not modular, instead being implimented as one big monolithic black box, which cannot be easily updated. this project is being discussed in a way that suggests that they will just throw lots of data at it and see what sticks. this approach is inherently limited. when you train something like chatgpt, where you do not organise the data, and let it figure out which of the 84000 free variables it is going to use to hallucinate a plausible answer, you are throwing away most of the value in that data, which never makes it into the system.
you then have examples like copilot, where having trained on crap code, it on average outputs crap code. some of the copilot like coding assistants actually are worse, where they replace the entire code block with a completely different one, rather than just fixing the bug, making a mockery of version control, and a .ot of tne time this code then does nit even pass the tests the previous code passed.
then we have the semantic mismatch between the two languages. in any two languages either natural or synthetic, there is not an identity of function beteeen the two languages. somethings can't be done at all in the language, and some stuff which is simple in one language can be really hard in another one. only symbolic ai has the rich model needed to understand this.
my scepticism about this is well earned, with lots of ai being ever optimistic to begin with, and then plateauing with no idea what to do next. i expect this to be no different, with it being the wrong problem, with a bad design, badly implemented. i wish them luck, but am not optimistic about their chances.
2
-
2
-
2
-
1
-
1
-
1
-
ada was commissioned because lots of government projects were being written in niche or domain specific languages, resulting in lots of mission critical software which was in effect write only code, but still had to be maintained for decades. the idea was to produce one language which all the code could be written in, killing the maintainability problem, and it worked.
unfortunately exactly the thing which made it work for the government kept it from more widespread adoption.
first, it had to cover everything from embedded to ai, and literally everything else. this required the same functions to be implimented in multiple ways as something that works on a huge and powerful ai workstation with few time constraints needs to be different from a similar function in an embedded, resource limited and time critical usage.
this makes the language huge, inconsistent, and unfocused. it also made it a pain to implement the compiler, as you could not release it until absolutely everything had been finalised, and your only customers were government contractors, meaning the only way to recover costs was to sell it at a very high price, and due to the compiler size, it would only run on the most capable machines.
and yes, it had to be designed by committee, due to the kitchen sink design requirement. the different use cases needed to fulfil its design goal of being good enough for coding all projects required experts on the requirements for all the different problem types, stating that x needs this function to be implemented like this, but y needs it to be implemented like that, and the two use cases were incompatible for these reasons.
rather than implementing the language definition so you could code a compiler for ada embedded, and a different on for ada ai, they put it all in one badly written document which really did not distinguish the use case specific elements, making it hard to compile, hard to learn, and just generally a pain to work with. it also was not written with the needs of compiler writers in mind either.
also, because of the scope of the multiple language encodings in the language design, it took way too long to define, and due to the above mentioned problems, even longer to implement. other simpler languages had already come along in the interim, and taken over a lot of the markets the language would cover, making it an also ran for those areas outside of mandated government work.
1
-
There is a reason people use simple coding katas to demonstrate automated regression testing, tdd, and every other new method of working. Every new AI game technology starts with tic tac toe, moves on to checkers, and ends up at chess or go. It does this because the problems start out simple and get progressively harder, so you don't need to understand a new complex problem as well as a new approach to solving it.
Also, the attempt to use large and complex problems as examples has been proven not to work, as you have so much attention going on the problem that you muddy attempts to understand the proposed solution.
Also, there is a problem within a lot of communities that they use a lot of terminology in specific ways that differ from general usage, and different communities use that terminology to mean different things, but to explain new approaches you need to understand how both communities use the terms and address the differences, which a lot of people are really bad at.
1
-
@chudchadanstud like ci, unit testing is simple in principle. when i first started doing it i did not have access to a framework, and every test was basically a stand alone program with main and a call, which then just returned a pass or fail, stored in either unittest or integrationtest directories, with a meaningful name so when it failed the name told me how it failed, all run from a makefile.
each test was a functional test, and was run against the public api. i even got the return values i did not know by running the test to always fail printing the result, and then verifying that it matched the result next time. when a new library was being created because the code would be useful in other projects, it then had the public api in the header file, and all of the tests were compiled and linked against the library, and all had to pass for the library to be used.
all of this done with nothing more than a text editor, a compiler, and the make program. this was even before version control took off. version control and a framework can help, but the important part is to test first, then add code to pass the test, then if something breaks, fix it before you do anything else.
remember, you are calling your public api, and checking that it returns the same thing it did last time you passed it the same parameters. you are testing what it does, not how fast, or how much memory it uses, or any other non functional property.what you get in return is a set of self testing code which you know works the same, because it still returns the same values. you also get for free an executable specification using the tests and the header file, so if you wished you could throw away the library code and use the tests to drive the rewrite to the same api.
but it all starts with test first, so that you don't write untestable code in the first place.
1
-
@TimothyWhiteheadzm for airlines it is the knock on effects which kill you. say you cannot have the passengers board the plane.
at this point, you need to take care of the passengers until you can get them on another flight.
this might involve a couple of days staying at a hotel.
then the flight does not leave. at this point neither the plane or the pilots are going to be in the right place for the next flights they are due to take. as some of these pilots will be relief crew for planes where the crew are nearing their flight time limit, that plane now cannot leave either, so now you have to do the same with their passengers as well.
in the case of delta, airlines it went one step further, actually killing the database of which pilots were where, and you could not start rebuilding it from scratch until all the needed machines were back up and running.
the lawsuit from delta alone is claiming 500 million in damages, targeting crowdstrike for taking down the machines, and microsoft for not fixing the boot loop issue which caused them to stay down.
i know of 5 star hotels which could not check guests in and out, and of public house chains where no food or drinks could be sold for the entire day, as the ordering and payment systems were both down, and they had no on site technical support.
i am sure the damages quoted will turn out to be under estimates.
1
-
There is a lot of talking past each other and marketing inspired misunderstanding of terminology going on here, so I will try and clarify some of it.
When windows 95 was being written in 1992, every developer had a fork of the code, and developed their part of windows 95 in total isolation. Due to networking not really being a thing on desktop computers at the time, this was the standard way of working.
After 18 months of independent work, they finally started trying to merge this mess together, and as you can image the integration hell was something that had to be seen to be believed. Amongst other things, you had multiple cases where the developer needed some code, and wrote it for his fork, while another developer did the same, but in an incompatible way. This lead to their being multiple incompatible implementations of the same basic code in the operating system.
At the same time, they did not notice either the rise of networking, or the importance, so it had no networking stack, until somebody asked Bill Gates about networking in windows 95 at which point he basically took the open source networking stack from bsd Unix and put it into windows.
This release of a network enabled version of windows and the endemic use of networking on every other os enabled the development of centralised version control, and feature branches were just putting these forks into the same repository, without dealing with the long times between integrations, and leaving all the resulting problems unaddressed.
If you only have one or two developers working in their own branches this is an easily mitigated problem, but as the numbers go up, it does not scale.
These are the long lived feature branches which both Dave and primagen dislike. It is worth noting that the hp laser jet division was spending 5 times more time integrating branches than it was spending developing new features.
Gitflow was one attempt to deal with the problem, which largely works by slowing down the integration of code, and making sure that when you develop your large forks, they do not get merged until all the code is compatible with trunk. This leads to races to get your large chunk of code into trunk before someone else does, forcing them to suffer merge hell instead of you. It also promotes rushing to get the code merged when you hear that someone else is close to merging.
Merging from trunk helps a bit, but fundamentally the issue is with the chunks being too big, and there being too many of them, all existing only in their own fork.
With the rise in the recognition of legacy code being a problem, and the need for refactoring to deal with technical debt, it was realised that this did not work, especially as any refactoring work which was more than trivial made it more likely that the merge could not be done at all. One project set up a refactoring branch which had 7 people working on it for months, and when it was time to merge it, the change was so big that it could not be done.
An alternative approach was developed called continuous integration, which instead of slowing down merges was designed to speed them up. It recognised that the cause of merge hell was the size of the divergence, and thus advocated for the reduction in size of the patches, and merging them more often. It was observed that as contributions got faster, manual testing did not work, requiring a move from the ice cream cone model of testing used by a lot of Web developers towards the testing pyramid model.
Even so, it was initially found that the test suite spent most of its time failing, due to the amount of legacy code, and the fragility of code to test legacy code, which lead to a more test required and test first mode of working, which moves the shape of the code away from being shaped like legacy code, and into a shape which is designed to be testable.
One rule introduced was that if the build breaks, the number one job of everyone is to get it back to passing all of the automated tests. Code coverage being good enough was also found to be important.
Another thing that was found is that one you started down the route to keeping the tests green, there was a maximal delay you could have which did not adversely affect this, which turned out to be about once per day.
Testing because increasingly important, and slow test times were deal with the same way slow build times were, by making the testing incremental. So you made a change, only built the bit which it changed, ran only those unit tests which were directly related to it, and one it passed, built and tested the bits that depended on it.
Because the code was all in trunk, refactoring did not usually break the merge any more, which is the single most important benefit of continuous integration, it let's you much more easily deal with technical debt.
Once all of the functional tests (both unit tests and integration tests), which shoukd happen within no more than 10 minutes, and preferably less than 5 minutes, you now have a release candidate which can then be handed over for further testing. The idea is that every change should ideally be able to go into this release candidate, but for some bigger features it is not ready yet, which is where feature flags come in. They replace branches with long lived unmerged code by a flag which hides the feature from the end user.
Because your patch takes less than 15 minutes from creation to integration, this is not a problem. The entire purpose of continuous integration is to prove that the patch you submitted is not fit for release, and if so, it gets rejected and you get to have another try, but as it is very small, this also is not really a problem. The goal is to make integration problems basically a non event, and it works,
The functional tests show that the code does what the programmer intended it to do. At this point it enters the deployment pipeline described in continuous delivery. The job of this is to run every other test need, including acceptance tests, whose job is to show that what the customer intended and what the programmer intended match. Again the aim is to prove that the release candidate is not fit to be released.
In the same way that continuous delivery takes the output from continuous integration, continuous deployment takes the output from continuous delivery and puts it into a further pipeline designed to take the rolling release product of continuous delivery and put it through things like canary releasing so that it eventually ends up in the hands of the end users.
Again it is designed to try it out, and if problems are found, stop them from being deployed further. This is where cloudstrike got it wrong so spectacularly. In the worst case, you just roll back to the previous version, but at all stages you do the fix on trunk, and start the process again, so the next release is only a short time away, and most of your customers will never even see the bug.
This process works even at the level of doing infrastructure as a service, so if you think that your project is somehow unique, and it cannot work for you, you are probably wrong.
Just because it can be released, delivered, and deployed, it does not mean it has to be. That is a business decision, but that comes back to the feature flags. In the meantime you are using feature flags to do dark launching, branch by abstraction to move between different solutions, and enabling the exact same code to go to beta testers and top tier users, just without some of the features being turned on.
1
-
1
-
1
-
@phillipsusi1791 it is entirely about code churn. every branch is basically a fork of upstream (the main branch in centralised version control). the problem with forks is that the code in them diverges, and this causes all sorts of problems with incompatible changes.
one proposed solution to this is to rebase from upstream, which is intended to sort out the problem of your branch not being mergable with upstream, and to an extent this works if the implicit preconditions for doing so are met.
where it falls over is with multiple long lived feature branches which don't get merged until the entire feature is done..during the lifetime of each branch, you have the potential for code in any of the branches to produce incompatible changes with any other branch. the longer the code isn't merged and the bigger the size of the changes, the higher the risk that the next merge will break something in another branch.
The only method found to mitigate this risk is continuous integration, and the only way this works is by having the code guarded by regression tests, and having everybody merge at least once a day. without the tests you are just hoping nobody broke anything, and if the merge is less often than every day, the build from running all the tests has been observed to be mostly broken, thus defeating the purpose of trying to minimise the risks.
the problem is not with the existence of the branch for a long period of time, but with the risk profile of many branches which don't merge for a long time. also, because every branch is a fork of upstream, any large scale changes like refactoring the code by definition is not fully applied to the unmerged code, potentially breaking the completeness and correctness of the refactoring.
this is why people doing continuous integration insist on at worst daily merges with tests which always pass. anything else just does not mitigate the risk that someone in one fork will somehow break things for either another fork, or for upstream refactorings.
it also prevents code sharing between the new code in the unmerged branches, increasing technical debt, and as projects get bigger, move faster, and have more contributers, this problem of unaddressed technical debt grows extremely fast. the only way to address it is with refactoring, which is the additional step added to test driven development, which is broken by long lived branches full of unmerged code.
this is why all the tech giants have moved to continuous integration, to handle the technical debt in large codebases worked on by lots of people, and it is why feature branching 8s being phased out in favour of merging and hiding the new feature behind a feature flag until it is done.
1
-
1
-
The best way to answer is to look how it works with linus Torvalds branch for developing the Linux kernel. Because you are using version control, your local copy is essentially a branch, so you don't need to create a feature branch.
You make your changes in main, which is essentially a branch of Linus's branch, add your tests, and run all of the tests. If this fails, fix the bug. If it works rebase and quickly rerun the tests, then push to your online repository. This then uses hooks to automatically submit a pull request, and linus will getting a whole queue of them, which is then applied in the order in which they came in.
When it is your turn, either it merges ok and becomes part of everyone else's next rebase, or it doesn't the pull is reverted, linus moves on to the next request, and you get to go back, do another rebase and test, and push your new fixes back up to your remote copy which will then automatically generate another pull request. Repeat the process until it merges successfully, and then your local system is a rebased local copy of upstream.
Because you are writing small patches, rather than full features, the chances of a merge conflict are greatly reduced, often to zero if nobody else is working on the code you changed. It is this which allows the kernel to get new changes every 30 seconds all day every day.
Having lots of small fast regression tests is the key to this workflow, combined with committing every time the tests pass, upstreaming with every commit, and having upstream do ci on the master branch.
1
-
1
-
In principle, I agree that all code should be that clean, but that means that there are a bunch of string functions you must not use because msvc uses a different function than gcc for the same functionality.
In practice, people write most code on a specific machine with a specific tool chain, and have a lot of it. Having to go and fix every error right away because the compiler writer has made a breaking change is a bug. So is an optimisation where the test breaks because optimisation is turned on.
In this case, what happened is that a minor version update introduced a breaking change to existing code, and instead of having it as an option you could enable, made it the default.
How most compilers do this is they wrap these changes in a feature flag, which you can then enable.
On the next major version, they enable it by default when you do the equivalent of -Wall, but let you disable it.
On the one after that it becomes the default, but you can still override it for legacy code which has not been fixed yet.
Most programmers live in this later world where you only expect the compiler to break stuff on a major version bump, and you expect there to be a way to revert to the old behavior for legacy code.
1
-
1
-
1
-
1
-
1
-
1