Comments by "" (@grokitall) on "NEW STUDY Does Co-Development With AI Assistants Improve Code?" video.

@julianbrown1331 if as you accept, it is a nightmare to maintain by hand, and it is not suitable for version control, it is almost certainly bad code. most reasons for code to be hard to,maintain are due to breaking good practice, and the code which was generated based on largely untested training data is almost certain to be a testability nightmare are well. not to mention the security issues, copyright problems, etc.
3
@julianbrown1331 partly it is down to the training data, but the nature of how they work does not filter them by quality either before, during, or after training, so lots of these systems are producing code which is as bad as that produced in the average training data, most of which is produced by newbies learning either the languages or the tools. also, you misrepresent how copyright law works in practice. when someone claims you are using their code, they only have to show that it is a close match. to avoid summary judgement against you, you have to show that it is a convergent solution from the constraints of the problem space, and that there was no opportunity to copy the code. given that there have been studies showing that for edge cases with very few examples they have produced identical code snippets right down to the comments in the code, good luck proving no chance to copy the code. just saying i got it from microsoft copilot does not relieve you of the responsibility to audit the origins of the code. even worse, microsoft cannot prove it was not copied either, as the nature of statistical ai obfuscates how got from the source data to the code they gave you. even worse, the training data does not even flag up which license the original code was under, so you could find yourself with gpl code with matching comments leaving you with your only choice being to release your proprietary code under the gpl to avoid triple damages and comply with the license. on top of that, the original code is usually not written to be security or testability aware, so it has security holes, is hard to test, and you can't fix it.
2
the value of maintainence is not in speed of change, but in the fact that when done well, it produces ever improving code which is easier to change. this requires minor updates which make specific changes to make particular types of improvements, which requires understanding why the code is less than optimal, and which change is the better one to make. this is fundamentally at odds with how statistical ai in general works, and when you regenerate sections of code in big blocks, you have no reason to believe that what it guessed this time is any better than what it guessed last time, or that it is not throwing away better code to replace it with something worse. it also fundamentally screws up the whole idea of version control, as it is impossible to create good commit messages, and you are repeatedly just bulk replacing large chunks of code rather than evolving it.
2
@julianbrown1331 yes, you can treat it as a black box, only doing version control on the tests, but as soon as you do you are relying on the ai to get it right 100% of the time, which even the best symbolic ai systems cannot do. also, the further your requirements get away from the ones defining the training data, the worse the results get. also the copyright issues are non trivial. when your black box creates infringing code, then by design you are not aware of it, and have no defence against it. even worse, if someone infringes your code, you again do not know by design, cannot prove it, as you are not saving every version, and if you shout about how you work, there is nothing stopping a bad actor copying the code, saving the current version, letting you generate something else, then suing you for infringement, which you cannot defend against because you are not saving your history. it is the wrong answer to the wrong problem, with the potential legal liabilities being huge.
1
the problem with the idea of using statistical ai for refactoring is that the entire method is about producing plausible hallucinations that conform to very superficial correlations. to automate refactoring, you need to understand why the current code is wrong in this context. this is fundamentally outside the scope of how these systems are designed to work, and no minor tweaking can remove the lack of understanding from the underlying technology. the only way around this is to use symbolic ai, like expert systems or the cyc project, but that is not where the current money is going. given the current known problems with llm generated code, lots of projects are banning it completely. these issues include: exact copies of the training data right down to the comments, leaving you open to copyright infringement. producing code with massive security bugs due to the training data not being written to be security aware. producing hard to test code, due to the training data not being written with testing in mind. the code being suggested being identical to code under a different license, leaving you open to infringement claims. when the code is identified as generated, it is not copyrightable, but if you don't flag it up it moves the liability for infringement to the programmer. the only way to fix generating bad code is to completely retrain from scratch, which does not guarantee fixing the problem and risks introducing more errors. these are just some of the issues of statistical methods, there are many more.
1
@ThomasTomiczek i do not underestimate the potential of ai to help with all sorts of programing tasks, only the implausability of getting there using the currently popular statistical ai systems, the whole point of which is that you throw a lot of data at them and they do not create a model of the system, just a bunch of plausible corellations each of which could be noise. you could have a swarm of symbolic ai's each looking at the code base and figuring things out which other parts could then make use of, which in turn could help take legacy code from an untestable big ball of mud to a much better design, but to do that, it needs to have some clue as to what the existing code is doing, why, and how to move it in the direction of something better.
1
@ThomasTomiczek the whole approach is based on not modeling the problem and looking for ever larger numbers of low level correlations, which be purely coincidental and hoping it can come up with something which is good enough. when you point out this problem, the suggested solution is just to throw more of the same at the issues, which will somehow magically solve problems caused by how the tech works. turtles all the way down does not work in philosophy, and it does not work in software.
1
@trignals not really. the history of programming has been to migrate away from hard to understand, untestable, clever code which almost nobody can understand, towards code which better models the problem space and the design goals needed to get something good to do the job, which is easier to maintain due to the costs moving away from the hardware, then the initial construction, till most of the cost is now in the multi year maintainence mode. there are lots of people in lots of threads on lots of videos about the subject who seem to buy the hype that you can just throw statistical ai at legacy code, it will suddenly create massive amounts of easy to understand tests, which you can then throw at another ai which can just trivially create wonderful code which will replace that big ball of mud with optimum code behind optimum tests, where the whole system is basically ai generated tests and code, but built by systems which fundamentally can never reach the point of knowing the problem space and the design options as they fundamentally do not work that way. as some of those problems are analogous to the halting problem, i am fundamentally sceptical of the hype which goes on to suggest that if there is not enough data to create enough superficial correlations, then we can just go ahead and use ai to fake up some more data to improve the training of the other ai systems. as you can guess, a lot of these assumptions just do not make sense. a system which cannot model the software cannot then use the model it does not have to make high level design choices to make the code testable, it cannot then go on to use the analysis of the code it does not do to figure out the difference between bad code and good code, or to figure out how to partition the problem. finally, it cannot use that understanding it does not have to decide how big a chunk to regenerate, and if the new code is better than the old code. for green field projects, it is barely plausible that you might be able to figure out the right tests to give it to get it to generate something which does not totally stink, but i have my doubts. for legacy code, everything depends on understanding what is already there, and figuring out how to make it better, which is something these systems basically are not designed to be able to do.
1