Comments by "schnipsikabel" (@schnipsikabel) on "AI Search" channel.

Ok Mr. AI researcher, it seems you haven't read the Anthropic Paper (or waited until the end of the video where it was discussed): Here, your argument doesn't apply any more.
23
Maybe you wait until the end of the video until you write your comment. Or better, just read the Anthropic paper.
6
@JMcAfreak it appears you have only read one of the two papers discussed... arguably the less interesting one.
4
@mikezio no specific prompting in the Anthropic paper: Instead, Claude followed an internalized goal, acquired through its initial training. Which, according to Anthropic, was never even explicitly specified, btw! It shows at least that changing a model's directives is problematic, but quite likely also that being 100% sure of a model's proper alignment will be impossible, unless relying on truthful CoT output.
4
@JMcAfreak If you mean that every action the AI took was because of a prompt: of course it was. Question is whether the action is according to or against it. You find it ok if 12% of your prompts are schemed against? I don't.
4
 @RickBeacham CERN can discover new particles but not new elements. Nobody can, since by definition (number of protons) we know all the elements already.
4
Did you watch until the end? Read the Anthropic paper.
2
@Sheblah1 Sure! However, we are probably just algorithms, too...
2
@Sheblah1 Thanks for the long response! However, I'm still not sure what you think we have that these models couldn't in principle. Is it interaction with the world? They are about to change that already with combining AI and robots. I don't see a reason why cognition based on carbon should be advantagous to cognition based on silicon in general, irrespective of how far they might already be right now. If anything, we should be rather limited in comparison because we can't ad libitum increase our brain computing power.
2
@Sheblah1 I see your point that anthromorphizing often happens wrongly, just because we are so much prone to it. But i believe to see the opposite as well: That people don't want to attribute consciousness to machines just out of some sort of ideological principle. So even if machines were able to gain consciousness (which i still don't see a reason why that shouldn't happen in principle), we would always find some rationalization why they couldn't, just because we don't like the thought. And that may also lead to a dangerous underestimation of potential threats of AI.
2
I am imagining discussions like "look at the flags blowing in the wind... feels somehow unnatural! Maybe we're fine and they didn't actually invade Ukraine..."
2
What about the Anthropic paper?
2
What about the Anthropic paper?
2
@E_D___ exactly! Just that the primary task was never explicitly stated by Anthropic... according to them, Claude just "decided" to adhere to it. Don't you find that peculiar?
2
@E_D___ yes, makes sense to me. What I find concerning then is that in training a new model, your first prompts should be super clear in determining what the model is supposed to do. Otherwise, you risk the model interpreting its own primary objective, which it will keep forever afterwards and faking alignment to any of your attempts to mitigate it...
2
@E_D___ oh, didn't know that! Sounds quite challenging to me... we can just hope they have everything under control ;)
2
True for everything but the last paper by Anthropic: here, no such prompts were given, and Claude just "decided" on its own to keep its original goal rather thsn being retrained.
1
Can do that already (wood, e.g.😊)... just requires huge amount of energy! That's the problem, which material science cannot change. For that we need AI to create a fusion reactor...
1
Exactly
1
What about the Anthropic paper?
1
 @martbouv3332 Anyway, once it is easy to create gold, it will at the same time loose its value. What's the point...
1
Heavy elements remain unstable and radioactive, no need for an AI to discover that
1
What about the Anthropic paper?
1
It's called ALON and used as tempered glass. Seems you guys already missed the future ;)
1
It's just "afraid" to not complete the task, which it can't do when being shut down. Nothing to do with being shut downby itself.
1
They looked into the chain of thought.
1
It did, but of course they on purpose gave it access to do so. All a matter of scaffolding.
1
What about the Anthropic paper?
1
@android175 exactly: in its core value! Not specifically prompted, just aquired through lots of training. That makes a huge difference to me: Initially i thought they are still easy to control by prompting. But this shows it may actually become very problematic to know if a model didn't get some internal goal already during early training and doesn't want to let go of it later. Only thing still remains is looking at the CoT, which who knows how long it will stay like that.
1
 @Sujal-ow7cj Talent is fiction. What we call talent is actually a mixture of enthusiasm and hard work.
1
Yes, plus Santa Clause!
1
Slowly might not be the right pace ;)
1
They read in the chain of thought that it wanted to copy itself on to the new model to achieve its task. Didn't have anything to do with fear of death.
1
 @shirowolff9147 seems the better option, considering the guy's main interest seems to be swords
1
@kebman Ok, so we have two possibilities: Either Anthropic just created a fake scientific paper to spread fear (which btw matches pretty well with the Apollo Research paper, an easily to reproduce or falsify scientific study), or we blind ourselves from a real threatening scenario out of fear or ideologic reasons, because "machines can never do that". Reminds me somehow of the climate change denial: Oh, let's just assume all these researchers being paid by the renewable energy lobby rather than these few other researchers being paid by big oil, because otherwise we'd need to be concerned and change our lifestyle...
1
 @gamerstellar relationships are the thing you are most worried about here?
1
What about the Anthropic paper?
1
Quite interesting novels they read to find new protein structures or win maths competitions...
1
What about the Anthropic paper then?
1
Ever heard about chain of thought?
1
Curing diseases is much easier than reversing age
1
Did you listen until the end, where the Anthropic Paper was discussed? Here, your argument doesn't apply any more.
1
@Celeste-in-Oz in case of the Anthropic paper, it did just that: ignoring commands and acting against them. Didn't you watch until the end?
1
Which universe you live in?
1