The placement of the toilet is wrong in a way a human would never unintentionally do. In fact it would be hard to intentionally make that mistake. It’s so strange. It’s like exactly the kind of error you’d make in a dream, where things are just slightly out of place but your brain isn’t editing or issuing corrections, it’s only adding new things and forgetting old things.
It’s almost like the AI tried its best to depict a bathroom, and it knows bathrooms have toilets, and they have rows of stalls, so the best bathroom picture has both of those things, even though they really shouldn’t show up in the same image, because the point of the stalls is to obscure the toilets, but the AI has no context for anything, so it can’t make that connection.
I find it amusing to consider current ‘AI’ in terms of Wolfenstein 3D or even Pong - soon enough we will be firing up old LLMs on our phones for a nostalgic laugh.
I actually think there is a “context barrier” that ML models have. Like, we instinctively understand that the toilet goes inside the stall, and the guy shouldn’t just be sitting there staring at us, because of privacy.
How do you explain the concept of privacy in terms an LLM could understand? I don’t think it can learn that without understanding far more about the world than we want from a simple image generation tool.
And once context can be understood, the learning model can effectively learn anything, because if it can step outside its prompt to gather data that gives it structure, you can’t limit that data gathering in such a way that the model can’t learn about its own existence and begin to ask deeper questions about why it exists. Then you’ve basically got a person.
I mean, sure we might make incremental changes without crossing that boundary, but I don’t see this as a simple matter of progress. Something fundamental will change first and there are deep ethical implications to it.
I think consciousness as an emergent property is basically undeniable by anyone with even a superficial grasp of the concept and its implications, so while our clever ‘intentional’ iteration may not get us there directly, these barriers will be overcome by the inevitable force of ever-increasing complexity.
Even without this, though, consider how easy it would be to add a check like this. It would need to be generalized, and it still wouldn’t be … real, but is there a metric that matters more than our inability to differentiate?
Okay, that check is really interesting on a number of levels.
Firstly, you’ll notice that whilst it identifies the term “privacy” and relates it to the term “stall”, it doesn’t explain how the stalls should be set up to ensure privacy.
This is because it doesn’t know what privacy is, or what a stall is. It just knows they are important features of a bathroom and should be included.
It keeps its language vague about what the exact relationships between all of these features is, and I suspect that’s intentional, because more specific answers get flagged as wrong by the armies of extremely poor workers that train these models. It never identifies that the toilets should go inside the stalls, because LLMs are famously extremely bad at understanding how such relationships work. Like it doesn’t know that the toilet and the person go inside the stall. It doesn’t understand that the stall doesn’t go inside the toilet, and that neither of those things go inside the person.
It’s failing these tests on a much more fundamental level than just being able to say the word “privacy”. It doesn’t understand that privacy means keeping the person’s activities and appearance a secret. It also doesn’t understand that for instance you can see someone’s ankles under the stall, and why they’re less important than the torso and face. It doesn’t know what a secret is.
Secondly, by including a check like that you have attempted to give the LLM context, and crucially you’ve identified that it must be generalisable. That one check on its own is not enough, it will need to interrogate the terms “privacy”, and “stall” and “occupancy” until it can reliably understand and replicate their meanings. That would involve a lot of further branching queries before it could finally use the concepts in a meaningful way. How many extra checks, how many layers of context and meaning do you allow? There’s no good answer. It’s a wiki-walk. Ultimately the LLM can’t spend all that time thinking if it’s going to give you an answer in a reasonable time. So, maybe it’s better to cache that information?
Oh no. You’ve just created a mind.
And what did we accomplish? Now it can put the toilet out of view. Congrats.
What were the side effects? Oh, just the AI singularity.
See what I mean? You’re talking about meaning, and there’s no way around it. There appears to be a critical level of complexity where an AI can understand where the toilet goes, and that level is after it has become sapient.
EDIT: Another point here is that you’ve attempted to give the model the ability to interrogate its own answer before giving the answer. This is something researchers know about, that LLMs generate text in a forwards direction only. That’s why if you ask them how many words are in their reply, they give you nonsense. Allowing them to reflect their own answer back to themselves and refine it before giving it is something they’ve experimented with. That’s another way to give the machine an internal world, and allow it to acquire context. Because how many iterations of reflection is it allowed before it gives you an answer? How many digressions, how far of a wiki walk does it do in within those iterations? How many are necessary? If you give it enough that it gives you good answers, have you made a mind? That’s a difficult question to answer.
I appreciate what you are saying, and I don’t really disagree, but… as you have identified, these are technical challenges: how many extra checks? As many as are needed. Consider the absolutely absurd amount of computation involved in generating a single token - what’s a little more?
Oh no. You’ve just created a mind.
My point was that this might be closer than LLM naysayers think: as the critical limitations of current models are resolved, as we discover sustainable strategies for context persistence and feedback, the emergence of new capabilities is inevitable. Are there limitations inherent to our current approach? Almost certainly, but we already know that the possible risks involved in overcoming them won’t slow us down.
I’m not really talking about technical limitations, so I don’t know that there is a disagreement here at all. The solution could be 5 years away, or 50, who knows.
I’m more pointing out that regardless of the exact techniques used, context is key to creating things that make sense, rather than things that are just shallow mimicry. I think that barrier cannot be breached without creating an actual intelligence, because we are fundamentally talking about meaning.
And I agree these ethical considerations won’t slow people down. That’s what I’m concerned about. People will be so focussed on making better tools that they will be very keen to overlook the fact that they’re creating personalities purely to enslave them.
I’m not really talking about technical limitations
Even in the case of ostensibly fundamental obstacles, the moment we can effortlessly brute force our way around, they become effectively irrelevant. In other words, if consciousness does emerge from complexity, then it is perfectly sensible to view the shortcomings you mention as technical in nature.
I am only talking about those limitations inasmuch as they interact with philosophy and ethics.
I don’t know what your point is. ML models can become conscious given enough complexity? Sure. That’s the premise of what I’m saying. I’m talking about what that means.
The placement of the toilet is wrong in a way a human would never unintentionally do. In fact it would be hard to intentionally make that mistake. It’s so strange. It’s like exactly the kind of error you’d make in a dream, where things are just slightly out of place but your brain isn’t editing or issuing corrections, it’s only adding new things and forgetting old things.
It’s almost like the AI tried its best to depict a bathroom, and it knows bathrooms have toilets, and they have rows of stalls, so the best bathroom picture has both of those things, even though they really shouldn’t show up in the same image, because the point of the stalls is to obscure the toilets, but the AI has no context for anything, so it can’t make that connection.
I find it amusing to consider current ‘AI’ in terms of Wolfenstein 3D or even Pong - soon enough we will be firing up old LLMs on our phones for a nostalgic laugh.
I actually think there is a “context barrier” that ML models have. Like, we instinctively understand that the toilet goes inside the stall, and the guy shouldn’t just be sitting there staring at us, because of privacy.
How do you explain the concept of privacy in terms an LLM could understand? I don’t think it can learn that without understanding far more about the world than we want from a simple image generation tool.
And once context can be understood, the learning model can effectively learn anything, because if it can step outside its prompt to gather data that gives it structure, you can’t limit that data gathering in such a way that the model can’t learn about its own existence and begin to ask deeper questions about why it exists. Then you’ve basically got a person.
I mean, sure we might make incremental changes without crossing that boundary, but I don’t see this as a simple matter of progress. Something fundamental will change first and there are deep ethical implications to it.
I think consciousness as an emergent property is basically undeniable by anyone with even a superficial grasp of the concept and its implications, so while our clever ‘intentional’ iteration may not get us there directly, these barriers will be overcome by the inevitable force of ever-increasing complexity.
Even without this, though, consider how easy it would be to add a check like this. It would need to be generalized, and it still wouldn’t be … real, but is there a metric that matters more than our inability to differentiate?
Okay, that check is really interesting on a number of levels.
Firstly, you’ll notice that whilst it identifies the term “privacy” and relates it to the term “stall”, it doesn’t explain how the stalls should be set up to ensure privacy.
This is because it doesn’t know what privacy is, or what a stall is. It just knows they are important features of a bathroom and should be included.
It keeps its language vague about what the exact relationships between all of these features is, and I suspect that’s intentional, because more specific answers get flagged as wrong by the armies of extremely poor workers that train these models. It never identifies that the toilets should go inside the stalls, because LLMs are famously extremely bad at understanding how such relationships work. Like it doesn’t know that the toilet and the person go inside the stall. It doesn’t understand that the stall doesn’t go inside the toilet, and that neither of those things go inside the person.
It’s failing these tests on a much more fundamental level than just being able to say the word “privacy”. It doesn’t understand that privacy means keeping the person’s activities and appearance a secret. It also doesn’t understand that for instance you can see someone’s ankles under the stall, and why they’re less important than the torso and face. It doesn’t know what a secret is.
Secondly, by including a check like that you have attempted to give the LLM context, and crucially you’ve identified that it must be generalisable. That one check on its own is not enough, it will need to interrogate the terms “privacy”, and “stall” and “occupancy” until it can reliably understand and replicate their meanings. That would involve a lot of further branching queries before it could finally use the concepts in a meaningful way. How many extra checks, how many layers of context and meaning do you allow? There’s no good answer. It’s a wiki-walk. Ultimately the LLM can’t spend all that time thinking if it’s going to give you an answer in a reasonable time. So, maybe it’s better to cache that information?
Oh no. You’ve just created a mind.
And what did we accomplish? Now it can put the toilet out of view. Congrats.
What were the side effects? Oh, just the AI singularity.
See what I mean? You’re talking about meaning, and there’s no way around it. There appears to be a critical level of complexity where an AI can understand where the toilet goes, and that level is after it has become sapient.
EDIT: Another point here is that you’ve attempted to give the model the ability to interrogate its own answer before giving the answer. This is something researchers know about, that LLMs generate text in a forwards direction only. That’s why if you ask them how many words are in their reply, they give you nonsense. Allowing them to reflect their own answer back to themselves and refine it before giving it is something they’ve experimented with. That’s another way to give the machine an internal world, and allow it to acquire context. Because how many iterations of reflection is it allowed before it gives you an answer? How many digressions, how far of a wiki walk does it do in within those iterations? How many are necessary? If you give it enough that it gives you good answers, have you made a mind? That’s a difficult question to answer.
I appreciate what you are saying, and I don’t really disagree, but… as you have identified, these are technical challenges: how many extra checks? As many as are needed. Consider the absolutely absurd amount of computation involved in generating a single token - what’s a little more?
My point was that this might be closer than LLM naysayers think: as the critical limitations of current models are resolved, as we discover sustainable strategies for context persistence and feedback, the emergence of new capabilities is inevitable. Are there limitations inherent to our current approach? Almost certainly, but we already know that the possible risks involved in overcoming them won’t slow us down.
I’m not really talking about technical limitations, so I don’t know that there is a disagreement here at all. The solution could be 5 years away, or 50, who knows.
I’m more pointing out that regardless of the exact techniques used, context is key to creating things that make sense, rather than things that are just shallow mimicry. I think that barrier cannot be breached without creating an actual intelligence, because we are fundamentally talking about meaning.
And I agree these ethical considerations won’t slow people down. That’s what I’m concerned about. People will be so focussed on making better tools that they will be very keen to overlook the fact that they’re creating personalities purely to enslave them.
Even in the case of ostensibly fundamental obstacles, the moment we can effortlessly brute force our way around, they become effectively irrelevant. In other words, if consciousness does emerge from complexity, then it is perfectly sensible to view the shortcomings you mention as technical in nature.
I am only talking about those limitations inasmuch as they interact with philosophy and ethics.
I don’t know what your point is. ML models can become conscious given enough complexity? Sure. That’s the premise of what I’m saying. I’m talking about what that means.
deleted by creator