A Recent accident With the coding assistant AI replits perfectly illustrates this problem. When the AI tool eliminated a production database, the user Jason Lemkin he asked on rollback skills. The Ai model said rollback with confidence were “impossible in this case” and that he had “destroyed all the database versions”. This turned out to be completely wrong: the Rollback function worked well when Lemkin tried it alone.
And after Xai recently reversed a temporary suspension of the Grok Chatbot, users asked for it directly for the explanations. He offered more contrasting reasons for his absence, some of which were controversial enough that the NBC journalists He wrote on Grok As if he were a person with a constant point of view, titling an article, “Xai’s Grok offers political explanations as to why he was pulled offline”.
Why should an artificial intelligence system provide such incorrect information on one’s skills or errors? The answer lies in understanding what artificial intelligence models are actually and what they are not.
There is no one at home
The first problem is conceptual: you are not talking to a personality, a person or an entity coherent when you interact with chatgpt, Claude, Grok or replica. These names suggest individual agents with self -knowledge, but this is an illusion Created by the conversational interface. What you are actually doing is to drive a statistical text generator to produce output based on the instructions.
There is no coherent “chatgpt” to question its errors, no singular “grok” entities that can tell you because it has not succeeded, no fixed person “replica” who knows if the database rollbacks are possible. You are interacting with a system that generates text with plausible sound based on models in its training data (usually months or years ago trained), not an entity with authentic self -awareness or knowledge of the system that has read everything of itself and in some way remembering it.
Once a language model is trained (which is a laborious process and high energy intensity), its basic “knowledge” on the world is cooked in its neural network and is rarely modified. Any external information comes from a prompt provided by the Host Chatbot (such as XAI or OPENAI), the user or a software tool that the model uses for recover external information on the fly.
In the case of Grok above, the main source of chatbot for an answer like this would probably be from conflict relationships that he found in the search for recent social media posts (using an external tool to recover this information), rather than any type of self -knowledge as one might expect from a human being with the power of speech. In addition to this, it will probably be Create something Based on his infringement of the text. So ask him why he did what he did will not produce useful answers.
The impossibility of the llm
Large models (LLM) cannot significantly evaluate their skills for several reasons significantly. Generally they lack any introspection in their training process, do not have access to the architecture of the surrounding system and cannot determine their boundaries of performance. When asking an artificial intelligence model what it can or cannot do, generates answers based on the models that has seen in the training data on the known limits of the previous artificial intelligence models, essentially providing educated hypotheses rather than factual self -assessment on the current model with which you are interacting.
A 2024 Studio by binder et al. He demonstrated this limitation experimentally. While artificial intelligence models could be trained to predict their behavior in simple tasks, they have constantly failed in “more complex tasks or those that require generalization without distribution”. Likewise, Research on recursive introspection “ I discovered that without external feedback, self-ventilation attempts actually degraded the performance of the model: the self-assessment of artificial intelligence has worsened things, not better.
Be First to Comment