"Adjust according to the person!" The "splitting dilemma" of AI large models

Currently, large AI models are facing a tricky technical dilemma: if the same question is phrased differently, the quality of the model's answers can vary dramatically.

This issue, known as "split-brain," exposes the AI model's excessive sensitivity to how questions are asked—if the model thinks the user is asking an "advanced" question, it will give a "smart" answer; if it judges the question to be "simple," the quality of the answer will decrease accordingly.

According to a recent report by The Information, researchers at institutions like OpenAI say this problem usually arises in the later stages of model training, when the model is trained on curated data to learn domain-specific knowledge or improve conversational style. A typical scenario is: if the same math problem is asked in formal proof language, the model usually answers correctly; but if it is expressed casually and verbally, the model may mistakenly think that it has entered a friendly conversation context and sacrifice accuracy for better formatting or even emojis.

This problem highlights a fundamental limitation of today's AI models: they don't truly understand how the world works, as humans do. Some experts believe it means models lack generalization ability and can't deal with tasks outside their training material. For investors, this is not a trivial matter—major labs are receiving tens of billions of dollars in investment, aiming to have models make new discoveries in fields like medicine and mathematics.

This is also not the kind of performance people expect from AI that is supposed to automate jobs across industries. Humans may misunderstand questions, but isn't the point of using AI for automation to overcome these human shortcomings?

Training Dilemma: Fixing Bugs Creates New Problems

Developing new AI models is sometimes like playing "whack-a-mole": fixing the model's errors on certain questions can lead to it giving wrong answers to other questions.

The "split-brain" issue tends to emerge in the later training stages of model development. At this stage, the model is trained on curated datasets to learn specialized knowledge in domains like medicine or law, or to improve how it responds to chatbot users. For example, the model may first be trained on a math dataset to improve accuracy, and then trained on another dataset to improve its personality, tone, and format in answers.

But this process may inadvertently teach the model to differentiate based on the scenario it believes it is facing—specific math problem or a more general question. This oversensitivity is not only reflected in the wording of questions; even insignificant differences like using a dash instead of a colon can affect the quality of the model's answer.

The Essence of "Tailoring Responses to the Audience"

Simply put, if the model thinks the questioner is asking a "stupid" question, it will give a "stupid" answer; if it thinks the question is "smart," it will give a "smart" answer.

This issue reveals the complexity and subtlety of model training, especially the importance of training models on the right combination of data. It also explains why every AI developer is paying tens of billions of dollars to experts in mathematics, programming, law, and other fields to generate training data, so that expert users don't encounter simple errors when using platforms like ChatGPT.

This phenomenon also highlights the core limitation of today's models: they have not developed an understanding of how the world works like humans do. Some experts believe it means the models can't generalize and can't handle certain tasks outside their training materials. Given that investors are pouring tens of billions into labs like OpenAI and Anthropic, expecting them to train models that can make new discoveries in fields such as medicine and math, this could be a major problem.

Risk Warning and DisclaimerThe market involves risk, and investment should be made cautiously. This article does not constitute personal investment advice, nor does it take into account the specific investment objectives, financial situations, or needs of individual users. Users should consider whether any opinions, viewpoints, or conclusions in this article are appropriate for their personal circumstances. Investing accordingly is at your own risk.