From Token to Symbol-Unit: The Cognitive Battle Behind AI's Chinese Naming Controversy

Recently, the National Committee for Terms in Sciences and Technologies announced a public notice recommending the translation of "Token" in the artificial intelligence field as "词元" (Ci Yuan), opening it for public trial use. Subsequently, People's Daily published an article titled "Expert Interpretation: Why Token's Chinese Name is Determined as 'Ci Yuan'," systematically explaining this naming from a professional perspective.

The article mentions that the term "token" originates from Old English "tācen," meaning "symbol" or "mark." In language models, token represents the minimum discrete unit obtained after text segmentation or byte-level encoding, which can manifest as words, subwords, affixes, or characters in different forms. Models demonstrate certain intelligent capabilities precisely through modeling token sequences.

This translation name was considered to conform to principles of unambiguity, scientificity, conciseness, and coordination within the expert argumentation system, also possessing certain usage foundations in the current Chinese context. However, after reading relevant interpretations, I formed a different understanding of this naming approach.

From a standardization perspective, this naming scheme offers comprehensibility and communication advantages in the short term. However, examining from dimensions of computational ontology, information structure, multimodal evolution, and back-translation consistency, its long-term adaptability requires further verification. Against this background, an equally worthy alternative path—"符元" (Fu Yuan, Symbol-Unit)—gradually reveals stronger structural consistency and cross-context stability.

I. Definition Dislocation: Cannot Replace "Essence" with "Origin"

Expert Viewpoint (Chen Xilin, Researcher at Institute of Computing Technology, Chinese Academy of Sciences): Token's initial role in artificial intelligence is "basic semantic unit of language," therefore "Ci Yuan" can better align with its essence.

This judgment holds rationality in historical context, but in the current era of technological paradigm leaps, this thinking essentially represents "academic carving swords on moving boats"—fundamentally outdated.

At the logical level of terminology definition, we must strictly distinguish between "initial application scenario" and "structural essential attributes."

Token indeed originated from Natural Language Processing (NLP), but in AGI's evolutionary path, it has long transcended language model boundaries, evolving into a fundamental unit for unified processing of text, images, speech, and even physical signals. In modern computing systems, Token's true structural ontology is "discrete symbol unit," not a single-modality language unit.

If naming follows "initial role," Computer should still be called "Electronic Calculator" (originating from its initial function replacing human calculators); Internet should be called "Cold War Military Network." This naming logic's fatal flaw: it only sees technology's "temporary job" at specific historical moments, ignoring its "physical ontology" spanning eras.

Historical paths cannot equal essential attributes. Similarly, we cannot permanently lock Token within the narrow context of "words" simply because it was initially used for text processing.

Defining basic concepts using "initial application scenarios" essentially replaces structural ontological truth with historical path dependency. Such definitions may provide understanding convenience in early technology stages, but during paradigm expansion phases of multimodal explosion, they rapidly fail and become cognitive shackles hindering understanding. In contrast, "Fu Yuan" directly aligns with cross-modal computing's symbol ontology—it defines not Token's "past," but Token's "truth."

II. Analogy Boundaries: When Explanation Becomes Definition, Deviation Begins

Expert Viewpoint (Dong Yuxiao, Associate Professor at Computer Science Department, Tsinghua University): Through analogies like "word cloud" and "bag of words," discrete units in multimodal contexts can be understood as "generalized words."

Professor Dong's analogy aids understanding but shouldn't replace definition. This approach offers certain heuristic value at the explanation level, but if further elevated to naming basis, it may trigger conceptual category errors.

Methodologically, analogy's function lies in lowering understanding thresholds, while definition's responsibility involves drawing semantic boundaries. When "word" expands to cover image patches, speech segments, vector representations (embedding), and even broader perceptual signals, its original language attributes become continuously diluted, semantic boundaries growing blurred. This "analogy-driven" expansion path can maintain explanatory consistency short-term, but容易造成 semantic drift during long-term evolution.

In cross-modal expansion capability, we must remain vigilant against "analogy" sliding toward "definition." In terminology standardization contexts, we must distinguish boundaries between "explanatory metaphors" and "ontological definitions," avoiding the former replacing the latter.

An intuitive comparison: in popular science contexts, we can analogize light bulbs as "artificial suns" to enhance understanding intuitiveness; but in scientific naming systems, we cannot rename the current unit "Ampere" as "Light-Unit" based on this. The former belongs to descriptive expression, the latter involves strict measurement systems and standardized definitions—the two cannot be confused.

Similarly, terms like "word cloud" and "bag of words" essentially belong to descriptive or statistical metaphors, their function helping understand data structures or distribution patterns; while Token as a fundamental measurement unit in large models has deeply embedded into computing power billing, model training, and academic measurement systems. When its usage scale reaches hundreds of billions to trillions of daily calls, the naming carries not just explanatory function, but a basic concept with engineering and standard significance. At this level, terminology更需要 align with its ontological attributes rather than rely on analogy extension.

If we further push this analogy logic to the naming level, it actually implies a dangerous premise: since people have become accustomed to understanding Token using "word," we might as well continue using this analogy. But this is actually path dependency continuation—using existing cognitive convenience to replace conceptual ontological correction. In this sense, this naming approaches "linguistic romanticism" rather than strict alignment with computational ontology.

We cannot require discussing "electronic horses" in motors simply because "horsepower" contains "horse." Analogy can inspire understanding but cannot define standards.

In contrast, "Fu" (Symbol) as a more neutral concept naturally possesses cross-modal adaptability, covering text, images, speech, and various information forms without additional explanation. Therefore, the naming path centered on "symbol unit" at the definition level approaches Token's structural essence more closely. In this logic, "Fu Yuan" as the corresponding translation name possesses higher conceptual consistency and long-term adaptability.

III. Cognitive Cost: When Semantic Anchoring Creates Systematic Misunderstanding

Expert Viewpoint (Comprehensive Expert Opinion): "Ci Yuan" expression is concise, conforms to Chinese habits, easy to disseminate.

This judgment holds certain rationality at the communication level, but its implicit premise is: the public can accept "word's" cross-modal analogy. However, analogy is essentially an expert thinking tool, not the public's natural cognitive approach. For ordinary users, "word" possesses extremely strong semantic anchoring effect—once hearing "word," their intuition necessarily points to language systems, not images, sounds, actions, or other modalities. This cognitive path isn't a technical problem, but a stable structure at the cognitive psychology level.

On this basis, when "word" expands to so-called "generalized word," it actually creates deviation in user cognition. Users first form intuitive understanding of "word = language unit," not the abstract concept of "cross-modal symbol unit." Once this misunderstanding is established, all subsequent explanations become corrections to existing cognition rather than natural understanding extension.

For example, when media reports "model trained using 10 trillion Ci Yuan," the public easily understands this as "reading massive text," ignoring the large amount of images, speech, and other modal data included. This misunderstanding isn't isolated but systematic induction produced by the term's own semantic anchoring.

In actual engineering contexts, this naming may also bring friction to interdisciplinary communication. When discrete units in vision models or speech models are called "words," it not only easily triggers semantic misunderstanding but also creates unnecessary language conflicts between different fields. Multimodal systems need "symbol layer" unification, not language category expansion.

In contrast, "Fu" as a more abstract concept, while having slightly higher initial understanding threshold, possesses more neutral semantic direction, not pre-locking cognition at the language layer. In long-term use, it更有利于 building stable, unified cognitive frameworks, thereby reducing overall explanation costs and providing more stable cognitive foundations for multimodal unification.

Naming costs don't occur at definition time, but at correction time. Once early naming forms semantic anchoring, subsequent cognitive repair costs will rise exponentially.

Experts can expand "word's" boundaries through analogy, but the public won't understand concepts through analogy. Naming doesn't serve experts—it serves the entire era's cognitive system.

IV. Unambiguity Illusion: When One Word Tries to Carry Two Systems

Expert Viewpoint (Terminology Standardization Principles): "Ci Yuan" conforms to unambiguity principles, helping solve chaotic translation problems.

Regarding terminology unambiguity, we need to pay special attention to systematic risks that "one word, two meanings" may trigger. In scientific terminology standardization, "unambiguity" is one of the foundational principles. If a term requires relying on context or additional explanation to distinguish meanings, its value as a standard component is already lost.

However, from the existing academic system perspective, this judgment still has room for further discussion. "Ci Yuan" has long been "claimed" in linguistics and Natural Language Processing (NLP) fields. In classical linguistics, its long-term corresponding English concept is Lemma, meaning the word's normalized original form (for example, is/am/are's Lemma is be). This usage has formed stable consensus in linguistics and NLP foundational textbooks and academic papers.

Against this background, translating Token as "Ci Yuan" simultaneously容易产生 semantic conflicts in specific expressions.

For example, when describing "lemmatize a token in NLP," the Chinese expression will appear as "perform 'lemmatization' on 'Ci Yuan'." This expression not only increases understanding costs but also introduces ambiguity in academic writing and information retrieval, making it difficult for readers to distinguish whether "Ci Yuan" points to the segmented discrete unit or the word's normalized original form.

From a conceptual function perspective, the two also have clear distinctions: Lemma emphasizes "restoration" at the language level, corresponding to normalized expression after morphological changes; while Token emphasizes "segmentation" during computational processes, corresponding to the minimum discrete unit when models process information. This "restoration" versus "segmentation" difference precisely corresponds to different dimensions of semantic layer and symbol layer.

Therefore, when a term needs to cover multiple existing concepts simultaneously through "generalization," its unambiguity actually transforms into "explanatory unification" rather than "semantic stability."

When a term requires explanation to maintain unification, its stability as a standard term often begins to shake.

In contrast, "Fu Yuan" has no semantic conflicts in the existing terminology system. On one hand, it retains Token's ontological attributes as discrete symbols; on the other hand, it avoids overlap with Lemma's existing translation name, thereby demonstrating higher stability in semantic clarity and system consistency.

V. Ontological Return: Token is Essentially "Symbol," Not "Word"

Expert Viewpoint (General Explanation): Token is the minimum unit used for processing text in language models.

This expression is valid at the functional level but still remains at the "how to use" level without touching its ontological attributes in computational theory. From the perspectives of information theory and computational theory, the basic objects processed by computing systems are not "words" but "symbols."

This point can be further understood from two levels:

On one hand, from the information theory perspective, information's essence lies in eliminating uncertainty, its measurement unit is bits, its carrying entity is discrete symbols. Symbols don't care about semantic content, only relating to probability distribution and encoding structure.

On the other hand, at the computational implementation level, large model 底层 don't "recognize characters"—their processing objects are discrete index representations (ID). Whether this ID corresponds to a Chinese character, an image patch, or an audio sampling point, all participate in computation in unified symbol form during the computational process.

Within this framework, precisely because its essence lies at the "symbol layer," not the "semantic layer." Symbols themselves don't carry semantics, existing only as the basic carrier of encoding and computation.

Naming Token as "Ci Yuan" introduces implicit direction from the language semantic layer to some extent, pulling this concept originally at the symbol layer back to a language-centered understanding path. This naming approach may provide intuitiveness at the explanatory level but easily blurs boundaries between "symbolic computation" and "semantic understanding" at the theoretical level.

In contrast, "Fu Yuan" remains within the symbol layer conceptually. On one hand, it accurately reflects Token's computational attributes as discrete symbols; on the other hand, it avoids introducing semantic features into ontological definitions, thereby better conforming to the basic frameworks of information theory and computational theory.

From a broader perspective, as AI systems continuously evolve toward generalization and multimodal fusion, if basic concept naming can directly align with their mathematical and computational ontology, it will更有利于 building stable, scalable cognitive systems. In this sense, the naming path centered on "symbol unit" is not just a language choice problem but a consistent expression of computational essence, and "Fu Yuan" is precisely the natural correspondence within this framework.

Defining concepts from the symbol layer aligns with computational essence; naming concepts from the semantic layer approaches explanation rather than definition.

VI. Language Rupture: Mapping Failure in Back-Translation Mechanisms

Expert Viewpoint (Comprehensive Interpretation): "Ci Yuan" has gradually formed usage foundations in the Chinese academic community, possessing certain communication advantages.

In cross-language contexts, we need to remain vigilant about systematic impacts brought by terminology "back-translation rupture." Measuring whether a scientific and technical term possesses long-term vitality depends not only on its expressive capability in Chinese contexts but more on whether it can achieve stable mapping in international academic systems. Ideal terminology should possess "reversibility," achieving consistent round-trip semantics between different languages.

The above judgment reflects "Ci Yuan's" acceptability in local contexts, but from a cross-language perspective, there's still room for further discussion. If a term is established only in a single language system without forming stable correspondence in international contexts, it may introduce additional understanding costs in academic exchanges.

Specifically, "Ci Yuan" lacks clear, unique corresponding paths during back-translation. When restored to English, it often creates divergence among multiple approximate concepts: for example, "word unit" lacks strict academic definition, "morpheme" corresponds to linguistics' morphemes, "lexeme" points to lexical items. None of these concepts can accurately cover Token's meaning in computational contexts, instead introducing category shifts.

In contrast, "Fu Yuan" can relatively naturally correspond to "symbolic unit." This concept possesses clear theoretical foundations and stable usage in information theory, discrete mathematics, and multimodal representation fields, maintaining consistent semantic direction across different contexts. Therefore, it更容易 forming one-to-one mapping relationships between Chinese and English.

From a practical perspective, once terminology enters academic papers, technical documents, and international exchange scenarios, its back-translation capability will directly impact expression efficiency and understanding accuracy. If a term requires additional explanation to complete cross-language conversion, its long-term usage costs will continuously accumulate.

Therefore, in cross-language systems, the main problem "Ci Yuan" faces lies in mapping path instability, while "Fu Yuan" demonstrates higher certainty in semantic correspondence and conceptual consistency. Against the background of increasingly globalized artificial intelligence, choosing terminology with good back-translation characteristics will更有利于 building open, interoperable academic and technical systems.

Terminology's international reversibility is essentially the key benchmark for whether it possesses long-term academic vitality.

VII. Unification Misconception: Formal Consistency Doesn't Equal Structural Consistency

Expert Viewpoint (Comprehensive Expert Opinion): "Ci Yuan" maintains consistent expression style with terms like "embedding" and "attention," concise and abstract, conforming to Chinese technical contexts.

Conclusion First: Terminology system unification should be built upon "conceptual isomorphism," not "linguistic homomorphism."

In "Ci Yuan's" supporting arguments, a common rationale is: its expression style maintains consistency with terms like "embedding" and "attention," concise and abstract, conforming to Chinese technical contexts. This rationale captures the real demand for terminology system unification, but the problem lies in—if unification stays only at the language level rather than the structural level, it will slide from "order" to "illusion."

"Embedding" and "attention" became stable terms because they correspond to clear computational structures: the former is vector mapping, the latter is weight mechanisms, their naming directly points to computational essence. While "Ci Yuan" belongs to explanatory naming, its rationality depends on the "generalized word" analogy framework. Once separated from explanation, this naming itself lacks self-consistent structural direction.

This difference brings a key problem: formal consistency, semantic drift.

The former reduces expression costs, the latter guarantees cognitive stability. If prioritizing "linguistic homomorphism," complexity won't disappear but transfers to long-term cognitive burdens; only naming built on "conceptual isomorphism" can maintain stability during cross-context and multimodal evolution.

When "embedding," "attention," and "Ci Yuan" appear together, it's easy to form an illusion of "conceptual same layer." But actually, the former two are mechanisms, the latter is an object; the former two possess strict definitions, the latter depends on contextual explanation. This structural misalignment plants hidden fractures in cognitive systems.

More importantly, when a basic concept's naming relies on analogy rather than structural definition, its impact won't stay within a single term but will diffuse to the entire terminology system. When subsequent concepts try to expand around this naming, they will have to continuously maintain consistency through explanation, thereby forming implicit structural misalignment.

In this sense, "Fu Yuan" provides an expression path closer to underlying structure. It directly points to the basic object in computing systems—symbol—requiring no analogy explanation to maintain consistency across different contexts.

Terminology isn't just labels—it's cognition's entrance. Good terminology makes explanation gradually disappear; poor terminology makes annotations continuously increase. When basic concepts deviate from structure, terminology systems can only maintain through explanation, not through self-consistent definition.

Conclusion

Essentially, terminology choice isn't just a language problem but early shaping of a field's cognitive structure. Once naming deviates from its structural ontology in the initial stage, subsequent systems can only operate through continuous explanation, unable to form self-consistent conceptual networks.

In artificial intelligence's journey toward generalization and multimodal fusion, a terminology that can align with computational ontology and possess cross-context stability will more likely become a long-term effective cognitive cornerstone. In this sense, the naming path centered on "symbol unit" demonstrates more balanced adaptability in balancing technical essence and cognitive clarity.