Introduction: Bridging the Language Gap in Programming

Programming has long been dominated by English-based syntax. From the earliest days of Fortran and COBOL to modern languages like Python and JavaScript, keywords, operators, and structure have been rooted in English vocabulary. This creates a significant barrier for non-English speakers, particularly in business contexts where domain experts understand the problem deeply but struggle with the syntactic requirements of traditional programming languages.

AScript, an open-source C# dynamic script parsing and execution library, offers an elegant solution to this challenge. With its support for custom syntax parsing, AScript enables developers to create scripting engines that operate in natural languages—including Chinese. This article provides a comprehensive, step-by-step guide to implementing a Chinese scripting engine using AScript, demonstrating how to create a conditional statement structure using native Chinese keywords.

Understanding AScript: The Foundation

Before diving into implementation, it's essential to understand what AScript provides and why it's uniquely suited for this task.

What is AScript?

AScript is a dynamic scripting library built for the .NET ecosystem. Its core capabilities include:

  • Dynamic Script Parsing: Interprets script code at runtime without compilation
  • Custom Syntax Support: Allows developers to define their own grammatical structures
  • Extensible Token Handling: Provides interfaces for creating custom language constructs
  • C# Integration: Seamlessly interoperates with existing C# code and libraries

Why AScript for Chinese Scripting?

The key advantage of AScript for natural language scripting lies in its token-based architecture. Rather than requiring a complete language grammar from the start, AScript allows incremental addition of language constructs through token handlers. This modular approach makes it practical to build a Chinese scripting engine piece by piece.

Implementation Walkthrough: Creating Chinese Conditional Statements

Let's implement a Chinese conditional statement with the structure: "如果 ... 则 ... 否则 ..." (If ... Then ... Else ...). This example demonstrates the core principles that can be extended to build a complete Chinese scripting language.

Step 1: Implementing the ITokenHandler Interface

The first step is creating a custom token handler that recognizes and processes the Chinese "如果" (if) keyword. This handler defines how the parser should interpret conditional statements.

public class 如果语法处理器 : ITokenHandler
{
    private static readonly HashSet<string> _StatementEndTokens = 
        new HashSet<string> { "则", "否则" };

    public void Build(DefaultSyntaxAnalyzer analyzer, TokenAnalyzingArgs e)
    {
        e.IsHandled = true;
        e.End = true;

        // If there's already a statement, push back the token and return
        if (e.TreeBuilder.Root != null)
        {
            e.TokenReader.Push(e.CurrentToken);
            return;
        }

        // Build the condition expression (everything until "则" or "否则")
        var condition = analyzer.BuildOneStatement(
            e.BuildContext, 
            e.ScriptContext, 
            e.Options, 
            e.TokenReader, 
            e.Control, 
            e.Ignore, 
            endTokens: _StatementEndTokens
        );
        
        // Validate and consume the "则" (then) keyword
        analyzer.ValidateNextToken(e.TokenReader, "则");
        
        // Build the main body of the if statement
        var createAllOptions = new BuildOptions(e.Options) 
        { 
            CreateFullTreeNode = true 
        };
        var body = analyzer.BuildOneStatement(
            e.BuildContext, 
            e.ScriptContext, 
            createAllOptions, 
            e.TokenReader, 
            e.Control, 
            e.Ignore, 
            endTokens: _StatementEndTokens
        );
        
        // Create the if node with condition and body
        var node = new IfNode 
        { 
            Condition = condition, 
            Body = body 
        };
        
        // Check for additional tokens
        var nextToken = e.TokenReader.Read();
        if (nextToken.HasValue && nextToken.Value.Value == ";")
        {
            nextToken = e.TokenReader.Read();
        }
        
        // Handle optional "否则" (else) clause
        if (nextToken.HasValue)
        {
            if (nextToken.Value.Value == "否则")
            {
                node.Else = analyzer.BuildOneStatement(
                    e.BuildContext, 
                    e.ScriptContext, 
                    createAllOptions, 
                    e.TokenReader, 
                    e.Control, 
                    e.Ignore
                );
            }
            else
            {
                e.TokenReader.Push(nextToken.Value);
            }
        }
        
        // Add the completed if node to the syntax tree
        e.TreeBuilder.Add(
            e.BuildContext, 
            e.ScriptContext, 
            e.Options, 
            e.Control, 
            node
        );
    }
}

Understanding the Token Handler Logic

Let's break down what this handler accomplishes:

  1. Token Recognition: The handler identifies "如果" as the start of a conditional statement
  2. Condition Parsing: It collects tokens until reaching "则" (then) or "否则" (else), forming the condition expression
  3. Validation: It ensures the "则" keyword is present, maintaining syntactic correctness
  4. Body Construction: It parses the statement block that executes when the condition is true
  5. Optional Else Handling: It checks for and processes an optional "否则" clause
  6. Tree Building: It constructs the syntax tree node representing the complete conditional statement

This approach mirrors how traditional compilers handle if-then-else constructs, but uses Chinese keywords instead of English ones.

Step 2: Defining the Chinese Language Environment

Next, we create a class that inherits from ScriptLang to define our Chinese language environment. This class registers all the Chinese keywords and their associated handlers.

public class 中文语言 : ScriptLang
{
    // Singleton instance for the Chinese language
    public static readonly 中文语言 实例 = new 中文语言();

    public 中文语言()
    {
        // Register Chinese type names
        AddType<int>("整型");
        AddType<string>("文本");

        // Register token handlers for Chinese keywords
        AddTokenHandler("如果", new 如果语法处理器());
        AddTokenHandler("返回", AScript.TokenHandlers.ReturnTokenHandler.Instance);
    }
}

Key Components Explained

Type Registration:

  • 整型 (Integer Type): Maps to C#'s int type
  • 文本 (Text Type): Maps to C#'s string type

This allows scripts to declare variables using Chinese type names, making the language feel natural to Chinese speakers.

Token Handler Registration:

  • 如果 (If): Associates the Chinese "if" keyword with our custom handler
  • 返回 (Return): Uses AScript's built-in return handler for function returns

This registration process is how AScript builds its understanding of the language. Each keyword maps to specific parsing and execution logic.

Step 3: Registering the Chinese Language

Once the language class is defined, it must be registered with the AScript engine:

Script.Langs["中文"] = 中文语言。实例;

This single line makes the Chinese language available to the scripting engine. Scripts can now specify "中文" as their language, and AScript will use the registered handlers and type definitions.

Step 4: Writing and Executing Chinese Scripts

With the language registered, we can write and execute scripts using Chinese syntax. Here's a complete example:

string s = @"
整型 n=10;
文本 s='';
如果 n<5 则 {
    s='小于 5';
} 否则 如果 n<20 则 {
    s='大于等于 5 且小于 20';
} 否则 {
    s='大于等于 20';
}
返回 $'{n},{s}';
";

var script = new Script();
Assert.AreEqual("10,大于等于 5 且小于 20", script.Eval(s));

Script Analysis

Let's examine this script line by line:

  1. Variable Declaration: 整型 n=10; declares an integer variable named n with value 10
  2. String Initialization: 文本 s=''; creates an empty string variable
  3. Conditional Logic: The if-then-else structure evaluates conditions in Chinese
  4. Nested Conditions: 否则 如果 creates an else-if chain for multiple conditions
  5. String Interpolation: $'{n},{s}' combines values into a formatted output string
  6. Return Statement: 返回 sends the result back to the caller

The expected output "10,大于等于 5 且小于 20" confirms that:

  • The variable n equals 10
  • The condition n<5 is false
  • The condition n<20 is true
  • The appropriate branch executes and returns the correct message

Technical Deep Dive: How It Works

Tokenization Process

When AScript processes a Chinese script, it goes through several stages:

  1. Lexical Analysis: The input string is broken into tokens (keywords, identifiers, operators, literals)
  2. Token Recognition: Each token is matched against registered handlers
  3. Syntax Building: Handlers construct syntax tree nodes according to language rules
  4. Execution: The syntax tree is traversed and executed

Custom Token Handling

The ITokenHandler interface is the extension point that makes Chinese scripting possible. By implementing this interface, developers can:

  • Define new keywords in any language
  • Create custom control structures
  • Implement domain-specific operations
  • Extend the language with new features

This extensibility is what transforms AScript from a simple scripting engine into a platform for creating domain-specific languages.

Syntax Tree Construction

The syntax tree represents the hierarchical structure of the script. For our conditional statement:

IfNode
├── Condition: n < 5
├── Body: s = '小于 5'
└── Else: IfNode
    ├── Condition: n < 20
    ├── Body: s = '大于等于 5 且小于 20'
    └── Else: s = '大于等于 20'

This tree structure enables efficient execution and provides a foundation for optimization and analysis.

Practical Applications

Chinese scripting engines built with AScript have numerous practical applications:

Domain-Specific Languages (DSLs)

Business domains often have terminology and logic that don't map cleanly to traditional programming constructs. A Chinese scripting engine allows:

  • Business Rule Expression: Rules written in business terminology
  • Workflow Definition: Processes described in natural language
  • Configuration Scripts: Settings specified by domain experts

Business Rule Engines

Organizations can encode business logic in scripts that business analysts can read and modify:

如果 客户。等级 = "VIP" 则 {
    订单。折扣 = 0.2;
} 否则 如果 订单。金额 > 1000 则 {
    订单。折扣 = 0.1;
} 否则 {
    订单。折扣 = 0;
}

This approach bridges the gap between technical implementation and business understanding.

Educational Tools

Chinese scripting can serve as an educational bridge:

  • Programming Introduction: Students learn programming concepts in their native language
  • Logic Training: Computational thinking developed without language barriers
  • Progressive Learning: Transition from Chinese to English syntax as skills advance

Rapid Prototyping

Development teams can prototype logic quickly using natural language scripts before implementing in production code. This accelerates iteration and improves communication between technical and non-technical stakeholders.

Extending the Language

The conditional statement is just the beginning. A complete Chinese scripting language would include:

Additional Control Structures

  • Loops: 当 ... 时 (while), 对于每个 (for each)
  • Exception Handling: 尝试 (try), 捕获 (catch), 最终 (finally)
  • Switch Statements: 选择 (switch), 情况 (case)

Data Structures

  • Arrays: 数组 with Chinese indexing syntax
  • Objects: 对象 with Chinese property access
  • Collections: 列表, 字典, 集合

Function Definitions

  • Function Declaration: 函数 名称 (参数) { ... }
  • Lambda Expressions: Chinese arrow function syntax
  • Anonymous Functions: Inline function definitions

Standard Library

  • String Operations: Chinese-named string manipulation functions
  • Math Functions: Mathematical operations with Chinese identifiers
  • I/O Operations: File and network operations in Chinese

Challenges and Considerations

Ambiguity Resolution

Natural languages are inherently ambiguous. Unlike programming languages designed for precision, natural language contains:

  • Contextual Meaning: Words change meaning based on context
  • Implicit Relationships: Connections not explicitly stated
  • Cultural Nuances: Expressions specific to language culture

The scripting engine must resolve these ambiguities through careful grammar design and validation rules.

Performance Optimization

Interpreted languages trade performance for flexibility. Optimization strategies include:

  • Syntax Tree Caching: Reusing parsed structures for repeated scripts
  • JIT Compilation: Converting frequently-executed scripts to native code
  • Expression Optimization: Simplifying expressions during parsing

Tooling and Debugging

A complete language ecosystem requires:

  • Syntax Highlighting: Editor support for Chinese keywords
  • Debugging Tools: Breakpoints, step-through execution, variable inspection
  • Error Messages: Clear, helpful error messages in Chinese

The Future of Natural Language Programming

The Chinese scripting engine example demonstrates a broader trend: programming languages becoming more accessible and expressive. As AI and natural language processing advance, we may see:

  • Hybrid Languages: Mixing natural language with formal syntax
  • AI-Assisted Coding: Natural language descriptions converted to code
  • Domain-Specific Evolution: Languages tailored to specific industries

AScript provides a practical foundation for exploring these possibilities today.

Conclusion: Empowering Developers Through Language

The Chinese scripting engine built with AScript represents more than a technical achievement—it's a step toward democratizing programming. By allowing developers and domain experts to express logic in their native language, we:

  • Reduce Barriers: Lower the entry threshold for programming
  • Improve Communication: Bridge gaps between technical and business teams
  • Enhance Productivity: Enable faster iteration and modification
  • Preserve Knowledge: Capture business logic in understandable form

The implementation demonstrated here—conditional statements with "如果 ... 则 ... 否则 ..."—provides a template for building complete natural language scripting engines. The principles extend beyond Chinese to any language, opening possibilities for truly global, accessible programming.

For developers interested in exploring this further, the AScript library provides the tools. The question isn't whether natural language programming has value—it's what you'll build with it.

The future of programming isn't just about more powerful languages—it's about more accessible ones. Chinese scripting with AScript is a concrete step in that direction.