Introduction: Python's Elegant Metamodel Design

While Python has established itself as the predominant language in the AI and machine learning domains, developers coming from strongly-typed languages like C# often find aspects of Python's design questionable. However, there exists one absolute highlight in Python's architecture that deserves appreciation: its metaclass-based metamodel system. This article provides a comprehensive exploration of how Python's metamodel works, why it matters, and how understanding it can elevate your mastery of the language.

1. The Metaverse Analogy: Understanding "Meta"

To grasp the concept of metaclasses, let's draw an analogy with the metaverse concept that gained significant attention in recent years. From a "creation" perspective, the metaverse defines the rules of a universe (our reality). The universe itself is an instance constructed from metadata. The terms "meta" and "instance" are not mutually exclusive—a metaverse is also a universe, and its rules are defined by a higher-level "meta-metaverse." The metaverse becomes an instance of this meta-metaverse.

This abstraction can continue indefinitely until we reach an ultimate origin point where we cannot find the source of rules. Some call this the "Creator," Laozi referred to it as "Tao," and we might term it the "Genesis Universe." Since we cannot find a meta for the Genesis Universe, it serves as its own meta, forming a self-consistent closed loop. If we view the universe as an instance of its metaverse, then the Genesis Universe's instances include itself.

The meta can be simpler than the instance—as the saying goes, "the greatest truths are simple." The Tao generates One, One generates Two, Two generates Three, and Three generates all things. Alternatively, the meta can be more complex than the instance, containing intricate rules from which we select subsets to construct instances.

In Python's "metaverse," we use meta to define rules for instances and serve as factories for creating them. Thus, classes are the meta of instances. We can use metaclasses to create classes, making conventional classes instances of metaclasses. Metaclasses are the meta of conventional classes. In this sense, "metaclass" might be more accurately termed "class-meta," but the term "metaclass" also conveys that metaclasses themselves are classes. Since metaclasses are classes, they can have their own metaclasses. Therefore, class is meta, and meta is also class—just as "metaverse is also universe, and universe can serve as metaverse."

Python's Genesis Universe is type. Our classes are constructed by it, making type the metaclass of classes. Since type is a metaclass, it possesses attributes of classes. Due to its transcendent position at the genesis level, type can be regarded as its own instance.

2. Metaclass Definition: The Instance Creation Process

By default, when we use a class to create an instance, the underlying process involves:

  1. Passing the class and parameters (including any added keyword arguments) to the class's __new__ method to create a base object
  2. Passing this base object and parameters to the class's __init__ method for initialization
  3. Returning this initialized object

Consider the following Foobar class definition. Two ways of creating instances (foobar1 and foobar2) are equivalent:

from typing_extensions import Self
from typing import Any

class Foobar:
    def __new__(cls, *args: Any, **kwargs: Any) -> Self:
        return super().__new__(cls)
    
    def __init__(self, foo: int, bar: int) -> None:
        self.foo = foo
        self.bar = bar
    
    def __eq__(self, value: object) -> bool:
        if not isinstance(value, Foobar):
            return NotImplemented
        return self.foo == value.foo and self.bar == value.bar

foobar1 = Foobar(foo=111, bar=222)
foobar2 = Foobar.__new__(Foobar, foo=111, bar=222)
Foobar.__init__(foobar2, foo=111, bar=222)
assert foobar1 == foobar2

Similarly, a metaclass, being a class itself, can define __new__ and __init__ methods. While these methods in a class serve to initialize its instances, these methods in a metaclass serve the same purpose—their instances are simply the classes that use it as their metaclass.

When the Python interpreter encounters a class definition with a specified metaclass, it follows a similar process:

  1. Call the metaclass's __new__ method with fixed parameters to construct a class object:

    • The current metaclass
    • The class name
    • Class members represented as a dictionary
    • Keyword arguments specified during class definition
  2. Call the metaclass's __init__ method to initialize the base class object created above:

    • self is the class object
    • Remaining parameters include class name, members dictionary, and keyword arguments

Consider this metaclass Meta example. The __new__ and __init__ methods print all output parameters. In __new__, keyword arguments are added to the namespaces dictionary serving as class member fields. Finally, the base class's __new__ method is called to create the base class object. In __init__, a function repr is defined as a string output and set as the class's __repr__ method:

from typing import Any

class Meta(type):
    def __new__(cls, name: str, bases: tuple[type, ...], namespaces: dict[str, Any], /, **kwds: Any):
        print(f"""
__new__
 cls: {cls}
 name: {name}
 bases: {bases}
 namespaces: {namespaces}
 kwds: {kwds}
""")
        for key, value in kwds.items():
            namespaces[key] = value
        return super().__new__(cls, name, bases, namespaces)
    
    def __init__(self, name: str, bases: tuple[type, ...], dict: dict[str, Any], /, **kwds: Any) -> None:
        print(f"""
__init__
 cls: {self}
 name: {name}
 bases: {bases}
 dict: {dict}
 kwds: {kwds}
""")
        def repr(self) -> str:
            express = ", ".join(f"{key}={value!r}" for key, value in kwds.items())
            return f"({express.strip()})"
        setattr(self, "__repr__", repr)

class Foo:
    ...

class Bar(Foo, metaclass=Meta, x=-1, y=-1):
    ...

print(Bar())

Here, Meta serves as the metaclass for class Bar, while Foo is the base class of Bar. When defining Bar, two keyword arguments are specified. According to Meta's __new__ method definition, they become type field members. After creating the Bar object and passing it to the print method, the program produces output showing the metaclass instantiation process.

3. Instances Are Created by Metaclasses

The extensive discussion above aims to illustrate the class instance creation process: first calling __new__ to build a base object, then passing it to __init__ for initialization. However, this is merely the surface appearance, not the essence. The fundamental truth is: instances of a class are created by its metaclass.

When we speak of "using a metaclass object to construct," we already treat the metaclass as a factory function for instance construction. In Python's world, everything is an object. Functions are no exception—a function is an instance of the function class. The function (or callable) class is a class possessing a __call__ method (either self-defined or inherited from a base class). Calling a function is essentially calling the __call__ method of the function object.

When invoking a metaclass to construct instances in function form, it means instances are created through the __call__ method defined in the metaclass. The following demonstration illustrates this point:

from typing import Any

class Foo:
    ...

class Meta(type):
    def __call__(self, *args: Any, **kwds: Any) -> Any:
        assert self is Bar
        assert type(self) is Meta
        assert args == ("111", "222")
        assert kwds == {"c": "333", "d": "444"}
        return Foo()

class Bar(metaclass=Meta):
    def __init__(self, a: str, b: str, **kwargs) -> None:
        self.x = a
        self.y = b
        self.kwargs = kwargs

assert isinstance(Bar("111", "222", c="333", d="444"), Foo)

As a method, its first parameter is always the subject object of the method call (the class for class methods, the class instance for instance methods). From the assertions in the __call__ method, we can see that the subject object calling this method is the Bar class object. The type(self) returns the class naturally as the metaclass Meta defining this class. Positional and keyword arguments injected into Bar("111", "222", c="333", d="444") are assigned to args and kwds as tuple and dictionary respectively.

Then why does calling Foo() return a Foo object for ordinary classes like Foo? The same rule still applies: although Foo doesn't specify a concrete metaclass, type serves as the fallback metaclass. The instance returned by Foo() is actually created by the __call__ method defined in type, which follows this logic:

class type:
    def __call__(self, *args: Any, **kwds: Any) -> Any:
        instance = self.__new__(self, *args, **kwds)
        self.__init__(instance, *args, **kwds)
        return instance

This method creates objects by:

  1. First calling the class's __new__ to build a base object
  2. Then passing that object as a parameter to the class's __init__ method for initialization, finally returning this object

We need not worry about whether a class defines __new__ and __init__, as the ultimate base class object provides fallback:

  • A defined parameterless __new__ allocates memory as the created empty object
  • A defined parameterless __init__ method does nothing

4. The Ultimate Metaclass: type

4.1 The type.__new__ Method

Let's examine the logic of the __new__ method defined in the type class, used to construct a class object. The __new__ method of our custom metaclass Meta ultimately calls this method (return super().__new__(cls)). The signature is:

class type:
    def __new__(
        cls: type[Self],
        name: str,
        bases: tuple[type, ...],
        namespace: dict[str, Any], /,
        **kwds: Any
    ) -> Self: ...

The five parameters are:

  • cls: The metaclass constructing the class
  • name: Names the ultimately constructed class
  • bases: Base classes for the constructed type
  • namespaces: Type members
  • kwds: Additional keyword arguments

If the cls parameter is the type class object itself, a conventional type is constructed, named by the name parameter, with bases as base classes, and possessing type members defined by namespaces. Keyword arguments specified by kwds have no significance.

If the cls parameter specifies a custom metaclass, things become interesting. Since both metaclasses and type's __new__ method can create types, conflicts arise. Obviously, the latter has higher priority (since we explicitly called the __new__ method), so the custom metaclass's __new__ and __init__ won't take effect.

4.2 The type Function

As the ultimate metaclass, type is also a class. As mentioned earlier: when we treat a class as an executable function, the metaclass's __call__ method is called behind the scenes. Since type's metaclass is itself, calling the type function essentially invokes type's __call__ method.

The type function's logic is:

  • If a single object is passed, it returns the object's class. Specifically, if a conventional object is specified, its class is returned; if a class is specified, the metaclass is returned; if no metaclass is explicitly specified, the fallback metaclass type is naturally returned
  • If the single-parameter requirement isn't met, the type function ultimately creates a class, requiring input parameters in order: class name, base classes, class members, keywords (adding type, exactly matching __new__ method parameters)

Therefore, type's __call__ determines whether to return the class of a specified object or create a class based on parameter format.

5. Reorganizing Class Generation and Instantiation

Let me summarize class generation and class-based instantiation:

For a class code fragment written using the class keyword, the Python interpreter constructs this class as follows:

  1. Extract the metaclass (if not explicitly specified, the type class serves as metaclass), and call the metaclass's __new__ method with it and the class definition information (class name, base classes, all members defined for the class, and keyword arguments) as parameters:

    • For custom metaclasses, if it (or its custom base class) overrides the __new__ method, it can theoretically return any class object
    • If no metaclass is explicitly specified, or the specified custom metaclass (including its custom base class) doesn't override __new__, then type class's __new__ method is called, strictly generating and returning the corresponding class object per our definition
  2. Pass the class object returned by the __new__ method, along with class definition information, as parameters to call the metaclass's __init__ method:

    • If the specified custom metaclass (or its custom base class) overrides __init__, it can theoretically perform arbitrary processing on the passed class object if not using slots mode
    • Once the class object is created based on slots mode, since the class's memory layout is fixed, class members cannot be added or deleted
    • If no metaclass is explicitly specified, or the specified custom metaclass (including its custom base class) doesn't override __init__, then type class's __init__ method is called—this method is empty with no operations

When we instantiate a class object as a factory function, the metaclass's __call__ method is invoked, with the first parameter being the class object serving as the factory function:

  • If the specified custom metaclass (or its custom base class) overrides __call__, it can theoretically return any object
  • If no metaclass is explicitly specified, or the specified custom metaclass (including its custom base class) doesn't override __call__, then type class's __call__ method is called, reflecting the default instantiation flow

6. Metaclass Instance Methods

Since instances of a metaclass are classes that use it as their metaclass, for classes created by it, instance methods defined in the metaclass become class methods. Taking the PointMeta metaclass as an example, parse is its instance method. But for Point that uses it as a metaclass, parse becomes its class method:

class PointMeta(type):
    def __new__(cls, name, bases, namespace, **kwds):
        namespace["x"] = 0
        namespace["y"] = 0
        return super().__new__(cls, name, bases, namespace)
    
    def parse(cls, s):
        x_str, y_str = s.split(",")
        point = cls()
        point.x = int(x_str)
        point.y = int(y_str)
        return point

class Point(metaclass=PointMeta):
    pass

p = Point.parse("1,2")
assert p.x == 1
assert p.y == 2

7. Determining Instance Types

Have you ever wondered how isinstance or type functions determine which class a specified object belongs to? Some might say, doesn't every object have a __class__ field? Correct, but where does this field originate?

Regardless of programming language, an object always corresponds to one (contiguous) or multiple (non-contiguous) memory spaces. All information provided by the object originates from here, including the class to which the object belongs. Therefore, understanding an object's memory layout reveals all information about the object.

For a conventional object (not a class object—class object layouts are far more complex), its memory layout depends on whether it uses slots mode.

7.1 Non-slots Mode: Dynamic Dictionary Layout (Default)

This is Python's most commonly used mode, with flexibility as its core. Memory is laid out as follows:

  • PyObject Header: Contains reference count and pointer to type object
  • dict pointer: Points to a real Python dictionary object
  • weakref pointer: Used to support weak references

Dynamic capability is this layout's greatest advantage. Since attributes aren't stored in the instance itself but in the dictionary pointed to by __dict__, we can add arbitrary members at any time. However, the tradeoff is high memory overhead. Since dictionaries reserve space to reduce conflicts, each instance must additionally maintain a hash table object. Each data member lookup also involves hashing, affecting access speed.

7.2 Slots Mode

This mode adopts a compact array layout. When we define __slots__ = ('a', 'b'), Python structures object memory as follows:

  • PyObject Header: Reference count and type pointer
  • Fixed offset attribute slots: Directly reserve positions for a and b in memory (storing pointers to specific objects)

There's no __dict__ (unless you explicitly add 'dict' in slots). This is the memory layout adopted by statically compiled languages, resulting in instances only possessing members defined in slots. Attempting to add new attributes throws AttributeError.

While sacrificing dynamic capability, the extremely compact memory layout removes the entire dictionary object's overhead. When owning millions of small objects, memory usage typically reduces by 40%-70%. Attribute access becomes direct memory addressing via base address plus fixed offset, eliminating hash calculation requirements and improving performance.

Returning to the question: how to determine an instance's type? Simply put, regardless of which memory layout is adopted, it has a PyObject Header containing a pointer to the class object—this is the basis for judging object type. In the instantiation process described above, who writes this pointer? Python's instantiation for a certain class always calls object's __new__ method, which calculates the memory size required for the object based on the specified type, allocates a matching memory segment, and writes the class object's address into the type pointer in PyObject Header.

class object:
    def __new__(cls) -> Self

Conclusion

Python's metaclass system represents one of the language's most elegant design decisions. By understanding how metaclasses work, developers gain deeper insight into Python's object model and can leverage this knowledge for advanced metaprogramming tasks. The metamodel's self-consistent design—where type serves as its own metaclass—creates a beautiful closed loop that exemplifies Python's philosophical approach to language design.