Tag: Research

Python Type Hints – Part 1 : Type Hints

A type-hint is a kind of comment in Python, which is used by developers to indicate which class they expect certain objects to be instances of.

A simple example can be:

a: int = 5
b: int = 3
c: float = a/b

However, there are a few more subtleties to it.

1 – Type-hints do nothing

From a practical sense, the type-hint is, well, a hint, and is not enforced in any way.
For example, the following code is perfectly legitimate, and works with no problems:

def f(a: int) -> list:
    return a*5

print( f("abc") )
# abcabcabcabcabc

This is an example of a function which expects an int, and tells us that it will return a list, but actually, it can get a string and return a string.

Why does it work?
Because, as the title hinted us, type-hints do nothing. They are ignored by Python.

Can we prove it?

We can prove it like so:

# When Python first sees a line of code, it parses it into an AST.
# In the AST, we do see that there is an `annotation`.
>>> print( ast.dump( ast.parse("x: int = 'a'"), indent=2) )
Module(
  body=[
    AnnAssign(
      target=Name(id='x', ctx=Store()),
      annotation=Name(id='int', ctx=Load()),
      value=Constant(value='a'),
      simple=1)],
  type_ignores=[])

# If we compile the line of code
#     (meaning that Python will convert it into an AST,
#         and will then convert it into byte-code),
#     then we still see the annotation
>>> dis.dis(compile("x: int = 'a'", "file_name", "exec"))
  0           0 RESUME                   0

  1           2 SETUP_ANNOTATIONS
              4 LOAD_CONST               0 ('a')
              6 STORE_NAME               0 (x)
              8 LOAD_NAME                1 (int)
             10 LOAD_NAME                2 (__annotations__)
             12 LOAD_CONST               1 ('x')
             14 STORE_SUBSCR
             18 LOAD_CONST               2 (None)
             20 RETURN_VALUE

# However, once we place the code in a function,
#     we see that the annotation is gone
>>> def f():
...     x: int = 'a'

>>> dis.dis(f)
  1           0 RESUME                   0

  2           2 LOAD_CONST               1 ('a')
              4 STORE_FAST               0 (x)
              6 LOAD_CONST               0 (None)
              8 RETURN_VALUE

What’s that opcode?

What’s that magical STORE_SUBSCR ( __annotations__ ) thing?
Well, it seems that it adds annotations to variables defined in the interpreter:

Why is it ignored?

It may seem a bit weird at first that the type hints are somewhat available internally (in the AST, and in code-run-by-the-interpreter-and-not-inside-some-function), while being invisible (not showing in the function), and ignored (there’s no enforcement that x will be int).

But there’s some logic behind that.
Python, as a language, has every object (in the C level) inherit from the same type – everything is PyObject*.
For example, here’s: PyObject_SetItem:

 /* o[key]=v. */
PyAPI_FUNC(int) PyObject_SetItem(PyObject *o, PyObject *key, PyObject *v);

This function signature cannot tell the type of the objects coming in. It simply receives PyObject*.
If the function really wants to know the type of the object, then it can do the C equivalent of
if isinstance(obj, typ): ....
And that’s by design.

However, it is sometimes not totally ignored

When assigning type hints to variables, like shown above, the type-hint is totally ignored by Python.
However, when assigning type-hints to functions, or classes, the type hints are stored somewhere.

>>> def f(a: int, b: float) -> str:
...     return a+b

>>> f.__annotations__
{'a': int, 'b': float, 'return': str}

likewise, for classes:

>>> class A:
...     a: int
...     b: str = 3

>>> A.__annotations__
{'a': int, 'b': str}

>>> a = A()
>>> a.__annotations__
{'a': int, 'b': str}

Other than storing the type-hints in the __annotations__ dict, they are ignored.

While mostly ignored, there are some things that use type-hints

Who’s using type-hints?

There are 2 obvious answers:
The first, and the most important, is: programmers.
Type hints are a way of adding comments to the code, and explaining the code in a short, concise, and readable way. They are great.

(One such example I personally like is when working with physical calculations that have units.
Writing stuff like: earth_radius: KiloMeter = ... or rotation: Radians = np.pi)

The second is: static code analysis.
The most common 2 are PyCharm, and mypy.

Yet there are more uses to type hints. famous examples are pydantic and dataclasses.

How’s the dataclasses module use __annotations__?

The first thing CTRL+F found is the following from the _process_class function:

Next, it sets up all the annotations into the fields property:

If you’re curious about the fields property, then here it is:
_FIELDS = '__dataclass_fields__'
setattr(cls, _FIELDS, fields)

And, lastly, the dataclass creates the __init__ function:

Now, what peaked my interest here is that there’s an option to choose the name of the self argument. And it’s passed as a string.
Answering how it works will also tell us how dataclasses dynamically create a function.

Onto _init_fn

Basically, the whole function enters a single screenshot, and is pretty self explanatory

The interesting thing is that it generates strings!
As a consequence, we can pretty much predict what _create_fn does:

Oh my! an exec in the wild!

Basically, it is merely an automation to writing self.%s = %s
which is, fairly, exactly what’s promised.

Yet there’s a nice trick they’re doing – they create a function inside a function.
Line 428 creates our __init__ function
and line 431 creates a different function, whose output is our __init__ function.

This is done so that the newly created class will have access to the local scope (hence passing locals)
Note that it also has access to the same global scope, as passed in exec.

How’s the pydantic module use __annotations__?

When pydantic creates a new model object, it calls a function called collect_model_fields.
The fields are exactly the annotations of that class:

Another place where pydantic uses annotations is when generating a scheme:

Other worthy mentions

There’s functools.wraps, that now has __annotations__ in its list-of-properties-to-copy-from-wrapped-to-wrapper

There’s inspect, which exposes get_annotations.
It also has _signature_is_functionlike, which checks that obj.__annotations__ is either dict or None.
(for functions, it’s always dict or None. But for classes that behave like functions, we can alter that)

There’s typing, (duh).
It exposes a function called get_type_hints, which not only returns the annotations dict, but also handles strings as annotations.
We’ll talk about strings as annotations later, but I’ll just point out here that
def f(a: int)
and
def f(a: "int")
have the same meaning to us, as developers, and also for static analysis tools.
But, to Python, the former is a dict (with one key: "a") and a value of the type int itself, whereas the latter has a string as the value.

Thus, the get_type_hints function handles that:

How does this function work?
A simple exception reveals it all:

typing also exposes a version of NamedTuple with annotations (that is, the programmer has to define a class, inherit from NamedTuple, and put the type hints him/herself.)
All typing does is define the class, and makes it behave like collections.namedtuple,
as well as adding the following magic (if you’re not familiar with metaclasses, I can suggest my own posts about it. Otherwise, the following code will be pure magic, and can be ignored)

How’s __annotations__ implemented?

In fact, there’s not much to it. The implementation is rather simple.

For functions, for example, there are but a few results, non of which reveal something interesting.

Modules appear to have annotations.
That’s just like the example we had above, where a variable we defined in the interpreter created the global variable __annotations__.
Likewise, defining variables in the module’s scope (i.e. not inside functions/classes) will create an __annotations__ variable for the module.

The place where annotations are created is in the byte code.

Creating annotations for functions

In order to generate byte-code that creates annotations, let us define a function, f, whose code creates a function (g) with annotations:

def f():
    def g(a: int):
        return 1
    return g

Looking at the byte code, we see:

Starting from MAKE_FUNCTION, we see the following implementation:

Basically, it takes the top item, that’s code object g (at 18) – this will be the codeobj.
Then, it takes the next top item – this will be the annotations.

What’s that next-top-item?
We see that load a ; load int ; build tuple does exactly that – it builds a tuple (which will be converted to dict) that has the annotations.
Neat.

Side note: Python versions

The above byte-code is generated in Python 3.11.8.
For Python 3.13, we see a slightly different byte code:

Simply put, this splits MAKE_FUNCTION to the actual making of the function, and to setting its attributes. There are minor changes in the implementation, but the main point is that the opcode has been split into two.

Creating annotations for classes

The byte code for creating classes is a bit more complicated (compared to load const codeobj ; make function).
However, we’re going to see that the part relevant to us – creating class annotations – is composed of 2 parts:
first, create the annotations property (this will be done by SETUP_ANNOTATIONS)
second, populate the dictionary with values.

Let’s see it in action:

The second step is in front of us – the STORE_SUBSCR does __annotations__['a'] = int
The first step is inside the SETUP_ANNOTATIONS opcode, which simply creates an empty dict

from __future__ import annotations

In [another post], I expanded more on how __future__ imports work.
I’ll just mention here that using from __future__ import annotations changes the compiler’s behavior so that annotations won’t be parsed to the objects they point to, but rather to the actual string that’s written.
In other words, def f(a: int) wont have the object int as the annotation, rather the string "int".

We can see it in the following code:
When the future feature is turned on, the annotation value (string) is used.
When turned off, the value it points to is used

Summary

It’s that time again – the end of a post.
Is this post too long? too short? too detailed? I’m not sure, but I hope that it was right for you ๐Ÿ™‚

In this post, we started looking at Python’s type hints.
At the syntax, and at the no-enforcement of it.
At the use cases, such as dataclasses and pydantic.
And at the implementation of it.

In the next post, we’re going to look at some special types that were created specifically for type hints.

See you next time ๐Ÿ™‚

Matlab Internals

Below is my agonizing journey of overriding one of Matlab’s magic methods.

Setup

In Matlab, every function is a file.
If you wish to split your code into functions, you’d better need some way of organizing the folder hierarchy.

(For those interested: there may be multiple functions defined in the same file. However, when one calls a function, say, foo, what Matlab does is search for a file with that name (in this case, foo.m), and calls the first function in that file, no matter that function’s name)

So, what can one do to achieve a reasonable structure to the code?
Classes!

Basically, it’s better off avoiding classes in Matlab, as they have spooky behavior.
But, when the code becomes so large, there’s no escape..
(for example, I’m at the point of more that 500 files)

Step 1: Wrapping a list

A basic code for a class with a list inside can look as follows:

classdef MyClass
    properties
        data
    end

    methods
        function result = get(self, index)
            result = self.data(index);
        end
        function set(self, index, value)
            self.data(index) = value;
        end
    end
end

That’s not intimidating. Yet.
The next step, is purely for a syntax sugar, and is not needed.

Step 2: Overriding __getitem__

In Python, we’d write something like

class MyClass
    def __getitem__(self, index):
        return self.data[index]

And this would allow us to write instance[index] rather than instance.get(index).

In Matlab, however, there’s a whole different story.

The function we need to override is called subsref.

subsref overview

Heading to the documentation, we find that “subsref” stands for “subscripted reference”. Lovely.
Plus, we see that the syntax is B = subsref(A,S). Aha. Yep. That tells us a lot, doesn’t it?

Reading further, we see that both A(i), A{i}, A.i are all mapped into calling B = subsref(A, S).
So, by overriding __getitem__, we also override __getattr__. How lovely. That surely won’t complicate things.

And, saving the best part for last, we have the following syntax:
A{1}.field(3:5), which translates to having that S as a 3x1 struct:

disp(S(1))
    type: '{}'
    subs: {[1]}
disp(S(2))
    type: '.'
    subs: 'field'
disp(S(3))
    type: '()'
    subs: {[3 4 5]}

Yep. You heard right. It’s not something logical, like “call __get_curly_brackets__ on A. Then, on the result, call __get_attr__, and on the result of that, call __call_function__”.
No.
It’s “Let’s give the caller all the information, and let him do the language processing for us”.

Fantastic.

Step 3: A Simple Bypass

What do we do when we see a weird looking code?
We let someone else handle it!

In pseudo code:
“””
If it is our very specific case:
we deal with it
else:
let the default behavior deal with it
“””

In Matlab code:

        function value = subsref(self, s)
            switch s(1).type
                case '()'
                    %{
                        The simple use case:
                            mps(5)
                            causes `s` to be
                                s(1).type = '()'
                                s(1).subs = {5}
                        The complicated use case:
                            mps(3 , 1:5)
                            causes `s` to be
                                s(1).type = '()'
                                s(1).subs = {3, [1 2 3 4 5]}
                    %}

                    subscripts = s(1).subs;
                    assert(length(subscripts) == 1, "Can't handle more at the moment");

                    indices = subscripts{1};

                    result = []
                    for index = 1:numel(indices)
                        result = [result , self.get(indices(index))];
                    end

                otherwise
                    value = builtin('subsref', self, s);
            end
        end

Simple.

Well, we could simplify it a bit, by having the for-loop actually take place inside self.get, but we’re not doing it since the objects that returns from self.get in my real use case are a bit complicated, so this turns out as a simpler way of iterating it.

Do you see any problem?
Neither did I, for a while.
But then..

Step 4: Calling Conventions

See the last line in the above example?

                otherwise
                    value = builtin('subsref', self, s);

It seems to simply call the builtin function “subsref” with A=self and S=s. Doesn’t look too bad.

And indeed, when calling my_instance.get(3), it behaves nicely, just like my_instance(3), or my_instance.data(3). They are all equivalent and nice.

The problem is with my_instance.set(index, value).
You see, when we defined this function, it had no outputs.
But we call it with result = us_actually_calling_it.

Nope.
Matlab doesn’t approve.

Now, calling our set raises an error:

>> my_instance.set(index, value)
Error using MyClass/set
Too many output arguments.

Why? What went wrong?

I’ll tell you why.

When calling a Matlab function, Matlab creates 2 special variables: nargin and nargout.
They store the number of arguments that were passed into the function, and the number that’s expected to be returned.

The problem is, that when Matlab “compiled” the line value = builtin('subsref', self, s);, it set nargout = 1. However, our set function cannot return an output, thus, Matlab raised an error before calling set.
It took me too many debug attempts to understand that Matlab doesn’t even call set, it simply crashes since it determines that it cannot call it.

Thus, we update our code with a hardcoded fix:

                otherwise
                    if string(s(1).type) == "." && string(s(1).subs) == "set"
                        % no nargout
                        builtin('subsref', self, s);
                    else
                        value = builtin('subsref', self, s);
                    end

Fantastic.
I find it hard to excuse such behavior.
But, that’s what we’ve got, so we’ll deal with it, no matter how “fantastic” we think it is.

And indeed, it worked fine, until the next bug.

Step 5: Accessing Properties

What did we achieve so far?
– We were able to make my_instance(index) call my_instance.get(index)
– We were able to pass any other call to subsref into the builtin behavior
– And, we fixed the call to set

What’s the current bug?

Calling my_instance.data fails.

Well, a few paragraphs ago it worked fine. What changes were made?

In the class itself, I overloaded some Matlab functions.
The new additions are:

classdef MyClass
    methods
        function n = numel(self)
            n = numel(self.data);
        end

        function n = length(self)
            n = length(self.data);
        end

        function n = size(self)
            n = size(self.data);
        end

        function result = end(self, varargin)
            % this function is called with `{self, [1], [1]}`
            result = numel(self);
        end

    end
end

And the problematic one is numel (stands for NUMber of ELements).

Now, we have the following behavior:
– Writing func(...) in a file, causes nargout to be 0
– Writing a = func(...) in a file, causes nargout to be 1
– Writing [a,b] = func(...) in a file, causes nargout to be 2
However:
– Writing a = func(...) in the terminal, causes nargout to be 1
– and the same for [a,b] = ...
The problem is that
– Writing my_instance.func(...) in the terminal, causes nargout to be numel(my_instance), which is now equal to the number of elements in the list.

Oh boy.
If numel(my_instance) Indicates the amount of items in the list, but also indicates the amount of items that should be returned when calling a function without collecting the output (like we usually do in the terminal), then we have a problem.

Specifically, we have my_instance.data, which is a property of my_instance.
my_instance.data is a single item (a list)
but numel(my_instance) returns a value larger than 1 (the amount of items in the list).

How do we solve that?

Step 6: Patching a Patch

Matlab has already thought about that problem, of course.

If numel is used for 2 different purposes, then there’s surely a solution.
Since users can override subsref, they also need a way to indicate how many items will be returned.

Entering: numArgumentsFromSubscript
Translation: This is a function with a signature nearly identical to subsref. The difference being that subsref actually returns the output, and this function returns a number indicating how many outputs would subsref return, if it were called with equivalent arguments.

Lovely.

When reaching a point with functions with such names, it surely has passed the point of “is it really worth it?”, and the answer, of course, is no.
However, this is now a battle that has to be won.
Thus, we shall continue.

So, we implement numArgumentsFromSubscript, and manually check that:
– if we access .set, then nargout = 0
– if we access a property, then nargout = 1
– otherwise, default bahavior (nargout = numel(self))

However, the 2nd option troubled me, since checking if we access a property was done manually, comparing the access name to each property name, hardcoded.

Step 7: Inheritance

Getting all the properties of a class in Matlab is done via the properties function.

However, as the title suggests, I had some inheritance in my code, and there were problems.

You see, calling properties(my_instance) returns the properties of the class of my_instance, but doesn’t return the properties of its parent classes.
Lovely.

Thus, I implemented a function called does_property_exist, which uses another function I wrote, get_all_properties, which handles the inheritance structure.

I would expect Matlab to handle such things, but, well, my expectations are very low at this point.

Behold, get_all_properties:

function all_properties = get_all_properties(obj)
    obj_class = metaclass(obj);

    all_properties = obj_class.PropertyList;

    % Get metaclasses of superclasses
    superClass = superclasses(class(obj));
    for i = 1:length(superClass)
        metaClass = meta.class.fromName(superClass{i});
        all_properties = [all_properties; metaClass.PropertyList];
    end
end

The highlighted lines indicate:
– line 2: We get the metaclass of our object. The metaclass is an object that holds the list of properties, list of methods, and so on.
– line 7: we get all the parent classes of our class
– line 9: we get the metaclass of each parent class

Step 8: Writing Generic!

Why stop here?
Let’s further add get_all_methods.

The code for get_all_methods is exactly the same.
Plus, I added a get_all_methods_names function, to return a list of strings, rather than a list of complicated objects.
And then, does_method_exist(instance, method_name) finishes the job.

But wait! There’s more!
The method object has some information:

  method with properties:

                   Name: 'get'
            Description: ''
    DetailedDescription: ''
                 Access: 'public'
                 Static: 0
               Abstract: 0
                 Sealed: 0
     ExplicitConversion: 0
                 Hidden: 0
             InputNames: {2x1 cell}
            OutputNames: {'result'}
          DefiningClass: [1x1 meta.class]

  method with properties:

                   Name: 'set'
            Description: ''
    DetailedDescription: ''
                 Access: 'public'
                 Static: 0
               Abstract: 0
                 Sealed: 0
     ExplicitConversion: 0
                 Hidden: 0
             InputNames: {3x1 cell}
            OutputNames: {0x1 cell}
          DefiningClass: [1x1 meta.class]

Aha! So there’s the OutputNames, out of which we can generically know whether some function should or should not be called with func() or output = func().

Victory!

Well, “success” is more appropriate, since it’s hard to call the following code a “victory”:

classdef MyClass
    methods % subsref

        function value = subsref(self, s)
            switch s(1).type
                case '()'
                    subscripts = s(1).subs;
                    value = cell(size(subscripts));

                    for subs_index = 1:numel(subscripts)
                        indices = subscripts{subs_index};

                        current_value = [];
                        for index = 1:numel(indices)
                            % disp("    Getting index: " + num2str(index));
                            current_value = [current_value, self.get(indices(index))];
                        end
                        value{subs_index} = current_value;
                    end

                    if numel(value) == 1
                        value = value{1};
                    end

                otherwise
                    if string(s(1).type) == "."
                        access_name = s(1).subs;
                        if does_property_exist(self, access_name)
                            value = builtin('subsref', self, s);
                        elseif does_method_exist(self, access_name)
                            if does_method_has_nargout(self, access_name)
                                value = builtin('subsref', self, s);
                            else % no nargout
                                builtin('subsref', self, s);
                            end
                        end
                    else % can't determine (well, it may be in s(2), s(3), or who knows where)
                        value = builtin('subsref', self, s);
                    end
            end
        end

        function n = numArgumentsFromSubscript(self, s, indexingContext)
            if strcmp(s(1).type, '.')
                if does_property_exist(self, s(1).subs)
                    n = 1;
                elseif does_method_exist(self, s(1).subs)
                    if does_method_has_nargout(self, s(1).subs)
                        n = 1;
                    else
                        n = 0;
                    end
                else % what are you trying to access? This will probably fail with a non-indicative error
                    n = numel(self);
                end
            else % default
                n = numel(self);
            end
        end
    end
end

Just, lovely.
So much time and effort was put into trying to make such a simple thing possible.
And what a shame that we need to delete it.

Step 9: Consequences

Every adventure has consequences.
You can’t have “fun” without feeling slowed down by all the “fun” you did.

Well, slowing down is accurate, as the code above turned out to be really slow. Like, really slow.

About 10% of runtime slow.

Crazy, don’t you think?

Well, it’ll sound less crazy when we’ll dress it with different words:
“For every call, access, get-item, or any operation whatsoever done to MyClass, and note that we’re dealing with recursive and iterative algorithms, we need to resolve the full inheritance structure of the object, just to know if it’s a regular call or not.
And not just resolving it, rather, resolving it 6 times on every access
(is it a property? is it a method? is it a method that has nargout? and then again, all those 3, once for numArgumentsFromSubscript and once for subsref)”
For example, my_instance(3) calls my_instance.get(3), which calls my_instance.data(3), thus multiplying the 6 resolves by 3.

Well, that indeed sounds like trouble.
But, I don’t really blame myself, as
A) It was about the journey, not the destination, right?
B) I really can’t think of a simpler way to write the code. I really am trying very hard to write code that documents itself, but, well, let’s just say that Matlab doesn’t make it easy for me…

(Not going into) Step 10: Caching

Great idea!
If we call something many times, why don’t we use caching?

I’ll tell you why.

Reason #1: There’s no real caching in Matlab.
There are, however, “persistent variables”, which, to me, sounds like a hell of a debugging experience to understand what’s persistent and how to deal with them.
Many websites turned me off from this direction.

(I do have some caching in my code, but that’s done to an object that should always return the same result.
This class we dealt with in this post is dynamic, which makes caching way more difficult, as we need some mechanism to know when the object changes, thus invalidating the cache)

But the real problem is the 2nd reason:

Reason #2: Matlab keeps accessing the disk

Imagine writing in the console my_instance = MyClass()
Then, editing the file containing MyClass, and adding a new method.
Then, like magic, my_instance has that method.

Isn’t it magical?
On each method-call, Matlab accesses the disk and calls what’s written there, instead of storing it in memory.

Won’t it make the caching oh so more fun? Each time we update the code, we’d have to clean the cache.

This makes us “cache-the-methods-of-an-object” obsolete every time the file is updated.
Should we have another thread monitor changes to the file, and send a “clear-cache” on file update?
Nope. Threading for file-updates just to make a cache of method changes is too much for this syntax-sugar.

Sad Ending

And so, we end our journey with some (irrelevant) understanding, and a large sentimental piece of code that we have to delete since it slows down our code by some crazy amount.

Oh well, it wasn’t all for nothing, I guess.

  • We learned that Matlab executes the code from the disk, rather than storing it in memory.
  • We learned that Matlab patches __getitem__ and __getattr__ together into subsref
  • We learned that Matlab first decides how many inputs and outputs does a certain call will have, then decides whether to crash or continue, and only then will it call the function. Lovely.
  • We learned that Matlab expects the output to be numel(obj), but this can be overridden by numArgumentsFromSubscript
  • We learned that getting all the properties/methods of some objects doesn’t take inheritance into account, and we need to do it ourselves.
  • And, last but not least, we learned that splitting to functions, writing generic code, and avoiding hard-coded constants, has consequences. Matlab wouldn’t want us to write this way. Thus, making our code slower.

With that, I hope you learnt what I learned, while feeling a bit of what I felt along the way.

An Exceptional Flow #3 – Almost Finally There

The climax of this series. The post that had the most build up. So much preparations and understanding and byte code and gdb was put into it.
And, after all that, we will get an answer, but not the full answer, like we’re used to in this series.

Why, you ask?
Because after too many hours spent on gdb and on the code, I decided that it’s better to put the current understanding onto a blog, rather than keep digging into what seems to get farther and farther from the topic of this post-series.

Finally, what about finally?

The short answer is: Python does what it can to ensure that the content of the finally block will get executed.
In other words, RETURN won’t stop Python from executing the finally block.

When a return, break or continue statement is executed in the try suite of a tryโ€ฆfinally statement, the finally clause is also executed โ€˜on the way out.โ€™

The return value of a function is determined by the last return statement executed. Since the finally clause always executes, a return statement executed in the finally clause will always be the last one executed

Python docs: https://docs.python.org/3/reference/compound_stmts.html#finally-clause

So.. that’s the result. finally will get executed.

But, Why?

We can answer that with a proof.
It’s not a full explanation, it’s more of a redirect of the question. From “Why is finally being executed after try?” to “Why is the code compiled to this byte-code?”
Nevertheless, let’s see some examples:

In Python 3.11, we get the following byte-code:

In Python 3.13, we get:

In both cases, we see that the RETURN of the finally block is the one that will determine the result.

Case closed?

A Small Confession

What’s this L1, L2, L3 thing in the Python 3.13 byte code?
These are labels (like in C), making the exception table more readable.

How come we didn’t see them in the previous posts?
Uhm, well, the point is, that in this series I used
– The source of Python 3.13
– A compiled Python 3.13 with gdb
– And ipython 3.11 for the screenshots

It didn’t change things all that much, so I kept the posts as they are, since I noticed this mistake a bit too late ๐Ÿ˜….

The point where I noticed the mistake was when I was banging my head on the wall trying to understand why the finally block generates this byte-code that ignores the return from the try block.
Allow me to take you on that journey, I promise to keep it short(er than it took me to go through it).

The Journey

How can we approach it?
First, what exactly is it that we’re approaching?

The thing is, that we have a rather simple python code: “try: return ; finally: return”. We can compile it in our head, and we should expect to see 2 different return statements. However, the result we see doesn’t have 2 different return statements.
Where is the missing return statement from the try block?

Station #1 – parser.c

So I decided to see how Python parses the code, and to go on from there.
In hindsight, this place was too deep for the question I was asking.

This code is both well written, and complicated.

As an example, the above code is a small part of the parsing of a try statement, one with no except or else.

While this part is readable, the rest is, well, still readable, but deals with many things I had no interest in, thus dragging me deeper and deeper into tokens and the structure of this mysterious Parser *p.

Automatically Generated Code

As a side-fact, I’d like to point your attention to the very first line in parser.c:
// @generated by pegen from python.gram

This is a common practice in programming languages.
The parsing of the long-string-of-code given from the user is a process which is both long and tedious, and is worth optimizing.
Thus, the python.gram is the grammar file for Python, and there’s a code that takes the grammar and converts it into an optimized C code for parsing strings of python code.

After spending (way too much) time in parser.c, we move onto

Station #2 – AST

Looking at the call stack above try_stmt_rule, we find:

Having a quick look at those functions (using gdb and step out), I wasn’t able to find the specific location that messes with the byte-code, and removes the RETURN of the try block.

Station #3 – Compile

A few more searches around, and I saw the the above stack-trace has pyrun_file calling _PyParser_ASTFromFile, and the next line in pyrun_file calls run_mod.

Looking in run_mod, a specific line caught my attention:
PyCodeObject *co = _PyAST_Compile(...)

So we dug deeper and deeper in this direction, until we got to

Station #4 – optimize_and_assemble_code_unit

Now this sounds like the place that will have our answers.

I started by adding some debug prints to indicate show us the filename and the function name of the code we’re optimizing:

Followed by a print of the byte code pre-optimization:

Here’s me parsing an object holding the list of opcodes:

And here’s the print in action:

Translation:
– 149 = RESUME
– 30 = NOP
– 83 = LOAD_CONST (at index 1)
– 36 = RETURN_VALUE
– 83 = LOAD_CONST (at index 0)
– 36 = RETURN_VALUE

Note #1

Along my journeys in the parser and the AST, somewhere along the way I found a function that adds a return None opcode to each code object, which sounds reasonable, as every function in python returns None implicitly.

It was nice to see it in action.

Note #2

Who’s the const at index 1? and who’s at index 0?
There’s a nice trick I did in order to find out ๐Ÿ™‚

This u object I’m parsing above is struct compiler_unit *u.
One part of the struct is:

Which is what’s parsed above.

Another part of the struct is:

Sounds very similar to the attributes of a code object!

A small difference is that code.co_consts is a tuple, whereas u.u_metadata.u_consts is a dict.
Bummer. parsing dict sounds like a lot of trouble.

If only I could execute Python inside gdb, passing a pointer directly.

Wait a seconds.. I am inside Python!
Thus, I added a printf of: convert_PyUnicode_to_C_str ( dict_repr ( u.u_metadata.u_consts ) )
Added a print for names and varnames, and we got:

Now this is the full print message.

Why are there 2 prints, you ask?
I asked that as well. And, after translating the opcodes, I found that the 2nd print has 83 -> 26, with 83 being LOAD_CONST, and 26 being MAKE_FUNCTION.
Meaning that the first print is of the inside of our function, and the second print is of the def statement.

I’m pretty proud of the idea to call dict_repr in C ๐Ÿ™‚

Side-note #3

From the small glance I gave at MAKE_FUNCTION and the functions it calls, id seems like a creation of the struct of a python-function-that’s-exposed-in-python, and doesn’t seem to call any optimization.

Station #4 – optimize_and_assemble_code_unit

Yep. It’s the same station #4.
That’s where we end.

The journey into the optimization of the byte code offers a sense of delight, yet it is extensive and elaborate/
We’ll take that path in another time,

For now, we’ll end with the saying that Python achieves what it promises.
It promised us that finally will run no matter what, and it indeed achieves that goal.

We cornered the magic of removing the return of the try block into the land of optimization.
We plan to visit it some day in the future.
But, until then, we shall keep it there, as we’re happy with the knowledge we got along the way.

In the meantime, there’s a connect the dots byte-code-jumps bonus down below ๐Ÿ™‚

Station #5 – Stack Overflow

A note-worthy question in stack overflow: [link]


This is part 3 out of 5 in the “An Exceptional Flow” series.
– Part 1: [Frames] (about the normal execution flow in Python)
– Part 2: [Exceptions] (how exceptions propagate through the stack trace, and how their handlers are called)
– Part 3: [Almost Finally There] (about the weird behavior of finally)
– Part 4: [Generators] (how generators store their state between calls)
– Part 5: [Exec] (how exec is implemented)


Bonus

def raise_in_raise():
    print("outer start")
    try:
        print("middle")
        try:
            print("inner")
        except ValueError:
            print("inner except")
    except KeyError:
        print("middle except")
    print("outer end")

An Exceptional Flow #4 -Generators

After looking at a regular flow of functions, and at the flow of exceptions, is it time to go on some side quests!
We’ll use the knowledge we got in part [6.1 (normal flow)], [6.2 (exception flow)], and [6.3 (finally)] in order to answer 2 new questions:
– How does generators work? How do they store their state?
– How is exec implemented?
The first question will be answered in this post, the other in [the next one].

Generators

What are generators in Python?
We can say that generators are functions that have multiple return statements, and carry their state between them.
(Instead of return, generators use yield).

A basic example would be:

def create_generator():
	state: list = []

	yield state
	state.append(1)
	yield state
	state.append(2)
	yield state

generator = create_generator()

print(next(generator))
# []
print(next(generator))
# [1]
print(next(generator))
# [1, 2]
print(next(generator))
# StopIteration

In other words, the “state”, in this case, the list, was kept untouched between calls.

How can we achieve the implementation of such behavior?

The series in which this post was placed gives us a clue – We’ve looked at Python’s byte-code evaluation framework, and encountered several behaviors that can help with implementing such behavior:
– We saw that the frame holds the Python-stack
– We saw that the frame holds the locals and globals dict.
– We saw that a frame has a “next instruction” pointer

Thus, if we were to have the ability to “call” a frame, then “put it out”, and “put it back in” at will, we could achieve our desired behavior!

Starting with structs

Frame Structs

Let’s first review the struct of the frame object, in order to see what it contains, thus helping us paint the picture of which object is responsible for what kind of data.
Then, after looking at several frame structs, we’ll look at the generator struct.

Initially, I searched for a file with frame in its name.
frame.c
frameobject.c
frameobject.h
Were the options given.

Thus, let us start with a clarification: there’s a frame object used in the source code, then one we saw in _PyEval_EvalFrameDefault. Additionally, there’s a frame object that’s exposed in python.
The frameobject.c file handles the object that’s exposed in python.

We know that since
A) It resides in the Objects folder
B) The file starts with static PyMemberDef frame_memberlist
Thus hinting at
C) The file has a PyTypeObject PyFrame_Type declaration.

Luckily, the file also lends us a lead.
The file starts with a few functions, such as the following:

PyFrame_GetLineNumber(PyFrameObject *f)

Aha! onto PyFrameObject.

typedef struct _frame PyFrameObject;

Onto _frame

They say the data resides in _PyInterpreterFrame:

It became cyclic. Let’s recap:

  • PyTypeObject PyFrame_Type – the pythonic type.
    It’s methods get:
  • PyFrameObject == _frame
    • holds trace information
    • holds a pointer to the previous frame
    • holds a pointer to the data:
  • _PyInterpreterFrame – the data
    • has a pointer to the function
    • globals, locals, builtins
    • handles the stack
    • and, holds a pointer to PyFrameObject.

Okay. So these are a few structs.
Let’s look at a few other structs – those of generators:

Generator Structs

Inside genobject.c, we find:

Here are some of its attributes:

As for the C-level object, that is PyGenObject:

Aha! it has prefix##_iframe (the ## symbol means “concatenate the strings at the pre-processor level”, i.e. it takes the value of prefix, and puts it right in front of _iframe, thus creating the attribute gi_iframe)

That’s what we looked for!

Seeing is Believing

Back to our example from the start of this post, we had a generator object:

Well, maybe there’s no frame since we called next enough times until StopIteration occurred and “killed” the frame?
Let’s create a new object and find out:

Aha!
A – it’s the exact same code object
B – we now have a frame object!

Additionally, once we start the generator, we get:

Let’s move on to the frame object:

Ooh-la-la.
We see the same code object,
A pointer to lasti, and an indication of the line number,
And the builtins, globals and locals dictionaries!

Let’s have some fun:

Nice! We’re able to change the local variables of the generator, thus altering its behavior!

We can also alter it’s behavior from gdb, by changing attributes of the frame (such as the locals, the stack, or the next-instruction pointer).
This is a bit harder to do in gdb than in Python.
The advantage is that most attributes are read-only for Python, so gdb has more power.
The cons are that changing the stack, or the next-instruction pointer, can lead to seg-fault if not done carefully.

With that, the basic idea of how generators are implemented is clear – the generator object holds a “state” in the form of a frame object.

The question now is – how can we enter and exit a frame?
Previously, we saw the opcode CALL, which calls:
_PyInterpreterFrame *new_frame = _PyEvalFramePushAndInit(..., callable, ...)
followed by
DISPATCH_INLINED(new_frame);

Additionally, we saw RETURN, which calls:
_PyEval_FrameClearAndPop(tstate, dying);
(dying is a pointer to the frame that we exit from)

The same exit-routine (_PyEval_FrameClearAndPop) was done when we had an exception that we didn’t catch.

Now, both operations (_PyEvalFramePushAndInit & _PyEval_FrameClearAndPop) sound like “start” and “end”.
When working with generators, we’d assume that there will be “suspend” and “resume” operations.

For “suspend”, our lead would be the opcode YIELD_VALUE.
For “resume”, we have a function named gen_send in genobject.c (this function wraps the function gen_send_ex2)

Starting with “SUSPEND”

Instead of asking “how do we suspend a frame?”, let us ask “what’s the difference between suspending and ending a frame?”
This question translates to the difference between RETURN_VALUE and YIELD_VALUE.
Let’s view them side by side:

With RETURN on the left, YIELD on the right, and the similarities highlighted, we can get a vague understanding of the difference.

RETURN seems to:
– clear the frame
– load the stack-pointer
– load the instruction-pointer
– and let the previous frame DISPATCH onto the next instruction.

YIELD seems to:
– suspend the frame
– store some info in the generator
– pass the exception to the generator
– load the instruction-pointer
– call resume_frame, which loads the stack-pointer

Seems like they both free the way to the previous frame to keep going.
One key difference is that RETURN calls _PyEval_FrameClearAndPop, i.e. clearing the object as if it won’t be used again, whereas YIELD stores frame->instr_ptr = next_instr, and sets gen->gi_frame_state = FRAME_SUSPEND, meaning that it simply pauses the execution of this frame, but keeps it ready and waiting for the next execution.

Yet there’s one more difference to point out – the stack trace! (and the chain of frames, i.e. frame->previous).
Let’s show it with a picture:

In other words, our call to next(generator) used gen_send_ex2 in order to evaluate the frame of our generator.

A consequence of this stack trace is that our current run in _PyEval_EvalFrameDefault has only one frame.
To be exact, there are 2 frames: the entry_frame, which is a dummy frame, and the frame we wish to evaluate.
Once YIELD ends, it calls resume_frame, in other words, it calls DISPATCH on entry_frame.

However, entry_frame.instr_ptr = INTERPRETER_EXIT.
TARGET(INTERPRETER_EXIT) can be summarized as:
retval = stack_pointer[-1]โ€‚;โ€‚return retval

And indeed, inside gen_send_ex2, we have the line PyObject *result = _PyEval_EvalFrame(tstate, frame, exc);, after which, result is of type list, as expected ๐Ÿ™‚

Afterwards, leaving gen_send_ex2 sets us back to the original _PyEval_EvalFrameDefault, under TARGET(CALL) with a callable named next.

OK. Quick summary:
We wanted to see the suspension process of a frame, i.e. the “bottom up” version of looking at generators.

We started from looking at the difference between RETURN and YIELD, which can be summarized as
RETURN closes the frame and cleans it
YIELD keeps the state and updates the next-instruction-pointer.

Then, YIELD loads the previous frame, which is the entry_frame, thus causing _PyEval_EvalFrameDefault to return, and, a few functions later, we’re back at the original _PyEval_EvalFrameDefault with our return value.

Next, let’s look at the other way, in order to complete the picture:

Continuing with “RESUME”

How do we resume a suspended frame?
And is there a special treatment for the initialization of a frame, rather than continuing a suspended one?

Where should we look?
genobject.c sounds like a good place to start.
And, lucky us, the functions in this file can be easily grouped into 3 groups, helping us to find our answer:
– functions wrapping gen_send_ex2
gen_send_ex2
– functions not related to gen_send_ex2, and not related to calling/resuming/stopping generators.

Well, when all the roads functions point towards one, how can we ignore the hint? Let’s look at gen_send_ex2

After the initialization:

We start with some validations:

A) For a frame that’s just created, we cannot send something, thus only None is allowed as an arg.
B) If the frame is already executing, we cannot send to it.
C) We also cannot send to a finished frame.

Afterwards, we verify the state of the frame:

Pass the argument that was sent to it using the stack:

Then, we shift the frame to executing mode, call it, and make sure that it’s state was shifted back to non-executing:

Then, we return our result.
The code below, responsible for returning the result, handles either returning the result from a generator that yielded a result, or verifying that not-returning-a-result is due to StopIteration

With the results defined as:

Nice.
Looking at gen_send_ex2 did clarify the picture that TARGET(YIELD) started to paint.
And, with that, we can conclude our understanding on generators.

Summary

In this post, we looked at the implementation of generators.

Starting in the interpreter, we saw that a generator object has several interesting attributes, one of them is generator.gi_frame.
Using the frame, we could mess with its state, e.g. using generator.gi_frame.f_locals, which is a dict of the local variables.

Next, we looked at the source code, and tried to understand the difference between a function and a generator, noting that a function has “start-frame” and “end-frame”, while a generator requires “resume-frame” and “suspend-frame”.

We saw that “resume-frame” is done in gen_send_ex2, which calls _PyEval_EvalFrameDefault, with the frame to evaluate being generator->gi_frame.
And we saw that “suspend-frame” is done in TARGET(YIELD), which keeps the frame alive (i.e. not cleaning it, like RETURN), instead, keeping the next-instruction-pointer relevant for the next execution.
Additionally, we saw that the generator is what’s responsible for telling the state of the frame (suspended / executing / finished / created)

And with that, I hope you now have a better understanding of the nuts-and-bolts of generators, as well as their similarities and differences from regular functions ๐Ÿ™‚


This is part 4 out of 5 in the “An Exceptional Flow” series.
– Part 1: [Frames] (about the normal execution flow in Python)
– Part 2: [Exceptions] (how exceptions propagate through the stack trace, and how their handlers are called)
– Part 3: [Almost Finally There] (about the weird behavior of finally)
– Part 4: [Generators] (how generators store their state between calls)
– Part 5: [Exec] (how exec is implemented)