Small gotcha under Python’s PYTHONOPTIMIZE feature

A long, long, long time ago I started using interpreter optimizations to help organize code. If a block of code is within a constant (or expression of constants) that evaluate to False, then the block (or line) is optimized away.

if (false){ # this is optimized away }

Perl was very generous, and would allow for constants, true/false, and operations between the two to work.

Thankfully, Python has some optimizations available via the PYTHONOPTIMIZE environment variable (which can also be adjusted via the -o and -oo flags). They control the __debug__ constant, which is True (by default) or False (when optimizations are enabled), and will omit documentation strings if the level is 2 (or oo) or higher. Using these flags in a production environment allows for very verbose documentation and extra debugging logic in a development environment.

Unfortunately, I implemented this incorrectly in some places. To fine-tune some development debugging routines I had some lines like this:

if __debug__ and DEBUG_CACHE:
    pass

Python’s interpreter doesn’t optimize that away if DEBUG_CACHE is equal to False, or even if it IS False (i tested under 2.7 and 3.5). I should have realized this (or at least tested for it). I didn’t notice this until I profiled my app and saw a bunch of additional statistics logging that should not have been compiled.

The correct way to write the above is:

if __debug__:
    if DEBUG_CACHE:
        pass

Here is a quick test

trying running it with optimizations turned on:

export PYTHONOPTIMIZE=2
python test.py

and with them off

export PYTHONOPTIMIZE
python test.py

As far as the interpreter is concerned during the optimization phase, __debug__(False) and False is True.

import dis

def foo():
    """docstring"""
    if __debug__:
        print("debug message")
    return True

def bar():
    """docstring"""
    if __debug__ and False:
        print("debug message")
    return True

# ============

# this will be a lean function
dis.dis(foo)

print "- - - - - - - -"
# this will show extended logic
dis.dis(bar)

The default version of foo should look like this:

21           0 LOAD_CONST               1 ('debug message')
             3 PRINT_ITEM          
             4 PRINT_NEWLINE       

22           5 LOAD_GLOBAL              0 (True)
             8 RETURN_VALUE        

But you should see that the optimized version of foo creates some very lean code:

22           0 LOAD_GLOBAL              0 (True)
             3 RETURN_VALUE        

Now let’s look at the bar function when unoptimized

26           0 LOAD_GLOBAL              0 (__debug__)
             3 POP_JUMP_IF_FALSE       20
             6 LOAD_GLOBAL              1 (False)
             9 POP_JUMP_IF_FALSE       20

27          12 LOAD_CONST               1 ('debug message')
            15 PRINT_ITEM          
            16 PRINT_NEWLINE       
            17 JUMP_FORWARD             0 (to 20)

28     >>   20 LOAD_GLOBAL              2 (True)
            23 RETURN_VALUE        

Which generates the same code as the optimized version:

26           0 LOAD_GLOBAL              0 (__debug__)
             3 POP_JUMP_IF_FALSE       20
             6 LOAD_GLOBAL              1 (False)
             9 POP_JUMP_IF_FALSE       20

27          12 LOAD_CONST               1 ('debug message')
            15 PRINT_ITEM          
            16 PRINT_NEWLINE       
            17 JUMP_FORWARD             0 (to 20)

28     >>   20 LOAD_GLOBAL              2 (True)
            23 RETURN_VALUE

So let’s add a new function, biz

def biz():
    """docstring"""
    if __debug__:
        if False:
            print("debug message")
    return True

The unoptimized code:

33           0 LOAD_GLOBAL              0 (False)
             3 POP_JUMP_IF_FALSE       14

34           6 LOAD_CONST               1 ('debug message')
             9 PRINT_ITEM          
            10 PRINT_NEWLINE       
            11 JUMP_FORWARD             0 (to 14)

35     >>   14 LOAD_GLOBAL              1 (True)
            17 RETURN_VALUE    

And a lot of that gets optimized away:

35           0 LOAD_GLOBAL              0 (True)
             3 RETURN_VALUE        

Not many people use/abuse how the interpreter compiles code to their advantage, but if you do — pay attention to constructs like this.

Using __debug__ is a great way to hide logging code on a production environment. The evaluation of __debug__ only happens once, when Python first compiles the code, so there is very little overhead.

Leave a Reply

Your email address will not be published. Required fields are marked *