Mon 30 June 2025
Low Latency Computation
Anything that loads faster than 200ms is instantaneous for humans, anything slower and we would perceive the delay. Some systems are built to respond quicker than this, business cost can be sensitive to latency and every ms can make a difference.
How do we develop highly performant systems? I happened to share a train ride with Mark Shannon, who at the time was leading a team at Microsoft that had one goal: Make Python Fast.
I asked him, how does one make code more performant? To which he responded.
Make it do less.
Here's what doing less looks like:
Loop Invariant
A simple example of doing less in order to achieve better performance is with loop invariant conditions.
We loop over arrays while we code and make calls to other functions while looping. During a refactor or while we move code around we might not realise that some of the variables we are computing are constant.
def expensive(trades: list[Trade], n: int):
rolling_sum = 0
for trade in trades:
beta = compute_beta(n)
rolling_sum += beta * trade.risk
return rolling_sum / len(trades)
You'll notice in the example that I am looping over the list
trades
and computing a beta
value, but this beta
value
is computed with the integer n
which is consistent within
the scope of this function. If we move the computation of
beta
to a point before we've entered loop we avoid having
to compute beta
for each trade
in trades
.
def expensive(trades: list[Trade], n: int):
rolling_sum = 0
beta = compute_beta(n)
for trade in trades:
rolling_sum += beta * trade.risk
return rolling_sum / len(trades)
Now beta
is computed at most once within this method
scope, we have essentially achieved doing less. If the
compute_beta
method incurs an expensive overhead and we
are looking at a lengthy trades
list this subtle oversight
would have a significant impact on our latency.
Python
There are several layers of abstraction in Python and there are ways we can code that allow us to utilise computation in C more effectively. One such example of this: using list comprehension over a vanilla for-loop.
If we wish to
understand the number of steps the machine is doing
for a method, Python allows us to disassemble bytecode using
import dis
. You might notice that there are additional
LOAD_ATTR
instructions in a vanilla for-loop so defaulting
to a list comprehension avoids this overhead and allows us
to do less.
Databases
Relying on external systems can be expensive, in some cases we might query the same system several times in order to complete an operation. If we have a distributed system or intend to scale our server horizontally we are increasing the number of things that open and close connections to the database. Each time we open/close a connection the CPU has to process these instructions, this overhead can be significant when we are working in low latency environments. To get around this we introduce a connection pool between the database and our server, the connection pool manages long lived connections on the database and frees up the database to focus on database things.
A common anti-pattern involving a database is having your
ORM make N+1 queries. For this to occur we have a 1-many
entity relationship. If we ask the database to return a
group of n countries and then for each country we ask for
all cities in that country we are making N
(number of
countries in our initial query) + 1
(the initial query)
total requests to the database. These can be tricky to spot
since the individual query could be performant, but the
accumulative overhead and latency of all these queries can
cause problems.
We get around this by either asking for all countries and
then ask for all cities given a set of country ids and
perform the mapping in memory on the server, or we can
increase the complexity of the query using a JOIN and allow
the database to make the mapping. Either way we avoid the
back and forth overhead of making N+1
queries, effectively
doing less.
When we can't do less
Sometimes we can't do less, but we can decide when work is done. There are situations that allow us to pre compute results and provide performant computation at the cost of a slower start time.
Chess programming is an area which relies on high performance, the quick our program the further into the future we can look on each move during the game. One way we speed up a chess engine is by precomputing the movesets for every piece on the board for every possible position they might be in before the game starts. This allows us to look up all possible board positions for a piece during the game instead of computing only when we need it.1
In 3b1b's wordle solver he actually computes a pattern matrix for each word and saves it to a file in order to avoid having to incur the overhead on each execution of his wordle solver.
Constant Folding
Lastly there's an interesting optimisation that compilers make which is also utilised in python called Constant Folding.
When python is interpreted it parses the AST before
executing. During this AST construction it finds expressions
who's outcome is constant and computes the constant at this
point instead of computing the expression every time it is
referenced. As an example the variable x = 24
* 60
will be converted to 1440
before compiling the
bytecode instructs for the machine.
So in all cases, if we want a computer to be highly performant we just have to make it do less.
Further Reading
- perflint try running this on your python codebase to check for performance optimisations.
-
More on this will be explained when I write my second post on chess programming. ↩
Mon 31 August 2020
Python Deque
This is now my third article on lists. As someone that uses the built-in python list on a fairly regular basis, I might have built up a false sense of security. I'm pretty familiar with these listy-boys. However, recently I found out that I was not thinking about them correctly. Readers might smack themselves if they're familiar with data-structures but don't know how lists are implemented internally. The built-in lists are dynamic arrays.
How else could they optimise a sweet O(1)
lookup time
on indexing: mylist[4]
. Especially when analysts are
trying to avoid the built-in iterator and cursing their code
with: for i in len(mylist): mylist[i]
.
Another trait an established data-structurer
with be familiar with when it comes to dynamic arrays
is that the append
and pop
methods are an amortised
O(1)
. Amortised; because occasionally you have to suffer
a cost of realloc(ating) memory across larger arrays.
Where the list starts to suffer is from pop
ing and
insert
ing at arbitrary positions.
Linked-List
The data-structurer will have had the linked-list
slammed into their head often enough that it will
pain them to hear about it again. So theory aside,
I'll give you that sweet O(1)
append
and last item pop
that you expect from a performant Stack
.
Python deque
provides a comparatively larger
performance hit on initialisation to list
and
has poor O(n)
performance when you want any arbitrary
item somewhere in the middle. It does, however, have
O(1)
; popleft
, pop
, append
and appendleft
. Due
to being a doubly-linked list (or double-ended queue to
get the abbreviation deque
)
Deque in the wild
I saw a nice little quote from an enginneer on Quora:
In 8 years of getting paid to write computer programs, this post is the only time I’ve typed ‘deque.’
There are many places deque
is used in the stdlib, most
commonly whenever someone needs a queue
or stack
such as
constructing a traceback, parsing python's sytax tree and
keeping track of context scope.
My little run-in with deque
was using it instead of a
recursive function to avoid python's
maximum recursion depth exceeded
This limit happens to be set to 10^4
. The solution was
to add child nodes to a deque
and when you were done with
analysing the current node, popleft
the next node.
Python Queue
You might be tempted to ask, well if deque
is for queues.
What on earth is from queue import Queue
.
These queues are different (although, still using deque
under
the hood). They are optimised for communication across threads,
which need to involve locking mechanisms and support methods like
put_nowait()
and join()
. These are not intended to be used
as a collective data-structure, hence the lack of support for
the in
operator.
More information
There is some neat documentation in the cpython repo which
contains more data-structures and other alternatives to
the standard built-in list
. Tools for working with
lists
References
- How are lists implemented:
- https://stackoverflow.com/a/15121933/3407256
- https://stackoverflow.com/a/23487658/3407256
Thu 06 December 2018
Chain of Responsibility
Easily one of my favourite patterns from the gang of four is the chain of responsibility pattern. It aims to avoid coupling the object that sends a request to the handlers of the request. This is especially useful if the structure to our handlers follows a sense of hierarchy.
I've had very few opportunities to implement design patterns, but gaming has really helped me to envision a context where several designs can be applied as handy solutions to managing complexity.
So we will apply the chain of responsibility to a situation where we have gathered four of the most powerful wizards in a room, each of whom will test their power by casting a spell.
For this example we have our wizards as the objects which send requests.
class Wizard:
"""Create a wizard."""
def __init__(self, name: str, intelligence: int):
#: Identify the wizard by name
self.name = name
#: Intelligence as a proxy of a wizard's power.
self.intelligence = intelligence
def cast_spell(self, spell: Spell):
"""Have the wizard cast a spell."""
print(f"{self.name} casts {spell.name} spell.")
spell.cast(self)
We will make each behaviour of the spell, which is determined by the power of the wizard casting it, as an object handling a request. If the wizard does not meet the handler's requirements it passes the request on, to it's successor. This is where we have applied a sense of hierarchy to the handler.
To begin we have an Abstract Handler:
from abc import ABC
from abc import abstractmethod
class AbstractHandler(ABC):
def __init__(self, successor=None):
self._successor = successor
def handle(self, creature):
reaction = self._handle(creature)
if not reaction:
self._successor.handle(create)
@abstractmethod
def _handle(self, spell):
raise NotImplementedError("Must provide implementation in subclass.")
Now we can define our handler hierarchy:
class LowPowerFireSpell(AbstractHandler):
def _handle(self, creature):
if creature.intelligence < 10:
print("Spell backfires.")
return True
class MediumPowerFireSpell(AbstractHandler):
def _handle(self, creature):
if creature.intelligence < 20:
print("Small fire ball is cast.")
return True
class HighPowerFireSpell(AbstractHandler):
def _handle(self, creature):
if creature.intelligence < 30:
print("A fire ball blazes across the room")
return True
class GodlikePowerFireSpell(AbstractHandler):
def _handle(self, creature):
print("A Massive column of fire burns through the room!")
return True
Finally a single object to identify the spell chain.
class Spell:
def __init__(self, name: str):
#: Identifying the spell by a name
self.name = name
#: How the spell behaves at differing levels of user's power
self.chain = LowPowerFireSpell(
MediumPowerFireSpell(
HighPowerFireSpell(
GodlikePowerFireSpell()
)
)
)
def cast(self, creature):
self.chain.handle(creature)
To test their skill, we have a spell which casts a "Fire Ball". We have the following wizards present:
if __name__ == '__main__':
fire_spell = Spell('Fire Ball')
merlin = Wizard("Merlin", 8)
albus = Wizard("Albus", 18)
howl = Wizard("Howl", 28)
gandalf = Wizard("Gandalf", 38)
They are all gathered in a room, and take turns casting the same spell.
room = [merlin, albus, howl, gandalf]
for wizard in room:
wizard.cast_spell(fire_spell)
print("")
To which we should see the spell behave according to their intelligence:
(myenv) pc-name ~ $ python script.py
Merlin casts Fire Ball spell.
Spell backfires.
Albus casts Fire Ball spell.
Small fire ball is cast.
Howl casts Fire Ball spell.
A fire ball blazes across the room
Gandalf casts Fire Ball spell.
A Massive column of fire burns through the room!
(myenv) pc-name ~ $
Thu 11 October 2018
Iterator Design Pattern
Before I started reading the gang of 4 ["Go4"], I was convinced I would not need an iterator. After all, in python, they are already implemented:
x = ['a', 'b', 'c']
for i in x:
print(i)
In the frame of python lists as aggregate objects, the intent of an iterator is satisfied.
Provide a way to access the elements of an aggregate object sequentially without exposing its underlying representations.
Displaying notes
Here I explain, how the iterator became my favourite design pattern.
I have written about representation of lists before, and I don't think this will be my last time. When implementing the drop-downs in foolscap to allow a user to see sections contained in their note, I ran into an issue of complexity where I had blocks of conditionals dictating what should be displayed to a user. I wanted the user to have an indication that a note contains sections, and for them to be able to toggle a drop-down to see each section. Similar to:
(+) | circleci |
| docker |
Where my "circleci"
note would contain sections
and the "docker"
note does not. Then expanding:
<-> | circleci |
└─ | -workflow |
| docker |
This is where I realised the abstraction power of an
iterator, And I could hide the collapsed sections
behind an "if"
conditional in an iterator.
class Menu:
def __init__(self, items):
self.items = items
def visible(self):
"""Yield the next appropriate item for display."""
for item in self.items:
yield item.title
if hasattr(item, 'expand') and item.expand:
for sub_item in item.sub_items:
yield sub_item.title
Supporting further traversal policies in my Menu
class is straightforward, and
drawing becomes absurdly simplified:
def draw(menu):
"""Draw all viewable menu items."""
for item in menu.visible():
draw_item(item)
After this I thought of a more complex aggregate structure that I could traverse, like a tree depth first.
Realising the abstraction strength of the iterator, one finds its definition of intent far more compelling.
Provide a way to access the elements of an aggregate object sequentially without exposing its underlying representations.
Tue 09 October 2018
Procedural Villages
My inspiration for any tool or project has always been the thought that it could be done better, either that or I'd feel like I would benefit a lot from a missing feature.
In my first embarked I attempted to create a dungeon crawler, full of all the creatures and characters I thought were missing from traditional games.
It was through brute-force of the Roguebasin tutorial where I learnt how to code.
I must have created the same game 4 or 5 times before I decided to scrap the libtcod library and create a game without an interface.
Armed with stdout
and turn-based printing, I
implemented feature after feature. One of the
features I was quite proud of was the randomly
generated villages I'd spawn my player into.
Objects
I was aiming for something simple, just populate
the map with houses that the player can interact
with, for this I deconstructed a house
into a
rectangle.
class Rect:
def __init__(self, x, y, h, v):
self.x1 = x
self.y1 = y
self.x2 = x + h
self.y2 = y + v
def center(self):
center_x = (self.x1 + self.x2) / 2
center_y = (self.y1 + self.y2) / 2
return (center_x, center_y)
def internal(self, x, y):
"""
[ ][ ][ ][ ]
[ ][X][X][ ]
[ ][X][X][ ]
[ ][X][X][ ]
[ ][ ][ ][ ]
"""
...
return bool()
def edges(self, x, y):
"""
[X][X][X][X]
[X][ ][ ][X]
[X][ ][ ][X]
[X][ ][ ][X]
[X][X][X][X]
"""
...
return bool()
def sides(self, x, y):
"""
[ ][X][X][ ]
[X][ ][ ][X]
[X][ ][ ][X]
[X][ ][ ][X]
[ ][X][X][ ]
"""
...
return bool()
The poorly defined object above could now provide boolean confirmation to a co-ordinates existence in appropriate sections of a rectangle.
From the above Rect
class I can place a door on
a Rect.sides()
and I can fill the area
defined by Rect.internal()
with items.
Empty space: | . |
Wall: | # |
Monster: | m |
Door: | + |
| . | . | . | . | . | . |
| . | # | # | + | # | . |
| . | # | m | . | # | . |
| . | # | . | . | # | . |
| . | # | . | . | # | . |
| . | # | # | # | # | . |
| . | . | . | . | . | . |
Throwing in some size variations and randomising the house position on the map, (making sure there are no houses intersecting each other). Allowed me to generate maps that looked like these:
Example 1
| . | . | . | . | . | . | . | . | . | . | . | . | . | . | . |
| . | . | . | . | . | . | . | . | . | . | . | . | . | . | . |
| . | . | . | . | . | . | . | . | . | . | . | . | . | . | . |
| . | # | # | # | # | # | # | . | # | # | # | # | . | . | . |
| . | # | u | . | . | @ | # | . | # | . | m | # | . | . | . |
| . | + | . | . | . | . | # | . | # | . | . | + | . | . | . |
| . | # | # | # | # | # | # | . | # | # | # | # | . | . | . |
| . | . | . | . | . | . | . | . | . | . | . | . | . | . | . |
| . | . | . | . | . | . | . | . | . | . | . | . | . | . | . |
| . | . | . | . | . | . | . | . | . | . | . | . | . | . | . |
| . | . | . | . | . | . | . | . | . | . | . | . | > | . | . |
| . | . | . | . | . | . | . | . | . | . | . | . | . | . | . |
| . | . | . | . | . | . | . | . | . | . | . | . | . | . | . |
| . | # | # | + | # | # | # | . | . | . | . | . | . | . | . |
| . | # | . | . | r | . | # | . | . | . | . | . | . | . | . |
| . | # | . | . | . | . | # | . | . | . | . | . | . | . | . |
| . | # | # | # | # | # | # | . | . | . | . | . | . | . | . |
| . | . | . | . | . | . | . | . | . | . | . | . | . | . | . |
| . | . | . | . | . | . | . | . | . | . | . | . | . | . | . |
| . | . | . | . | . | . | . | . | . | . | . | . | . | . | . |
Example 2
| . | . | . | . | . | . | . | . | . | . | . | . | . | . | . |
| . | . | . | . | . | . | . | . | . | . | . | . | . | . | . |
| . | . | . | . | . | . | . | . | . | . | . | . | . | . | . |
| . | . | . | . | . | . | . | . | . | . | . | . | . | . | . |
| . | . | . | . | . | . | . | . | . | . | . | . | . | . | . |
| . | . | . | . | . | . | . | . | . | . | . | . | . | . | . |
| . | . | . | . | . | . | . | . | . | . | . | . | . | . | . |
| . | . | . | . | . | . | . | # | # | + | # | . | . | . | . |
| . | . | . | . | . | . | . | # | . | . | # | . | . | . | . |
| . | . | . | . | . | . | . | # | . | . | # | . | . | . | . |
| . | . | . | . | . | . | . | # | m | . | # | . | . | . | . |
| . | . | . | . | . | . | . | # | # | # | # | . | . | . | . |
| . | . | . | . | . | . | . | . | . | . | . | . | . | . | . |
| . | . | . | . | . | . | . | . | . | . | . | # | # | # | # |
| . | # | # | # | # | # | # | # | . | . | . | # | . | . | # |
| . | # | . | . | . | . | u | # | . | . | . | + | . | . | # |
| . | + | . | . | . | . | . | # | . | . | . | # | > | . | # |
| . | # | X | . | . | . | . | # | . | . | . | # | . | r | # |
| . | # | # | # | # | # | # | # | . | . | . | # | # | # | # |
| . | . | . | . | . | . | . | . | . | . | . | . | . | . | . |
Unfortunately my code was not beautiful back then. I still think it was a neat idea