s3bw - software

Mon 11 August 2025

The Bleeding Edge

A tech stack is more often inherited than it is chosen. On the rare occasion someone gets the chance to decide which tech the company gets to use for the next decade. I've seen this go well and I've seen this go belly up.

If you're in the right club, you know the answer; stick with boring.

The Postgres Paradox

When thinking about how long a technology will last I often use the current age of the software to determine how much longer it will be around. So if it's 2 years old, it might be around 2 more years. Postgres dates back 35 years, so it will probably be around for 35 more years.¹

We can view this in terms of experienced employees; we are more likely to find someone with 20 years of Postgres experience than we are to find someone with 20 years in a tech that's only existed for 2 years.

Taking the age of the tech into consideration allows us to earmark the hiring pool and the likelihood that fixes or workarounds exist for problems we might run into. Diving into newer tech increases the chance we are the first to come across certain problems.

The bleeding edge might appear shiny and new but we aren't here to ship blog posts we are here to ship features and delight customers.

These avoidable issues are hard to solve because of two reasons:

They're novel, we're the first to run into them.
The people that have faced this issue are expensive, because not many of them exist.

Make a Difference

Starting a business is a risk, picking fancy new tech is a risk. These risks add up. Where you decide to take the risk depends on where you are trying to drive innovation.

In 2015, London, two neobanks were founded; this is the tech they are built on and the tech's age at the time. Revolut is built on Postgres (25yo) and Java (19yo), the other is built on Cassandra (7yo) and Golang (6yo). The former has 50million customers, the later has 10million.

Risks are allowed to be taken, but we should limit the number of unknowns and not make things harder than they need to be. If we pick our entire stack on the principle of shiny and new we are going to be fighting our tech more often that we are solving customer use-cases.

Get excited about delighting users, not the fancy tech.

Don't be First, Be Fast

Early market leaders have much greater long-term success and enter an average of 13 years after pioneers.

Golder and Tellis (1993)

People make the assumption that if they're not using the latest technology they're not staying competitive, when they are instead just becoming a case-study for the industry.²

The study by Golder and Tellis tells us that first movers have a failure rate that is 6 times higher than fast followers.

If the tech stack mattered, you'd see it in advertising, but you don't, because the customer doesn't care. Your tech stack doesn't determine your product market fit³

Your investors might care (🤷).

If I had to choose between customers and investors I'd pick customers.⁴

The Right Tool for The Job

Ask a graduate what tool or language they would use for their next project and they'll give you a political answer: "I would pick the best tool for the job". Here's a list of programming languages. There are 50 that start with the letter A. Let me know when you've found the one that's best suited for the job.

Perhaps we should define what "best for the job", means. If we are in a team of 12 engineers that are maintaining 2 Postgres databases; introducing a MySQL database on the basis that it addresses a specific use-case better than postgres is not a good idea.

If we hire a DB expert in the future are we going to ensure they're versed in both Postgres and MySQL? It's niche but we could find them at a price.

We don't have to hire this DB expert, our team has built up experience with Postgres, now they'll need to familiarise themselves with the maintenance process of a new db. Just bumping your database version is now slightly tricker and you're monitoring news and bug fixes for both releases.

We could define "best for the job" as perhaps the one we are most familiar with, and this is often the best choice. If you can skip learning a new language or learning the framework step, you're going to get to market faster than others. Don't let learning the tech be the procrastination for learning the market.

In a startup we are testing our market assumptions, this is the largest risk to the company, we shouldn't slow down the speed we get to market. It's better to know we are wrong sooner rather than later.

Getting Experience

Engineers at startups might use the job as an opportunity to focus on CV driven develop instead of focusing on the needs of the business.

I've seen a graduate write every new project in a new programming language.

Anyone know Elixir? We're trying to fix a bug in production and the developer that wrote this is no longer at the company.

From a slack thread.

Starting every new project in a new language isn't a fast way to learn how to deliver. It's a fast-way to get 3 months of experience 5 times. Learn how to deal with a language's pain points and how to solve issues with a specific framework is what makes you valuable. If you're looking to stretch yourself, then use the same language and framework for larger projects.

If you've only got 3 months of experience in a language and framework you're probably not that hard to replace.

The new thing won’t be better, you just aren’t aware of all of the ways it will be terrible yet

Slide 84

Propaganda

Propaganda doesn't just occur in war or between competing states, corporate propaganda is a well established strategy available to large technical institutions.

Let's say you are working on an internal framework that holds up your billion dollar social media platform, you wish to hire more engineers to maintain and build the project. Training new hires on this framework is time consuming, wouldn't it be great if the engineers we hired were already familiar with the framework?

They can also hire developer advocates, these are the influencers of the software engineering world, corporate mercenaric missonaries that are very good at making the work they're doing exciting.

Ignore the hype, your job is to keep things simple.

Bleeding Edge

One reason you might wish to be in the bleeding edge of tech is to gain the expertise and then build around some of the fundamentals you've learnt. Perhaps your customers are others trying to enter the market, this is perhaps a safety-net for the risk you take, but the market can also flatten out and you're left holding a bag of air.

This is known as the Lindy Effect ↩
Your competitors will thank you for your sacrifice. ↩
Unless your tech stack is your product ↩
A blog post on capital allocation and the VC world might happen in the future. ↩

S Williams-Wynn at 12:05 | Comments() |

Mon 04 August 2025

How the Blog Works

There are a few parts that make up the blog, the static site generator, the API and the repo that hosts all the HTML. This article provides an overview of these pieces and how they're all brought together to create the website.

Framework

The framework is called Pelican a python based static site generator. Each post is written in markdown which starts with meta tags such as Title and Category. Pelican then uses this and a custom theme template to create HTML. The theme uses jinja templating to construct template HTML pages.

Pelican also provides a config file which I have used for feature flags and to enable extensions. The extensions I've enabled include footnotes and code-highlighting. Additionally Pelican hosts a local version of my blog which it recompiles whenever it detects a change to any of the files.

In the config there's a list which contains all the links rendered in the sidebar. Extending or updating these links is only a matter of making changes to this list.

There are two directories inprogress/ and content/. inprogress/ stores drafts that are half complete, completed or minor thoughts for potential blog posts. When these are ready to be published they are moved over to content/ and when these are pushed to the repo, using a github action, pelican automatically renders the entire blog into HTML. The action then creates a commit for any changes and pushes the changes to a public repo.

The public repo contains the entire rendering of the blog and theme. Relying on Github pages, this is then hosted.

The CV

Within the same repo as the drafts and published content there's a CV written in Latex. This latex file can be rendered to a PDF. I have a bash script which allows me to recompile this PDF whenever changes are made to the Latex script.

The API

There's an API that contains edge functions written in Go. These serve requests for the comment sections that sit below each blog post. There are three endpoints, one fetches the count of comments given a list of article URLs. This is used on index pages.

There's an endpoint to fetch comments given an article URL and an endpoint to post a comment to an article URL. The repo containing these endpoints has an action which deploys updates on each new commit.

Finally the API handles the mailing list. It handles registering new emails, verifying the email is valid and lastly triggers an edge function to look for new posts using the blog's RSS feed in order to send updates to recipients in the mailing list.

S Williams-Wynn at 12:04 | Comments() |

Mon 28 July 2025

Software Is Planning

Software is planning, and as the adage goes, failing to plan is planning to fail. This article goes into the development of Git and UNIX with the goal of dispelling the myth that the genius software engineer exists and when we hear something was written entirely over night there's key information being omitted.

Everyone likes the story of the programmer that disappears and resurfaces after a week with some revolutionising software. We all enjoy a compelling narrative but in reality great software takes time, additionally for most great projects the code is actually the smallest part of the project.

Planning

Projects don't go wrong they start wrong

How Big Things Get Done (2023)

We shouldn't view software creation from the point when coding starts. The process should include the upfront planning. In 1975 Brooke advocated that the coding portion of a software project should amount to 17% of the overall time and it should be noted that in the 1970s the most dominant programming languages were FORTRAN and COBOL.

If a software requirements error is detected and corrected during the plans and requirements phase, its correction is a relatively simple matter of updating the requirements specification.

Software Engineering Economics (1981)

When we think of planning we often think of planning permissions, regulation and endless bureaucracy. We don't have time for that, we are trying to move fast and break things. However, in software, the planning stage is usually when most of the thinking happens and the hard problems get ironed out. Thinking is hard and most people avoid it, make thinking easier and you can stand out from other developers.

The planning phase of a project is also the point in time when large changes are the cheapest.¹

UNIX

One fable within software is that UNIX was developed over the weekend by Ken Thompson. Ken is quite the mythical figure in the software world, Brian Kernighan attests to this in multiple interviews.

We must keep in mind though that Ken worked for three years on Multics which he carried many features forward to UNIX. Here's not saying Ken isn't a great programmer, but we shouldn't discount that his mind was focused on an operating system for quite an amount of time and probably made some contribution towards formulating a better system.

Imagine your best programmer said to you, "sure let me work on the project for two to three years then we will scrap it; and start again. You are bound to have the best in class software." Instead businesses want every project to have a fabled programmer and to continue the trope.

Ken Thompson developed UNIX over 3 weeks, there's a video where Brian Kernighan remarks that modern software engineers aren't as productive².

Git

At this point Git is the most popular form of version control for software, it's also famously known to have been written in a fairly short amount of time. Linus states that he wrote Git in 10 days and we mostly assume the timeline for the project was - I have an idea and ten days later we have Git.

However this is not the case. Linus wrote Git in 10 days but it took him 4 month of thinking about the problem until he had a solution that he was satisfied with. 10 days after that we had Git.

Being a Creator

When it comes to creation, sometimes we are too excited by the answer that we fail to think about the question. A lot of software gets written without getting to the root of the problem and thus fail to materialise any value.

Large projects fail when creation comes first and planning comes never. Even after you have complete knowledge of the problem, solutions may have challenges which require addressing upfront, addressing them when you are midway through a project is going to cost both time and money.

For some problems you can have an attitude of "we will work it out when we get there", but you don't want to have this attitude to all problems as when you get there you might be 95% of the way through your time and budget and you have crossed the point of no return and getting over this hurdle requires going over budget.

Use your planning to categorise problems into "it won't work unless we know this" and "we can figure this out when we get there".

Board Games

"Is it fun" is the most important question in board game design. Ideally you should find this out before you hire an artist, write storyboards or even print and design cards. In board game creation circles they advocate tearing up pieces of paper and scribbling out a prototype to answer this question. The same thing applies to software.

Determine if the question you are answering is the correct one and determine if your software will actual solve the problem, do this; Wizard of Oz style, before putting a shovel into the ground.

Assumptions

All projects start with assumptions. These are typically the first things that need to be addressed before you start coding. Sometimes asking someone if a solution exists or how something works can knock off some unknowns and you'll be in a better position to start than had you remained silent.

Design it Twice

A Philosophy of Software Design (2018)

John Ousterhout advocates that designing software once is not enough and when you are planning something out you should make multiple attempts at it before you move ahead. He's found this leads to better design.

Every project comes with its unknowns and challenges, we are probably never going to rebuild the exact same solution twice. So packing in as much learning as you can upfront about the challenges ahead will leave you at least the most prepared you could have been.

The beginning of project is the cheapest time to learn so we should maximise and front load learning. Learning midway through a project is an expensive way of learning. It's fine if we can afford these learnings but on big and critical projects you don't want to bring up the point that the work we've been doing for the last 6 months was all for nothing especially if we could have learnt this earlier on.

Break things down, explore and figure out how each piece of the project will work and that we are actually providing the right answer to the correct question, this is the only way to deliver on time and on budget.

If we jump straight into code our design strategy is hope.

Whenever I hear "prompt engineering" I roll my eyes, mainly because it's a term that offers zero value. If instead we referred to it as "prompt planning" I think we would get people onto the correct page. Having a clear idea of the answer, having thought about what you want to create and providing a clear unambiguous prompt is essentially doing the upfront planning, which I hope you're doing when writing your software. ↩
and I took that personally. ↩

S Williams-Wynn at 12:05 | Comments() |

Mon 07 July 2025

Database Indexes

Often the most crucial part of a system is the database. Databases manage and handle the storage of our business data, an inefficient database can impact the overall performance of our system. There are, however, a few ways we can improve database performance, one such way is defining appropriate indexes.

This article describes indexes in Postgres and some of the lesser known indexes it uses, such as GIN and BRIN indexes.

Why do we use indexes.

We use indexes in postgres to optimise how we scan the data we have stored in the database. Similar to how binary search can optimise how quickly we find things in a dictionary (see my post on binary search).

We can use an index to inform the database on how to structure data in a way that allows us to find what we are looking for effectively i.e. fewer operations, fewer i/o and fewer comparisons leading to lower CPU usage and better performance.

What is an index.

An index is a data structure which allows us to minimise the number of instructions performed by the database in order to lookup table rows or improve the speed at which a result is sorted.

In the case of sorting if there's an index on a column that you require results to be ordered by, the query can scan the index to return the results in sorted order, if there is no index on the column the data must first be fetched and then sorted in a separate step.

Indexes aren't always good

Over doing the number of indexes on a table can lead to lower write performance as every update or insert into the table requires updating the index. Indexes also require disk space, so having unnecessary indexes will contribute to the rate at which your disk usage grows.

B-Tree

The default index type Postgres uses is a B-Tree. This data structure is similar to a binary tree except instead of having a node with only two pointers a node can contain many pointer. The number of pointers is determined by Postgres's default block size, which is 8kb. A single node will store as many sorted keys as it can until it reaches this size limit. Postgres refers to these nodes as pages.

Having a smaller index keys can improve the performance of your index as you'll be able to store more keys within the page limit. Resulting in fewer i/o instructions as fewer pages are required to be read from disk.

In postgres leaf nodes will point to the physical location of the row on disk and the intermediary nodes (or internal pages) will point to the nodes on the next level down the tree.

There's a neat animation that can help visualise b-trees: https://btree.app/

Partial Indexes

We can optimise the indexes we create by understanding usage patterns and having some intuition about the system and the business. One such optimisation is the use of partial indexes, these are handy when you have a large table but only a subset of the data is used most frequently.

As an example, we can imagine an online order system, it's quite likely that once an order is fulfilled the status transitions to "complete" and this order isn't likely to be accessed as frequently as the orders that are still in progress. We can partition our index so that it contains only unfulfilled orders which will be a significantly smaller portion of our overall orders.

CREATE INDEX idx_uncomplete_orders_customer ON
orders(customer_id) WHERE status != 'complete';

We also have to include this WHERE filter in our queries if we wish to make use of this index.

Covering Index

A covering index allows us to add columns to the index's leaf nodes and thus avoids the lookup and reading of the entire the table row from disk. Essentially everything that the query needs is available on the index, reducing the number of operations required to complete the query.

As an example if we typically request a user's first name and last name with an email we can write an index on the email that includes the first and last name.

CREATE INDEX idx_user_email_include
ON users (email) INCLUDE (firstname, lastname);

We have to bare in mind that we should only include columns which change as frequently or less frequently than the key we are indexes otherwise we are duplicating data and increasing the write overhead. This isn't an issue for columns that rarely change.

Gin Index

GIN stands for Generalised Inverted Index, similar to the inverted index that Lucene is built on, the same Lucene that Elasticsearch is built on. The slight difference is that this inverted index is generalised so it expects items that are indexed to be composed of multiple values the same way a sentence is composed of multiple words, with the goal to support full-text search.

An index entry is created for each element in the composite column and the entry points to a list of matching locations. So as an example a row that has a column with the value "Here's is a blog post" will create an index entry for each word in the value, (e.g. "blog") and then each entry will point to the rows that contain "blog" inside the composite column.¹

ALTER TABLE blogposts
ADD COLUMN search_vector tsvector;

CREATE INDEX idx_blogposts_search ON blogposts
USING GIN (search_vector);

GIN indexes aren't limited to just text, they are generalised so they can be used with any composite type, JSONB for example.

BRIN Index

BRIN stands for Block Range Index. These are indexes that specialise in handling very large tables in which certain columns have some natural correlation to their physical storage location. Essentially if each row in the table can be grouped in some manner to the blocks on disk then we have a potential use-case for the BRIN Index.

GPS tracking points and log tables are good examples of data with this natural correlation, if you have a table that is small or data is updated or inserted out of chronological order then they won't form a good fit for BRIN indexes.

Instead of storing pointers to entry locations in the B-Tree leaf nodes, BRIN indexes store summary information for the ranges that the blocks on disk are ordered. This allows for a smaller overhead compared to the default B-Tree which stores all entries and their corresponding locations.

CREATE INDEX idx_transactions_created_at_brin
ON transactions using BRIN (created_at);

This can be used to optimise queries that rely on fetching rows between certain ranges.

SELECT * FROM orders WHERE
created_at BETWEEN '2024-01-01' AND '2024-12-31';

I'm describing a reverse index. ↩

S Williams-Wynn at 12:32 | Comments() |

Mon 30 June 2025

Low Latency Computation

Anything that loads faster than 200ms is instantaneous for humans, anything slower and we would perceive the delay. Some systems are built to respond quicker than this, business cost can be sensitive to latency and every ms can make a difference.

How do we develop highly performant systems? I happened to share a train ride with Mark Shannon, who at the time was leading a team at Microsoft that had one goal: Make Python Fast.

I asked him, how does one make code more performant? To which he responded.

Make it do less.

Here's what doing less looks like:

Loop Invariant

A simple example of doing less in order to achieve better performance is with loop invariant conditions.

We loop over arrays while we code and make calls to other functions while looping. During a refactor or while we move code around we might not realise that some of the variables we are computing are constant.

def expensive(trades: list[Trade], n: int):
    rolling_sum = 0
    for trade in trades:
        beta = compute_beta(n)
        rolling_sum += beta * trade.risk

    return rolling_sum / len(trades)

You'll notice in the example that I am looping over the list trades and computing a beta value, but this beta value is computed with the integer n which is consistent within the scope of this function. If we move the computation of beta to a point before we've entered loop we avoid having to compute beta for each trade in trades.

def expensive(trades: list[Trade], n: int):
    rolling_sum = 0
    beta = compute_beta(n)
    for trade in trades:
        rolling_sum += beta * trade.risk

    return rolling_sum / len(trades)

Now beta is computed at most once within this method scope, we have essentially achieved doing less. If the compute_beta method incurs an expensive overhead and we are looking at a lengthy trades list this subtle oversight would have a significant impact on our latency.

Python

There are several layers of abstraction in Python and there are ways we can code that allow us to utilise computation in C more effectively. One such example of this: using list comprehension over a vanilla for-loop.

If we wish to understand the number of steps the machine is doing for a method, Python allows us to disassemble bytecode using import dis. You might notice that there are additional LOAD_ATTR instructions in a vanilla for-loop so defaulting to a list comprehension avoids this overhead and allows us to do less.

Databases

Relying on external systems can be expensive, in some cases we might query the same system several times in order to complete an operation. If we have a distributed system or intend to scale our server horizontally we are increasing the number of things that open and close connections to the database. Each time we open/close a connection the CPU has to process these instructions, this overhead can be significant when we are working in low latency environments. To get around this we introduce a connection pool between the database and our server, the connection pool manages long lived connections on the database and frees up the database to focus on database things.

A common anti-pattern involving a database is having your ORM make N+1 queries. For this to occur we have a 1-many entity relationship. If we ask the database to return a group of n countries and then for each country we ask for all cities in that country we are making N (number of countries in our initial query) + 1 (the initial query) total requests to the database. These can be tricky to spot since the individual query could be performant, but the accumulative overhead and latency of all these queries can cause problems.

We get around this by either asking for all countries and then ask for all cities given a set of country ids and perform the mapping in memory on the server, or we can increase the complexity of the query using a JOIN and allow the database to make the mapping. Either way we avoid the back and forth overhead of making N+1 queries, effectively doing less.

When we can't do less

Sometimes we can't do less, but we can decide when work is done. There are situations that allow us to pre compute results and provide performant computation at the cost of a slower start time.

Chess programming is an area which relies on high performance, the quick our program the further into the future we can look on each move during the game. One way we speed up a chess engine is by precomputing the movesets for every piece on the board for every possible position they might be in before the game starts. This allows us to look up all possible board positions for a piece during the game instead of computing only when we need it.¹

In 3b1b's wordle solver he actually computes a pattern matrix for each word and saves it to a file in order to avoid having to incur the overhead on each execution of his wordle solver.

Constant Folding

Lastly there's an interesting optimisation that compilers make which is also utilised in python called Constant Folding.

When python is interpreted it parses the AST before executing. During this AST construction it finds expressions who's outcome is constant and computes the constant at this point instead of computing the expression every time it is referenced. As an example the variable x = 24 * 60 will be converted to 1440 before compiling the bytecode instructs for the machine.

So in all cases, if we want a computer to be highly performant we just have to make it do less.