Mon 12 January 2026

Seating Charts

We have two large tables and 100 guests coming to our wedding and we have to figure out how they will be seated. Defining the seating chart tends to be the most enjoyable part of wedding planning1.

After drawing circles for all the seats and guests in excalidraw.com, I began connecting the circles with colourful lines to map out specific relationships. As an example, a couple at the wedding should be seated together.

It dawned on me that this is a constraint satisfaction problem (CSP); and modeling CSPs is something I've been doing for the last year.

Constraint Satisfaction Problems

CSPs are a family of problems where you are required to find the value to a number of variables given certain constraints to those variables. There are many areas that such problems occur, one such area is packing optimisations.

There are software engineering roles that rely on solving these type of questions; typically these roles will include the term "Formal Methods" under their list of responsibilities.

To solve a CSP problem, we begin by modelling the problem mathematically. There are a few notations that allow us to define the variables and constraints for these kinds of problems, the one I am familiar with is called SMT-language.

Here is a simple example where we wish to find valid values for x and y:

(declare-const x Int)
(declare-const y Int)
(assert (> x 0))
(assert (< y 10))
(assert (= (+ x y) 15))
(check-sat)
(get-model)

You may have noticed that this uses prefix notation, i.e. (> x 0) which is equivalent to (x > 0) in the familiar infix notation.

To find valid solutions we need a solver such as z3. If we provide z3 with the file above it will give us a solution for x and y that fits the constraint. E.g x = 15 and y = 0. There may be more than one valid solution. Sometimes there is not solution and it will return unsat. Short for unsatisfiable.

Knapsack problems

We can use z3 and smt to model and solve a classic knapsack problem. Given the following products:

  • Product A has a size of 3 and value of 4.
  • Product B has a size of 5 and a value of 7.

Pack a bag with a capacity of 16. Maximizing the total value of the bag.

(declare-const productA_count Int)
(declare-const productB_count Int)
(declare-const total_value Int)

(assert (>= productA_count 0))
(assert (>= productB_count 0))

; Product A: size=3, value=4
; Product B: size=5, value=7
; Bag capacity: 16

(assert (<= (+ (* 3 productA_count) (* 5 productB_count)) 16))
(assert (= total_value (+ (* 4 productA_count) (* 7 productB_count))))

(maximize total_value)
(check-sat)
(get-model)

Solving this tells us that if we pack two of each product we maximize the total value of the bag; this being 22.

Seating

Getting back into the seating chart. I wrote (vibed) a program that allowed me to draw different connections between each of my guests. These connections account for the different constraints we wanted to map onto the guests. So if A and B are a couple, ensuring they sit next to each other is a constraint.

Here is a complete list of the constraints that were modeled:

  1. Sit next to each other.
  2. Sit opposite each other. Or constraint 1.
  3. Sit diagonally across from each other. Or constraint 2.
  4. Not next to, not opposite and not diagonal from each other.
  5. A should be next to B OR C.

After drawing all these constraints between our guests this is how the wedding looks:

Initial Grouping

Green is constraint 1, yellow 2, orange 3, dashed red 4 and dashed purple is constraint 5.

We are seating everyone at two large tables. As would have been done in a classic viking longhouse. In order to model the guests at these tables each guest will need a variable for the position, the side and the table.

The groom will have an assigned seat, in the middle of the first table facing inward, towards the guests. As guest number 1 he would have the following variables:

# table_1: int = 0
# pos_1: int = 12
# side_1: bool = true

This would seat me in the middle of the table looking out across the room. We then define these variables for all 100 guests. We also need to include a constraint that all the table_{n} variables must be 0 or 1, since there are only two tables. Additionally the pos_{n} variables have to be between [0 and 25) since there are only 25 seats on each side of the tables.

Constraints

Now we model each constraint listed above. The first constraint states that guest A and B must be next to each other. If we take guests 11 and 12 this would look like the following:

(assert (= table_11 table_12))
(assert (= side_11 side_12))
(assert (or (= pos_11 (+ pos_12 1) (= pos_12 (+ pos_11 1)))))

I.e. Same table, same side but their positions are offset by 1 or -1.

The next constraint allows the guest to not only sit next to someone but as an alternative they can be opposite each other. To do this we assign the above constraint to same_side_const and define a new constraint assign it to opposite_const and in order to satisfy either of them we say. (assert (or same_side_const opposite_const)). Here's the definition of opposite_const:

(assert (= table_11 table_12))
(assert (distinct side_11 side_12))
(assert (= pos_11 pos_12))

I.e. Same table, different sides but same position.

For the 3rd constraint we merge constraint 1 and 2 together. B needs to have a different side to A and the position needs to be offset by 1 or -1.

Using the tool above to define the graph constraints I could then export the constraints to be passed into a script that generates the smt file for z3. It then provided a solution that I could import into the same tool to see a visual representation of how it would seat all the guests. You can see that below:

Programming Output

As you'll notice from the colour of each line, all the constraints are satisfied.2

Results

So after 5 hours was this seating chart useful? Not really...

The guests that are on the edge of each of the clusters weren't really people we wanted to sit together. However the arrangement within each cluster was great.

Was this fun? Totally!

I'm not going to do this again, because I'm not going to be married again, but in the event that I have to create a seating chart for 100 people in the future there are certain things that might provide a better outcome.

Perhaps it would be better to rely on more flexible constraints, as an example instead of saying A and B need to be in close vicinity I would instead say A should sit next to at least B, C or D. Giving more flexibility to who A can be sat next to.

After presenting this analysis to my future wife she showed me how she had laid out the tables. You can see this below.

Final Output

Since the constraints are already linking the guests, I was able to spot some improvements to the chart and get some value from this effort.

There was another improvement to the modelling I could have made when looking at the final chart.

If we wanted A to sit next to C and A was in a couple with B. I had not considered that it was probably fine for B to sit next to C instead of A. If C would get on with A there's a chance they would also get on with their partner B. So I could have relaxed this constraint and had C sit next to either A or B.

Creating a seating chart is not a recurring problem of mine, so sadly there's no need for me to refine this solution further.

Guests forgetting to tell me that they're coming has also caused me to pull some hair out, as with most math, when it attempts to model the real world.


  1. So they say; I don't want to build experience in getting married. 

  2. Right-click and open the image in a new tab if you need to zoom in. 

S Williams-Wynn at 12:25 | Comments() |

Mon 11 August 2025

The Bleeding Edge

A tech stack is more often inherited than it is chosen. On the rare occasion someone gets the chance to decide which tech the company gets to use for the next decade. I've seen this go well and I've seen this go belly up.

If you're in the right club, you know the answer; stick with boring.

The Postgres Paradox

When thinking about how long a technology will last I often use the current age of the software to determine how much longer it will be around. So if it's 2 years old, it might be around 2 more years. Postgres dates back 35 years, so it will probably be around for 35 more years.1

We can view this in terms of experienced employees; we are more likely to find someone with 20 years of Postgres experience than we are to find someone with 20 years in a tech that's only existed for 2 years.

Taking the age of the tech into consideration allows us to earmark the hiring pool and the likelihood that fixes or workarounds exist for problems we might run into. Diving into newer tech increases the chance we are the first to come across certain problems.

The bleeding edge might appear shiny and new but we aren't here to ship blog posts we are here to ship features and delight customers.

These avoidable issues are hard to solve because of two reasons:

  1. They're novel, we're the first to run into them.
  2. The people that have faced this issue are expensive, because not many of them exist.

Make a Difference

Starting a business is a risk, picking fancy new tech is a risk. These risks add up. Where you decide to take the risk depends on where you are trying to drive innovation.

In 2015, London, two neobanks were founded; this is the tech they are built on and the tech's age at the time. Revolut is built on Postgres (25yo) and Java (19yo), the other is built on Cassandra (7yo) and Golang (6yo). The former has 50million customers, the later has 10million.

Risks are allowed to be taken, but we should limit the number of unknowns and not make things harder than they need to be. If we pick our entire stack on the principle of shiny and new we are going to be fighting our tech more often that we are solving customer use-cases.

Get excited about delighting users, not the fancy tech.

Don't be First, Be Fast

Early market leaders have much greater long-term success and enter an average of 13 years after pioneers.

Golder and Tellis (1993)

People make the assumption that if they're not using the latest technology they're not staying competitive, when they are instead just becoming a case-study for the industry.2

The study by Golder and Tellis tells us that first movers have a failure rate that is 6 times higher than fast followers.

If the tech stack mattered, you'd see it in advertising, but you don't, because the customer doesn't care. Your tech stack doesn't determine your product market fit3

Your investors might care (🤷).

If I had to choose between customers and investors I'd pick customers.4

The Right Tool for The Job

Ask a graduate what tool or language they would use for their next project and they'll give you a political answer: "I would pick the best tool for the job". Here's a list of programming languages. There are 50 that start with the letter A. Let me know when you've found the one that's best suited for the job.

Perhaps we should define what "best for the job", means. If we are in a team of 12 engineers that are maintaining 2 Postgres databases; introducing a MySQL database on the basis that it addresses a specific use-case better than postgres is not a good idea.

If we hire a DB expert in the future are we going to ensure they're versed in both Postgres and MySQL? It's niche but we could find them at a price.

We don't have to hire this DB expert, our team has built up experience with Postgres, now they'll need to familiarise themselves with the maintenance process of a new db. Just bumping your database version is now slightly tricker and you're monitoring news and bug fixes for both releases.

We could define "best for the job" as perhaps the one we are most familiar with, and this is often the best choice. If you can skip learning a new language or learning the framework step, you're going to get to market faster than others. Don't let learning the tech be the procrastination for learning the market.

In a startup we are testing our market assumptions, this is the largest risk to the company, we shouldn't slow down the speed we get to market. It's better to know we are wrong sooner rather than later.

Getting Experience

Engineers at startups might use the job as an opportunity to focus on CV driven develop instead of focusing on the needs of the business.

I've seen a graduate write every new project in a new programming language.

Anyone know Elixir? We're trying to fix a bug in production and the developer that wrote this is no longer at the company.

From a slack thread.

Starting every new project in a new language isn't a fast way to learn how to deliver. It's a fast-way to get 3 months of experience 5 times. Learn how to deal with a language's pain points and how to solve issues with a specific framework is what makes you valuable. If you're looking to stretch yourself, then use the same language and framework for larger projects.

If you've only got 3 months of experience in a language and framework you're probably not that hard to replace.

The new thing won’t be better, you just aren’t aware of all of the ways it will be terrible yet

Slide 84

Propaganda

Propaganda doesn't just occur in war or between competing states, corporate propaganda is a well established strategy available to large technical institutions.

Let's say you are working on an internal framework that holds up your billion dollar social media platform, you wish to hire more engineers to maintain and build the project. Training new hires on this framework is time consuming, wouldn't it be great if the engineers we hired were already familiar with the framework?

They can also hire developer advocates, these are the influencers of the software engineering world, corporate mercenaric missonaries that are very good at making the work they're doing exciting.

Ignore the hype, your job is to keep things simple.

Bleeding Edge

One reason you might wish to be in the bleeding edge of tech is to gain the expertise and then build around some of the fundamentals you've learnt. Perhaps your customers are others trying to enter the market, this is perhaps a safety-net for the risk you take, but the market can also flatten out and you're left holding a bag of air.


  1. This is known as the Lindy Effect 

  2. Your competitors will thank you for your sacrifice. 

  3. Unless your tech stack is your product 

  4. A blog post on capital allocation and the VC world might happen in the future. 

S Williams-Wynn at 12:05 | Comments() |

Mon 04 August 2025

How the Blog Works

There are a few parts that make up the blog, the static site generator, the API and the repo that hosts all the HTML. This article provides an overview of these pieces and how they're all brought together to create the website.

Framework

The framework is called Pelican a python based static site generator. Each post is written in markdown which starts with meta tags such as Title and Category. Pelican then uses this and a custom theme template to create HTML. The theme uses jinja templating to construct template HTML pages.

Pelican also provides a config file which I have used for feature flags and to enable extensions. The extensions I've enabled include footnotes and code-highlighting. Additionally Pelican hosts a local version of my blog which it recompiles whenever it detects a change to any of the files.

In the config there's a list which contains all the links rendered in the sidebar. Extending or updating these links is only a matter of making changes to this list.

There are two directories inprogress/ and content/. inprogress/ stores drafts that are half complete, completed or minor thoughts for potential blog posts. When these are ready to be published they are moved over to content/ and when these are pushed to the repo, using a github action, pelican automatically renders the entire blog into HTML. The action then creates a commit for any changes and pushes the changes to a public repo.

The public repo contains the entire rendering of the blog and theme. Relying on Github pages, this is then hosted.

The CV

Within the same repo as the drafts and published content there's a CV written in Latex. This latex file can be rendered to a PDF. I have a bash script which allows me to recompile this PDF whenever changes are made to the Latex script.

The API

There's an API that contains edge functions written in Go. These serve requests for the comment sections that sit below each blog post. There are three endpoints, one fetches the count of comments given a list of article URLs. This is used on index pages.

There's an endpoint to fetch comments given an article URL and an endpoint to post a comment to an article URL. The repo containing these endpoints has an action which deploys updates on each new commit.

Finally the API handles the mailing list. It handles registering new emails, verifying the email is valid and lastly triggers an edge function to look for new posts using the blog's RSS feed in order to send updates to recipients in the mailing list.

S Williams-Wynn at 12:04 | Comments() |

Mon 28 July 2025

Software Is Planning

Software is planning, and as the adage goes, failing to plan is planning to fail. This article goes into the development of Git and UNIX with the goal of dispelling the myth that the genius software engineer exists and when we hear something was written entirely over night there's key information being omitted.

Everyone likes the story of the programmer that disappears and resurfaces after a week with some revolutionising software. We all enjoy a compelling narrative but in reality great software takes time, additionally for most great projects the code is actually the smallest part of the project.

Planning

Projects don't go wrong they start wrong

How Big Things Get Done (2023)

We shouldn't view software creation from the point when coding starts. The process should include the upfront planning. In 1975 Brooke advocated that the coding portion of a software project should amount to 17% of the overall time and it should be noted that in the 1970s the most dominant programming languages were FORTRAN and COBOL.

If a software requirements error is detected and corrected during the plans and requirements phase, its correction is a relatively simple matter of updating the requirements specification.

Software Engineering Economics (1981)

When we think of planning we often think of planning permissions, regulation and endless bureaucracy. We don't have time for that, we are trying to move fast and break things. However, in software, the planning stage is usually when most of the thinking happens and the hard problems get ironed out. Thinking is hard and most people avoid it, make thinking easier and you can stand out from other developers.

The planning phase of a project is also the point in time when large changes are the cheapest.1

UNIX

One fable within software is that UNIX was developed over the weekend by Ken Thompson. Ken is quite the mythical figure in the software world, Brian Kernighan attests to this in multiple interviews.

We must keep in mind though that Ken worked for three years on Multics which he carried many features forward to UNIX. Here's not saying Ken isn't a great programmer, but we shouldn't discount that his mind was focused on an operating system for quite an amount of time and probably made some contribution towards formulating a better system.

Imagine your best programmer said to you, "sure let me work on the project for two to three years then we will scrap it; and start again. You are bound to have the best in class software." Instead businesses want every project to have a fabled programmer and to continue the trope.

Ken Thompson developed UNIX over 3 weeks, there's a video where Brian Kernighan remarks that modern software engineers aren't as productive2.

Git

At this point Git is the most popular form of version control for software, it's also famously known to have been written in a fairly short amount of time. Linus states that he wrote Git in 10 days and we mostly assume the timeline for the project was - I have an idea and ten days later we have Git.

However this is not the case. Linus wrote Git in 10 days but it took him 4 month of thinking about the problem until he had a solution that he was satisfied with. 10 days after that we had Git.

Being a Creator

When it comes to creation, sometimes we are too excited by the answer that we fail to think about the question. A lot of software gets written without getting to the root of the problem and thus fail to materialise any value.

Large projects fail when creation comes first and planning comes never. Even after you have complete knowledge of the problem, solutions may have challenges which require addressing upfront, addressing them when you are midway through a project is going to cost both time and money.

For some problems you can have an attitude of "we will work it out when we get there", but you don't want to have this attitude to all problems as when you get there you might be 95% of the way through your time and budget and you have crossed the point of no return and getting over this hurdle requires going over budget.

Use your planning to categorise problems into "it won't work unless we know this" and "we can figure this out when we get there".

Board Games

"Is it fun" is the most important question in board game design. Ideally you should find this out before you hire an artist, write storyboards or even print and design cards. In board game creation circles they advocate tearing up pieces of paper and scribbling out a prototype to answer this question. The same thing applies to software.

Determine if the question you are answering is the correct one and determine if your software will actual solve the problem, do this; Wizard of Oz style, before putting a shovel into the ground.

Assumptions

All projects start with assumptions. These are typically the first things that need to be addressed before you start coding. Sometimes asking someone if a solution exists or how something works can knock off some unknowns and you'll be in a better position to start than had you remained silent.

Design it Twice

A Philosophy of Software Design (2018)

John Ousterhout advocates that designing software once is not enough and when you are planning something out you should make multiple attempts at it before you move ahead. He's found this leads to better design.

Every project comes with its unknowns and challenges, we are probably never going to rebuild the exact same solution twice. So packing in as much learning as you can upfront about the challenges ahead will leave you at least the most prepared you could have been.

The beginning of project is the cheapest time to learn so we should maximise and front load learning. Learning midway through a project is an expensive way of learning. It's fine if we can afford these learnings but on big and critical projects you don't want to bring up the point that the work we've been doing for the last 6 months was all for nothing especially if we could have learnt this earlier on.

Break things down, explore and figure out how each piece of the project will work and that we are actually providing the right answer to the correct question, this is the only way to deliver on time and on budget.

If we jump straight into code our design strategy is hope.


  1. Whenever I hear "prompt engineering" I roll my eyes, mainly because it's a term that offers zero value. If instead we referred to it as "prompt planning" I think we would get people onto the correct page. Having a clear idea of the answer, having thought about what you want to create and providing a clear unambiguous prompt is essentially doing the upfront planning, which I hope you're doing when writing your software. 

  2. and I took that personally. 

S Williams-Wynn at 12:05 | Comments() |

Mon 07 July 2025

Database Indexes

Often the most crucial part of a system is the database. Databases manage and handle the storage of our business data, an inefficient database can impact the overall performance of our system. There are, however, a few ways we can improve database performance, one such way is defining appropriate indexes.

This article describes indexes in Postgres and some of the lesser known indexes it uses, such as GIN and BRIN indexes.

Why do we use indexes.

We use indexes in postgres to optimise how we scan the data we have stored in the database. Similar to how binary search can optimise how quickly we find things in a dictionary (see my post on binary search).

We can use an index to inform the database on how to structure data in a way that allows us to find what we are looking for effectively i.e. fewer operations, fewer i/o and fewer comparisons leading to lower CPU usage and better performance.

What is an index.

An index is a data structure which allows us to minimise the number of instructions performed by the database in order to lookup table rows or improve the speed at which a result is sorted.

In the case of sorting if there's an index on a column that you require results to be ordered by, the query can scan the index to return the results in sorted order, if there is no index on the column the data must first be fetched and then sorted in a separate step.

Indexes aren't always good

Over doing the number of indexes on a table can lead to lower write performance as every update or insert into the table requires updating the index. Indexes also require disk space, so having unnecessary indexes will contribute to the rate at which your disk usage grows.

B-Tree

The default index type Postgres uses is a B-Tree. This data structure is similar to a binary tree except instead of having a node with only two pointers a node can contain many pointer. The number of pointers is determined by Postgres's default block size, which is 8kb. A single node will store as many sorted keys as it can until it reaches this size limit. Postgres refers to these nodes as pages.

Having a smaller index keys can improve the performance of your index as you'll be able to store more keys within the page limit. Resulting in fewer i/o instructions as fewer pages are required to be read from disk.

In postgres leaf nodes will point to the physical location of the row on disk and the intermediary nodes (or internal pages) will point to the nodes on the next level down the tree.

There's a neat animation that can help visualise b-trees: https://btree.app/

Partial Indexes

We can optimise the indexes we create by understanding usage patterns and having some intuition about the system and the business. One such optimisation is the use of partial indexes, these are handy when you have a large table but only a subset of the data is used most frequently.

As an example, we can imagine an online order system, it's quite likely that once an order is fulfilled the status transitions to "complete" and this order isn't likely to be accessed as frequently as the orders that are still in progress. We can partition our index so that it contains only unfulfilled orders which will be a significantly smaller portion of our overall orders.

CREATE INDEX idx_uncomplete_orders_customer ON
orders(customer_id) WHERE status != 'complete';

We also have to include this WHERE filter in our queries if we wish to make use of this index.

Covering Index

A covering index allows us to add columns to the index's leaf nodes and thus avoids the lookup and reading of the entire the table row from disk. Essentially everything that the query needs is available on the index, reducing the number of operations required to complete the query.

As an example if we typically request a user's first name and last name with an email we can write an index on the email that includes the first and last name.

CREATE INDEX idx_user_email_include
ON users (email) INCLUDE (firstname, lastname);

We have to bare in mind that we should only include columns which change as frequently or less frequently than the key we are indexes otherwise we are duplicating data and increasing the write overhead. This isn't an issue for columns that rarely change.

More on covering indexes see "INCLUDE".

Gin Index

GIN stands for Generalised Inverted Index, similar to the inverted index that Lucene is built on, the same Lucene that Elasticsearch is built on. The slight difference is that this inverted index is generalised so it expects items that are indexed to be composed of multiple values the same way a sentence is composed of multiple words, with the goal to support full-text search.

An index entry is created for each element in the composite column and the entry points to a list of matching locations. So as an example a row that has a column with the value "Here's is a blog post" will create an index entry for each word in the value, (e.g. "blog") and then each entry will point to the rows that contain "blog" inside the composite column.1

ALTER TABLE blogposts
ADD COLUMN search_vector tsvector;

CREATE INDEX idx_blogposts_search ON blogposts
USING GIN (search_vector);

GIN indexes aren't limited to just text, they are generalised so they can be used with any composite type, JSONB for example.

BRIN Index

BRIN stands for Block Range Index. These are indexes that specialise in handling very large tables in which certain columns have some natural correlation to their physical storage location. Essentially if each row in the table can be grouped in some manner to the blocks on disk then we have a potential use-case for the BRIN Index.

GPS tracking points and log tables are good examples of data with this natural correlation, if you have a table that is small or data is updated or inserted out of chronological order then they won't form a good fit for BRIN indexes.

Instead of storing pointers to entry locations in the B-Tree leaf nodes, BRIN indexes store summary information for the ranges that the blocks on disk are ordered. This allows for a smaller overhead compared to the default B-Tree which stores all entries and their corresponding locations.

CREATE INDEX idx_transactions_created_at_brin
ON transactions using BRIN (created_at);

This can be used to optimise queries that rely on fetching rows between certain ranges.

SELECT * FROM orders WHERE
created_at BETWEEN '2024-01-01' AND '2024-12-31';

  1. I'm describing a reverse index. 

S Williams-Wynn at 12:32 | Comments() |
Socials
Friends
Subscribe