s3bw

Mon 28 July 2025

Software Is Planning

Software is planning, and as the adage goes, failing to plan is planning to fail. This article goes into the development of Git and UNIX with the goal of dispelling the myth that the genius software engineer exists and when we hear something was written entirely over night there's key information being omitted.

Everyone likes the story of the programmer that disappears and resurfaces after a week with some revolutionising software. We all enjoy a compelling narrative but in reality great software takes time, additionally for most great projects the code is actually the smallest part of the project.

Planning

Projects don't go wrong they start wrong

How Big Things Get Done (2023)

We shouldn't view software creation from the point when coding starts. The process should include the upfront planning. In 1975 Brooke advocated that the coding portion of a software project should amount to 17% of the overall time and it should be noted that in the 1970s the most dominant programming languages were FORTRAN and COBOL.

If a software requirements error is detected and corrected during the plans and requirements phase, its correction is a relatively simple matter of updating the requirements specification.

Software Engineering Economics (1981)

When we think of planning we often think of planning permissions, regulation and endless bureaucracy. We don't have time for that, we are trying to move fast and break things. However, in software, the planning stage is usually when most of the thinking happens and the hard problems get ironed out. Thinking is hard and most people avoid it, make thinking easier and you can stand out from other developers.

The planning phase of a project is also the point in time when large changes are the cheapest.¹

UNIX

One fable within software is that UNIX was developed over the weekend by Ken Thompson. Ken is quite the mythical figure in the software world, Brian Kernighan attests to this in multiple interviews.

We must keep in mind though that Ken worked for three years on Multics which he carried many features forward to UNIX. Here's not saying Ken isn't a great programmer, but we shouldn't discount that his mind was focused on an operating system for quite an amount of time and probably made some contribution towards formulating a better system.

Imagine your best programmer said to you, "sure let me work on the project for two to three years then we will scrap it; and start again. You are bound to have the best in class software." Instead businesses want every project to have a fabled programmer and to continue the trope.

Ken Thompson developed UNIX over 3 weeks, there's a video where Brian Kernighan remarks that modern software engineers aren't as productive².

Git

At this point Git is the most popular form of version control for software, it's also famously known to have been written in a fairly short amount of time. Linus states that he wrote Git in 10 days and we mostly assume the timeline for the project was - I have an idea and ten days later we have Git.

However this is not the case. Linus wrote Git in 10 days but it took him 4 month of thinking about the problem until he had a solution that he was satisfied with. 10 days after that we had Git.

Being a Creator

When it comes to creation, sometimes we are too excited by the answer that we fail to think about the question. A lot of software gets written without getting to the root of the problem and thus fail to materialise any value.

Large projects fail when creation comes first and planning comes never. Even after you have complete knowledge of the problem, solutions may have challenges which require addressing upfront, addressing them when you are midway through a project is going to cost both time and money.

For some problems you can have an attitude of "we will work it out when we get there", but you don't want to have this attitude to all problems as when you get there you might be 95% of the way through your time and budget and you have crossed the point of no return and getting over this hurdle requires going over budget.

Use your planning to categorise problems into "it won't work unless we know this" and "we can figure this out when we get there".

Board Games

"Is it fun" is the most important question in board game design. Ideally you should find this out before you hire an artist, write storyboards or even print and design cards. In board game creation circles they advocate tearing up pieces of paper and scribbling out a prototype to answer this question. The same thing applies to software.

Determine if the question you are answering is the correct one and determine if your software will actual solve the problem, do this; Wizard of Oz style, before putting a shovel into the ground.

Assumptions

All projects start with assumptions. These are typically the first things that need to be addressed before you start coding. Sometimes asking someone if a solution exists or how something works can knock off some unknowns and you'll be in a better position to start than had you remained silent.

Design it Twice

A Philosophy of Software Design (2018)

John Ousterhout advocates that designing software once is not enough and when you are planning something out you should make multiple attempts at it before you move ahead. He's found this leads to better design.

Every project comes with its unknowns and challenges, we are probably never going to rebuild the exact same solution twice. So packing in as much learning as you can upfront about the challenges ahead will leave you at least the most prepared you could have been.

The beginning of project is the cheapest time to learn so we should maximise and front load learning. Learning midway through a project is an expensive way of learning. It's fine if we can afford these learnings but on big and critical projects you don't want to bring up the point that the work we've been doing for the last 6 months was all for nothing especially if we could have learnt this earlier on.

Break things down, explore and figure out how each piece of the project will work and that we are actually providing the right answer to the correct question, this is the only way to deliver on time and on budget.

If we jump straight into code our design strategy is hope.

Whenever I hear "prompt engineering" I roll my eyes, mainly because it's a term that offers zero value. If instead we referred to it as "prompt planning" I think we would get people onto the correct page. Having a clear idea of the answer, having thought about what you want to create and providing a clear unambiguous prompt is essentially doing the upfront planning, which I hope you're doing when writing your software. ↩
and I took that personally. ↩

Mon 31 August 2020

Python Deque

This is now my third article on lists. As someone that uses the built-in python list on a fairly regular basis, I might have built up a false sense of security. I'm pretty familiar with these listy-boys. However, recently I found out that I was not thinking about them correctly. Readers might smack themselves if they're familiar with data-structures but don't know how lists are implemented internally. The built-in lists are dynamic arrays.

How else could they optimise a sweet O(1) lookup time on indexing: mylist[4]. Especially when analysts are trying to avoid the built-in iterator and cursing their code with: for i in len(mylist): mylist[i].

Another trait an established data-structurer with be familiar with when it comes to dynamic arrays is that the append and pop methods are an amortised O(1). Amortised; because occasionally you have to suffer a cost of realloc(ating) memory across larger arrays.

Where the list starts to suffer is from poping and inserting at arbitrary positions.

Linked-List

The data-structurer will have had the linked-list slammed into their head often enough that it will pain them to hear about it again. So theory aside, I'll give you that sweet O(1) append and last item pop that you expect from a performant Stack.

Python deque provides a comparatively larger performance hit on initialisation to list and has poor O(n) performance when you want any arbitrary item somewhere in the middle. It does, however, have O(1); popleft, pop, append and appendleft. Due to being a doubly-linked list (or double-ended queue to get the abbreviation deque)

Deque in the wild

I saw a nice little quote from an enginneer on Quora:

In 8 years of getting paid to write computer programs, this post is the only time I’ve typed ‘deque.’

There are many places deque is used in the stdlib, most commonly whenever someone needs a queue or stack such as constructing a traceback, parsing python's sytax tree and keeping track of context scope.

My little run-in with deque was using it instead of a recursive function to avoid python's

maximum recursion depth exceeded

This limit happens to be set to 10^4. The solution was to add child nodes to a deque and when you were done with analysing the current node, popleft the next node.

Python Queue

You might be tempted to ask, well if deque is for queues. What on earth is from queue import Queue.

These queues are different (although, still using deque under the hood). They are optimised for communication across threads, which need to involve locking mechanisms and support methods like put_nowait() and join(). These are not intended to be used as a collective data-structure, hence the lack of support for the in operator.

More information

There is some neat documentation in the cpython repo which contains more data-structures and other alternatives to the standard built-in list. Tools for working with lists

References

How are lists implemented:
https://stackoverflow.com/a/15121933/3407256
https://stackoverflow.com/a/23487658/3407256