Mon 28 July 2025
Software Is Planning
Software is planning, and as the adage goes, failing to plan is planning to fail. This article goes into the development of Git and UNIX with the goal of dispelling the myth that the genius software engineer exists and when we hear something was written entirely over night there's key information being omitted.
Everyone likes the story of the programmer that disappears and resurfaces after a week with some revolutionising software. We all enjoy a compelling narrative but in reality great software takes time, additionally for most great projects the code is actually the smallest part of the project.
Planning
Projects don't go wrong they start wrong
How Big Things Get Done (2023)
We shouldn't view software creation from the point when coding starts. The process should include the upfront planning. In 1975 Brooke advocated that the coding portion of a software project should amount to 17% of the overall time and it should be noted that in the 1970s the most dominant programming languages were FORTRAN and COBOL.
If a software requirements error is detected and corrected during the plans and requirements phase, its correction is a relatively simple matter of updating the requirements specification.
Software Engineering Economics (1981)
When we think of planning we often think of planning permissions, regulation and endless bureaucracy. We don't have time for that, we are trying to move fast and break things. However, in software, the planning stage is usually when most of the thinking happens and the hard problems get ironed out. Thinking is hard and most people avoid it, make thinking easier and you can stand out from other developers.
The planning phase of a project is also the point in time when large changes are the cheapest.1
UNIX
One fable within software is that UNIX was developed over the weekend by Ken Thompson. Ken is quite the mythical figure in the software world, Brian Kernighan attests to this in multiple interviews.
We must keep in mind though that Ken worked for three years on Multics which he carried many features forward to UNIX. Here's not saying Ken isn't a great programmer, but we shouldn't discount that his mind was focused on an operating system for quite an amount of time and probably made some contribution towards formulating a better system.
Imagine your best programmer said to you, "sure let me work on the project for two to three years then we will scrap it; and start again. You are bound to have the best in class software." Instead businesses want every project to have a fabled programmer and to continue the trope.
Ken Thompson developed UNIX over 3 weeks, there's a video where Brian Kernighan remarks that modern software engineers aren't as productive2.
Git
At this point Git is the most popular form of version control for software, it's also famously known to have been written in a fairly short amount of time. Linus states that he wrote Git in 10 days and we mostly assume the timeline for the project was - I have an idea and ten days later we have Git.
However this is not the case. Linus wrote Git in 10 days but it took him 4 month of thinking about the problem until he had a solution that he was satisfied with. 10 days after that we had Git.
Being a Creator
When it comes to creation, sometimes we are too excited by the answer that we fail to think about the question. A lot of software gets written without getting to the root of the problem and thus fail to materialise any value.
Large projects fail when creation comes first and planning comes never. Even after you have complete knowledge of the problem, solutions may have challenges which require addressing upfront, addressing them when you are midway through a project is going to cost both time and money.
For some problems you can have an attitude of "we will work it out when we get there", but you don't want to have this attitude to all problems as when you get there you might be 95% of the way through your time and budget and you have crossed the point of no return and getting over this hurdle requires going over budget.
Use your planning to categorise problems into "it won't work unless we know this" and "we can figure this out when we get there".
Board Games
"Is it fun" is the most important question in board game design. Ideally you should find this out before you hire an artist, write storyboards or even print and design cards. In board game creation circles they advocate tearing up pieces of paper and scribbling out a prototype to answer this question. The same thing applies to software.
Determine if the question you are answering is the correct one and determine if your software will actual solve the problem, do this; Wizard of Oz style, before putting a shovel into the ground.
Assumptions
All projects start with assumptions. These are typically the first things that need to be addressed before you start coding. Sometimes asking someone if a solution exists or how something works can knock off some unknowns and you'll be in a better position to start than had you remained silent.
Design it Twice
A Philosophy of Software Design (2018)
John Ousterhout advocates that designing software once is not enough and when you are planning something out you should make multiple attempts at it before you move ahead. He's found this leads to better design.
Every project comes with its unknowns and challenges, we are probably never going to rebuild the exact same solution twice. So packing in as much learning as you can upfront about the challenges ahead will leave you at least the most prepared you could have been.
The beginning of project is the cheapest time to learn so we should maximise and front load learning. Learning midway through a project is an expensive way of learning. It's fine if we can afford these learnings but on big and critical projects you don't want to bring up the point that the work we've been doing for the last 6 months was all for nothing especially if we could have learnt this earlier on.
Break things down, explore and figure out how each piece of the project will work and that we are actually providing the right answer to the correct question, this is the only way to deliver on time and on budget.
If we jump straight into code our design strategy is hope.
-
Whenever I hear "prompt engineering" I roll my eyes, mainly because it's a term that offers zero value. If instead we referred to it as "prompt planning" I think we would get people onto the correct page. Having a clear idea of the answer, having thought about what you want to create and providing a clear unambiguous prompt is essentially doing the upfront planning, which I hope you're doing when writing your software. ↩
-
and I took that personally. ↩
Mon 31 August 2020
Python Deque
This is now my third article on lists. As someone that uses the built-in python list on a fairly regular basis, I might have built up a false sense of security. I'm pretty familiar with these listy-boys. However, recently I found out that I was not thinking about them correctly. Readers might smack themselves if they're familiar with data-structures but don't know how lists are implemented internally. The built-in lists are dynamic arrays.
How else could they optimise a sweet O(1)
lookup time
on indexing: mylist[4]
. Especially when analysts are
trying to avoid the built-in iterator and cursing their code
with: for i in len(mylist): mylist[i]
.
Another trait an established data-structurer
with be familiar with when it comes to dynamic arrays
is that the append
and pop
methods are an amortised
O(1)
. Amortised; because occasionally you have to suffer
a cost of realloc(ating) memory across larger arrays.
Where the list starts to suffer is from pop
ing and
insert
ing at arbitrary positions.
Linked-List
The data-structurer will have had the linked-list
slammed into their head often enough that it will
pain them to hear about it again. So theory aside,
I'll give you that sweet O(1)
append
and last item pop
that you expect from a performant Stack
.
Python deque
provides a comparatively larger
performance hit on initialisation to list
and
has poor O(n)
performance when you want any arbitrary
item somewhere in the middle. It does, however, have
O(1)
; popleft
, pop
, append
and appendleft
. Due
to being a doubly-linked list (or double-ended queue to
get the abbreviation deque
)
Deque in the wild
I saw a nice little quote from an enginneer on Quora:
In 8 years of getting paid to write computer programs, this post is the only time I’ve typed ‘deque.’
There are many places deque
is used in the stdlib, most
commonly whenever someone needs a queue
or stack
such as
constructing a traceback, parsing python's sytax tree and
keeping track of context scope.
My little run-in with deque
was using it instead of a
recursive function to avoid python's
maximum recursion depth exceeded
This limit happens to be set to 10^4
. The solution was
to add child nodes to a deque
and when you were done with
analysing the current node, popleft
the next node.
Python Queue
You might be tempted to ask, well if deque
is for queues.
What on earth is from queue import Queue
.
These queues are different (although, still using deque
under
the hood). They are optimised for communication across threads,
which need to involve locking mechanisms and support methods like
put_nowait()
and join()
. These are not intended to be used
as a collective data-structure, hence the lack of support for
the in
operator.
More information
There is some neat documentation in the cpython repo which
contains more data-structures and other alternatives to
the standard built-in list
. Tools for working with
lists
References
- How are lists implemented:
- https://stackoverflow.com/a/15121933/3407256
- https://stackoverflow.com/a/23487658/3407256