Mon 31 August 2020
Python Deque
This is now my third article on lists. As someone that uses the built-in python list on a fairly regular basis, I might have built up a false sense of security. I'm pretty familiar with these listy-boys. However, recently I found out that I was not thinking about them correctly. Readers might smack themselves if they're familiar with data-structures but don't know how lists are implemented internally. The built-in lists are dynamic arrays.
How else could they optimise a sweet O(1)
lookup time
on indexing: mylist[4]
. Especially when analysts are
trying to avoid the built-in iterator and cursing their code
with: for i in len(mylist): mylist[i]
.
Another trait an established data-structurer
with be familiar with when it comes to dynamic arrays
is that the append
and pop
methods are an amortised
O(1)
. Amortised; because occasionally you have to suffer
a cost of realloc(ating) memory across larger arrays.
Where the list starts to suffer is from pop
ing and
insert
ing at arbitrary positions.
Linked-List
The data-structurer will have had the linked-list
slammed into their head often enough that it will
pain them to hear about it again. So theory aside,
I'll give you that sweet O(1)
append
and last item pop
that you expect from a performant Stack
.
Python deque
provides a comparatively larger
performance hit on initialisation to list
and
has poor O(n)
performance when you want any arbitrary
item somewhere in the middle. It does, however, have
O(1)
; popleft
, pop
, append
and appendleft
. Due
to being a doubly-linked list (or double-ended queue to
get the abbreviation deque
)
Deque in the wild
I saw a nice little quote from an enginneer on Quora:
In 8 years of getting paid to write computer programs, this post is the only time I’ve typed ‘deque.’
There are many places deque
is used in the stdlib, most
commonly whenever someone needs a queue
or stack
such as
constructing a traceback, parsing python's sytax tree and
keeping track of context scope.
My little run-in with deque
was using it instead of a
recursive function to avoid python's
maximum recursion depth exceeded
This limit happens to be set to 10^4
. The solution was
to add child nodes to a deque
and when you were done with
analysing the current node, popleft
the next node.
Python Queue
You might be tempted to ask, well if deque
is for queues.
What on earth is from queue import Queue
.
These queues are different (although, still using deque
under
the hood). They are optimised for communication across threads,
which need to involve locking mechanisms and support methods like
put_nowait()
and join()
. These are not intended to be used
as a collective data-structure, hence the lack of support for
the in
operator.
More information
There is some neat documentation in the cpython repo which
contains more data-structures and other alternatives to
the standard built-in list
. Tools for working with
lists
References
- How are lists implemented:
- https://stackoverflow.com/a/15121933/3407256
- https://stackoverflow.com/a/23487658/3407256