Top Few

05 Jul, 2020

I was reading Tim Bray's recent blog posts (here and here) about his Topfew utility with interest, and wondered how my go-to systems programming language (Nim) stacks up compared to Go. I knocked up a quick and dirty implementation and then ran against a 900MB access_log from my own site. A minute or so later I hit CTRL+C, realising my quick and dirty implementation was (a) not quick at all, and (b) perhaps a bit too "dirty". ಠ_ಠ

Once I changed from the naive read everything into memory approach, to using Nim's streams, I ended up with something slightly more acceptable. Caveat: not really optimised (well... apart from compiling with -d:release --passC:-mcpu=native --boundChecks:off flags), lazily developed, probably still naive, but at least has border-line acceptable performance (Github).

Results on my laptop (with an approx 1.5 million line access_log):

Elapsed User System vs Go
Tim's topfew 2.68 3.86 0.41 1.0
ls/uniq/sort 9.10 1.76 0.26 3.4
Nim 3.91 3.65 0.25 1.5

1.5x isn't brilliant, but it's not bad considering the minimal amount of effort I spent on it. But that led me to wonder, what about Python performance? Another Q&D implementation later (and this time I haven't bothered to dig into the performance at all)...

Elapsed User System vs Go
Tim's topfew 2.68 3.86 0.41 1.0
ls/uniq/sort 9.10 1.76 0.26 3.4
Nim 3.91 3.65 0.25 1.5
Python 8.57 7.77 0.79 3.2

Not much better than the uniq/sort version, but the python version is arguably a little more readable than Nim. One interesting difference between Python and Nim -- the Python version does read the whole file into memory before tokenising...

with open(filename) as f:
    for line in f.readlines():

...yet the performance was significantly better than my earlier naive Nim implementation which also read the whole file. So it's not just the right tool for the job, but the right technique for the tool.

Do I feel like dusting off the rather rusty Haskell skills though...?

Problem with bouncing ball

30 May, 2020

Lennier M writes:

I am on page 202 of your book, but when I followed your instructions, the ball stopped moving, instead of moving in multiple directions. Here is my code:

class Ball:
    def __init__(self, canvas, color):
        self.canvas = canvas = canvas.create_oval(10, 10, 25, 25, fill=color)
        self.canvas.move(, 245, 100)
        starts = [-3, -2, -1, 1, 2, 3, ]
        self.x = starts[0]
        self.y = -3
        self.canvas_width = self.canvas.winfo_width
        self.canvas_height = self.canvas.winfo_height()
    def draw(self):
        self.canvas.move(, self.x, self.y)
        pos = self.canvas.coords(
        if pos[1] <= 0:
            self.y = 1
        if pos[3] >= self.canvas_height:
            self.y = -1
        if pos[0] <= 0:
            self.x = 3
        if pos[2] >= self.canvas_width:
            self.x = -3

please help

If I run your code I get the following error:

Traceback (most recent call last):
  File "", line 40, in <module>
  File "", line 26, in draw
    if pos[2] >= self.canvas_width:
TypeError: '>=' not supported between instances of 'float' and 'method'

So that tells us the line which is failing, but why?

If you look at these two lines, hopefully you'll see the difference (and the reason for your problem):

        self.canvas_width = self.canvas.winfo_width
        self.canvas_height = self.canvas.winfo_height()

The missing brackets on the first line mean that you haven't actually called the winfo_width function (or method). So self.canvas_width isn't a number of pixels - it's actually a reference to the function itself. If we added a print statement at that point in the code it would be even more obvious...

<bound method Misc.winfo_width of <tkinter.Canvas object .!canvas>>

This is the reason why comparing pos[2] (which is a number - to be exact it's a floating point number) with self.canvas_width (which is the reference to a function/method) comes back with the error message: "'>=' not supported between instances of 'float' and 'method'".

If you add the missing brackets, you'll hopefully find the ball moves as expected.

Online supermarket delivery in the UK needs to evolve

03 May, 2020

We managed to get a Tesco's delivery last week, which was a challenge - the best we've been do so far is about one delivery every two weeks or so (from different supermarkets). But that basically means sitting online for hours hitting the refresh button to see if a slot becomes available.

Hunter-gathering in the twenty-first century...

A couple of hours after our delivery had arrived, I happened to look out the window and see another Tescos delivery being dropped off about two doors down. So not only dumb UX, but from a logistics perspective, poor design as well.

What would make far more sense to me, particularly in this time of lockdown (probably after as well), is you load up your basket with your shopping, then specify half day slots when you'll be at home. i.e. "here's the 30 things I want, and I can be home for delivery any time Monday to Sunday for the next two weeks". or "I can be home for delivery on Friday morning, Saturday morning, and all day Sunday". Periodically the supermarket runs an algorithm to find the optimal delivery for each order (and obviously you don't get charged if you don't get a delivery) -- perhaps with some prioritisation for those who've been waiting longer for their delivery slot.

There's nothing massively complicated about this as an algorithm - group addresses in the same street which have an intersecting time period. Then group addresses with close proximity (maybe half a mile or so). Then look for addresses slightly farther apart, but still close enough that delivery can be optimised (half a mile up to a couple of miles) - perhaps some of those can be grouped with one of the first two groups as well. Give each of these delivery groups a certain amount of points. You then rank the deliveries by their points, and by the age of the oldest order in the group -- ungrouped orders would just be ranked by their age -- and send out notifications to the confirmed orders. Discount delivery charges for grouped orders, and if there are any remaining slots available, open them up to adhoc deliveries with a slightly higher charge.

There's more complexity to take into account, of course - the number of delivery vans versus the number of deliveries per van, distance to the supermarket/depo, etc. But still, the result is likely an improved user experience for everyone, better for the environment and probably more cost effective for the supermarkets.

Long and short dashes

14 Apr, 2020

Jan vK writes:

here's an example from your book that gives an error:
count_down_by_twos = list(range(40, 10, −2))
SyntaxError: invalid character in identifier
please inform me how to solve this problem

It looks like you might have copied-and-pasted the code? Perhaps from the digital version of the book? It looks like the -2 in your example is actually a hyphen (i.e. a long dash −) instead of a minus (i.e. a short dash -). So if I try the version of the code you sent, I get the same error:

>>> count_down_by_twos = list(range(40, 10, 2))
  File "<stdin>", line 1
    count_down_by_twos = list(range(40, 10, 2))
SyntaxError: invalid character in identifier

However, if I try with the correct character, there's no error:

>>> count_down_by_twos = list(range(40, 10, -2))

The next question you might ask is what does "invalid character in identifier" actually mean? An identifier is the name of something (the name of a keyword, a variable, a function or a class, and so on) -- valid identifiers are a sequence of letters (characters), digits and underscores. In effect you're getting that error message because python doesn't recognise "−2" (a long dash followed by 2) as any recognisable keyword, or variable, or anything resembling a valid identifier.

Hope that helps.

ICANN corruption

18 Jan, 2020

Belatedly... this article about ICANN & dot-com price increases (which does look rather like ICANN corruption, in my opinion), annoyed me more than I can properly express, despite the fact that initially (first few years) the 7% increase will still be less than I'm paying for dot-nz domain names. This is pretty obvious profiteering by Verisign, and I worry that this will trickle out from the US to domain names for other countries, inevitably turning the ownership of a domain name from something that's a petty cash expense into a real, and significant, cost. It's particularly concerning if you take into consideration similar news in the dot-org space.

It irritated enough that I started looking at alternatives. However, none of the blockchain domain name options look like a particular economic, straightforward, sure-fire win (paticularly not for a non-technical audience) - even OpenNIC, which is the closest tech to the incumbent, would require jumping through additional hoops because I don't believe my current hosting provider supports DNS alternatives - on their forums, I can't find any mention of OpenNIC apart from a note on a "Rejected Feature Proposals" forum, back in 2007, about it being a stale project.

Maybe the global Internet community will eventually route its way around the "damage" by selecting a generally acceptable alternative. Or a privacy-focused browser maker like Mozilla will come up with (and promote) a viable domain name system that the other browser makers will have to implement or be left behind.

In the meantime, perhaps I'll look at redirecting my primary domain elsewhere and route around the problem myself, before the name comes up for renewal in a few years time...

Problems with restarting the game

22 Sep, 2019

Serhii writes (excerpted from two emails):

I teach programming lessons for the pupils. We try to do restart button for the "Bounce" as it follows:
...there is a problem in "command=restart".

If I run your code I get the following error:

Traceback (most recent call last):
  File "", line 105, in <module>
  File "", line 86, in add_restart
    self.restart_button = Button(tk, text="Click to Restart Game", command=restart, bg="green")
NameError: name 'restart' is not defined

Looking at your code...

    def add_restart(self):
        self.restart_button = Button(tk, text="Click to Restart Game", command=restart, bg="green")

...the problem is you don't actually have a function called restart anywhere - which explains the error message "name 'restart' is not defined". The easiest way to fix this, is to define that function inside your Game class, in which case the above function should actually be:

    def add_restart(self):
        self.restart_button = Button(tk, text="Click to Restart Game", command=self.restart, bg="green")
(note the addition of self there)

The restart function itself should remove the restart button from the screen, move the paddle and ball back to the starting position, and reset the score. I suggest you create a simple function first, just to prove the button works:

    def restart(self):
        print("Restart the game!")

If you see "Restart the game!" printed when clicking the button, you know you're good to start adding the code to do the actual restart (you might also find this post useful: journal/2018/03/04/restarting-the-bounce-game-revisited). The other thing you might want to think about changing, is to only add the restart button if the game is over (so that's a small change to the while loop at the bottom).

Hope that helps.

IDLE3 on Ubuntu

07 Sep, 2019

Chris K writes (excerpted):

I'm teaching myself and home educating my three young daughters at the same time. Just a little bit every day (excepting Sunday which is entirely reserved for pancakes and not inter-computery-things) . Thank you for providing the opportunity for me to introduce the subject of computer programming in a fun way.
Anyhow, we hit a snag early on that does not seem to get a mention on the publishers site or your blog. When instructed to search for IDLE on the Ubuntu software centre nothing of relevance was listed. I had one of the girls do it and she was very disappointed...
...I've done some homework and followed instructions on installing idle3, which worked. So we are all set to go today.

I don't know how useful this feedback is to you but this is an opportunity to express my appreciation for all your hard work and skill, so I'm taking it.

The next re-print of Python for Kids will include updated instructions for installing IDLE3 on Ubuntu - which obviously doesn't help anyone reading the current print of the book. My steps are pretty similar to the link you've referenced:

snippet from the book

Thanks for the email - it's a good prompting to put something on my site, which others in the same position might come across.

Not the normal knock-off

02 Aug, 2019

Usually on Amazon, we see Python for Kids knock-offs which are an exact copy of the book, with cruddy printing and/or binding. No Starch have a clever binding which allows books to open flat, without falling apart after reading a couple of chapters (clever enough that a few people thought it was actually a failure in the glue) - so these were pretty obviously cheap copies, even without the often misprinted and missing pages.

However, an eagle-eyed reader recently notified No Starch of a new type of knock-off -- where they have slightly rewritten the text (I assume just enough to fool a copyright-checking algorithm), and included content from (I think) other sources, to make it even less likely that any automation would flag the book.

For example, here's an excerpt from Python for Kids...

pfk excerpt 1

And here's the dodgy knock off...

knockoff excerpt 1

Erm... what the heck is a "Trump String"?

Here's another one from PfK:

pfk excerpt 2

And here's the knock off again...

knockoff excerpt 2

Yeah... way to rewrite it to be more boooooooring, Book Pirates!

The code examples are pretty much exactly the same in the knock-off (at least the examples I checked) - if badly formatted (including misprinted wingdings characters and other artifacts).

So, the first part of the book is basically a slightly (and extremely poorly) rewritten knock-off of mine. The second part of the book has things like bubble sort, insertion sort and...

knockoff pagerank

...because every self-respecting kid needs to how to write a sorting algorithm (by just looking at the code) and how to use numpy and pandas for page-rank???

And from there, on to games like Hangman, but written with Python2 and incorrectly formatted as well...

knockoff hangman

A garbage knock-off, and 29 five-star reviews in a couple of weeks, no less (I assume paid for). Interestingly, I clicked through a few of the other reviews by those same reviewers and found more poorly written texts. It's an Amazonian (sic) nest of crappy Python books!

Tick tick tick. I wonder how long it'll take Amazon to catch on...

String formatting

23 Jul, 2019

Lou O writes:

Hi Jason, Read a few good reviews of your book on Amazon.
One of the reviews pointed out "The explanation of String formatting needs to be updated. We don't do embedded values using %s anymore. I recommend skipping the chapters on Turtle Graphics and tkinter. The introductory chapter on classes and objects is not bad, but the topic is beyond what most kids will need, and they should really focus on imperative / procedural programming first using just lists and dictionaries as their basic data structures."
And I was wondering if those points had been taken into account and updated since then.

In terms of string formatting, the reviewer is correct but, on the other hand, % formatting hasn't actually been deprecated yet. From the official Python 3 documentation:

The formatting operations described here exhibit a variety of quirks that lead to a number of common errors (such as failing to display tuples and dictionaries correctly). Using the newer formatted string literals, the str.format() interface, or template strings may help avoid these errors. Each of these alternatives provides their own trade-offs and benefits of simplicity, flexibility, and/or extensibility.

I have thought about updating the section on formatting though, just because using str.format is the more accepted/modern method -- but this will probably have to wait for a second edition, or perhaps the next major reprint.

In terms of classes and objects, I don't agree at all. When originally writing the book, I thought rather hard about whether it was worth going into the complexity of that topic and, in the end, came to the conclusion that there is too much in Python which is object-oriented, and would be more confusing to explain without at least covering the basics (IMHO).

And finally, in regard to the comment about skipping the chapters on turtle and tkinter... sure, if they want a dry book on programming fundamentals, with nothing fun for a kid to experiment with -- one that they will then put down 10 minutes after opening and never return to -- by all means, skip those chapters.

For everyone else: will your child use the turtle and/or tkinter modules in the future? Probably not. But are they a useful tool to learn how to use those fundamental programming concepts (without needing to install any complicated third party libraries)? Personally, I believe so.

Bouncing in Polski

08 May, 2019

Marzena writes:

I'm writing to you because neither me nor my 11-year old daughter with whom we're learning Python can figure out where the problem is. We get the following error:
 ================ RESTART: C:/Users/Enarpol/Desktop/ ================
 Traceback (most recent call last):
   File "C:/Users/Enarpol/Desktop/", line 5, in <module>
     class Piłka:
   File "C:/Users/Enarpol/Desktop/", line 20, in Piłka
     if pozycja[1] <= 0:
 NameError: name 'pozycja' is not defined
And the code is exactly like in the book (some words are in Polish, but I assume it's not a problem for you to trace the error despite of it)

Your problem problem is caused by indentation and the idea of "scope" - I guess you're using the Polish language version of the book, so I'm not sure of the correct page number, but in the English language version of the book the section on Variables and Scope (page 84) would be useful to re-read.

In short, here is the incorrect bit of your code:

    def rysuj(self):
        self.płótno.move(, self.x, self.y)
        pozycja = self.płótno.coords(
    if pozycja[1] <= 0:
        self.y = 3

If I re-indent this with visible spaces, to show how it should look, hopefully you can see what you need to fix in the rest of your code:

    def rysuj(self):
        self.płótno.move(, self.x, self.y)
        pozycja = self.płótno.coords(
    ␣␣␣␣if pozycja[1] <= 0:
    ␣␣␣␣    self.y = 3

Why does this make a difference? Because in the case of rysuj above, the variable pozycja is only visible within the function - or to be exact, within the block of code that makes up the function. And how do we create a block of code? Basically through indentation. Your if statement was at the same indentation level as def rysuj(self), so it wasn't part of the function and that's why you're getting the error name 'pozycja' is not defined.

Hope that helps.