Em Học Python

16 Nov, 2021

cover The Vietnamese translation of Python for Kids is now available from the publisher Sputnik Edu. I think that's something in the region of 12 languages in total now (including English)! 👍


Spark-on-Pi

19 Mar, 2021

I've built a Docker image for running single-node Spark and Hadoop (one worker) on a Raspberry Pi, since I couldn't find anything to experiment with. Certainly not suitable for anything other than experimentation, but the image can be found on Docker hub here:

hub.docker.com/repository/docker/jasonrbriggs/pi-spark

The "source", such that it is (dockerfile, etc), can be access from radicle via this URN:

rad:git:hwd1yreyw4xcjarwtb4yuxtsd5dmip9wx9fr9tomjib1mmtkmojnicnn4bo

Booting Raspberry Pi from External USB

05 Feb, 2021

I wanted to boot my Raspberry Pi 4 from an external SSD connected via a USB-to-SATA cable. According to Jeff Geerling's video, it's pretty straightforward using the latest version of the firmware - but try as I might I couldn't get it to start. Tried re-imaging the SSD; also used the SD card copier, to copy the SD contents across to the SSD; but regardless it didn't seem to detect on boot (it was accessible after starting up using the SD, so I knew it probably wasn't a powering issue).

After much muddling around, came across rpi-eeprom-config, and realised that BOOT_ORDER was was set to 0x1 -- according to this page that's SD card mode. Also according to that page, the default is supposed to be 0xf41 (try SD, if not found, try USB). Ran sudo rpi-eeprom-config --edit, changed the value, saved, rebooted and voila, the SSD was bootable, and has been seemlessly working since then.

I'm sure this info has been posted elsewhere, but given I couldn't find it, posting here in case it's of use to someone else.


Breaking out of loops

30 Jan, 2021

Sherry W writes (excerpted):

Excellent book so far for my grandson.

On page 78 should it read?:

while True:
    lots of code here
    lots of code here
    lots of code here
    if some value == False
        break

Book is written very well for that age group. It’s great to have a book that is able to explain concepts with simple examples.

The example on page 78 is not supposed to be executable code (obviously the text "lots of code here" repeated 3 times isn't), so it doesn't actually matter if the condition is "some_value == False" or "some_value == True". If I was going to write a runnable version of the example, it might look something like this:

some_value = True
while True:
    print("aaaa")
    print("bbbb")
    print("cccc")
    if some_value == True:
        break 

Which, if it was run, would print the following just once:

aaaa
bbbb
cccc

But you could use True or False in the above example (on the first line and the second-to-last line), and it would work just as well.

However a shortcut, not mentioned in the book (because I think simplicity and clarity is better for beginners), is that when you're checking for True, you can omit the "== True" altogether:

some_value = True
while True:
    print("aaaa")
    print("bbbb")
    print("cccc")
    if some_value:
        break 

Tkinter colorchooser problems (revisited)

23 Jul, 2020

Suranga writes:

My son and I have been learning Python with your great book. Unfortunately, we hit an issue with Colorchooser - the problem appears identical to this one: https://jasonrbriggs.com/journal/2013/05/01/tkinter-colorchooser-problems.html In the solution, you suggest we revisit Chapter 1 but can see no mention of a way to ensure IDLE is launched with No Subprocesses. Have we missed something here? Perhaps there was an earlier version of the book that did not contain this? (ours is the Tenth Printing). Thanks for any guidance you can provide!

No, you haven't missed anything. In subsequent printings of the book, my advice about using "No subprocess" has been removed -- that mode is no longer valid with the versions of Python 3 released since Python for Kids came out in 2012.

Interestingly, despite the fact all the code in the original print was tested by multiple people (including me!), I can't now find a version of python where from tkinter import * actually results in colorchooser.askcolor() working properly. So in yet later printings (some time after Jan 2017), I changed the instructions to reflect that fact that colorchooser is not imported by default when using import *.

This is now the corrected code:

from tkinter import *
from tkinter import colorchooser
tk = Tk()
tk.update()
colorchooser.askcolor()

Top Few

05 Jul, 2020

I was reading Tim Bray's recent blog posts (here and here) about his Topfew utility with interest, and wondered how my go-to systems programming language (Nim) stacks up compared to Go. I knocked up a quick and dirty implementation and then ran against a 900MB access_log from my own site. A minute or so later I hit CTRL+C, realising my quick and dirty implementation was (a) not quick at all, and (b) perhaps a bit too "dirty". ಠ_ಠ

Once I changed from the naive read everything into memory approach, to using Nim's streams, I ended up with something slightly more acceptable. Caveat: not really optimised (well... apart from compiling with -d:release --passC:-mcpu=native --boundChecks:off flags), lazily developed, probably still naive, but at least has border-line acceptable performance (Github).

Results on my laptop (with an approx 1.5 million line access_log):

Elapsed User System vs Go
Tim's topfew 2.68 3.86 0.41 1.0
ls/uniq/sort 9.10 1.76 0.26 3.4
Nim 3.91 3.65 0.25 1.5

1.5x isn't brilliant, but it's not bad considering the minimal amount of effort I spent on it. But that led me to wonder, what about Python performance? Another Q&D implementation later (and this time I haven't bothered to dig into the performance at all)...

Elapsed User System vs Go
Tim's topfew 2.68 3.86 0.41 1.0
ls/uniq/sort 9.10 1.76 0.26 3.4
Nim 3.91 3.65 0.25 1.5
Python 8.57 7.77 0.79 3.2

Not much better than the uniq/sort version, but the python version is arguably a little more readable than Nim. One interesting difference between Python and Nim -- the Python version does read the whole file into memory before tokenising...

with open(filename) as f:
    for line in f.readlines():
    	...

...yet the performance was significantly better than my earlier naive Nim implementation which also read the whole file. So it's not just the right tool for the job, but the right technique for the tool.

Do I feel like dusting off the rather rusty Haskell skills though...?


Problem with bouncing ball

30 May, 2020

Lennier M writes:

I am on page 202 of your book, but when I followed your instructions, the ball stopped moving, instead of moving in multiple directions. Here is my code:

class Ball:
    def __init__(self, canvas, color):
        self.canvas = canvas
        self.id = canvas.create_oval(10, 10, 25, 25, fill=color)

        self.canvas.move(self.id, 245, 100)

        starts = [-3, -2, -1, 1, 2, 3, ]
        random.shuffle(starts)

        self.x = starts[0]
        self.y = -3
        self.canvas_width = self.canvas.winfo_width
        self.canvas_height = self.canvas.winfo_height()

    def draw(self):
        self.canvas.move(self.id, self.x, self.y)

        pos = self.canvas.coords(self.id)

        if pos[1] <= 0:
            self.y = 1
        if pos[3] >= self.canvas_height:
            self.y = -1
        if pos[0] <= 0:
            self.x = 3
        if pos[2] >= self.canvas_width:
            self.x = -3

please help

If I run your code I get the following error:

Traceback (most recent call last):
  File "test.py", line 40, in <module>
    ball.draw()
  File "test.py", line 26, in draw
    if pos[2] >= self.canvas_width:
TypeError: '>=' not supported between instances of 'float' and 'method'

So that tells us the line which is failing, but why?

If you look at these two lines, hopefully you'll see the difference (and the reason for your problem):

        self.canvas_width = self.canvas.winfo_width
        self.canvas_height = self.canvas.winfo_height()

The missing brackets on the first line mean that you haven't actually called the winfo_width function (or method). So self.canvas_width isn't a number of pixels - it's actually a reference to the function itself. If we added a print statement at that point in the code it would be even more obvious...

print(self.canvas_width)
<bound method Misc.winfo_width of <tkinter.Canvas object .!canvas>>
print(self.canvas_height)
400

This is the reason why comparing pos[2] (which is a number - to be exact it's a floating point number) with self.canvas_width (which is the reference to a function/method) comes back with the error message: "'>=' not supported between instances of 'float' and 'method'".

If you add the missing brackets, you'll hopefully find the ball moves as expected.


Online supermarket delivery in the UK needs to evolve

03 May, 2020

We managed to get a Tesco's delivery last week, which was a challenge - the best we've been do so far is about one delivery every two weeks or so (from different supermarkets). But that basically means sitting online for hours hitting the refresh button to see if a slot becomes available.

Hunter-gathering in the twenty-first century...

A couple of hours after our delivery had arrived, I happened to look out the window and see another Tescos delivery being dropped off about two doors down. So not only dumb UX, but from a logistics perspective, poor design as well.

What would make far more sense to me, particularly in this time of lockdown (probably after as well), is you load up your basket with your shopping, then specify half day slots when you'll be at home. i.e. "here's the 30 things I want, and I can be home for delivery any time Monday to Sunday for the next two weeks". or "I can be home for delivery on Friday morning, Saturday morning, and all day Sunday". Periodically the supermarket runs an algorithm to find the optimal delivery for each order (and obviously you don't get charged if you don't get a delivery) -- perhaps with some prioritisation for those who've been waiting longer for their delivery slot.

There's nothing massively complicated about this as an algorithm - group addresses in the same street which have an intersecting time period. Then group addresses with close proximity (maybe half a mile or so). Then look for addresses slightly farther apart, but still close enough that delivery can be optimised (half a mile up to a couple of miles) - perhaps some of those can be grouped with one of the first two groups as well. Give each of these delivery groups a certain amount of points. You then rank the deliveries by their points, and by the age of the oldest order in the group -- ungrouped orders would just be ranked by their age -- and send out notifications to the confirmed orders. Discount delivery charges for grouped orders, and if there are any remaining slots available, open them up to adhoc deliveries with a slightly higher charge.

There's more complexity to take into account, of course - the number of delivery vans versus the number of deliveries per van, distance to the supermarket/depo, etc. But still, the result is likely an improved user experience for everyone, better for the environment and probably more cost effective for the supermarkets.


Long and short dashes

14 Apr, 2020

Jan vK writes:

here's an example from your book that gives an error:
count_down_by_twos = list(range(40, 10, −2))
SyntaxError: invalid character in identifier
please inform me how to solve this problem

It looks like you might have copied-and-pasted the code? Perhaps from the digital version of the book? It looks like the -2 in your example is actually a hyphen (i.e. a long dash −) instead of a minus (i.e. a short dash -). So if I try the version of the code you sent, I get the same error:

>>> count_down_by_twos = list(range(40, 10, −2))
  File "<stdin>", line 1
    count_down_by_twos = list(range(40, 10, −2))
                                             ^
SyntaxError: invalid character in identifier

However, if I try with the correct character, there's no error:

>>> count_down_by_twos = list(range(40, 10, -2))
>>> 

The next question you might ask is what does "invalid character in identifier" actually mean? An identifier is the name of something (the name of a keyword, a variable, a function or a class, and so on) -- valid identifiers are a sequence of letters (characters), digits and underscores. In effect you're getting that error message because python doesn't recognise "−2" (a long dash followed by 2) as any recognisable keyword, or variable, or anything resembling a valid identifier.

Hope that helps.


ICANN corruption

18 Jan, 2020

Belatedly... this article about ICANN & dot-com price increases (which does look rather like ICANN corruption, in my opinion), annoyed me more than I can properly express, despite the fact that initially (first few years) the 7% increase will still be less than I'm paying for dot-nz domain names. This is pretty obvious profiteering by Verisign, and I worry that this will trickle out from the US to domain names for other countries, inevitably turning the ownership of a domain name from something that's a petty cash expense into a real, and significant, cost. It's particularly concerning if you take into consideration similar news in the dot-org space.

It irritated enough that I started looking at alternatives. However, none of the blockchain domain name options look like a particular economic, straightforward, sure-fire win (paticularly not for a non-technical audience) - even OpenNIC, which is the closest tech to the incumbent, would require jumping through additional hoops because I don't believe my current hosting provider supports DNS alternatives - on their forums, I can't find any mention of OpenNIC apart from a note on a "Rejected Feature Proposals" forum, back in 2007, about it being a stale project.

Maybe the global Internet community will eventually route its way around the "damage" by selecting a generally acceptable alternative. Or a privacy-focused browser maker like Mozilla will come up with (and promote) a viable domain name system that the other browser makers will have to implement or be left behind.

In the meantime, perhaps I'll look at redirecting my primary domain elsewhere and route around the problem myself, before the name comes up for renewal in a few years time...