Egregious Thoughts
[Most Recent Entries] [Calendar View] [Friends]

Below are 20 journal entries, after skipping by the 20 most recent ones recorded in Greg Hewgill's LiveJournal:

    [ << Previous 20 -- Next 20 >> ]
    thursday, february 18, 2010
    10:41 pm
    sunrise/sunset update, new google chart api features
    Just a small update today: I updated the Sunrise and Sunset graph page to show a green line on the graph to indicate the current date.

    In related news, I noticed that the Google Chart API has been significantly upgraded recently, with interactive charts (a fully Javascript solution, no plugins such as Flash), plus new types of static charts including Forumlas using TeX.

    tuesday, february 16, 2010
    8:49 pm
    careless people
    Today when I went to get my bike from the bike shed at work on the way home, I noticed that it had been shifted from one side of its stall to the other. As soon as I got on, I noticed the right handgrip was tweaked to the side, and as soon as I shifted gears I noticed that the derailleur was wonky. I could still shift but the gears were all out of alignment.

    It looks like somebody knocked my bike over at some point today, which scuffed up the right handgrip and bent the derailleur hanger. I'm annoyed because of careless people who knock over other people's bikes, and doubly annoyed because it's going to cost me a new derailleur hanger to fix this.
    wednesday, february 10, 2010
    9:24 pm
    twig - twitter irc gateway
    My primary "instant" communications medium is IRC. I've used IRC for roughly 15 years and it's an integral part of my communications toolkit.

    In order to follow Twitter, rather than having to reload the Twitter home page all the time or use a separate Twitter client, I was looking for a way to make tweets show up in an IRC channel. Originally I wrote an IRC "bot", which reloaded my Twitter page periodically and posted new tweets to an IRC channel. This was a bit fragile and wasn't a great solution, since it depends on connecting to some existing IRC network.

    There is a project called BitlBee that is an IRC server that connects to various different IM networks (AIM, Jabber, etc). You connect your IRC client to your own local BitlBee server, which then connects to the IM networks of your choice. BitlBee is written in C and is therefore less easy to adapt and modify than if it were implemented in a more modern language.

    I wanted to play with the Twitter Streaming API, so I wrote a Twitter-IRC gateway in Python called twig. Right now it does the simplest thing possible - it forwards tweets from the people you follow on Twitter into an IRC channel. It's a select-loop based single threaded implementation which should be easy to extend and modify.

    If you use IRC and want to follow Twitter, check it out on Github.
    monday, january 25, 2010
    7:34 am
    the state of the livejournal

    My LiveJournal "friends" page looks pretty sad today. It's got 25 entries covering four days (back in the heyday of LJ, my friends list would typically roll over 25 entries in a day). There's one of my own entries, but as for the rest of them:

    • Number of new posts in the last 24 hours: 0 (except for status telling everybody the site is still up)
    • Number of posts by people I've actually met in real life: 0
    • 12 entries by jwz (occasionally amusing but mostly videos I don't watch)
    • 2 annoying "loudtwitter" crossposts
    • 1 LJ news update from news
    • 1 post where a mutual friend actually posts words they've written instead of just pictures.

    LiveJournal = dead.

    saturday, january 23, 2010
    2:11 pm
    two dead mp3 players, what next?

    I usually only use my MP3 player (Creative Zen Stone Plus) to listen to podcasts when I take the bus to work. After the Christmas holiday break, I went to plug in my player to copy the latest podcasts to it, and it didn't power up. I tried the reset button on it, I tried "charging" it for a whole day, I tried everything suggested on the Creative troubleshooting site, and didn't get so much as a blink out of its little LED. It's possible that the battery was drained to a level below some threshold where it refuses to charge again (apparently that sometimes happens with Li-ion batteries).

    I liked the player, so I found another identical one on Trademe (New Zealand's own version of Ebay). I charged it up, copied my current files to it, and it worked like a charm. For just two days. Then, as I was copying some more files to it, I got some odd I/O errors during copying, followed by generally slow performance (and USB timeout errors in my Macbook system.log file), followed by a complete loss of files on the device and and inability to boot it up (turning it on shows the "Creative" logo screen and nothing further). That was an unfortunate waste of $60.

    I'm now on the hunt for a new player. One of the features I use on the player is the ability to delete a file while I'm listening to it. I do this whenever I'm done listening to a podcast, as unlike music the exception is wanting to keep the file. Anyway, a file delete operation is a must, and I never realised that this was apparently an uncommon feature. Fortunately, today virtually all manufacturers provide the full user manuals for their devices online for download. Because deleting files seems to be an obscure feature, it generally doesn't seem to be mentioned in the short blurbs.

    • Sony Walkman - no delete function
    • Philips GoGear - no delete function
    • Samsung U4 - delete function, but not current file!
    • Creative Zen Stone - delete function, including current file

    I'm sort of having trouble finding other models because I know I don't want an iPod (I've written my own program to download and manage podcast feeds, so I need an ordinary filesystem interface). I also don't care about: video, photos, hackability (ie. I don't need to load Rockbox), more than about 4GB of space, or anything over about NZ$100.

    I'm currently undecided between trying my luck with another Zen Stone from Trademe (as they don't seem to be sold at retail here anymore), or getting a Samsung U4 at retail so I can return/replace it if it's not working for me. Does anybody have any other ideas?

    thursday, january 7, 2010
    11:45 pm
    y2010 bugs found via google code search

    Last week the news started coming about regarding various computing system failures caused by the rollover to the year 2010. I wondered how easy it would be to identify such bugs in open source software, using Google Code Search. What kind of bug would be would be easy to identify? A common error in the last century was to use a C printf format string "19%d", which would roll over from 1999 to 19100 at the turn of the century.

    What if people used "200%d" as a format string? That would roll over from 2009 to 20010 in the year 2010. But surely nobody would actually do that, right? Wrong. Some of those hits are false hits and not relevant to dates, but I found about 10 open source projects with such date-related format strings. Some of them are:

    I've sent suggested patches to fix the bug(s) to each project that I could find.

    I'm pretty sure this technique of using Google Code Search has been used to locate unsafe coding practices related to software security vulnerabilities, but I wonder whether anybody has successfully applied it to other types of software bugs.

    friday, january 1, 2010
    4:22 pm
    2009 in review

    2009 seems to have had even fewer changes than previous years. But it still had lots going on!

    friday, december 18, 2009
    12:09 am
    relight: restart a crashing process

    One of the projects I have running is a temperature monitor. I built a QK145 temperature monitor kit a few years ago and have been monitoring local temperature data for a few years now. However, the process I have monitoring and logging the temperature occasionally crashes with a bus error or illegal instruction or some other weird error. I don't know whether it's related to the old version of FreeBSD running the server, or dodgy hardware (memory?), a broken driver, or what. It doesn't happen often enough to get concerned about (yet).

    Anyway, I got annoyed with having to restart the monitor if it happened to fall over. The result is relight, a small Python script that automatically restarts a process that crashes occasionally. Here's the usage:

    Usage: ./ [options] command args ...
            -n restarts
                number of restarts within a minute before we give up (default 5)
            -l logfile
                name of log file (default relight.log)
            -w wait
                seconds to wait between restarts (default 5)

    Relight also comes with complete unit tests! It was a bit tricky to write automated tests that deliberately killed a process spawned by a child process (a "grandchild" process). But by having the grandchild process echo its own pid, the test code was able to read that and send a SIGKILL to the correct process.

    monday, december 14, 2009
    9:49 pm
    marginal costs of leisure activities
    I just sent a resignation email to the Canterbury Gliding Club. I haven't been flying in something over a year, and I couldn't justify the continued membership expense if I'm not even going to fly.

    Flying with Fault Line Flyers in Texas was pretty much a no-brainer for me at the time. It was about US$23 for a tow, and time in the air (glider rental) was a flat $3 or so per flight. I was living by myself and had plenty of time and enough money to spend on a weekend activity.

    Here, a similar tow is about NZ$50, and glider rental is just under $1 per minute of flying time. As you can imagine, that adds up extremely quickly and the reality is it exceeds what I can justify for a weekend activity today. Especially since I like to spend weekend time with my wife (as one or both of us often seem to be busy during weeknights).

    Coming to this conclusion was tough because I truly enjoy flying, but was always conscious of how much it cost. Naturally, that dampened the enjoyment somewhat. I realised that there are two types of leisure activities: those that have a marginal cost, and those that don't. Activities with a marginal cost are those where you have to shell out some amount of cash every time you do whatever it is. Gliding is definitely one of these activities; skiing, skydiving, and golfing are other examples. On the other hand, activities without a marginal cost usually require you to purchase equipment of some kind, but doing the activity just once doesn't cost anything. There are many examples of this: cycling, surfing, hiking, fishing, diving, even things like sailing or motorcycling (where the initial cost might be substantial). I have discovered that I'm a no-marginal-cost kind of guy. (Having said that, I'll still go skiing!)

    Speaking of gliding, congratulations to Terry Delore who just yesterday broke the world distance gliding record with a 2501 km flight!
    wednesday, december 9, 2009
    9:49 pm
    debug line information in psil compiler

    I've only had a bit of time to work on the Psil compiler, but it's coming along well. The compiler now generates Python AST code for many kinds of examples.

    As mentioned previously, I've tried annotating the AST with line and column information to help locate Python runtime errors. The representation of lists that I'm currently using (just a Python list) doesn't leave a lot of room to store the extra annotation information. For example, the code:

    (print (+ foo 5))

    is represented as the following Python lists:

    [Symbol("print"), [Symbol("+"), Symbol("foo"), 5]]

    There's not a lot of room in this representation to store source line annotations. The first thing I tried was to declare a global DebugInfo dictionary, indexed by the id() of the list (Python's id() represents a unique identifier such as a machine address). So for example, the above debug info might be:

    DebugInfo[123456] = [(1, 2), (1, 8)]
    DebugInfo[123460] = [(1, 9), (1, 11), (1, 15)]

    The first DebugInfo[123456] represents the starting line and column of each element in the first (outer) list. The second DebugInfo[123460] represents the same for the three elements of the inner list. This seemed like a great idea, and it was poised to work well for small examples. However, after some more complex examples particularly including macro expansion (very common in a Lisp language), the original code was garbage collected and the addresses of lists were re-used, causing the DebugInfo addresses to align with different source lists! This was tricky to track down.

    I've moved that code to another branch until I figure out what to do with it. I may be able to manage it by not letting the original code as read from the source be garbage collected (by keeping a reference somewhere else), that could work. The information only needs to be kept during the compile phase, as soon as it's embedded in the AST it doesn't need to hang around any longer.

    Anyway, more work needed. Just writing this post helped me sort out some ideas. Source on GitHub.

    wednesday, november 25, 2009
    9:00 pm
    psil presentation and compiler details

    As promised last week, I've been working on making Psil work in a Python 3 environment. But first, a small but entirely relevant digression which includes a video of me giving a presentation:

    A few weeks ago, I attended Kiwi PyCon here in Christchurch. I had thought about maybe possibly presenting something when I signed up, but I didn't actually prepare anything ahead of time. Immediately after lunch on the first day, there was a "lightning talk" session where people get 5 minutes to present something, no questions from the audience. During lunch, I decided that my opportunity was right then, and that if I didn't try to present something then I'd probably be annoyed at myself for not having tried.

    Anyway, I hurriedly prepared a presentation (PDF, 8 slides) in the 30 minutes before the presentation, and wondered whether I would even be able to find 5 minutes worth of stuff to say. It turns out I did, because although it felt like I had zipped through my presentation in like 3 minutes, I actually took slightly over 5 minutes. There is a video recording of my (terrible) presentation that you can watch online (skip forward to about 52:00 using the slider).

    Right, now that you're back, I'll continue with what I was going to write about.

    The Psil compiler (as opposed to the interpreter) compiles Psil code into Python code, which is then executed directly by the Python runtime system. For example, here is a simple function and its use:

        Psil                            Python
        ----                            ------
        (define (sq x) (* x x))         def sq(x):
                                            return x * x
        (print (sq (+ 2 3)))            print sq(2 + 3)

    Previously (before the current Python 3 migration), I did this by generating literal Python source code as shown in the right hand column above. But instead of generating the source code directly, I generated a Python AST (Abstract Syntax Tree) representation, then implemented a "de-parser" that creates equivalent Python source code from an AST. The original idea was to have Python compile and execte the AST directly (rather than going through the source code step), but I think the version of Python 2.x that I was using didn't support passing an AST to the compile() function.

    Since Psil now requires Python 3, the compile() function supports direct compilation of an AST that represents Python code. So now Psil can generate the AST, pass it to compile() without going through source code, and execute the resulting code object directly. One of the difficulties with this was that the internal Python AST changed completely from version 2 to 3, so I had to rework most of the compiler code. Fortunately, it turned out to be reasonably straightforward. I also maintained the de-parser because debugging actual source code is a lot easier than debugging with only an AST.

    Right now I've got this all working for simple examples. I still need to re-implement the "lambda lift" which promotes a Psil lambda expression (which may contain an arbitrary number of forms) to a Python function (because Python's lambdas can only contain a single expression).

    The next interesting thing I can do with an AST is annotate the nodes with source line and column numbers. This means that if a Python exception happens at runtime, the traceback will actually point to the location in the Psil source code that caused it! That's a pretty cool feature and I'm looking forward to playing with that.

    saturday, november 21, 2009
    4:47 pm
    thawte web-of-trust shut down, what next?
    Since Thawte has shut down their free Personal Email Certificates program, there doesn't appear to be a free community-driven replacement for S/MIME certificates. Thawte VeriSign offers personal email certificates for US$20 per year, but no thanks.

    It looks like we're back to the tried-and-true PGP for no-cost secure person-to-person communications.
    thursday, november 19, 2009
    9:45 pm
    python 2 to 3 upgrade and exception handling

    I've just (partially) upgraded Psil to Python 3. I had originally been developing it in Python 2, but there are a few particular things about Python 3 that make it a better choice. There's no particular reason it needs to run on Python 2, so I decided to not develop two parallel versions, and not try to run the same code in both Python 2 and 3, but just upgrade wholesale to Python 3 and not look back.

    There were in fact only a few simple changes that were required. The most obvious is the print() function, but I also ran into:

    • The result of map() is an iterator not a list (this affected some unit test code)
    • Renaming of raw_input() to input()
    • Use of next(it) instead of (and my use of a local variable called next)
    • Need to use f(*a) instead of apply(f, a)
    • Lack of callable(x) (use hasattr(x, "__call__"))
    • Exception syntax: except Exception as e instead of except Exception, e
    • reduce() now lives in the functools module
    • Some test code needed to handle Unicode strings properly
    • A difference in default exception handling (more on that below)

    So I certainly got a pretty good coverage of the major differences.

    The problem I had with exception handling was related to my unorthodox method of handling tail recursion. When I ran a program that used a lot of tail recursion, the memory usage immediately and quickly went through the roof! Clearly there was a memory leak, but Python is generally supposed to handle that for me with its garbage collector. The clue to solving this lay in an obscure warning in the documentation for sys.exc_info:

    Warning: Assigning the traceback return value to a local variable in a function that is handling an exception will cause a circular reference. This will prevent anything referenced by a local variable in the same function or by the traceback from being garbage collected.

    I had read that warning, but I wasn't using the traceback of sys.exc_info at all so I thought that shouldn't be a problem. However, Python 3 now automatically includes a __traceback__ attribute of every exception (see PEP 3134). Due to the way I was calling a function referenced within the exception object itself, the presence of the traceback was creating a huge chain of unfreeable function and exception references.

    Fortunately, there was a simple solution:

            except TailCall as t:
                a = t
                a.__traceback__ = None

    Setting the __traceback__ attribute explicitly to None releases the reference to previous stack frames and my code no longer leaks memory.

    On the recommendation of the python-dev mailing list, I filed a documentation bug to clarify the warning quoted above.

    Finally, I said at first that this was a partial upgrade, because I haven't even addressed the compiler part of Psil (that compiles Psil code to native Python). The modules and interfaces that I was using previously are either gone or changed in Python 3, so a slightly different approach is needed. More on that in a later post.

    thursday, november 12, 2009
    11:45 pm
    look before you leap
    I started to investigate yesterday's idea of a shell implementation in Perl, and immediately got mired in the Byzantine shell quoting rules. The most basic thing about the shell turns out to be unnecessarily hard, and besides, it's already been done. Why did I think this was a good idea again?

    Somehow I had got it into my head that msysgit on Windows would only run from the msysgit shell. Since I don't actually use Git on Windows, it turns out I was mistaken. If you run any Git command from the regular Windows command prompt, it already works. The appropriate shell is invoked to execute the shell script (git-rebase is one such shell script, instead of being part of the native C implementation).

    Nothing to see here, move along...
    7:59 am
    git on windows
    Git is mostly written in C, with some important parts in shell and Perl. This works well on Unix-like platforms, but unfortunately suffers on Windows because there is no native Windows (Bourne) shell. Git can be built under the Cygwin environment, but that is a heavyweight solution and Cygwin is not appropriate for general use. msysgit builds a native Windows executable plus uses the msys shell implementation, which provides a workable solution, but still requires Git operations to be done within the msys shell and not the native Windows command shell.

    I had a thought the other day, that one could implement a Bourne shell interpreter in a more widely available language such as Python. This would provide a more widely portable option for using Git, and on Windows in particular. Such a shell implementation would provide the minimum implementation necessary to run Git's shell scripts (and which would include minimal implementations of various utitilies such as grep, sed, and awk). That wouldn't solve the Perl script problem, so it seems obvious that a portable shell interpreter designed for use with Git should be written in Perl.

    A long time ago there was the Perl Power Tools project, which aimed to provide implementations of various Unix utilities in native Perl. Unfortunately, nobody ever submitted an implementation for sh. Is there any existing implementation of a Bourne-like shell in Perl? More generally, is this a viable plan to make Git work better on Windows?

    Update: See the next post to find out just how far I got with this.
    saturday, november 7, 2009
    9:38 pm and google app engine
    Last week, I was quietly working on my web server when all of a sudden the whole thing ground to a near-halt. It wasn't completely dead, because it would still ping and every few minutes I would receive another packet of characters (I was literally in the middle of refreshing a screen I was looking at). Not knowing what was happening and not having any way to find out, I went and did something else for a while.

    45 minutes later, my server returned to normal operation as if nothing had happened. This did not appear to be just a network congestion problem, it was definitely something my server was busy doing. A bit of investigation showed that the culprit was in fact Hundreds of machines all across Canada had all accessed the same short link at the same time, and completely pegged the PostgreSQL database processes, and also run my server out of memory. It's quite a testament to both FreeBSD and PostgreSQL that they survived at all.

    What I believe happened was that somebody had sent an email containing a link to a mailing list to which lots of people from Canada were subscribed. (The link in question happened to be a job opening at Looking up the reverse PTR records for the machines that loaded the URL, there are names like "mail", "barracuda", "filtre", "antispam", "mx1", "incoming-smtp", "guardian", etc. It seems that they all accessed the link for purposes of virus checking, all at pretty much exactly the same time. This was not good for my poor server.

    I decided that it might be time to move to a different server. It's written in Python, so it's an ideal candidate for Google App Engine, and I've been looking for an excuse to play with GAE. So I downloaded the SDK, converted the code over to GAE (using Google's datastore instead of SQL), and made it work locally. This part was refreshingly easy and worked well.

    The next step is to set up the Google site so it responds to and handles the requests appropriately. Given that I've already got the code working locally, that should be straightforward. However, there is one gigantic caveat when using Google web site services (that I've actually already run into for another project): You cannot have Google's servers respond to a "naked" domain name that doesn't have a hostname. This means that having Google respond to is not possible.

    (There is in fact a good technical reason for the above restriction. When you set up a site with Google hosting, you add a CNAME record to the DNS for your hostname, ie. " CNAME". This lets Google completely manage the association between "" and any particular IP address(es), which is critical for their load balancing setup. The caveat is that a record with a CNAME must not have any other DNS records associated with it, including an SOA record. The SOA record is required on a "naked" domain name like, so you can't add a CNAME there.)

    To work around this, I'll have to set up a hostname that Google can respond to, something like Of course, that's a pretty lame name for a link shortener to use, so I'll still want the published link to be This means that I'll have to have some other, non-Google server respond to a request with a redirect to This adds another level of indirection to the resolution process for a shortened link, which adds another browser round-trip, which might slow the whole experience down no matter how fast GAE hosting ends up being.

    It turns out that Namecheap (my registrar for offers "URL redirection" where their server will respond to a particular hostname and redirect the browser somewhere else. It can also be configured to retain URL path information, so would redirect to This would completely take my own server out of the loop, hopefully avoiding any more problems like those last week.
    friday, november 6, 2009
    12:20 am
    earthquake widget update
    Over the last couple of days, a few people have emailed me about my Earthquakes Widget which displays the most recent earthquakes on a small world map on your desktop (it's a Yahoo! Widget). It seems that it had stopped displaying earthquakes due to an error loading the data file from the server.

    It turns out that it was due to an URL change on the server, and the widget framework's XMLHttpRequest object doesn't seem to automatically follow the redirect. So I changed the link and it all seems to work again. I sent a new widget to the people who had emailed me to make sure that it solves their problem, and if it's all okay then I'll submit it to Yahoo! again (they approve widgets manually, so it pays to make sure it's right before submitting).

    If you're looking for the latest version (1.5) before it gets submitted to Yahoo!, see the Earthquakes Widget page which has a direct download link.

    I've uploaded the quake-widget repository to GitHub.
    wednesday, october 28, 2009
    8:51 pm
    mandelbrot set viewer
    A couple of years ago I had the idea to build a Mandelbrot set fractal viewer in the spirit of Google Maps. Google Maps itself has a method to display your own map data by implementing a "tile" server, but it seems specifically oriented toward geographical data and doesn't have features like arbitrary zoom depth (which you'd clearly want for fractals). I looked around for other similar generic solutions and didn't find anything at the time.

    Back then, I implemented a drag-scrollable zoomable fractal viewer in Javascript from first principles. This involved doing all the crufty cross-browser compatibility work myself, which was really annoying and very nearly took all the fun out of the project. Recently, after playing with jQuery a bit, I thought it would be good to go back to the fractal viewer and reimplement it using jQuery.

    You can see the results of my efforts at my Mandelbrot set viewer.

    The controls are pretty much what you'd expect. Panning and dragging work as expected (but sometimes you have to wait for the tile server to keep up). The zoom control zooms in 2x for each step, with an effective upper bound limited only by the floating point precision of Javascript. Double-clicking zooms in 2x on a particular point on the plane. The currently-unlabeled button below the zoom control allows you to generate a larger (desktop-sized) snapshot of your current view.

    I was talking to Phil today who suggested I look at OpenLayers. Sure enough, in their Gallery there is already a Fractal Browser which does almost exactly the work that I've already done!

    Once again, somebody out there on the internet has already thought of the same idea. Nevertheless, I'm happy with my viewer (even though it's not everything I had envisioned yet) and I think I'll leave it as is and move along to something else. Finally, because this is Open Source Wednesday, you can find the code in ghewgill/mandelbrot on GitHub. Enjoy!
    saturday, october 24, 2009
    9:38 pm
    skiing around the world
    The ski season in New Zealand runs from Queen's Birthday (first Monday in June) through to Labour Day (last Monday in October, this weekend). Unfortunately, Amy and I didn't manage to go skiing at all this season. While we were in Iceland last April, we stopped in Akureyri and drove up to the local ski hill on the slopes of Hlíðarfjall (which is no more than about 10 minutes drive from town).

    We had a hot chocolate at the lodge, and seriously considered spending the day skiing. I said at the time that although it would be fun, I wouldn't be disappointed if we chose not to ski. We decided to press onward (since we only had a limited number of days left) and ended up spending our spare day in Stykkishólmur instead.

    However, I now think it would have been awesome to go skiing in Iceland. It was a gorgeous day (see above), not too cold, we had enough time, and Akureyri is a charming little town with an excellent hot pool. And all the red traffic lights in Akureyri are heart-shaped!

    At night, you could also see lights across the fjord Eyjafjörður that form a huge red heart. We didn't spend much time there, but I think Akureyri was one of my favourite places in Iceland.
    wednesday, october 21, 2009
    11:09 pm
    xearth tropical storm overlay
    Two new things for xearth this week. First, there is a new overlay for current tropical storm activity. The source data is the Tropical Storms, Worldwide data set from the University of Hawaii. I wrote a script to download the current tropical storm file (which is a flat ASCII file), some Python code to convert that to an SVG file, and then converted that to a transparent PNG for the overlay.

    The other new thing is xearth now supports multiple overlay files. Here's an example showing all map and overlay features (the Living Earth base map, and earthquakes, storms, and clouds overlays).

    I'm going to improve the presentation of the tropical storm map so the labels are more visible, especially when combined with the cloud overlay (white on white is not an ideal colour combination).
[ << Previous 20 -- Next 20 >> ]
My Website   About