Greg Hewgill (ghewgill) wrote,
Greg Hewgill

kindle 3 and chinese text

A few weeks ago I started a beginner level Chinese (Mandarin) evening class at the local community college. We're about four weeks in and I'm really enjoying it, will definitely do more.

I thought it would be an interesting project to load a Chinese-English dictionary onto my Kindle for reference. I've already played with kindlegen, which takes a collection of HTML files along with some additional metadata and creates a MobiPocket .mobi format file for the Kindle. (I've got several ideas for reference documents that could be loaded onto the Kindle, more on that later.)

Anyway, I've now got a prototype Chinese dictionary loaded onto the Kindle. However, although many of the characters are displayed correctly, quite a few are just empty boxes (in the form of I ⬜ Unicode). I couldn't find any particular pattern to the missing characters, for example 他 (he) displayed correctly but 她 (she) did not.

Some research with Google showed that some people had hacked the Kindle and found a way to load different font files onto the device. I didn't try doing that, becuase I thought there should be a better solution. After all, Chinese text is displayed correctly in the Kindle web browser! Furthermore, if I copy a UTF-8 format file directly to the Kindle for display, it will have the missing character problem. But if I email the file to Amazon where they automatically reformat it to a .azw book file and deliver it to the Kindle, the characters show up correctly. If I convert to a MobiPocket book file myself, missing characters.

After laboriously paging through dozens of threads on (typical internet "forum" software really is awful), I finally found a solution. It uses an undocumented debug feature of the Kindle. Press <Home> <Search> and enter the following commands:

    ;debugOn <Enter>
    ~setLocale zh-CN <Enter>

That's it, that is all that was required. I have no idea what this actually does or how it changes the Kindle's interpretation of UTF-8 documents (UTF-8 is supposed to be an unambiguous encoding of Unicode). But I now have a very basic Chinese-English dictionary in my Kindle.

  • 2013 in review

    2013 is the year when everything changed. The biggest event was the birth of our daughter Lily. She was born prematurely in Shanghai while we…

  • 2012 in review

    2012 has been fairly quiet. Maybe it just seems that way because I haven't actually written anything new in this blog since last year's annual…

  • new photo galleries

    I've been busy processing photo galleries from the last year (or two) and putting them online for your perusal. Vancouver 2010 Northland…

  • Post a new comment


    Anonymous comments are disabled in this journal

    default userpic

    Your reply will be screened

    Your IP address will be recorded