EuroClojure 2012 impressions

I just got back from the EuroClojure 2012 conference. I won't try to summarize all the talks here, just convey some general impressions that I got:

  • Clojure community is awesome. People are incredibly nice to each other. Ask a question on the mailing list, and you'll get a number of replies from people much smarter and more experienced than you. Same at the conference: you can approach anyone and expect to get great advice even if the competence gap between you and the other person is comparable in size to the Grand Canyon.

  • A lot of the talks focused on how to approach difficult and complex real-life problems better. Those weren't talks about syntactic sugar, "best practices", or new "features" the language should have. Instead, speakers presented results of months of thinking and experimenting: new architectures, new approaches, new ways to think about problems. If semicolons were discussed, it was a discussion about how to preserve them while doing source-code transformations. This is incredibly important: you can't overestimate the value of listening to smart people talk about ideas they thought hard about and developed for many months. I had several "aha!" moments where I suddenly saw that the architectures I developed were a poor-man's subset of a more general solution.

  • There was a focus on building real systems. I'd say that hobbyists and academics were in the minority: most people were there to learn how to better build code that makes money. It was also interesing to see the range of sizes of companies that use Clojure: from one-person consultancies through startups and small web-development shops, all the way through large financial institutions.

  • The median age of participants was probably between 35 and 40 years. This is clearly not a bunch of teenagers, but rather a group that gained significant experience in various languages and then moved on to Clojure. I think this has a lot to do with my first point — the community is both incredibly nice and mature, which often go together.

Overall, the conference was a success — more so than I expected. The organization was nearly flawless (not an easy task, I know, so hats off to Marco Abis). And it is now very clear than a European conference about Clojure is necessary. I'm looking forward to EuroClojure 2013!

Fixing a Tektronix 2246A oscilloscope

I bought an (obviously used) Tektronix 2246A oscilloscope at an auction site. It wasn't expensive, it looked fairly good and the seller was nice and provided start-up warranty.

After getting the scope it turned out that channel 3 was dead. Channels 1, 2 and 4 worked fine, but channel 3 would just draw a flat line, no signals registered.

I negotiated a discount deal with the seller and set on to repair the thing. Thank heavens for the extremely detailed service manuals! The manual for the 2246A Mod A wasn't difficult to find. I opened the enclosure and after performing the standard self-test procedures suggested by the manual started following the channel 3 signal path with an old scope I had.

Tektronix 2246A main board

I quickly discovered that the input signal was fine right until it entered the preamp IC (U230), a custom Tektronix chip. The inputs were fine, the output was flat. I was rather disappointed, as a failed IC would mean I'd need to find a replacement, which would be neither easy nor inexpensive.

But -- I then supplied the same signal to channel 4 and started comparing the two paths. According to the schematics, they should be identical. When I got to the preamp IC for channel 4, I started comparing the preamp ICs for both channels pin by pin. And… it turned out that the enable signal for U230 wasn't there!

One of these four solder joints is unlike the others...

The enable signals are generated in the U600 ("slow logic ic") chip, at the back of the main board. This is where things got weird, as the service manual I had diverged from the board I had in front of me. Not sure why, but clearly my main board was very different, with U600 and U602 placed in completely different locations and rotated. But, with some detective work I managed to figure out which chip is the U600 "slow logic ic". And the enable signal for channel 3 preamp was clearly there on the chip's pin. So I started following the signal and bingo! One of the resistors on the CH 3 EN signal path wasn't soldered at all!

Now, I'm not sure how this is possible -- would a cold solder joint produce such an effect? But clearly the connection was broken. Fortunately, this was easily fixable with a soldering iron, and a couple of minutes later I had my scope with all 4 channels working just fine.

Given that I paid about $200 for the scope, I'm quite happy with the end result.

[Posts like this one are written for search engines: one day someone might be looking for repair tips and find this page. I hope it helps someone.]

Making your Targus Bluetooth Presenter actually usable

Here's a tip that will come in useful if you'd like to use a Targus Bluetooth Presenter (AMP11US or AMP11EU) with your Mac.

It seems that the Targus wireless presenter remote (an otherwise nice device) was designed by a committee of morons, none of which actually ever gave any Keynote presentations.

Apparently someone at Targus said that the buttons are supposed to be for "Next Slide" and "Previous Slide", which other people took literally, so the buttons just jump over to the next slide, skipping any builds or transitions that you might have in place. All you can have is flat slides. Goodbye builds, goodbye special effects, goodbye bullet points, goodbye movies. The buttons generate "Shift-CursorDown" and "Shift-CursorUp", forcibly skipping over anything that isn't a full slide.

Am I being unnecessarily harsh calling the designers "morons"? I don't think so. If you design a device whose only purpose is to facilitate presenting, and then you create a version specifically for the Mac (I quote from the Targus web page: "the only wireless presenter dedicated to Mac users") to be used with Keynote — is it too much to ask that you design it so that the two keys on the device actually perform useful functions? I mean, seriously — two keys, next step and previous step, how hard is that?

It's also rather clear that most "reviews" that you can find online are junk and the "reviewers" haven't actually used the device to perform presentations.

Fortunately, there is a solution. There is a small, free utility called KeyRemap4Macbook. Download it, install it (requires a restart), then go into your Mac OS X Preferences, access the KeyRemapper panel, and from within its last settings pane access the private.xml file that stores custom key mappings.

Once you get there, enter the following:


<?xml version="1.0"?>
<root>
  <list>
    <item>
      <name>Targus Wireless Presenter Keynote Fix</name>
      <identifier>private.targus_wireless_presenter_keynote_fix</identifier>
      <autogen>--KeyToKey-- KeyCode::CURSOR_DOWN, VK_SHIFT, KeyCode::CURSOR_RIGHT</autogen>
      <autogen>--KeyToKey-- KeyCode::CURSOR_UP, VK_SHIFT, KeyCode::CURSOR_LEFT</autogen>
    </item>
  </list>
</root>

Save the file, go back to the first pane of the KeyRemap configuration and click "Reload XML". You might also want to check the box that says "Don't remap an internal keyboard".

And there you go — what this does is remaps the useless key combinations that the Targus Presenter generates to simple "cursor right" and "cursor left", which do the right thing in Keynote.

Enjoy!

Check your provider's spam reputation before signing up

Before you choose a hosting provider, always check their reputation with Spamhaus.

I rent virtual servers with Bluemile (formerly also Fivebean). This morning I was greeted with bouncing E-mail, and a quick check showed:

Ref: SBL99441 68.68.16.0/20 is listed on the Spamhaus Block List (SBL) 30-Nov-2010 08:26 GMT | SR02

bluemilenetworks.com (escalation)

BLUEMILENETWORKS.COM ignores spam complaints, hosts spammers including known spam operations (ROKSO), assigns non-SBL'd IPs to spammers who get their assigned IPs listed in SBL, provides snowshoe spam configurations, fails to provide rwhois information as required by ARIN (thus providing anonymity for spammers) and generally acts like a network unconcerned with its mailing reputation. Spamhaus thereby treats it accordingly.

Here's a link to the actual updated spamhaus page, which might be different when you look it up. Notice that's a /20 block that is being blocked — I can't do anything about it!

I reported this to Bluemile support, who were completely unconcerned. They would deal with it once an engineer comes to work in the morning (business hours). Well, several hours have passed, then several business hours have passed, and there are no results to be seen. Meanwhile, almost all my outgoing e-mail keeps bouncing, which I have zero control over.

I am not a spammer. I take extensive care to make sure my servers never relay any E-mail. And yet here I am, listed in the SBL because I didn't check my provider's reputation.

I am really angry. And for those of you who suggest changing providers: sure, but moving mail and DNS servers is not that easy. It takes time and effort.

One thing is sure: the next time I look for a hosting provider, I will check their IP ranges and check with spamhaus (and other lists, possibly) to see if they are a spammer haven. I don't want to have anything to do with providers that are.

Drobo and DroboShare — a review

Executive summary: don't buy it.

Convinced by people on podcasts (mostly TWiP and This Week in Tech) raving about how great the Drobo (from Data Robotics) storage device is, I decided to budget two into a project I'm working on. Expectations were high — Drobo marketing pushes the devices as easy to use, reliable and flexible. Being a Mac user, I expected an "Apple experience": plug it in and forget it's even there.

Nothing could be farther from the truth.

To begin with, the Drobo is Loud. Not just "loud", but REALLY LOUD. And it isn't the drives, it's the fan that cools the whole thing. To give you an idea of what I mean by Loud, one single Drobo with ultra-quiet WD Green drives spun down is louder than my 8-core Mac Pro with 4 drives and an army of fans in it. It's that loud. To make matters worse, the fan in the Drobo turns on very frequently, even when the drives have been spun down for hours. I don't know why, as the drives are very cool to the touch.

You won't want to have a Drobo under your desk, or anywhere in your vicinity, trust me. And that means the fancy fast FireWire-800 interface that you just paid for is pretty much useless. I used a DroboShare to setup my Drobo in a remote location where I can't hear it.

The DroboShare comes with Gigabit Ethernet, as the marketing will point out. What they won't point out is that it connects to your Drobo with a USB cable, which (together with SMB) pretty much limits your transfer speeds to about 5-8MB/s. That's about 6 times slower than when connected via FireWire-800.

What you should also know is that using the DroboShare will provide its own annoyances. As an example, I found it impossible to create a sparsebundle disk image for use with SuperDuper on the Drobo. Go figure. SMB introduces other annoying problems, too — I couldn't copy my music collection onto the Drobo, because some filenames had non-ascii characters in them.

But all of the above are merely inconveniences. The real issue is with reliability. I bought the Drobo so that I can trust it with my data and forget about failing drives and losing data. Which is why I was slightly miffed when Drobo Dashboard kept crashing on me and reporting unreliable data, annoyed when it hung in the middle of the night when doing my first real backup, slightly angry when support told me my Drobo is defective and needs to be replaced, and really pissed off when the second unit I got corrupted my volume and lost data (when connected to a DroboShare). And then Data Robotics support asked me... whether I have a backup. Or a copy of DiskWarrior.

I have so far been through TWO Drobo replacements. Despite my asking, Data Robotics was unwilling to provide an upgraded (better) unit.

What's worse is that now I don't trust the Drobo at all. I looked closer: the DroboShare seems to use the plain Linux support for HFS+ that is known to be shaky. There is NO FSCK (Filesystem Check) program for HFS+ at all! Data Robotics will tell you that you can switch your Drobo between a Mac and DroboShare and you will be ok — but that seems to be exactly what resulted in my data corruption problems.

Then there is Data Robotics support. When you make "reliable data storage devices", you really need to have support that cares about customers, reads their emails and responds instantly. Responding after one business day is not enough. Given that support people will forget what was written before, or begin by asking what your address is and when you bought your Drobo, it will easily take a week before you get to the real issue.

What you should also realize is that when your Drobo unit fails, there is no way for you to read data off the drives. You need a working Drobo unit to do that, and it has to recognize the filesystem and mount it.

I bought a Drobo so that I can have reliable data storage without worrying about reliable data storage. The net effect was that I got an unreliable solution that I have to manage, worry about and spend time and money on. That's a failure in my book. I will never buy another Drobo unit again.

[... the above was been drafted, and then 3 months passed ...]

So, today my volume (drobo mounted via a droboshare) unexpectedly disappeared on my Mac. Investigation of the DroboShare logs shows:


MOUNT HFS+ : s_id = [sda1]
scsi: unknown opcode 0xea
SCSI error : <2 0 0 0> return code = 0x70000
end_request: I/O error, dev sda, sector 4533105544
Buffer I/O error on device sda1, logical block 566638188
SCSI error : <2 0 0 0> return code = 0x70000
end_request: I/O error, dev sda, sector 4533105552
Buffer I/O error on device sda1, logical block 566638189
SCSI error : <2 0 0 0> return code = 0x70000
end_request: I/O error, dev sda, sector 4533105560
Buffer I/O error on device sda1, logical block 566638190
SCSI error : <2 0 0 0> return code = 0x70000
end_request: I/O error, dev sda, sector 4533105568
Buffer I/O error on device sda1, logical block 566638191
SCSI error : <2 0 0 0> return code = 0x70000
end_request: I/O error, dev sda, sector 4533105576
Buffer I/O error on device sda1, logical block 566638192
SCSI error : <2 0 0 0> return code = 0x70000
end_request: I/O error, dev sda, sector 4533105584
Buffer I/O error on device sda1, logical block 566638193
SCSI error : <2 0 0 0> return code = 0x70000
end_request: I/O error, dev sda, sector 4533105592
Buffer I/O error on device sda1, logical block 566638194
SCSI error : <2 0 0 0> return code = 0x70000
end_request: I/O error, dev sda, sector 4533105600
Buffer I/O error on device sda1, logical block 566638195
SCSI error : <2 0 0 0> return code = 0x70000
end_request: I/O error, dev sda, sector 4533105608
Buffer I/O error on device sda1, logical block 566638196
usb 1-1: USB disconnect, address 2
SCSI error : <2 0 0 0> return code = 0x70000
end_request: I/O error, dev sda, sector 4533105616
Buffer I/O error on device sda1, logical block 566638197
[...]

Buffer I/O error on device sda1, logical block 270838
scsi2 (0:0): rejecting I/O to dead device
Buffer I/O error on device sda1, logical block 270838
scsi2 (0:0): rejecting I/O to dead device
Buffer I/O error on device sda1, logical block 276472
scsi2 (0:0): rejecting I/O to dead device
Buffer I/O error on device sda1, logical block 276472
scsi2 (0:0): rejecting I/O to dead device
Buffer I/O error on device sda1, logical block 422806275
Buffer I/O error on device sda1, logical block 422806276
Buffer I/O error on device sda1, logical block 422806277
scsi2 (0:0): rejecting I/O to dead device
scsi2 (0:0): rejecting I/O to dead device
scsi2 (0:0): rejecting I/O to dead device

Drobo Dashboard doesn't launch, console shows me crash logs for the ddserviced daemon, which crashes every 10 seconds or so. Reinstalling drobo dashboard doesn't help.

I am so tired. I bought the Drobo so that I can save time, not so that I can run around and service it all the time, jumping through hoops set up by "support" from Data Robotics. I can already see how I'll have to spend several hours debugging the problems, dealing with support, reinstalling things.

I am posting this so that people are warned. Hopefully people will google for "Drobo" before buying it and I will save someone the hassle and frustration.

Will I lose data again this time?

Don't buy a Drobo.

Who's the sheep?

Cory Doctorow tells us we shouldn't buy iPads. Others join him, whining about how iPad makes us all consumers, sheep, or worse, and how we are headed for a future similar to the one in Idiocracy, where we won't be able to do much except consume digital media.

To all those who complain about how un-hackable the iPad is: what have you hacked recently? Have you actually modified any hardware? Written interesting new software for an existing device? Released anything as open-source perhaps?

Well guess what: I have. I have been using Linux for 15 years, on desktops, laptops, handhelds, servers, tablets and embedded devices. I compiled software, fixed bugs, wrote drivers, improved things. I took to my HP-48G with a soldering iron and expanded the memory. I struggled with Linux on a Sharp Zaurus because I believed in an open device. I had to reverse engineer a Fujitsu tablet and write a Linux driver for a microcontroller that serviced the keys and orientation sensor, just so that I could use Linux on that tablet.

And you know what -- life is too short. I'll be buying an iPad so that I can work on more interesting things than making my hardware work properly. I'll use the device to jot down ideas, read articles, write notes, create presentations, sketch diagrams.

I'm not "losing" anything by buying and using the iPad. Just as I don't have to tinker with the jet engine of the airplane that will take me to London, I don't have to tinker with the internals of the iPad. If I want to tinker and hack, I can build a model airplane or an ultralite. In the computer world, there is Arduino, OpenMoko, and many other similar projects. Tinker and hack to your heart's delight and get educated about how electronics and software work on every level.

But I wonder -- why aren't you hacking and tinkering? Where are those "cool ideas from the creative universe" that you need so badly to give to me to run on my hardware?

More importantly, why aren't you designing something better?

Look at you: you have to actively convince people not to buy iPads. This means the product is so good and people want it so badly, that you have to fight the trend. So why hasn't anyone invented and designed a product that is this good and ships with full schematics and has this all-open architecture you crave?

Why haven't you?

If you actually wanted to write software for your iPad, instead of writing lengthy articles complaining about stuff, nothing prevents you from doing so. Just download the SDK and off you go. Yes, you will need Apple's acceptance to sell your app in the App Store, but it's all about ideals, isn't it, so no worries.

I know. It's easier to complain. But who's the sheep now?

Comment

Dear American Website Owner

You live in the United States of America. You design all your forms to have a mandatory "State" field. And then you decide it might just be a good idea to sell to the other 95.4% of the world. But you know, most of the world does not use the concept of a "State" all that much.

The moment you put a "country" field in your form, two things should happen:

  • you should remove the State field if the country isn't set to U.S.A. or at least make it optional
  • you should stop insisting on NANPA-formatted phone numbers (NNN-NNNNNNN)

I write this after wrestling with a number of unbelievably stupid web forms, all of which required me to provide a "State" name (I don't have one), choose from a list of states, or provide a fake phone number just to satisfy a stupid validator routine.

HTML5, H.264 and Free Software: it's the wrong game!

Two important articles appeared in the last few days, both elaborating on why Mozilla is reluctant to adopt the H.264 video codec. Both are well thought out, but Mozilla is playing the wrong game here.

The implied conclusion is that we should all switch to Theora, since that is unencumbered with patents. Well guess what — pretty much every algorithm used in modern video compression is patented. And there are only so many ways you can slice and 2D-DCT a macroblock. There is no reason to believe that Theora is somehow designed "around" all those patents. It might very well be impossible to create a video codec that doesn't infringe on something. This article has a much more realistic approach to the issue at hand.

The game to play is to either abolish the patent system altogether (it has outlived its usefulness), or to make patent claims on algorithms void and unenforceable. Simply avoiding H.264 just because the licensing situation there is sorted out won't get us anywhere. We'll end up adopting something else (be it Theora or On2 VPwhatever) and finding out about patent claims years later, once the codec becomes popular.

Mozilla, HTML5 and H.264

Robert O’Callahan, Mozilla Hacker, wrote an interesting article about why he believes Mozilla should not support the H.264 format.

Other issues aside, I don’t understand why supporting a proprietary Flash plugin from a single vendor is better than opening support for a standardized (albeit similarly patent-encumbered) video format with open-source implementations.

Comment

x86 assembly encounter

Every couple of years I have an encounter with assembly programming. It's funny how rules that applied years ago are useless now. The most recent encounter lasted about two weeks and resulted in a 600x speedup in a critical function. But, all wasn't nice and rosy: it was more difficult than I initially planned, took more time and provided a few surprises.

Key takeaway points, so that I can remember them and so that people googling for answers may find them:

  • If you're looking for the PSRLB (parallel shift right logical bytes) SSE instruction, it isn't there. But there are two ways around it: you can either shift words using PSRLW and then mask out the higher bits, or for shifts with a count of one, use (xmm14 contains 1 in every byte and xmm15 is 0):
   psubusb xmm0, xmm14
   pavgb xmm0, xmm15
  • If you need to "horizontally" sum 16 bytes in an XMM register, you will find that the PHADDB instruction doesn't exist, either. There is PHADDW and you could use that in combination with PMADDUBSW (multiply-add bytes to words), but the resulting sequence of instructions is far from optimal. Fortunately, there is a trick: use PSADBW. This computes the sum of absolute differences, which if you use 0 as the source parameter will correspond to your sum, and stores it in two quadwords, which gets you halfway there. In my case, I simply accumulated the results using two quadwords per register, and combined them at the end.
  • There is a nice PMOVMSKB instruction which converts a byte mask to a bit mask. But why, oh why isn't there an instruction which does the opposite? Extracting a 16-bit mask to a 16-byte register turns out to be painful.
  • The last time I programmed in x86 assembly was using a Pentium 4 with the infamous NetBurst architecture. It was an ugly, unpredictable beast, where a mispredicted branch could cost you a fortune in performance terms. It seems that with the newer Nehalem chips Intel really got things right -- latencies for most instructions are small and predictable and overall performance is more consistent across the board. There are fewer traps. And unaligned data accesses aren't penalized as badly as before!
  • LOOP is slower than
   sub rcx, 1
   jnz .loop

Go figure.

  • Thank God and AMD for FINALLY adding registers. Back in the P4 days it was ridiculous: having a 3GHz processor with only 6 usable general-purpose registers and 8 SIMD ones sounded like a joke.

And the final observation: just as several years ago, the state of x86 assemblers is a sad, sad affair. To use a construction industry metaphor, an average x86 assembler has the complexity and usefulness of a hammer, while the DSP world is using high-speed mag-rail blast-o-matic nail guns with automatic feeders and superconducting magnets. I mean, seriously, do I really have to manually track register allocations?! Manually reorder instructions and measure performance to see which arrangement is faster (hoping not to break any dependencies)? Manually update stack pointer offsets after pushing something onto the stack? Write prologs and epilogs for C-linkable functions myself?

If anybody is thinking about writing or improving an x86 assembler, take a look at what Texas Instruments provides for their DSPs. See how you can write "linear assembly" and have your compiler schedule VLIW execution units for you. See how you don't need a piece of paper with a huge table detailing which registers are used in which part of your code.

I find it ridiculous that the most popular computing platform in the world does not have a decent assembler. What's even worse, from the discussions I've seen on the net, people are mostly interested in how fast the assembler is (?!) rather than how much time it saves the programmer.

Anyway, the net result of this encounter is a function that is about 600x faster than the original C implementation. It is about 4x slower than the theoretical limit (calculated assuming only arithmetic ops, no overhead, no memory accesses, and 16 ops per cycle), which I'm very happy with.

x86 assembly, see you in several years!

UPDATE (22.12.2009): I wrote this post hoping that it will help people searching for the non-existing PSRLB instruction -- and it worked -- I can already see it in the logs!

Folder actions on Mac OS X: usable now?

AppleScript release notes for Snow Leopard (Mac OS 10.5.6):
Folder Actions now attempts to delay calling “files added” actions on files until they are done being copied. Previous versions called “files added” actions on new files as soon as they appeared in the file system. This was a problem for files that were being copied into place: if the file was sufficiently large or coming over a slow server link, the file might appear several seconds before it was done being copied, and running the folder action before it finished copying would behave badly. Folder Actions now watches if the file is changing size: if it stays the same size for more than three seconds, it is deemed “done”, and the action is called.

My experience with folder actions was that they are one big race condition waiting to bite you. It’s something all the tutorials conveniently glossed over. I kept wondering why Apple kept them if they are so obviously unreliable.

Hopefully that change, while far from correct, will make them usable.

Adobe applications on Macs

Merlin Mann:

Because, with Adobe apps, everything from installation through activation through re-activation through software updates through more re-re-reactivations through (HEY! more updates!) is like a giant rectal exam. That I paid for.

Couldn’t have phrased it better myself.

1 Comment

Why I will steal music

Dear Music Industry Executives,

This is to explain why I will “steal” music using BitTorrent, eDonkey and any other easily available means.

I will not do it because I want to save money or because I’m cheap. Far from it. My 600-or-so CD collection packaged into boxes and stored in my basement should attest to it.

I’ve been trying to pay for music online. I really have. I wanted to use the iTunes store, but it doesn’t sell music or movies in my country. I tried to register as a US customer, but a US-based credit card is required to do that.

I managed to buy several albums from Amazon MP3 right when it opened, before it told me my money was not welcome (“Please note that AmazonMP3.com is currently only available to US customers”). Hulu told me its video library can only be streamed from within the United States.

I own a Sonos system, so I tried to get a Rhapsody subscription. But they didn’t want my money (“The Rhapsody MP3 Store is currently only available inside the United States”). Pandora didn’t want me either (“We are deeply, deeply sorry to say that due to licensing constraints, we can no longer allow access to Pandora for listeners located outside of the U.S.”).

Spotify was a glimmer of hope (it isn’t in the US!), until it told me that “Unfortunately, due to licensing restrictions we are not yet available in your country.”

So, in spite of my best efforts over the past several years, I have been unable to pay for music online. And frankly, I’m tired of trying. What difference does it make which country I’m in? Is my credit card any different from any other one? Are my dollars/euros of lesser value?

You have been playing your silly regional games and you think you can keep playing them forever. Make these people wait. Release the album here, see if it gets traction, then price it higher there. Regionalize DVDs to control releases and pricing. Well, the game is over.

From now on, I will have no qualms about downloading digital music. I will continue to buy from sources that want my money (Magnatune, artists like Ronald Jenkees). For everything else, I will just download it. It only takes a couple of minutes anyway.

So, next time you wonder about why your sales and profits are declining, remember — it’s because you didn’t want my money. And perhaps instead of complaining about P2P, hiring hordes of lawyers or buying expensive ad campaigns it is easier to simply start SELLING your stuff to people who want to pay for it.

GTD apps for the Mac: a subjective review

Having tried all major GTD apps for the Mac I thought I’d summarize my thoughts. While many people try to compare features, I would like to concentrate on a more subjective review. After all, a GTD app is something you use on a daily basis, so it isn’t just tables with features that matter.

Since I used all major GTD apps on the Mac extensively (e.g. I moved my entire life into each of them in turn), I think I’m qualified to form an opinion.

There are three major contenders in the Mac OS native application GTD arena:
* OmniFocus
* Things
* The Hit List

There used to be iGTD as well — but it has been discontinued now that its developer joined Cultured Code and works on Things. I used iGTD a long time ago, but found it too heavy on features and too crash-prone.

I should also probably mention TaskPaper, which while cool, isn’t really a full-blown GTD app.

Let’s go through each of the three in turn.

OmniFocus (The Omni Group) is the most mature of the apps. It was clearly developed with lots of user feedback. It is quite complex, with lots of user interface. However, I found that I’m spending lots of time on the mechanics of managing tasks instead of actually doing stuff. There is lots of clicking, tabbing and cursoring around to be had in OmniFocus. Plus there is that ubuquitous Omni inspector thing, which some people love and some people hate. I fall in the second category. I don’t like multiple window apps.

Things is carefully designed to look nice, which scores it a lot of marketing points. It also seems simple to use. I jumped onto it with enthusiasm, also buying the Things Touch iPhone app. But after several weeks problems became apparent. First, Things forces a structure upon you and that structure isn’t very well designed. There are projects, areas and „focuses”, which don’t really complement each other. In theory, Projects are for ordered, sequential lists of tasks, Areas for single-shot tasks and Focuses cut across them, letting you see which tasks you have to do immediately and which can wait. But if this is so, why can’t I schedule a task in a project to be done in the future?

The biggest problem with Things might seem inconsequential unless you realize this happens dozens of times a day. Let’s say I have a task in my Inbox. I know it belongs to a project and I need to start it today. I can either drag it to a project or drag it to „Today”, but in either case the task will disappear from my Inbox. I then have to hunt it down again, searching for it. This is a complete showstopper problem.

Until very recently Things also had no keyboard support at all — even the tabbing order seemed wrong. This has been improved in recent versions, but it is clear the developers never use the app without a mouse.

Things Touch was nice until I filled it with tasks. Then it became so slow that it was virtually useless. Unreliable syncing didn’t help either.

I then tried The Hit List — and after an hour moved my life into it and never looked back. It isn’t perfect, but it gets most things right. Here’s what I really like about the app, all of this is in contrast to the others:
* In the Inbox, you can drag things to „Today” and they still remain in the Inbox, which lets you then assign them to projects,
* There are lists and folders. You can use these lists as projects, areas, shopping lists, anything you want. No artificial distinction into „Areas” and „Projects”.
* Smart Folders let you organize tasks your own way (I have a „Stale tasks” smart folder that picks up untouched stale things for review).
* Insanely great keyboard support. Navigate to a task, press „F”, and then type several letters from any of your project names, press enter and your task gets moved. Similarly for jumping to projects, use „G” and type any subsequence of characters of your project’s name. I wish all apps had this nailed down so well.
* Great interface for repeating tasks. Press „Cmd-R” on a task, type „every week” and the task becomes a repeating one.
* Tabs that let you keep frequently used views easily accessible.
* Auto-suggested tags that really work (surprisingly).

Overall feeling after several weeks of usage was that I was on top of things. I could manage my tasks easily without spending too much time on the mechanics of it.

The Hit List seems to contain everything I wanted from OmniFocus, but with a much better interface. I just hope the author will keep improving it very carefully, without implementing every feature people ask for. In GTD apps, streamlined interface and usability are more important than features!

Roughly quoting Merlin Mann (43Folders.com): „asking which GTD app is better is like asking if mustard is better than ketchup”. Those are subjective choices, hence my subjective review.

Experiments with parallel genetic programming in Clojure

I’ve been experimenting with genetic programming, learning Clojure as I go. I came to the point where I wanted to make my program parallel.

First of all, I am amazed at how readable and concise the code turns out to be. As an example, take a look at this function:

(defn choose-reproducing-parents [individuals]
  (take 2 
    (sort-by :fitness > 
      (select-randomly 5 individuals))))

It doesn’t get more readable than this!

But the real joy came when I started to parallelize my code. Normally, the process would involve extra libraries, lots of fussing around with locks, and hours spent debugging deadlocks. So let’s look at an example function I needed to parallelize. Generating random code takes quite a bit of CPU time, especially if one needs to generate code for thousands of individuals. There is a function for that:

(defn create-random-individuals [number code-generator]
  (map create-individual-from-code (take number (repeatedly code-generator))))

And here’s the reworked parallel version:

(defn create-random-individuals [number code-generator]
  (pmap create-individual-from-code (take number (repeatedly code-generator))))

Can you spot the difference? Yes, that’s it. The little letter ‘p’ is all it takes for the work to be spread among all of my CPU cores.

Other functions required more work (again, notice how concise and readable the code is):

(defn produce-offspring [population number]
  (take number
	(repeatedly #(reproduce
		      (choose-reproducing-parents population)))))

For those unfamiliar with Clojure, repeatedly produces a lazy sequence whose elements are produced by the function (in this case an anonymous function) supplied as the argument. take simply takes the first number elements of a sequence.

And now for the parallel version:

(defn produce-offspring [population number]
  (pmap reproduce
	(take number
	      (repeatedly #(choose-reproducing-parents population)))))

Encouraged by this, I then moved on to parallelize the most time-consuming step of all GP programs: fitness evaluations. I’ll spare you the boring details of extended parallelization work I did on the function (but see examples above). The result was:

(defn test-generation [population fitness-function]
  (pmap #(set-fitness % (fitness-function %)) population))

After fitness was evaluated for all individuals, the next generation was produced and the fitness evaluations were run in parallel again.

This worked fine, but I thought there should be a better way. pmap limits me to a single multicore machine. This is fine for now, but in the future I plan to move to a distributed cluster, where the synchronous nature of map would be limiting. So I tried to write an asynchronous implementation.

First, I defined my pool: a collection where individuals are gathered once their fitness is evaluated:

(def pool (agent (vector)))

The pool is a Clojure agent. Agents are a synchronization primitive: you can send them actions (functions), which will be queued and executed in order.

As you can see, the pool initially starts as an empty vector. So how do individuals get to the pool? Their fitness needs to be evaluated, and then they need to be added to the pool. It all starts with this function:

(defn run-individuals [individuals]
  (dorun
   (map #(send % test-individual *fitness-function*)
	(map #(add-watcher (agent %) :send pool fitness-tested) individuals))))

First, we make each individual an agent, so that we can add a watcher to it. Watchers are a cool Clojure feature — they let you watch for state changes. Using add-watcher we add a watcher to each individual, telling it to send a fitness-tested action to the pool (which is also an agent, remember?). Then, once we’ve set up watching for state changes, we send a test-individual action to each individual, giving it the fitness function as a parameter. test-individual is a really simple function, all it does is call the fitness evaluation function and return the new state of the individual.

The dorun is necessary, because we’re dealing with lazy sequences and discarding the result (sending agent actions is a side effect). If the dorun wasn’t there, the entire sequence would never get evaluated and actions would never get sent.

Let’s see what happens once the pool is notified of a state change in an individual:

(defn fitness-tested [watched-population individual]
  (let [population (conj watched-population @individual)]
    (if (>= (count population) target-population-size)
      (let [new-population (prune-population population)]
	(run-individuals
	 (produce-offspring new-population
			    (- target-population-size (count new-population))))
	new-population)
      population)))

First, we add the new individual to the pool. If we haven’t gathered enough individuals, we simply return the pool with the individual added — this updates the global pool.

The fun part begins when we have enough individuals to produce the next generation. Then, we prune the population, deleting the poorest individuals and produce a new batch of individuals, letting them go using the previously described run-individuals.

If you’ve done any parallel programming, you’ll probably worry about multiple threads modifying (pruning) the population simultaneously. Not to worry — Clojure agents are monitors, you are guaranteed that only one action will execute at a time.

We now have a fully asynchronous parallel GP implementation. Notice how there aren’t any queues, locks, thread pools to manage. All we have is a single global variable and a couple of simple functions. We don’t need any new data structures! The beauty of this solution is that because we’re using agents and watchers, Clojure does the queueing for us. Look, ma, no queues!

I am very happy with how easy and clean this solution turned out to be. I can now see why people keep raving about Clojure. Somebody finally did some serious thinking and implemented a new approach to parallel programming, not just a rehash of old ideas.

In less than an hour I went from a sequential implementation to a parallel asynchronous one. I’d say that’s impressive. And most importantly, the same code still runs on a single-cpu machine, with minimal performance impact. I am very impressed.

Clojure performance revisited

Since many people asked me about this, here are some additional notes about Clojure performance.

First, something which came to me as a surprise: the single biggest performance jump I got with my application was achieved by switching from Java 5 to Java 6 (64-bit, Mac OS X). The jump was huge — from interpreting around 850,000 instructions per second right up to 1,300,000 instr/s. That’s a nearly 60% improvement that required ZERO work on my part. Two clicks in Java Preferences.

Invoking a function is expensive. I am back to old Common Lisp techniques of using macros instead of functions in many places.

Watch out for var lookups (yes, I mentioned this before, but this is important).

The other things I did were application-specific, so there isn’t much point in describing them here.

And if you’re interested in how the JIT performs, here’s a sample run of the application. As you can see, it takes almost 20 runs until the times stabilize at around 1.5 million interpreted instructions per second. The improvement is dramatic: 276% from the first run to the last one.


Executed 616154 instructions in 1.543867 seconds, instruction rate: 399097.84 inst/s
Executed 616154 instructions in 0.653465 seconds, instruction rate: 942902.9 inst/s
Executed 616154 instructions in 0.522443 seconds, instruction rate: 1179370.8 inst/s
Executed 616154 instructions in 0.492671 seconds, instruction rate: 1250639.9 inst/s
Executed 616154 instructions in 0.482119 seconds, instruction rate: 1278012.2 inst/s
Executed 616154 instructions in 0.424934 seconds, instruction rate: 1449999.2 inst/s
Executed 616154 instructions in 0.424169 seconds, instruction rate: 1452614.4 inst/s
Executed 616154 instructions in 0.416273 seconds, instruction rate: 1480168.1 inst/s
Executed 616154 instructions in 0.420429 seconds, instruction rate: 1465536.4 inst/s
Executed 616154 instructions in 0.421797 seconds, instruction rate: 1460783.2 inst/s
Executed 616154 instructions in 0.421114 seconds, instruction rate: 1463152.5 inst/s
Executed 616154 instructions in 0.4115 seconds, instruction rate: 1497336.5 inst/s
Executed 616154 instructions in 0.410837 seconds, instruction rate: 1499753.0 inst/s
Executed 616154 instructions in 0.411064 seconds, instruction rate: 1498924.8 inst/s
Executed 616154 instructions in 0.410936 seconds, instruction rate: 1499391.6 inst/s
Executed 616154 instructions in 0.410301 seconds, instruction rate: 1501712.1 inst/s
Executed 616154 instructions in 0.410638 seconds, instruction rate: 1500479.8 inst/s
Executed 616154 instructions in 0.408832 seconds, instruction rate: 1507108.0 inst/s
Executed 616154 instructions in 0.410466 seconds, instruction rate: 1501108.5 inst/s
Executed 616154 instructions in 0.410113 seconds, instruction rate: 1502400.5 inst/s
Executed 616154 instructions in 0.409741 seconds, instruction rate: 1503764.5 inst/s

Clojure performance tuning

Having done some actual coding in Clojure I can post notes that will hopefully help others. I have code that gets executed a lot (it’s a stack-based language interpreter) and needed to bring it up to reasonable performance. Here are some notes from the process:

  1. You really should know the difference between a seq and a list. If you use any of the seq functions (such as drop or take), your list will no longer have an O(1) count operation. Instead, it will become a LazySeq and count will become O(n). If count is something you call frequently (I do), you will want to avoid this. In my case, my stacks were implemented as lists and count gets called very frequently, so this was a serious and surprising problem. Surprising, because most tutorials skim over the difference, only emphasizing how general seqs are. So if your count performance takes a dive, see if you can replace (drop 2 mylist) with (pop (pop mylist)). The latter will keep the PersistentList structure.
  2. I implemented my stacks as both lists and vectors. There was almost no performance difference between the two (lists were actually slightly faster). I found this to be a surprising result, I expected vectors to be faster. I still think vectors might have lower memory requirements, but I don’t know how to check.
  3. For vector access, (v n) is faster than (nth v n) which is faster than (get v n). This is not something I expected. I think this will get ironed out in the future, as Clojure matures.
  4. Attempts to replace Clojure vectors containing integers with primitive Java int arrays produced no performance gains. In fact, they managed to hurt performance because of some necessary conversions.
  5. As expected, there are very few places where type declarations improve performance. But if you have loops that get executed a lot, you might want to check. Always measure, as the results might be unexpected, and remember that less is more.
  6. When measuring anything running on top of the JVM, let it run for a while before you draw any conclusions. The Hotspot JIT compiler does really cool things with the code, but it does them after a while. In my case, I run code for at least 10-20 seconds and watch the results. I take them into account only after they stabilize. It is common to see a 2x improvement between the first run and the last one.
  7. Accessing vars costs cycles. Clojure has no constants, so many people use *global-parameters*. This is not a good idea in performance-sensitive code. There are two ways around it:
    1. Define macros that expand to constants:
      (defmacro global-parameter [] *global-parameter*)
      and use (global-parameter) in your code.
    2. Enclose your function definitions in let forms, rebinding global constants to lexical variables:
      (let [global-parameter *global-parameter*] (defn my-function [] ...))

As for profiling tools, the old and tried method of actually measuring wall time worked best. I tried the YourKit profiler, but wasn’t impressed, at at $499 it is way overpriced. If they come out with a Clojure edition (drop some features) at $79, I’ll consider. I also tried JVisualVM, but it turned out that it is buggy on a Mac and the profiler doesn’t work. I hope this will be fixed in the future.

PostgreSQL transactions

I’ve been doing some work with PostgreSQL recently. I have a table that stores user ratings, with a UNIQUE constraint:

UNIQUE (author_id, talk_id)

Think of it as a key-value store, just using PostgreSQL. Each user can store only one rating per talk (the latest one). It didn’t make sense for me to use a dedicated key-value database, since PostgreSQL was already used by the application.

Now, how do we deal with storing ratings, so that the UNIQUE constraint is satisfied? Easy! We do a DELETE and then an INSERT, all inside a transaction. The row effectively gets replaced. Nice and easy, right?

Well, not really. When testing, PostgreSQL promptly complained:

Error 23505 / duplicate key value violates unique constraint “ratings_author_id_key” has occurred.

How is that possible, I thought — we were supposed to be in a transaction after all!

Well, it turns out the default level of transaction isolation in postgres is “Read Committed”, not “Serializable” as the SQL standard mandates. So it is entirely possible that in the middle of a transaction someone else will insert a row, and that will cause our INSERT to violate the UNIQUE constraint.

The solution is to either use a MERGE statement, which PostgreSQL still does not support, or set the transaction isolation level to SERIALIZABLE. This, apart from performance implications, means you will get aborted transactions:

Error 40001 / could not serialize access due to concurrent update has occurred.

…so you have to deal with those in your application and retry. A major pain in the rear.

Problem is, most people assume they can simply do a DELETE and then an INSERT inside a transaction, and since the default isolation level in PostgreSQL is not SERIALIZABLE they introduce a race condition. I would venture a guess that there are hundreds of applications online with a race condition of this kind. I’ve seen a Drupal bug report mentioning a “small race condition” of this kind. A “small race condition” is like being “slightly pregnant”, if you ask me.

The cross-platform browser fallacy

As I watch my browser sluggishly scroll a page, I can’t help but think: with all this hardware, why can’t we have decent performance in our browsers?

I have 320 processors1 on my graphics card, together with half a gigabyte of memory. Four cores crunch instructions at over 3GHz, storing data in four gigabytes of main memory. Surely that is enough for smooth page scrolling? I mean, seriously — my 1MHz Commodore 64 could scroll things smoothly!2

The problem with todays browsers is that everyone assumes that a browser has to be cross-platform. A new browser obviously has to rule the world, so it has to run on all the major platforms. And since those have wildly different APIs, a compatibility layer is needed on top of them. Performance, you ask? Oh, it got lost along the way.

Let’s list the benefits that multi-platform browsers bring to users:

  • slow rendering and scrolling
  • code bloat resulting in low performance
  • large memory consumption
  • bugs resulting from overly complicated code

For all the unified experience talk, not a single browser can even sync bookmarks across platforms!3 Don’t believe the hype; there is exactly zero benefit from the multi-platformness of our browsers.

It used to be that for economic and political reasons it was necessary to have a unified multi-platform solution. Mozilla/Firefox had to do it to fight the IE monopoly and the misconception that Internet==IE (also to get money from Google for directing searches to them). Safari and Chrome had to be multi-platform because of Apple’s and Google’s world domination goals. IE didn’t have to be multi-platform, but curiously enough was designed as such anyway, so even though it only runs on Windows now, it carries a load of platform abstraction cruft.

Today, we live in a world where it is generally accepted that there can be many browsers and one has to stick to standards when designing web pages. World domination goals can safely be dropped: you don’t have to rule the world anymore.

Every major platform out there exposes the GPU to applications. There are also platform-specific approaches to threading, networking and disk I/O, providing great performance.

I would like to see a Mac-specific browser4, a Linux-specific browser and a Windows-specific one. Each one should be light and trimmed, make the best possible use of platform-specific features, use the native GUI and integrate with the OS.

Stop building multi-platform monsters, users don’t need them.

 

1 Give or take a hundred.

2 …and do it in the overscan area where supposedly it was impossible to display anything!

3 Foxmarks/Xmarks finally came along and did the job right.

4 Safari 4 is close, but it could do better.

The Javascript trap I willingly step into

A response to Richard Stallman’s The Javascript trap

We used to live in the world where software was the essential value in computing. It would take years for companies to develop software, which then would be sold. Users would buy software, install it and run it.

This is the world that Richard Stallman decided to change. Thanks to him Free Software1 exploded in popularity. Because of the growing number of people contributing to Free Software, its quality became comparable and then superior to that of proprietary software. Companies began to take notice.

Richard Stallman won this battle. The scales have been tipped: with a few exceptions2, it now makes more economic sense for companies to release much of their work as Free Software (or at the very least give some freedoms to users) than to keep the cards to themselves. There are benefits: aside from the good press, they can tap into a developer community, get improvements, coexist with other software.

However, in recent years, value has shifted from software to services. Software itself has less and less value, as it becomes easier to write, and as more of it gets created. It used to be that people would pay for a web server — today they have a number of free choices and no one in their right mind considers developing yet another web server. Many software component types become commodities.

Since it is easier to get lots of good software, we tend to do more with our computers. Our time is more limited, so we place value on things which didn’t matter ten years ago. Convenience and time are major considerations. Another important factor is software maintenance: there is so much software being developed so rapidly that keeping up becomes an issue.

That’s why these days we see more and more online applications. It isn’t because they are better than desktop applications: they aren’t. It’s because they are more convenient to use. You don’t have to install, you don’t have to update and most importantly, you gain additional functionality only available online. It could be storage, backup, online synchronization, data feeds or processing, but without it the application loses a lot of its appeal, or doesn’t make sense at all. So, many of these online applications are really services with an application frontend.

Many people don’t really care if they are able to modify the Javascript that is being shipped to their browser and run there. The reason for this is economic — it doesn’t make sense to invest time and effort into modifying the code. The reason we’re using someone else’s code and service in the first place is to save time and gain convenience. And in the online world it isn’t the application that has value: it’s the combination of application and service.

When considering Richard Stallman’s point of view, you should think about where to draw the line. Do you require your service providers to use Free Software exclusively? If so, you might agree with Richard and you might want to require that all Javascript shipped to your browser be free3.

But for some people the Javascript in the browser is really a part of the service — or part of the service providers infrastructure — and just as they don’t require the provider to only use Free Software, they don’t necessarily require all the Javascript to be under a Free Software License. And more importantly, there is no point in replacing or modifying this code. The whole point of using the code in the first place was to get the service from someone else, instead of writing code oneself.

I have nothing against the conventions proposed by Stallman for free Javascript programs (although I am not a fan of GPLv3 and I do have reservations about all versions of the GPL). However, I think the issue is much less important that it is made out to be. I willingly step into “The Javascript trap”, because I do not, and never will want to modify parts of my service provider’s code. If I become unhappy with it, I will simply stop using the service.

1 “Free” as in “freedom”.

2 Large and/or specialized applications are the exception here: packages like Mathematica and MATLAB, firmware for embedded devices, software for designing airplanes, etc.

3 Javascript libraries, large bodies of code shared across many sites, should be treated differently. They are still “software” in the classical sense. But they are also usually Free Software, so Richard’s article does not apply to them.