Jan Rychter: Blog [EN]

Cloud server CPU performance comparison

Jan Rychter — Thu, 12 Dec 2019 01:00:00 +0100

Alternate titles: "The cloud makes no sense", "Intel Xeon processors are slow", "The Great vCPU Heist".

I recently decided to try to move some of my CPU-intensive workload from my desktop into the "cloud". After all, that's supposedly what those big, strong and fast cloud servers are for.

I found that choosing a cloud provider is not obvious at all, even if you only want to consider raw CPU speed. Operators do not post benchmarks, only vague claims about "fastest CPUs". So I decided to do my own benchmarking and compiled them into a very unscientific, and yet revealing comparison.

I was aiming for the fastest CPUs. Most of my needs are for interactive development and quick builds, in terms of wall clock performance. Which means CPU performance matters a lot. Luckily, that's what all cloud providers advertise, right?

I decided to write up my experiences because I wish I could have read about all this instead of doing the work myself. I hope this will be useful to other people.

Providers tested

In alphabetical order:

Amazon AWS (c5.xlarge, c5.2xlarge, c5d.2xlarge, z1d.xlarge)
Digital Ocean (c-8, c-16)
IBM Softlayer (C1.8x8)
Linode (dedicated 8GB 4vCPU, dedicated 16GB 8vCPU)
Microsoft Azure (F4s v2, F8s v2)
Vultr (404 4vCPU/16GB, 405 8vCPU/32GB)

Why those? Well, those are the ones I could quickly find and sign up for without too much hassle. Also, those are the ones that at least promise fast CPUs (for example, Google famously doesn't much care about individual CPU speed, so I didn't try their servers).

Setting up and differences between cloud providers

Signing up and trying to run the various virtual machines offered by cloud operators was very telling. In an ideal world, I would sign up on a web site, get an API key, put that into docker-machine and use docker-machine for everything else.

Sadly, this is only possible with a select few providers. I think every cloud operator should contribute their driver to docker-machine, and I don't understand why so few do. You can use Digital Ocean, AWS and Azure directly from within docker-machine. The other drivers are non-existent, flaky or limited, so one has to use vendor-specific tools. This is rather annoying, as one has to learn all the cute names that the particular vendor has invented. What do they call a computer, is it a server, plan, droplet, size, node, horse, beast, or a daemon from the underworld?

One thing I quickly discovered is that what the vendors advertise is often not available. As a new user, you get access to the basic VM types, and have to ask your vendor nicely so that they allow you to spend more money with them. This process can be quick and painless with smaller providers, but can also explode into a major time sink, like it does with Azure. There was a moment when I was spending more time dealing with various tiers of Microsoft support than testing. I find this to be rather silly and I don't understand why in the age of global cloud computing I still have to ask and specify which instances I'd like to use in which particular regions before Microsoft kindly allows me to.

Assuming you can actually get access to VM instances, there is a big difference in how complex the management is. With Digital Ocean, Vultr or Linode you will be up and running in no time, with simple web UIs that make sense. With AWS or Azure, you will be spending hours dealing with resources, resource groups, regions, availability sets, ACLs, network security groups, VPCs, storage accounts and other miscellanea. Some configurations will be inaccessible due to weird limitations and you will have no idea why. A huge waste of time.

The benchmark

I used the best benchmark I possibly could: my own use case. A build task that takes about two and a half minutes on my (slightly overclocked) i7-6700K machine at home. I started signing up at various cloud providers and running the task.

After several tries, I decided to split the benchmark into two: a sequential build and a parallel build. Technically, both builds are parallel and use multiple cores to a certain extent, but the one called "parallel" uses "make -j2" to really load up every core the machine has, so that all cores are busy nearly all of the time.

The build is dockerized for easy and consistent testing. It mounts a volume with the source code, where output artifacts go, too. It does require a fair bit of I/O to store the resulting files, but I wouldn't call it heavily I/O-intensive.

Methodology

A single test consisted of starting a cloud server, provisioning it with Docker (both were sometimes done automatically by docker-machine), copying my source code to the server, pulling all the necessary docker images, and performing a build.

The total wall clock time for the build was measured. The smaller the better. I always did one build to prime the caches and discarded the first result.

I tried to get six builds done, over the course of multiple days, to check if there is variance in the results. And yes, there is very significant variance, which was a surprise.

For some cloud providers (Linode and IBM) the build times were so abysmal that I decided to abandon the effort after just two builds. No point in torturing old rust.

I also threw in results for my own local build machine (a PC next to my desk), with no virtualization (but the build was still dockerized), and a dedicated EX62-NVMe server from Hetzner.

Results

I first created rankings for average build times, but then realized that with so much variance, these averages make little sense. What I really care about is the worst build time, because with all the overbooking and over-provisioning going on, this is what I really get. I might get better times if I'm lucky, but I'm paying for the worst case.

The error bars indicate how much better the best case can be. As you can see, in some cases the differences are very significant.

These are the worst-case results for "sequential" builds (see "The benchmark" above for a description of what "sequential" means):

These are the worst-case results for "parallel" builds:

And this is the best case you can possibly get using a "sequential" build, if you are lucky:

The ugly vCPU story

What cloud providers sell is not CPUs. They invented the term "vCPU": you get a "virtual" CPU with no performance guarantees, while everybody still pretends this somehow corresponds to a real CPU. Names of physical chips are thrown around.

Those "vCPUs" correspond to hyperthreads. This is great for cloud providers, because it lets them sell 2x the number of actual CPU cores. It isn't so great for us. If you try hyperthreading on your machine, you will see that the benefits are on the order of 5-20%. Hyperthreading does not magically double your CPU performance.

If you wondered why everybody was so worried about hyperthreading-related vulnerabilities, it wasn't because of performance loss. It was because if we pressured the cloud providers, they would have to disable hyperthreading, and thus cut the number of "vCPUs" they are selling by a factor of two.

In other words, we now have a whole culture of overselling and overbooking in the cloud, and everybody accepts it as a given. Yes, this makes me angry.

Now, you might get lucky, and your VMs might have neighbors who do not use their "vCPUs" much. In that case, your machines will run at full (single-core) performance and your "vCPUs" will not be much different from actual CPU cores. But that is not guaranteed, and I found that most of the time you will actually get poor performance.

Intel® Xeon® processors are slow

There. I've said it. These processors are slow. Dog slow, in fact. We've been told over the years that the Intel® Xeon® chips are the powerhouses of computing, the performance champions, and cloud providers will often tell you which powerful Xeon® chips they are using. The model numbers are completely meaningless at this point, which I think is intentional confusion, so that even a 6-year old chip branded with the Xeon® name appears to be powerful.

Fact is, Xeon® processors are indeed very good, but for cloud providers. They let them pack lots of slow cores onto a single CPU die, put that into a server, and then sell twice that number of cores as "vCPUs" to us.

Now, if your workload is batch-oriented and embarassingly parallel, and if you can make 100% use of all the cores, then Xeon® processors might actually make sense. For other, more realistic workloads, they are completely smoked by desktop chips with lower core counts.

Of course, if this were the case, then everybody would buy desktop chips. Which is why Intel intentionally cripples those, removing ECC RAM support, thus making them more unreliable. And desktop chips are inconvenient for cloud providers, because you can't get as many "vCPUs" from a single physical server. Still, there are providers where you can get servers with desktop chips — Hetzner, for example, and these servers come out at the very top of my performance charts, being a fraction of the cost.

In other words, what we actually buy when we order our "Powerful compute-oriented Xeon®-powered VM" is a hyperthread on a dog-slow processor.

Enterprise shmenterprise

But, I can hear you say, this is wrong! Intel® Xeon® processors are for ENTERPRISE workloads! The serious stuff, the real deal, the corporate enterprisey synergistic large-mass cloud computing workloads that Real Enterprises use!

Well, my build is mostly Java execution and Java AOT compilation. Dockerized. That enterprisey enough? There is also some npm/grunt (it's a modern enterprise), with a bunch of I/O. It can make use of multiple cores, although not perfectly. I'd say it's the ideal "enterprise" use case.

Seriously, Xeon® chips are just plain slow. The benchmarks show it, especially in the single-threaded CPU performance part. They still rank relatively well in the multi-threaded benchmarks, but remember, a) your code is not embarassingly parallel most of the time, b) you will be renting 4-8 "vCPUs" (hyperthreads), not 16 actual cores that you're looking at in the GeekBench results.

Takeaways

If you want to spin up a relatively fast developer-friendly cloud server for software development, I'd say that Vultr and Digital Ocean are the top picks.

Digital Ocean is by far the most user- and developer-friendly. If you have little time, just go with them. Things are simple, make sense, and are fun to use. As an example, Digital Ocean lets you configure firewall rules and apply them to servers based on server tags. Any server deployed with a certain tag will then use those firewall rules. Simple, makes sense, quick and easy to use. Now go and try doing the same in Azure, let us know in a week how things are going.

Vultr has some rough edges, but is a very promising provider. Almost as user-friendly as Digital Ocean (but no docker-machine driver!). If you want to use attached storage, you will run into problems (attaching storage reboots the machine, which their support tells me is expected behavior).

You can get slightly faster machines at AWS if you pay a lot more. The z1d instances are advertised as fast. My testing shows them to be only slightly faster, which probably isn't worth the price increase over a c5.2xlarge.

Buying more "vCPUs" often gets you better performance, even for the sequential build case. This is a bit surprising, until you realize that you are buying hyperthreads on an over-provisioned machine. If you buy more hyperthreads, you push out the neighbors and "reserve" more of the real CPU cores for yourself.

The best performance comes from… desktop-class Intel processors. My old i7-i6700K is near the top of the charts, so is Hetzner's EX62-NVMe server with an i9-9900K. The EX62-NVMe is 64€/month, so for development it might make sense to just rent one or two and not bother with on-demand cloud servers at all.

Apart from Hetzner's desktop CPU offerings, there seems to be no way to get a cloud server with fast single-core performance.

Another conclusion from these benchmarks is that I decided to buy an iMac for my development machine, not an iMac Pro. Sure, I would like to have the improved thermal handling of the iMac Pro, as well as better I/O, but I do not want the dog-slow Xeon® processor. Perhaps it makes sense if you load all cores with video encoding/processing, but for interactive development it most definitely does not, and a desktop-class Intel CPU is a much better fit.

Spark E-mail app: why I don't use it anymore

Jan Rychter — Fri, 20 Jul 2018 02:00:00 +0200

I've been using the Spark iOS E-mail app for almost as long as it exists. Better in every way than Apple's built-in Mail and nicely designed, it was a joy to use.

A couple of months ago Readdle announced Spark 2, "an e-mail experience built for teams". I am not a team, but I wished Readle all the luck with the new business model. However, all this talk of team functionality made me slightly suspicious, as it seemed such features would be hard to implement without Readdle reading my E-mail server-side.

But I'm not a team, and I didn't sign up for anything new, and the app didn't ask me whether I will allow Readdle servers to access and read my E-mail, right?

As it turns out, if you enter your E-mail login and password into Spark 2 on an iOS device, that login and password will be sent to Readdle's servers, stored there, and used to access your E-mail.

So, what's the problem?

E-mail credentials are the keys to the kingdom. If you want to seriously disrupt somebody's life, get access to their E-mail. Most sites do not implement 2-factor authentication and will happily allow an E-mail password reset, so E-mail access gives any attacker instant access to most online accounts.

A confirmation E-mail is used when signing up for new services. Receipts are stored in E-mail archives. Lots of personal information is in E-mail. Nearly all E-mail is unencrypted and unsigned, and many people will trust an E-mail that they receive without question.

What's more, if my mobile device has my E-mail password, there are certain limits on what it can do. It probably won't train a machine-learning model on all 20GB of my archives, or extract all image attachments to get geo-positioning data from them. But there are no such limits server-side. If Readdle's servers have my password, they are free to download, read and process as much of my E-mail as they want to, whenever and however they want to.

I trusted Spark on iOS with my E-mail password, expecting that the app will keep it to itself on the device. iOS devices are reasonably secure, and there are limits to what a mobile app can do, so it was a compromise I was willing to make.

I never agreed for my password to be sent to online servers, stored there and be used to access my E-mail. That's an entirely different implied contract, and I'm not happy with it.

It's worth noting that guarding my password suddenly becomes much more difficult when it's stored on servers, and I think the risk of a breach is too high.

Clarity in communication

Since the app never asked me if I'm fine with my password being sent to and stored on their servers, I looked into the Privacy Policy. Here are the relevant parts:

Email address: As an email client, the core functionality of our Product is based on providing you with the ability to manage your email. For this reason, Spark services access your email account when you start using the App. […]

That sounds entirely reasonable. I don't know what "Spark services" are (they aren't defined in the policy), but I assume they must be parts of the E-mail app that run on iOS, right?

OAuth login or mail server credentials: Spark requires your credentials to log into your mail system in order to receive, search, compose and send email messages and other communication. Without such access, our Product won’t be able to provide you with the necessary communication experience. In order for you to take full advantage of additional App and Service features, such as “send later”, “sync between devices” and where allowed by Apple – “push notifications” we use Spark Services. […]

This also sounds reasonable and doesn't indicate that my credentials are being sent anywhere, right?

Except if you substitute "Spark services" with "online servers in the cloud". Oh, wait.

I do not know if it was Readdle's intention to hide the fact that "Spark services" are really "servers in the cloud". I do not suspect them of ill will, but I consider all this to be a serious lapse in judgment.

Here is what I would expect:

Do not force non-team users to share their credentials server-side. There is no reason to.
Ask clearly for permission to "SEND AND STORE YOUR PASSWORD ON OUR ONLINE SERVERS WHICH WILL ACCESS YOUR E-MAIL". It should be very clear to the user what is happening. The wording an presentation should make it difficult to accidentally agree. The users takes on additional risks by agreeing, so be clear about those risks.
Replace "Spark services" with "our online servers in the cloud" in the Privacy Policy.

As for me, I stopped using Spark immediately and deleted it from all my devices. I do not trust it anymore. I miss it (Readdle makes really good apps), but trust is important.

Leaving Squarespace

Jan Rychter — Mon, 28 Sep 2015 02:00:00 +0200

After several years the time has come to move my blogs from Squarespace. It was a strange relationship: I run my own servers and I'm certainly capable of implementing my own blogging solution, but using Squarespace was just easier. I could never find the time to do something of my own. So, even though I wasn't entirely happy with how Squarespace worked, I kept paying to have my pages hosted there.

The proverbial straw that broke the camel's back came recently. As I was leaving on vacation, I got an E-mail from Squarespace about them being unable to charge my credit card for another month. Not surprising, as my credit card expired a couple of weeks ago, so I had a new expiration date and CVV code. What was surprising, though, was that Squarespace immediately proceeded to turn off my blogs and pages. I gave them my new expiration date and CVV code, but they said I have to re-register again. I asked customer support for one week of grace period, as I was on vacation with poor internet connectivity, but the answer was a definitive "No". My pages went 403.

Think about it. This is a company that takes pride in customer support. I have been a loyal customer for several years. And now, they are unable to give me one week of grace period? They begin with taking everything offline and responding with a 403?

This is not intended to be a Squarespace review — but if you're considering hosting your blog/pages with them, you should take these points into account:

There is no „relationship“: the moment your credit card can't be charged, Squarespace will take your pages offline. As in "HTTP 403 Forbidden" offline.
Customer support, while very responsive and polite, is only useful as an intelligent manual. They will help you with finding settings, but anything that would result in changes to the code is off-limits. As an example, I've been asking for years to make a change to the code that generates URLs for blog posts. They remove charactes with diacritics, instead of replacing them with ascii-lookalikes (so „łódź-2014“ gets transformed into „d-2014“ instead of „lodz-2014“). This is an eyesore and a disaster for SEO, and yet I could never get them to fix it. And I first reported it in March 2010.
If your site is multi-lingual, or even non-English, you will have a rough road ahead of you.
Squarespace will lose some of your data over time. Migration from Squarespace 5 to Squarespace 6, for example, lost high-resolution versions of my images. Only the thumbnails made it through. Some of the formatting was lost, too. It is up to you to write CSS to correct the more glaring problems.
Your data is held hostage. The export functionality is poor, broken and Squarespace has no interest in fixing it. While trying to write an importer for their XML export, I encountered a number of issues and reported them. After two months I finally got a definitive answer: the issues will not be fixed (more on this coming soon in a separate blog post). Only one issue got fixed: non-ASCII characters in exported comments are no longer lost (!).

This blog (and all my other pages) have been moved to my own server. I find it disappointing that I have to write my own blogging software (it's 2015 after all), but I'm getting used to it — I recently had to do the same thing to have private photo galleries for sharing with my family.

Goodbye, Squarespace.

System perspective: channels do not necessarily decouple

Jan Rychter — Wed, 20 May 2015 10:27:38 +0200

Clojure's core.async channels provide a great metaphor for joining components in a system. But I noticed there is a misconception floating around: that channels fully decouple system components, so that components do not need to know anything about one another. As a result, it's easy to go overboard with core.async. I've seen people overuse channels, putting them everywhere, even when a cross-namespace function call would do just fine.

What channels do provide is asynchronous processing. The "degree" of asynchronicity can be controlled — we may choose to always block the caller until someone reads from a channel (thus providing a rendezvous point), or we may choose to introduce asynchronicity, letting the caller proceed without worrying about when the value gets processed.

Since you can put anything onto a channel, it's easy to forget that this "anything" is part of the protocol. Just as functions have an argument signature, channels have value signatures, or protocols: the common data format that all parties agree on.

It isn't true that channels fully decouple components, so that "they don't need to know anything about one another". You still need a wire protocol, just as with directly calling functions in another component. Channels do decouple things in time: you are not forced to a synchronous execution model and can control when things are being processed. But they are not a magic component decoupling wand, so don't use them when a simple synchronous function call will do.

Hard Drive Encryption, revisited

Jan Rychter — Tue, 03 Mar 2015 11:45:13 +0100

Hard Drive Encryption, revisited

Several years ago I made a comment on Hacker News (full discussion) about full-disk encryption performed by the hard drives themselves. Basically, the idea is that you give your hard drive a password/key and hope that it transparently encrypts your data before it hits the platters (or flash memory for SSDs).

I wrote:

That kind of encryption is useless, because I can't audit it. How do I know my data really IS encrypted and the key isn't just stored on the drive itself?

Now, Hacker News has a number of well-known people, who have a following. Opposing their opinions is not popular. Notice how my to-the-point response to tptacek gets downvoted.

Anyway — I feel somewhat vindicated by the recent revelations of hard drive firmware hacking by the NSA. I was right: you can’t and shouldn’t trust your hard drives. If you care about encryption at all, your drives should see the data already encrypted.

I2C driver for Freescale Kinetis microcontrollers

Jan Rychter — Wed, 17 Dec 2014 18:44:28 +0100

I wrote a tiny driver that allows you to access I2C (IIC, I²C, or I squared C) on Freescale Kinetis microcontrollers. It works asynchronously (interrupt-driven), supports repeated start (restart) and does not depend on any large software framework.

The code is on Github and the driver has its own page, too, with a README.

Even though it isn't fully-featured or well tested, I have a good reason for publishing the code. I wrote this and then had to put my Kinetis-related projects on hold. After several months I forgot having written this driver and started searching online for one… only to finally find it using Spotlight, on my hard drive. This is what happens if you work on too many projects.

To avoid this happening in the future, I now intend to publish most things I write as soon as they are reasonably complete, so that I can find them online when I need them.

Fablitics Launch

Jan Rychter — Tue, 04 Nov 2014 13:53:49 +0100

We have just launched Fablitics — our friendly business intelligence and E-commerce analytics solution.

The driving idea behind Fablitics is that only meaningful numbers and graphs should be shown in business intelligence software. We tried to understand which numbers are important and can help decision making, and which bring little value and only confuse.

Early on we noticed that many analytics-type products show lots of data, but most of that data isn’t related to actual business.

Let’s take an example: page views and visits in an online store. They are related to the business, but very remotely. Estimating performance based on page-views could be compared to estimating the performance of a supermarket by the number of cars in the parking lot. Sure, that number is correlated with supermarket sales. Yes, probably the more cars there are in the lot, the better. Yes, the days when there is more traffic will likely bring in more revenue. But the relationship is too weak to allow meaningful conclusions.

So, when designing Fablitics, we decided to focus on fundamental business concepts: customers, products, sales. Instead of showing the number of page views, we count customers that visit the store. We determine which customers enter the store for the first time, and which are returning. We know how much each customer purchased, and we also know how the customer was referred to us, so we can put a monetary revenue value on advertisement campaigns.

All this is based on a rethinking of what analytics software should do. In our opinion, as long as the purpose is to improve the business, it should be strongly rooted in business concepts.

If you run an online store, you can sign up now for a free trial http://fablitics.com/ — no credit card required.

Lsquaredc: accessing I²C in Linux

Jan Rychter — Tue, 20 May 2014 09:40:01 +0200

It might seem that writing I2C libraries is my favorite activity, but it really isn't. This library is not something I expected to write, but since I had to, I'm releasing it in the hope that it will save others time and frustration.

Lsquaredc is a tiny Linux library that allows you to access I2C (IIC, I²C, or I squared C) from userspace without causing excessive pain and suffering.

When I tried accessing I2C on my BeagleBone Black, I was sure it would be obvious. Well, I was wrong. It turned out that the usual way is to read/write data using read() and write() calls, which doesn't support restarts. And if one needs repeated start, there is an ioctl with an entirely different and difficult to use interface.

For more information, see the Lsquaredc page and its Github repository.

Designing a High Voltage Power Supply for Nixie Tube Projects

Jan Rychter — Sun, 04 May 2014 22:07:47 +0200

PCB layout for the switch-mode HV PSU

I've posted a page describing the design of a HV PSU (High-Voltage Power Supply) that generates up to 220V from a 12V input. In addition to that, it also provides 2*Vout (so, up to 440V, for dekatrons), and two outputs for powering digital logic: 5V and 3.3V. The primary HV boost circuit reaches 88% efficiency when going from 12V to 185V at 55mA, with a 3% output ripple.

The version I'm posting online is not perfect, but works quite well in a number of my projects. I decided I'd rather publish it as it is now rather than keep it locked forever.

It is published as Open-Source Hardware, to be used however one likes. All source design files are provided. It's my way of paying back: I learned a lot from looking at other designs and by asking questions, so now it's time to give back.

Dollhouse built from laser-cut plywood

Jan Rychter — Thu, 01 May 2014 20:45:09 +0200

I wanted a dollhouse for my daughter. But, as it often happens, I couldn't find anything I liked. I wanted it to be tall, with multiple floors connected with stairs. I wanted every room easily accessible, with few external walls. And I also wanted it built in a way that would allow four kids to comfortably play together.

I ended up designing my own dollhouse in a CAD program, then laser-cutting it in 5mm plywood. The structure is held together by mortise-tenon joints, with just a little glue so that it doesn't fall apart when picked up. It's amazing how precisely you can cut plywood with a laser.

This is my second design in laser-cut plywood (the first was an Art-Deco inspired Nixie clock) and I feel I've learned a lot. My main discoveries so far:

Plywood will warp. Not as much as wood, but expect large surfaces to eventually warp. You can either ignore it (it might not matter), or design additional support structures.
Laser cutter is incredibly precise, but your plywood often isn't. You can't rely on the "official" thickness. I found 5mm plywood to be anything from 4.75mm to 5.25mm (and that is supposedly pretty good). Measure your particular batch and design your structure for the measured thickness. It really helps to use a parametric modelling CAD program, so that you can change the thickness anytime.
It is easy to design a structure, but more difficult to design a structure that you can assemble. I discovered the hard way that some designs simply can't be assembled (parts block one another and there is no order of putting things together that will allow you to complete the structure).
Your mortise-tenon joints will fit even if you make both the hole (mortise) and the peg (tenon) the same size. Cutting laser thickness provides enough room. Still, unless you have perfect quality plywood, it is better to offset your hole edges and make holes slightly larger.

I'm quite happy with the results so far and will certainly use this method for other projects.

I2C using USI on the MSP430

Jan Rychter — Thu, 09 Jan 2014 18:27:09 +0100

I've released a tiny library that implements I2C master functionality for MSP430 chips that have the USI module (MSP430G2412 or MSP430G2452 are ones that I use often).

The code is on GitHub, it is MIT-licensed, so you can do whatever you want with it.

From the README:

Features

Small.
Works.
Reads and writes.
Implements repeated start.
Uses the Bus Pirate convention.

Rationale

I wrote this out of frustration. There is lots of code floating around, most of which I didn't like. TI supplies examples which seem to have been written by an intern and never looked at again. The examples are overly complex, unusable in practical applications, ugly and badly formatted, and sometimes even incorrect.

The MSP430G2xx2 devices are tiny and inexpensive and could be used in many application requiring I2C, but many people avoid them because it is so annoyingly difficult to use I2C with the USI module.

This code is very, very loosely based on the msp430g2xx2usi16.c example from TI, but if you compare you will notice that:

the state machine is different (simpler): see doc/usi-i2c-state-diagram.pdf for details,
it actually has a useful interface,
it is smaller.

Limitations

This is a simple I2C master that needs to fit on devices that have 128 bytes of RAM, so scale your expectations accordingly. There is no error detection, no arbitration loss detection, only master mode is implemented. Addressing is fully manual: it is your responsibility to shift the 7-bit I2C address to the left and add the R/W bit.

Have fun!

Bus Pirate Reference Card

Jan Rychter — Thu, 14 Nov 2013 11:38:02 +0100

The Bus Pirate from Dangerous Prototypes is a really useful tool. I use it regularly to bring up and test new I2C devices, or as a cheap protocol analyzer for I2C and SPI.

Unfortunately, the supplied cables aren't clearly labeled (only color-coded), and I found the reference cards available on the net lacking: they were usually not clearly readable and didn't help much. So I created my own.

You can print it on A4 paper and laminate it (which is what I did), or cut away just the upper color-coding portion and make a smaller laminated card.

Enjoy — Bus Pirate Reference Card (PDF)

Kinetis K and L Series Power Consumption

Jan Rychter — Wed, 13 Nov 2013 12:09:39 +0100

While considering a new microcontroller to use, I looked at the power consumption figures for the Freescale Kinetis K and L Series. Having some experience with various MSP430 MCUs, I am used to shaving off microamps and running systems on battery power, sometimes for years.

While the real answer can only be gotten with a real setup, one can get some preliminary information from the datasheets. And I found some surprises lurking there.

I was mostly interested in comparing three MCUs: KL05, KL25 (basically a KL05+USB) and the K20. The K10 which appears in the figures can be thought of as the K20 without USB, and I didn't expect its power figures to be any different from the K20.

The ARM Cortex M0+ has a range of power modes, so comparisons aren't easy. I left out those that basically amount to a full power-down (losing the contents of the RAM). The numbers do not include analog supply current, and are taken at 3V, 25C. The RUN values are at full core frequency (48MHz) with a while(1) loop executing from flash. VLPR and RUN values are with peripheral clocks enabled. Let's look at the first four modes:

As expected, the Cortex M4 core draws significantly more current when running at full speed. About three times more than Cortex M0+, in fact. Note that this doesn't necessarily mean the final product will use more power — if you draw three times more current, but get your computations done three times as fast, you're basically even.

Note what begins to happen in STOP mode, though. The power draw for all chips is nearly the same. Continuing on into the lower-power modes (note the change of scale to 5µA max):

Apart from the strange discrepancy with KL25 in VLPS mode, all numbers are nearly identical.

This is not something I expected — although in retrospect, it makes sense: a stopped core and stopped peripherals consume little to no current, and you can stop nearly everything on all MCUs. Still, this has some real-life implications.

My takeaway from this is:

if your application spends most time in deep sleep modes, then it doesn't really matter which MCU you choose, from the power consumption point of view they are nearly the same,
if you use the CPU a lot, the choice isn't at all clear, and will really depend on the application and its computation patterns.

For comparison, let's look at the MSP430F5510, a chip comparable (from the application point of view) with the KL25, though the CPU is significantly less powerful. Values are in mA, taken at 3V, 25C, RUN values at full frequency (25MHz) running from flash:

The interesting fact here is that the F5510 at 25MHz running code consumes about as much current as the KL05 or KL25 running at 48MHz. ARM Cortex M0+ really is a very power efficient core. It makes sense: the MSP430 is a rather old design. But note how years of experience allow TI to achieve the impressive LPM0 number: 83µA. LPM0 is the lightest sleep mode on the MSP430, roughly comparable to WAIT mode (core stopped, peripheral clocks active, ability to wake up on any interrupt). When you get down to deep-sleep modes (LPM3), the numbers become more comparable.

Take all this with a grain of salt, as it is only based on datasheet numbers. Still, I found the data interesting and worth sharing.

TI MSP430 vs Freescale Kinetis: a price comparison

Jan Rychter — Wed, 06 Nov 2013 10:02:17 +0100

I've been using MSP430 microcontrollers for a while now, but I wanted to look into something more powerful, but not necessarily much more expensive. I believe component cost is a major factor when designing electronics (for many reasons).

I found the Freescale Kinetis line of devices, which looks really good capability-wise. So let's try to compare the pricing of (some) MSP430 devices and (some) Kinetis K and L series devices, with a Tiva-C (Stellaris) thrown in for comparison.

Here's the table of prices compiled from Farnell. The prices are in PLN, but it really doesn't matter that much, I cared mostly about the relative pricing. Divide by 3 to get USD if you need to. All prices are from the first price break (usually qty 10), and I only chose devices that make sense to me in my projects.

Things I noted:

The low-end Kinetis KL05 devices (MKL05Z32VFK4, MKL05Z32VFM4, MKL05Z32VLC4 and MKL05Z32VLF4) are almost the same price as the tiny MSP430G2412 (~$1.60-$2 or so). That is amazing, considering they are modern 48MHz 32-bit ARM processors, have 4 times the flash memory, 16 times the RAM, and many more peripherals (G2412 has no ADC).
It makes little sense to use the MSP430G2553 unless you already have designs using it (I do).
The cheapest USB-capable device in the kinetis line is the KL25 (MKL25Z128VFM4). It is slightly more expensive than the USB-capable MSP430F5502, but with much more flash and RAM. Also, Farnell doesn't sell cheaper KL25 devices, I suspect they would end up being the lowest cost USB solution.
The first K10 (MK10DN32VFM5) at roughly $2.50 is a 48MHz Cortex-M4 core with DSP instructions.
The first K20 (MK20DN32VLF5) is only 3 times as expensive (at $3.65) as the tiniest MSP430 I use. That is also amazing.
Tiva-C (TM4C123AH6PMI) is only price-competitive if you need large memory sizes and floating-point. Sadly, it makes zero sense to me as a hobbyist — also because I could not find Code Composer licensing terms for Tiva-C devices. For MSP430 there is a 16kB memory limit.

My key takeaways from this:

I will switch to Freescale Kinetis (KL05, KL25 and K20) for most of my new projects. The chips are good, inexpensive, and the tools are free (CodeWarrior is free up to 128kB, which covers all low-end devices I might want to use).
I will continue to use MSP430 in either legacy projects, for really simple projects, or for designs where every microamp counts (another blog post on power consumption of MSP430 vs Kinetis is coming).
Sadly, the Tiva-C (formerly Stellaris) which I had high hopes for, isn't an option at all. It's way too expensive plus it isn't clear if developer tools will be free.

Home-made reflow oven for SMD

Jan Rychter — Mon, 07 Oct 2013 11:52:23 +0200

I wanted to build electronic devices with SMD components. From what I can see, many people shy away from SMD, trying to solder QFN components with a soldering iron and fighting the trend. I have no idea why.

I find SMD components easy to work with, cheap and small. If you restrict yourself to components 0603 or larger (I use mostly 0603 in my designs) and don't try to use BGA components, you'll be fine. Even the dreaded QFN packages aren't a problem at all.

For reflowing boards you can just get solder paste in a syringe, apply it manually, place your components manually, and then reflow the board. For tiny boards even a hot-air soldering station is enough, for larger ones, it turns out you can get decent results from a cheap assembly involving a tiny oven, a thermocouple (to measure the temperature), a thermocouple interface chip (I used the MAX31855), an SSR (solid-state relay) and a TI MSP430 Launchpad to control it all.

See the graph: it's a plot of temperature vs time. Sure, it isn't perfect, but it's good enough for amateur work.

EuroClojure 2012 impressions

Jan Rychter — Mon, 28 May 2012 11:12:57 +0200

I just got back from the EuroClojure 2012 conference. I won't try to summarize all the talks here, just convey some general impressions that I got:

Clojure community is awesome. People are incredibly nice to each other. Ask a question on the mailing list, and you'll get a number of replies from people much smarter and more experienced than you. Same at the conference: you can approach anyone and expect to get great advice even if the competence gap between you and the other person is comparable in size to the Grand Canyon.
A lot of the talks focused on how to approach difficult and complex real-life problems better. Those weren't talks about syntactic sugar, "best practices", or new "features" the language should have. Instead, speakers presented results of months of thinking and experimenting: new architectures, new approaches, new ways to think about problems. If semicolons were discussed, it was a discussion about how to preserve them while doing source-code transformations. This is incredibly important: you can't overestimate the value of listening to smart people talk about ideas they thought hard about and developed for many months. I had several "aha!" moments where I suddenly saw that the architectures I developed were a poor-man's subset of a more general solution.
There was a focus on building real systems. I'd say that hobbyists and academics were in the minority: most people were there to learn how to better build code that makes money. It was also interesing to see the range of sizes of companies that use Clojure: from one-person consultancies through startups and small web-development shops, all the way through large financial institutions.
The median age of participants was probably between 35 and 40 years. This is clearly not a bunch of teenagers, but rather a group that gained significant experience in various languages and then moved on to Clojure. I think this has a lot to do with my first point — the community is both incredibly nice and mature, which often go together.

Overall, the conference was a success — more so than I expected. The organization was nearly flawless (not an easy task, I know, so hats off to Marco Abis). And it is now very clear than a European conference about Clojure is necessary. I'm looking forward to EuroClojure 2013!

Fixing a Tektronix 2246A oscilloscope

Jan Rychter — Thu, 26 Jan 2012 12:13:40 +0100

I bought an (obviously used) Tektronix 2246A oscilloscope at an auction site. It wasn't expensive, it looked fairly good and the seller was nice and provided start-up warranty.

After getting the scope it turned out that channel 3 was dead. Channels 1, 2 and 4 worked fine, but channel 3 would just draw a flat line, no signals registered.

I negotiated a discount deal with the seller and set on to repair the thing. Thank heavens for the extremely detailed service manuals! The manual for the 2246A Mod A wasn't difficult to find. I opened the enclosure and after performing the standard self-test procedures suggested by the manual started following the channel 3 signal path with an old scope I had.

Tektronix 2246A main board

I quickly discovered that the input signal was fine right until it entered the preamp IC (U230), a custom Tektronix chip. The inputs were fine, the output was flat. I was rather disappointed, as a failed IC would mean I'd need to find a replacement, which would be neither easy nor inexpensive.

But -- I then supplied the same signal to channel 4 and started comparing the two paths. According to the schematics, they should be identical. When I got to the preamp IC for channel 4, I started comparing the preamp ICs for both channels pin by pin. And… it turned out that the enable signal for U230 wasn't there!

One of these four solder joints is unlike the others...

The enable signals are generated in the U600 ("slow logic ic") chip, at the back of the main board. This is where things got weird, as the service manual I had diverged from the board I had in front of me. Not sure why, but clearly my main board was very different, with U600 and U602 placed in completely different locations and rotated. But, with some detective work I managed to figure out which chip is the U600 "slow logic ic". And the enable signal for channel 3 preamp was clearly there on the chip's pin. So I started following the signal and bingo! One of the resistors on the CH 3 EN signal path wasn't soldered at all!

Now, I'm not sure how this is possible -- would a cold solder joint produce such an effect? But clearly the connection was broken. Fortunately, this was easily fixable with a soldering iron, and a couple of minutes later I had my scope with all 4 channels working just fine.

Given that I paid about $200 for the scope, I'm quite happy with the end result.

[Posts like this one are written for search engines: one day someone might be looking for repair tips and find this page. I hope it helps someone.]

Making your Targus Bluetooth Presenter actually usable

Jan Rychter — Tue, 08 Mar 2011 23:12:11 +0100

[Updated 10.10.2014]

Here's a tip that will come in useful if you'd like to use a Targus Bluetooth Presenter (AMP11US or AMP11EU) with your Mac.

It seems that the Targus wireless presenter remote (an otherwise nice device) was designed by a committee of morons, none of which actually ever gave any Keynote presentations.

Apparently someone at Targus said that the buttons are supposed to be for "Next Slide" and "Previous Slide", which other people took literally, so the buttons just jump over to the next slide, skipping any builds or transitions that you might have in place. All you can have is flat slides. Goodbye builds, goodbye special effects, goodbye bullet points, goodbye movies. The buttons generate "Shift-CursorDown" and "Shift-CursorUp", forcibly skipping over anything that isn't a full slide.

Am I being unnecessarily harsh calling the designers "morons"? I don't think so. If you design a device whose only purpose is to facilitate presenting, and then you create a version specifically for the Mac (I quote from the Targus web page: "the only wireless presenter dedicated to Mac users") to be used with Keynote — is it too much to ask that you design it so that the two keys on the device actually perform useful functions? I mean, seriously — two keys, next step and previous step, how hard is that?

It's also rather clear that most "reviews" that you can find online are junk and the "reviewers" haven't actually used the device to perform presentations.

Fortunately, there is a solution. There is a small, free utility called KeyRemap4Macbook. (UPDATE 10.10.2014: the utility has since been renamed to Karabiner). Download it, install it (requires a restart), then go into your Mac OS X Preferences, access the KeyRemapper panel, and from within its last settings pane access the private.xml file that stores custom key mappings.

Once you get there, enter the following:





Targus Wireless Presenter Keynote Fix
private.targus_wireless_presenter_keynote_fix
--KeyToKey-- KeyCode::CURSOR_DOWN, VK_SHIFT, KeyCode::CURSOR_RIGHT
--KeyToKey-- KeyCode::CURSOR_UP, VK_SHIFT, KeyCode::CURSOR_LEFT

Save the file, go back to the first pane of the KeyRemap configuration and click "Reload XML". You might also want to check the box that says "Don't remap an internal keyboard".

And there you go — what this does is remaps the useless key combinations that the Targus Presenter generates to simple "cursor right" and "cursor left", which do the right thing in Keynote.

Enjoy!

Check your provider's spam reputation before signing up

Jan Rychter — Tue, 30 Nov 2010 17:12:01 +0100

Before you choose a hosting provider, always check their reputation with Spamhaus.

I rent virtual servers with Bluemile (formerly also Fivebean). This morning I was greeted with bouncing E-mail, and a quick check showed:

Ref: SBL99441 68.68.16.0/20 is listed on the Spamhaus Block List (SBL) 30-Nov-2010 08:26 GMT | SR02

bluemilenetworks.com (escalation)

BLUEMILENETWORKS.COM ignores spam complaints, hosts spammers including known spam operations (ROKSO), assigns non-SBL'd IPs to spammers who get their assigned IPs listed in SBL, provides snowshoe spam configurations, fails to provide rwhois information as required by ARIN (thus providing anonymity for spammers) and generally acts like a network unconcerned with its mailing reputation. Spamhaus thereby treats it accordingly.

Here's a link to the actual updated spamhaus page, which might be different when you look it up. Notice that's a /20 block that is being blocked — I can't do anything about it!

I reported this to Bluemile support, who were completely unconcerned. They would deal with it once an engineer comes to work in the morning (business hours). Well, several hours have passed, then several business hours have passed, and there are no results to be seen. Meanwhile, almost all my outgoing e-mail keeps bouncing, which I have zero control over.

I am not a spammer. I take extensive care to make sure my servers never relay any E-mail. And yet here I am, listed in the SBL because I didn't check my provider's reputation.

I am really angry. And for those of you who suggest changing providers: sure, but moving mail and DNS servers is not that easy. It takes time and effort.

One thing is sure: the next time I look for a hosting provider, I will check their IP ranges and check with spamhaus (and other lists, possibly) to see if they are a spammer haven. I don't want to have anything to do with providers that are.

Drobo and DroboShare — a review

Jan Rychter — Fri, 18 Jun 2010 11:02:28 +0200

Executive summary: don't buy it.

Convinced by people on podcasts (mostly TWiP and This Week in Tech) raving about how great the Drobo (from Data Robotics) storage device is, I decided to budget two into a project I'm working on. Expectations were high — Drobo marketing pushes the devices as easy to use, reliable and flexible. Being a Mac user, I expected an "Apple experience": plug it in and forget it's even there.

Nothing could be farther from the truth.

To begin with, the Drobo is Loud. Not just "loud", but REALLY LOUD. And it isn't the drives, it's the fan that cools the whole thing. To give you an idea of what I mean by Loud, one single Drobo with ultra-quiet WD Green drives spun down is louder than my 8-core Mac Pro with 4 drives and an army of fans in it. It's that loud. To make matters worse, the fan in the Drobo turns on very frequently, even when the drives have been spun down for hours. I don't know why, as the drives are very cool to the touch.

You won't want to have a Drobo under your desk, or anywhere in your vicinity, trust me. And that means the fancy fast FireWire-800 interface that you just paid for is pretty much useless. I used a DroboShare to setup my Drobo in a remote location where I can't hear it.

The DroboShare comes with Gigabit Ethernet, as the marketing will point out. What they won't point out is that it connects to your Drobo with a USB cable, which (together with SMB) pretty much limits your transfer speeds to about 5-8MB/s. That's about 6 times slower than when connected via FireWire-800.

What you should also know is that using the DroboShare will provide its own annoyances. As an example, I found it impossible to create a sparsebundle disk image for use with SuperDuper on the Drobo. Go figure. SMB introduces other annoying problems, too — I couldn't copy my music collection onto the Drobo, because some filenames had non-ascii characters in them.

But all of the above are merely inconveniences. The real issue is with reliability. I bought the Drobo so that I can trust it with my data and forget about failing drives and losing data. Which is why I was slightly miffed when Drobo Dashboard kept crashing on me and reporting unreliable data, annoyed when it hung in the middle of the night when doing my first real backup, slightly angry when support told me my Drobo is defective and needs to be replaced, and really pissed off when the second unit I got corrupted my volume and lost data (when connected to a DroboShare). And then Data Robotics support asked me... whether I have a backup. Or a copy of DiskWarrior.

I have so far been through TWO Drobo replacements. Despite my asking, Data Robotics was unwilling to provide an upgraded (better) unit.

What's worse is that now I don't trust the Drobo at all. I looked closer: the DroboShare seems to use the plain Linux support for HFS+ that is known to be shaky. There is NO FSCK (Filesystem Check) program for HFS+ at all! Data Robotics will tell you that you can switch your Drobo between a Mac and DroboShare and you will be ok — but that seems to be exactly what resulted in my data corruption problems.

Then there is Data Robotics support. When you make "reliable data storage devices", you really need to have support that cares about customers, reads their emails and responds instantly. Responding after one business day is not enough. Given that support people will forget what was written before, or begin by asking what your address is and when you bought your Drobo, it will easily take a week before you get to the real issue.

What you should also realize is that when your Drobo unit fails, there is no way for you to read data off the drives. You need a working Drobo unit to do that, and it has to recognize the filesystem and mount it.

I bought a Drobo so that I can have reliable data storage without worrying about reliable data storage. The net effect was that I got an unreliable solution that I have to manage, worry about and spend time and money on. That's a failure in my book. I will never buy another Drobo unit again.

[... the above was been drafted, and then 3 months passed ...]

So, today my volume (drobo mounted via a droboshare) unexpectedly disappeared on my Mac. Investigation of the DroboShare logs shows:


MOUNT HFS+ : s_id = [sda1]
scsi: unknown opcode 0xea
SCSI error : <_2 _0="_0"> return code = 0x70000
end_request: I/O error, dev sda, sector 4533105544
Buffer I/O error on device sda1, logical block 566638188
SCSI error : <_2 _0="_0"> return code = 0x70000
end_request: I/O error, dev sda, sector 4533105552
Buffer I/O error on device sda1, logical block 566638189
SCSI error : <_2 _0="_0"> return code = 0x70000
end_request: I/O error, dev sda, sector 4533105560
Buffer I/O error on device sda1, logical block 566638190
SCSI error : <_2 _0="_0"> return code = 0x70000
end_request: I/O error, dev sda, sector 4533105568
Buffer I/O error on device sda1, logical block 566638191
SCSI error : <_2 _0="_0"> return code = 0x70000
end_request: I/O error, dev sda, sector 4533105576
Buffer I/O error on device sda1, logical block 566638192
SCSI error : <_2 _0="_0"> return code = 0x70000
end_request: I/O error, dev sda, sector 4533105584
Buffer I/O error on device sda1, logical block 566638193
SCSI error : <_2 _0="_0"> return code = 0x70000
end_request: I/O error, dev sda, sector 4533105592
Buffer I/O error on device sda1, logical block 566638194
SCSI error : <_2 _0="_0"> return code = 0x70000
end_request: I/O error, dev sda, sector 4533105600
Buffer I/O error on device sda1, logical block 566638195
SCSI error : <_2 _0="_0"> return code = 0x70000
end_request: I/O error, dev sda, sector 4533105608
Buffer I/O error on device sda1, logical block 566638196
usb 1-1: USB disconnect, address 2
SCSI error : <_2 _0="_0"> return code = 0x70000
end_request: I/O error, dev sda, sector 4533105616
Buffer I/O error on device sda1, logical block 566638197
[...]

Buffer I/O error on device sda1, logical block 270838
scsi2 (0:0): rejecting I/O to dead device
Buffer I/O error on device sda1, logical block 270838
scsi2 (0:0): rejecting I/O to dead device
Buffer I/O error on device sda1, logical block 276472
scsi2 (0:0): rejecting I/O to dead device
Buffer I/O error on device sda1, logical block 276472
scsi2 (0:0): rejecting I/O to dead device
Buffer I/O error on device sda1, logical block 422806275
Buffer I/O error on device sda1, logical block 422806276
Buffer I/O error on device sda1, logical block 422806277
scsi2 (0:0): rejecting I/O to dead device
scsi2 (0:0): rejecting I/O to dead device
scsi2 (0:0): rejecting I/O to dead device

Drobo Dashboard doesn't launch, console shows me crash logs for the ddserviced daemon, which crashes every 10 seconds or so. Reinstalling drobo dashboard doesn't help.

I am so tired. I bought the Drobo so that I can save time, not so that I can run around and service it all the time, jumping through hoops set up by "support" from Data Robotics. I can already see how I'll have to spend several hours debugging the problems, dealing with support, reinstalling things.

I am posting this so that people are warned. Hopefully people will google for "Drobo" before buying it and I will save someone the hassle and frustration.

Will I lose data again this time?

Don't buy a Drobo.