Bunch of Yahoos

Having a great set of developer tools can help make your platform ubiquitous and loved. When Microsoft first launched its Developer Network it revolutionized the way programmers got access to their operating systems, tools, and documentation. They successfully migrated that set of resources to the web and it remains invaluable for Windows developers. If you’ve ever set up access to a Google API, or deployed a set of EC2 resources on Amazon’s AWS cloud infrastructure, you know how impactful a clean, functional web interface is.

By the same token, a clunky, dysfunctional interface can make a platform loathed and avoided. Take Yahoo’s Developer Network and their BOSS premium APIs, for example.

We’re working on a system that needs to geolocate placenames in blocks of free text. This isn’t a trivial problem. There’s been a lot of work done on it, and we’ve explored most of it. During that exploration we wanted to try Yahoo’s PlaceSpotter API. It’s a pay service, but if it works well the cost could be reasonable, and just because we have built our system on free and open-source components doesn’t mean we won’t pay for something if it improves our business.

With that in mind I set out to test it, just as I had previously set out to test Google’s Places API. In that experiment I simply created a Google application under my user name, grabbed the creds, and wrote a python wrapper in about five minutes to submit text queries and print out the results. That’s my idea of a test.

In order to test Yahoo’s PlaceSpotter I needed access to the BOSS API. To get access to the BOSS API I needed to create a developer account. Ok, that’s not an issue. I will happily create a developer account. To create a developer account, it turns out, requires a bunch of personal info, including an active mobile number. Ok, I’ll do that too, albeit not quite as happily because all I want to do is figure out if this thing is worth exploring.

I should note that there is a free way to get to the same Geo data that BOSS uses, and the same functionality, through YQL queries. Maybe I was shooting myself in the foot right from the beginning, but I had no experience with YQL, I needed to move quickly and make some decisions, and I just wanted an API I could fling http requests at. Since the billing is per 1000 queries I had no problem paying for the first 1000 to test with. Not that big a deal.

After creating the account, during which I had to change the user name four times because of the cryptic message that it was “inappropriate” (no, I was not trying to use b1tch as a user name, or anything else objectionable), I finally ended up on a control panel-ish account dashboard. There I could retrieve my OAUTH key (ugh) and other important stuff, and activate access to the BOSS API.

I clicked the button to activate the API, and the panel changed to display another button labelled “BOSS Setup.” Next to that was a red rectangle stating that access to BOSS was not enabled because billing had not been set up. It wasn’t obvious to me that in order to set up billing you have to click “BOSS Setup.” I assumed billing would be at the account level. Well there are billing options at the account level! They’re not the ones you want, and unless the verbiage triggers some warnings as it did for me you might just go ahead and set up your credit card there, only to find it didn’t help.

Not to be deterred, I googled a bit and found that, indeed, I had to click “BOSS Setup.” It would have been nice if they had mentioned that in the red-colored billing alert. So I clicked, entered my login again because, you know, I was using the account control panel and so obviously might be an impostor, and ultimately found the place to enter my payment information. Once that was done, submitted, and authorized I received a confirmation and invoice in my email. Now, I could finally toss a few queries at the API.

Except no. When I returned to the account dashboard the same red-colored billing alert appeared. No access. I am a patient man, some of the time. Maybe their systems are busy handshaking. I waited. Nope. I waited some more. Nope. I gave up and waited overnight, and checked again this morning. Nope. Ok, dammit, I’ll click the “BOSS Setup” button again. I do that and what they show me is the confirmation page for my order again, with an active submit button. But wait… I got an invoice? Was I charged? Will I be charged again? Should I resubmit, or email Yahoo, or call my bank?

Maybe I should just not use the API. Oh, and did I mention that they have a “BOSS Setup” tutorial? It’s a download-only PDF. And 2/3 of it is about setting up ads.

Adding swap to an EBS-backed Ubuntu EC2 instance

Another one of those recipes I need to capture for future use. We were running a bunch of memory intensive processes on a medium (m2) Ubuntu instance on EC2 yesterday, and things were not going well. It looked like some processes were dying and being restarted. Poking around in the kernel messages we came across several events reading:

"Out of memory: Kill process ... blah blah"

Well, damn. There are less than 4GB of usable RAM on a medium, but it should be swapping, right? Wrong. We checked free and there was no swap file. That was my screw up, since I set up the instance and did not realize it had no swap. Turns out that what Amazon considers to be memory constrained instances (smalls and micros, for example) get some swap in the default config, but apparently mediums and larges do not. We decided to rebuild the instance as a large to get more RAM and also another processing unit, and add some swap at the same time.

I wasn’t able to figure out how to configure swap space in the instance details prior to launch, so I searched around and put together this recipe for adding it post-launch. This applies to Ubuntu EBS-backed instances. It should work generally for any debian-based distro, I think. If your instance is not EBS backed you can use many of the same techniques, but you’ll have to figure out the deltas because we don’t use any non-EBS instances. Anyway, to the details.

EBS-backed instances have their root volume on EBS, which is what EBS-backed means. But you don’t want to put swap space on EBS. EBS use incurs I/O charges, and although they are very low, they aren’t nothing. If you were to locate swap on EBS then there would at least be some chance of some process going rogue and causing a lot of swapping and associated costs. Not good.

Fortunately, EBS-backed instances still get so-called “ephemeral” or “instance storage” at launch. Unlike EBS this storage is physically attached to the system, and also unlike EBS it goes away when the instance is stopped (but not when it is rebooted). That’s why it is called “ephemeral.” Something else that is ephemeral is the data stored in a swap file, so it seems like instance storage and swap files are made for each other. Good news: every EBS-backed instance on EC2 still gets ephemeral storage by default. It consists of either a 32GB SSD, or two 320GB magnetic disks, depending on your choices. For instances that do not get swap allocated by default, this storage is mounted at /mnt after startup (either the SSD, or the first of the magnetic disks; if you want the second disk you need to mount it manually).

So before you start make sure your config matches what I’ve described above, i.e. you have an EBS-backed instance with ephemeral storage mounted at /mnt (which you can confirm with the lsblk command), and no swap space allocated (which you can confirm with the swapon -s command).

First thing to do is create a swapfile in /mnt. The one thing I am not going to do is opine on what size it should be, because there is a lot of info out there on the topic. In this example I made the swap 2GB.

# sudo dd if=/dev/zero of=/mnt/swapfile bs=1M count=2048

This command just writes two gigs worth of 0’s to the file ‘swapfile’ on /mnt. Next make sure the permissions on this new file are set appropriately.

# sudo chown root:root /mnt/swapfile
# sudo chmod 600 /mnt/swapfile

Next use mkswap to turn the file full of 0’s into an actual linux swapfile.

# sudo mkswap /mnt/swapfile
# sudo swapon /mnt/swapfile

The first command formats the file as swap space (I actually have no idea what it does at a low level, so ‘format’ might be a wildly incorrect term to use), and the second sets it up as the system swap file. Now you should update fstab so that this swap space gets mounted and used when the system is rebooted. Use your preferred editor and add the following line to /etc/fstab:

/mnt/swapfile swap swap defaults 0 0

Lastly, turn on swapping.

# swapon -a

You can now use swapon -s to confirm that the swap space is in use, and the free or top commands to confirm the amount of swap space. One thing to note is that this swap space will not survive an instance stop/start. When the instance stops the ephemeral storage will be destroyed. On restart the changes made in fstab and elsewhere will still be there, because the system root is on EBS, but the actual file we created at /mnt/swapfile will be gone, and will need to be recreated.