Categories
Programming

Automating the hell out of it

Even before the 4-Hour Work Week made me more serious about this, I really enjoyed automating tasks, that benefit from not needing to remember to do, or would be troublesome to do otherwise. This frees up a lot of time, keeps a bunch of problems away, and it is actually quite fun when the information comes to me instead me going to it.

Now I have automated checking my bank account and credit card balance, updating dynamic IP of server, ebook sales numbers, and network clock synchronizing. There are some general ideas that I summarize, then give an intro to all of those scripts.

Banking script

Tools

Most of my scripts are written in bash, because it’s relatively straightforward to hammer out simple stuff, and it is surprisingly simple to do a lot of things once I have thought enough about a problem. The Advanced Bash-Scripting Guide is always on my reading list, but I usually get to check only the parts that are relevant to the given problem. You can get quite far with a few simple constructs.

The most common parts I seem to come across:

  • if-then-else constructs: if [ -f ‘directory ‘]; then echo “Found!”; fi
  • for loops: for f in *.png; do optipng $f; done
  • loading the results of a command into a variable: VAR=$(command)

For most other problems with a little keyword-fu there’s always an answer on StackOverflow or on the web.

Another group of scripts uses Python, when a bit more data-manipulation is needed, like web scraping or JSON parsing. Actually, all of the scripts could be rewritten in Python for consistency, and it would probably be be simpler too, which is something for the future.

As a general tip, most of these scripts need tweaking, and all of them are sort of alpha-beta quality code. To facilitate hacking and reduce heartache of mangled clever code, I keep everything in git repos. I share those repos online, so have to make sure there are no secrets checked in, ever. It helps to strategically use .gitignore, separate files for the secrets, and having an example how that secrets file should look in the inside.

Most of these scripts are run periodically by cron, so it is worth having some basic knowledge about how to schedule it.

Some scripts send me emails under specific circumstances (some after every run, some when new information appears), and for good delivery I have set up postfix to use Gmail as an SMTP relay. This way I’m sure to receive the emails and receive them quickly.

Scripts

These are the scripts I use most often and the longest. Still, many of them are under development and adjust them whenever I learn how to do things better. I list the links to all their repos, where it can be improved.

Banking account balances

My two main bank accounts are queried once a day for available balance and I’m notified by email. Both accounts needed quite a bit of web scraping (and got them done at two different OpenHack Taipei events). The banks’ websites are pretty awfully organized (iframes within iframes within iframes; not using CSS classes and id), though it doesn’t have to be good for me, it has to be good for the bank.

Cathay United Bank

The cathaycheck (click for repo) script queries the available balance at Cathay United Bank by logging in with curl, and parsing the final page with Beautiful Soup. The script can be a skeleton for any other website where on has to log in and then navigate over a series of pages to get the information. The required HTML variable names can be extracted with the help of the Inspect Element tools in Chrome.

At the moment the credentials is stored in the crontab command, which is not really ideal, should rewrite to use a secrets file, though given that it runs on a server where I’m the only user (and root), for me there’s no practical difference at the moment. I have set it up to receive an email at the end of the day with the current balance.

ANZ Taiwan credit card

The anzcheck (click for repo) script queries my spending with the ANZ Taiwan credit card. Again bash for logging in and Beautiful Soup for parsing the final page. It needs a bit more logic extracting information from a table, because the websites developers added no classes or ids to the items to make it easier to understand – or for them to style, but that’s not my problem.

Just recently updated that it extracts the spending items added to my balance on a given day, so I can will never be caught by surprise again (hopefully). Since many of my charges go to companies that have Chinese names, I quickly run into the problem of having to tell my Heirloom Mailx (that I use to send emails on my ArchLinux box)  that the text I want to mail is plain text, not an attachment. With some hacking the solution was to add a few more commands to “mail” so it knows that the text is UTF-8. From “sendthatmail.sh” in the repo, the parameters needed are:

-S sendcharsets=utf-8 -S ttycharset=utf-8 -S encoding=8bit

I could still extract some more information from the bank’s website, though nothing really urgent.

No-IP address updater

At the Taipei Hackerspace we have a handful of servers running, but the residential internet connection is provided by Chunghwa Telecom only gives us a dynamic IP address. Applying for a static IP seems to be pretty troublesome, so in the meantime I’m using a script on one of the servers to update the IP address associated with our dynamic tpehack.no-ip.biz address.

The no-ip-bash-updater (click for repo) script is forked originally from elsewhere, but I have rewritten it quite a bit so that it

  • needs no extra file to store the current IP address, but compares external IP with a DNS query
  • stores no secrets in the file

It uses a pretty straightforward API call with HTTP authentication, the only real logic in there is to check when that call actually needs to be made.

E-book sales

Recently I have helped a friend to publish an ebook version of How to Start a Business in Taiwan on Leanpub, and of course I want to know when there are any sales are made (disclaimer: I don’t get a cut of the sales, all goes to the author). The leanpubsales (click for repo) script is written in Python, because using JSON there is easier than it would be with bash. The call otherwise is quite simple, just keep an external file around to check if the sales number have increased or not, if yes then send an email. To send an email conditional on the output the the script the “ifne” command from moreutils is very useful (meaning: “if input is not empty”).

The query is run periodically, and lovely to receive the results. I will surely set up a script when I get my own book ideas published on Leanpub.

RTC correction

As a physicist in atomic physics, which is the area of science very much concerned about keeping precise time, keep all my servers’ times synchronized with network time protocol (NTP) using chrony. One difficulty is that the real-time clock (RTC) of those computers is pretty crappy and drifts away. Wouldn’t be a problem if I never restart them, but a pain if I do: after restart it can be tens of seconds away until the time is synchronized again.

Chrony can sync NTP and the RTC, but it doesn’t do that automatically, I have to trigger it manually. Instead I have written up an rtccorrect (click for repo) script that is run every 2 hours or so (could be done just once a day, actually), and eliminates the drift of the RTC.

Server backup

For backing up data between servers rsync has proven invaluable. I have a couple of scripts that do just that, though those are among my oldest ones and at that time I haven’t separated out personal information (way too easy to inline every credential, email, login, and all that), so I need to sanitize that. A couple of  ideas about these backup scripts:

  • sometimes higher transfer speed can be achieved by messing with the ssh algorithms, eg. passing “-e ‘ssh -c arcfour'” to rsync
  • more often there’s even better performance when there’s an rsync daemon running on the remote computer (though with Raspberry Pi, both cases are still frustratingly slow)
  • can exclude some files if no need to transfer them, eg: “–filter=’- *.part'”
  • using rsync not just to transfer but to mirror, the “–delete” (delete at target if doesn’t exist at origin) and “–archive” are pretty useful

For these backups I also use the Dead Man’s Snitch to know when things didn’t work out, e.g having a similar command in the cron list, where backup.sh is my script’s name, xxxxxxxx is the snitch ID from my account:

backup.sh && curl -s https://nosnch.in/xxxxxxx > /dev/null

This way I got to know when my backup server was dying all the time because of bad heatsink, or my host server by flaky hosting company….

Afterword

I guess there will be just more automation in the future, and maybe many of these scripts can be ported onto a common base so new ones are made much easier. What else do you guys automate?

Categories
Maker Programming

Electronic check-in at the Taipei Hackerspace

One issue we have frequently at the Taipei Hackerspace is that people don’t know when we are open. Our  basic rule is simple: whenever a keyholder member is in the Hackerspace, anyone/everyone can come. In practice people never really know if anyone’s there.

They could give a call to the space, or even send an email to the mailing list, while the people I know usually end up asking me directly – hey, anyone’s at the space at the moment? Since I don’t always know the answer, the search was on for a better – maybe more technological or hackish solution: let’s build an electronic check-in/out system that will show the current status on out website, so people can check right there.

I had the following idea in the back of my mind for a few weeks and even got the hardware acquired, but one of our co-founder had to call me out by name on the mailing list, a few days ago to swing into action. So now here it is, kinda working, ready for real usage.

The main idea is that in Taipei pretty much everyone has an EasyCard, an 13.56MHz RFID card that is used for all public transport in the city and a lot more. The RC522 card-antenna module seems to be able to read the card pretty well, and all I need to get off it is the the ID number which is pretty straightforward (after digging the Arduino forums for source code).

The project in a nutshell is:

  • Use Arduino Mega with an RC522 board to get the ID number of a given EasyCard
  • Use witches to get whether the person is checking in or out
  • Use LEDs to provide some feedback and basic user interface for the hardware
  • Node.js server to communicate with the Arduino, interface the check-in/out database, and provide API and realtime access to the data
  • Create a bit of interface on the website to display the check-in status

Now let me dig into the different parts in detail.

RFID

The RC522 module has 8 pins, and Arduino can use the SPI library to communicate with it. I used Arduino Mega ADK, because the SPI pins are conveniently accessible, unlike e.g. the Leonardo, for which I would have had to make some new cables or headers. The RC522(pin number)->Mega(pin number) connections are done such that:

  • SA(1) -> SS(53)
  • SCK(2) -> SCK(52)
  • MOSI(3) -> MOSI(51)
  • MISO(4) -> MISO(50)
  • (5) not connected
  • GND(6) -> GND
  • RST(8) -> (any digital pin)
  • +3.3V(8) -> +3.3V
Photo of the electronics
RFID-RC522, with blank card and pins

The source code to talk to the card is from a blog, and originally from a tech shop in China, I guess (based on the big bunch of Simplified Chinese comments).

Switches and Visual Feedback

I wanted to make as simple interface for the card reader as possible. Added this pair of switches and LEDs (the D1 being green, and D2 being red). After the Arduino received a card ID from the reader, the LEDs are blinked to prompt people to press either the Check In or Check Out buttons. If they press either of them, the corresponding LED is blinked very brightly for a bit, and the card ID and check-in/out event is sent to the connected computer via serial connection

The (very basic) circuit for the check-in/out buttons and visual feedback LEDs
The (very basic) circuit for the check-in/out buttons and visual feedback LEDs. “Pins” refer to the Arduino pins used in the current version

If no button press occurs within 10 seconds or so, the reading is discarded and the card reader goes back to listening mode.

Webserver

Node.js is very useful to make quick web services, and its library support is not too bad at all, although it’s not all smooth sailing: their documentation is often scarce at best. Nevertheless it was the fastest one to get things up and running, since I have used before almost all required components.

The server communicates with Arduino via the serialport library. I’m more used to Python’s pyserial, though in this case it was very handy that serialport can emit read events, thus the server can just wait until there’s something to read and run some functions on the incoming data. In my experience, serialport wouldn’t be good for every cornercase I came across in serial-land, but in this setup works beautifully.

I chose SQLite3 to store the data, using the sqlite3 library. There are a bunch of others, had to look around which one is still being developed. This particular library is not too bad, though I found myself fighting the lack of documentation and asynchronicity quite a bit. The resulting code is pretty ugly I’m sure, in some places inefficient because I didn’t know how to get to the result I wanted in a less roundabout way, still it seems to work and that is what matters for a prototype.

First I made a simple REST API to query the currently checked-in people, and later added (real-time) push updates via socket.io, to make it nicer. It’s brilliant that without any polling, all clients can be updated once someone signs in or out.

Since this code is running on a different computer than our main web server, had to play around with the Access-Control-Allow-Origin header, and adjusting the settings of our  router to make it accessible for the web correctly.

Tried to add a pretty-much self-contained script that the front-end can load, and it handles everything, just need an appropriate HTML span or div element to display the information.

Photo showing the circuit used for the check-in system
Hardware setup for checking in/out: Arduino Mega, RFID-RC522 circuit, and some switches and LEDs.

The result is pretty good, as long as the card-reader does not crash. Originally the results were displayed in a table, but wanted to make it more human, so here’s the format I ended up with:

Website screenshot showing two people checked in
Screenshot of the homepage with one particular check-in situation.

There can also be people with no name, they just show up something like “Right now there are three people checked in the Hackerspace: Greg, and two other people.”

It lives!

Here’s a quick demonstration video of how does it work:

So you can check out our website at http://tpehack.no-ip.biz/ for the live results, and drop in if you are in the neighbourhood if there’s anyone in the ‘space.

The whole source code is shared in a Github repository: the Arduino sketch, the server script, and any additional files. I’m sure there are a lot of things that could be improved about it.

Categories
Programming Taiwan

Barometric recording of Typhoon Soulik

It all started a few weeks ago with Sparkfun having “20%-off” day, when I got myself (among other things) a BMP085 barometric pressure sensor. When it arrived, I have soldered some pins on it, and set it up with an Arduino Nano, to have the readings off it easily.

View of the circuit
BMP085 barometric pressure sensor breakout board from Sparkfun

Originally all I wanted is just some laid back pressure recording, so maybe I can use that to predict the weather a bit. “Pressure falls: bad weather comes, pressure rises: things will clear up”. I was recording for about a week, and nothing really noteworthy came out of that.

Then it was the news, that the year’s first typhoon is on the way to Taiwan, and it was supposed to be a big one. Obvious that I will try to record the barometric pressure pattern of its passing, but wanted to make it more interesting and informative. More visual than just the timeseries plot of pressures.

The Japanese Meteorological Agency (JMA) is a good place to watch for information about typhoons. They list path prediction, typhoon properties like strength, wind speeds, and central pressure, have satellite imagery. Putting these together, two days before the typhoon arrived, I set up a script to download the satellite imagery as it became available.

Satellite picture of Typhoon Soulik and location of Taiwan on 2013-07-12 morning
The morning before the typhoon arrived

The JMA publishes usually 2 satellite images in an hour for our North Western Quadrant (at :00 and :30), one of them covers the whole area, the other covers just the top 80% or so, leaving a dark band on the bottom. Nevertheless, matching up the pressure reading with the satellite pictures would be a good little project for this time.

Friday came, the government gave the afternoon off, though it turned out no landfall happened till everyone supposed to be off anyways, just a bit of on-and-off rain. People stocked up on convenience store food (I now have a good supply of instant noodles:) and water, taped over their glass windows, take in their plants and BBQ equipment from outside – well, those who have planned.

Around 10pm the big rain has arrived, here’s a video of how it looked from my window. Went to sleep later, and got woken up around 3:30am by the rain having changed into pretty darn big wind. Here’s another video of the violent part of the typhoon that time in the morning, that doesn’t even really do it justice. The houses around here are pretty tall, and I wonder if they have protected from the wind, or been artificial canyons channeling it. Some things got broken, though not as much as I expected – which is a very good thing.

In the meantime by the power of the Internet I have checked out the pressure reading, how is it going a few miles away in the Taipei Hackerspace, where I have left the barometric pressure sensor (the geolocation is 25.052993,121.516981)

Here’s the entire recording of the approximately 2 days of typhoon. It was pretty okay weather in the start and end of the plot.

Plot of pressure readings
Pressure reading during the passing of Typhoon Soulik, recorded at the Taipei Hackerspace

The readings have been corrected to sea level (from about 20m height, where the Taipei Hackerspace is), should be good within 1hPa or less.

The the pressure was indeed dropping like a rock, and the dip on the graph coincided with the most violent wind that woke me up. According the JMA, the central area of the typhoon had pressures down to 950hPa, which means that core must have passed pretty close to here, having readings below 958hPa, though probably not directly, as it didn’t stay down there for long.

I made a video syncing up the pressure reading and the satellite picture. The red dot on the video marks the recording location. (Watching it in full screen and HD makes it clearer.)

I would wonder what was the flat part in the readings while the typhoon was leaving. Maybe sign of changing direction, by the look of it.

Either way, this was fun to do, and I am glad that only a few people got hurt here, much fewer then even during the less powerful typhoons. Maybe getting people scared a little (like with this “super typhoon” stuff that went on) helps them keep safe? Just don’t use it too often.

Extra material

I put almost all material used here into a gist: the satellite imagery download script, the plotting, the movie frame generation, the movie generation script, and the complete barometric recording. Because this last part is pretty big (5Mb), Github truncated the rest of the scripts. I guess it’s okay to check check it out. Will add the Arduino sketch to read the sensor and the logging script later.

The satellite imagery weighs about 60Mb, so don’t put it online, but if anyone wants them, let me know.

Keep safe!

Categories
Programming

I got my Pebble, now let’s play

Just received my Pebble smartwatch yesterday, a bit more than a year after the Kickstarter campaign has ended. Its pitch is being a watch with e-paper display, Bluetooth communication with Android and iOS, a Cortex-M3 ARM processor, with the ability to run programs that change the watch look (“watchfaces”) or provide extra functionality (“watchapps” ).

I almost forgot about that I have supported it, until my friends on Facebook started to receive theirs, and I run into a person recently wearing one (the red coloured one, mine is gray and that was manufactured later). Even if I’m not really a wristwatch person – haven’t had one for years -, I like watches in general and clever new tech all is always welcome with me. (I wonder if any of these will be  in pocket watch form some day).

Pebble watch showing the time, 2:45
Pebble watch in action. Or just chilling and showing the time anyways.

Setting it up was pretty easy, the Pebble app on Android does pretty much everything. Setting up notifications for emails, Facebook messages, calls and SMS is one of the first thing. Until I get overwhelmed them and will turn them off, probably. The app also sets up such that when my phone downloads a file with the right extension (.pbw), it will automatically sent to the connected Pebble. The updates are pretty easy and quick too. I got a bunch of different watchfaces from My Pebble Faces, an unofficial repository that has much more than the official one. Maxed out around 10 or so, then had to use the app on my phone to remove some of them. Easy, though now really feel it all depends on the phone I use as well, the watch is mostly as clever as the accompanying phone app.

Fooled around with it a bit more, checking the alarms, the automatic Runkeeper integration (pretty neat, if only I could switch from “pace” display to “speed” for the watch), some small watchapps. In the case of watchfaces, I switched back to the original “Text Watch” face, which is pretty neat and unique for non-smart-watches. If only it could be modified to write things like “ten oh seven” for “10:07” instead of “ten seven”, it would feel more natural. There’s a watchface called Words o’Date that does that, except that it also adds the date, making the front look a bit too crowded for me. So keeping the original at the moment.

With this playing around I noticed one possible strength of the display, that almost everything can be animated, and that animation is pretty smooth. [Update: it’s actually e-paper, not e-ink, so my bad, the following comparison is not really valid Thanks for the comment.] From my limited experience with e-ink in the form of reading on my Kindle Touch, Pebble is really agile. Looking in the the Software Development Kit (SDK) documentation, there is a lot of functionality dedicated to animation.

After the whole day of usage I was thinking that when I was young(er), I would start hacking on interesting things right away, would have a lot of ideas, put in loads of effort, oftentimes stay up late to make something new to work. These days it’s less like that, even if I have much more toys lying around waiting to be hacked on. Around dinner time I decided that it cannot continue like that, so before going to sleep, will have my very on watchapp done.

Magic 8-Ball

Didn’t have many ideas, though, and haven’t looked yet too much, just wanted something simple. A Magic 8-Ball came up as a possibility: an app that gives you answers to your yes/no questions. It’s pretty easy in the core: a bunch of stock answers (20 in the original case), a random number generator to choose from them, and display it. Can do just bare text for the first time.

To set up my Linux box for the development, got the latest Pebblekit from Github, which is the SDK and examples and all the tools together. Unfortunately everything inside relies on “python” being Python 2.x, and on my ArchLinux this is not the case. The quickest workaround I found was to use Python’s virtualenv. I have also needed a different version of GCC that compiles for the Pebble’s ARM processor, the arm-none-eabi-gcc, fortunately in the ArchLinux User Repository (AUR). It just took a looooong time to compile. These two steps put me right into business and been able to compile the example watchapps and watchfaces.

I don’t remember much of my C skills (never had that much), that are mostly kept alive just by Arduino programming. Fortunately the Magic 8-Ball is simple enough program that looking at the examples, Stack Overflow, and Github, I could find all the pieces I needed (array of strings, random number generator in C, text display).

The Magic 8-Ball app is showing its result to my question: ask again later.
My first watchapp, the Magic 8-Ball

By around 1am I had a working version, and it’s totally fine. It’s even fun, even if very simple. People can get it directly from its page on My Pebble Faces, and apparently while I was writing this, 3 people already did, sweet!

In the future, I would like to keep up this creative hacking. For the Magic 8-Ball I could include some graphics, maybe neater transitions, make it less an “example” app and more an app. It’s really cool, that so many of the uploaded watchapps and watchfaces on My Pebble Faces also share their source code (50% of the 500 items, can filter for it in the search form), so I can learn from that. How much better Android apps would be if they’d do the same? My code is also on Github, naturally, in the magic8 repo.

For Pebble in general, communicating with the phone over Bluetooth can be very useful, if a useful Android app complements it. Smart clothing and accessories are just going to be more prevalent. There are already “next generation” smartwatches out there, like the Agent, that will be cool in a different way (I haven’t supported yet, they got 10x funding anyways).

Also, I think I should put some effort into compiling Libpebble for my laptop, that would make it possible to cut out the phone as a middleman and make things more future proof.

And as a first priority, I should be getting used to wearing a watch again, it’s been a while.

Categories
Admin

Switched to SPDY and now Google’s confused

Out of interest, I recently switched this site to SPDY, party because I like to try out new things, and partly because I would want to make things be better and faster. So far it’s a mixed experience, with some puzzling changes, that I cannot make heads or tails of.

The first step for the switch was bringing everything onto HTTPS, which I have done with a free SSL certificate from StartSSL. Redirected everything from the HTTP to the secure connection, with the 301 http code so I thought Google will be able to follow it well and replace the addresses in their index. Then enabled the SPDY module in Nginx, and checking the result looked like I was in business.

Some time has passed, and a scary graph started to manifest itself in Google Analytics:

Google Analytics impression count, the site has changed around May 8.
Google Analytics impression count, the site has changed around May 8.

Right after I have made the changes, my impression count on Google dropped like a brick, now being exactly 0. That’s not really the change I wanted to see. Digging more into it, though, it looks like I still have a constant stream of visitors from Google Search:

Visitor numbers from Google Search, same time interval as the impression count.
Visitor numbers from Google Search, same time interval as the impression count.

How can I have zero impressions, but still a half a dozen visitors from Search? The results in the Webmaster Tools mirror things: dropping impression count, no crawl errors, same or even better indexed count, and relatively good stats:

Google Crawler stats, with a big spike when switched over HTTPS/SPDY when needed to reindex everything
Google Crawler stats, with a big spike when switched over HTTPS/SPDY when needed to reindex everything

The crawl seemed to have gotten a bit slower (the bottom plot of the three), but more consistent.

I wonder what could be the change, does the impression count depend on the method of access (http/https)? Or did I made some braking changes? If so, then why’s the conflicting information?

Being a scientist, my main concern is not actually the raw value of any visitor count, but understanding the reactions to my actions, and consistency of the “experimental results”.  I wonder what kind of technique I could use to debug all this?

Update 2013/May/28: 

Following some recommendations from the comments, it looks like that the https:// version of my URL has to added to the Webmaster Tools separately. Now there’s a http://gergely.imreh.net and a https://gergely.imreh.net section as well. In the latter section, I can see that there are some impressions reported. Some weird things still exist: the sum of impressions from both is less than how many visitors I reportedly get from Google Search; the crawl stats is shared between the two sections (ie. the https version reports a lot of crawl stats even from the time there wasn’t https enabled), while most other data is separate for the two sections (e.g. impression, search queries, sitemaps). Still probably this is on the right path.

The impression count after adding a https version of my site's records to the Webmaster  Tools
The impression count after adding a https version of my site’s records to the Webmaster Tools

After the Webmaster Tools changes, I have just switched the Google Analytics association from one WMT property to the other. Hopefully this will freak me out less, though it will likely take some days to see the changes in the result.