Categories
Computers Machine Learning Programming

Refreshing Airplane Tracking Software With and Without AI

A bit like last time this post is about a bit of programmer hubris, a bit of AI, a bit of failure… Though I also took away more lessons this time about software engineering, with or without fancy tools. This is about rabbit-holing myself into an old software project that I had very little knowhow to go on…

The story starts with me rediscovering a DVB-T receiver USB stick, that I had for probably close to decade. It’s been “barnacled” by time spent in the Taiwanese climate, so I wasn’t sure if it still works, but it’s such a versatile tool, that it was worth trying to revive it.

When these receivers function, they can receive digital TV (that’s the DVB-T), but also FM radio, DAB, and also they can act as software defined radio (SDR). This last thing makes them able to receive all kinds of transitions that are immediately quite high on the fun level, in particular airplane (ADS-B transmission) and ship (AIS) tracking. Naturally, there are websites to do both if you just want to see it (for example Flightradar24 and MarineTraffic, respectively, are popular aggregators for that data but there are tons), but doing your own data collection opens doors to all kinds of other use cases.

So on I go, trying to find, what software tools people use these days to use these receivers. Mine is a pretty simple one (find out everything about it by following the “RTL-SDR” keywords wherever you like to do that :) and so I remembered there were many tools. However also time passed, I forgot most that I knew, and also there were new projects coming and going.

ADSBox

While I was searching, I found the adsbox project, that was interesting both kinda working straight out of box for me, while it was also last updated some 9 years ago, so it’s an old code base that tickles my “let’s maintain all the things!” drive…

The GitHub repo information of ADSBox, last commits overall have been 9 years ago, and there are very few of them.

The tool is written mostly in C, while it also hosts its own server for a web interface, for listing flights, and (back in the day) supporting things like Google Maps and Google Earth.

The ADSBox interface showing a bunch of airplane information.
The adsbox plane listing interface.

Both the Google Maps and Earth parts seem completely: Maps has changed a lot since, as I also had to update my Taiwan WWII Map Overlays project over time too (the requirement of using API keys to even load the map, changes to the JavaScript API…). Earth I haven’t tried, but I’m thinking that went the way of the dodo on the the desktop?

So in the supur of the this-is-the-weekend-and-I-have-energy-to-code moment, I started to think of the options:

  • could fix up the map, either with the Google Maps changes, or bring in some other map?
  • the project has barely any readme, and I mainly managed to make it work by looking at old articles from the time when adsbox waas new, could fix those up?
  • during the compilation, loads of warnings happened, that seem to call for some “better quality” coding, let’s fix stuff until -Werror (making all warnings errors) passes too! This would be a learning experience
  • I’m sure I can find other tasks to do as well, like an error message here, a strange behaviour there…

Here’s the kicker though: I don’t really know C. I spend most of my time in Python-land, and haven’t done a C project in anger yet. Is it worth trying to dig in, while there are other ADS-B projects that a) work better, b) are in languages that I’m more looking to learn, such as Rust?

There was an additional drive of curiosity, just like in my last post: can I use Large Language Models (LLMs) to complement me on things I lack, such as knowledge of the exact programming language at hand?

With this I thought let’s dig in, and let’s dig into the C code: that seemed immediately tractable, more limited in scope, and thus would help build up (hopefully) some successes and I’ll learn my way around the codebase better.

On the LLM side I have GitHub Copilot – though it seems somewhat crippled in my open source Code Server installation of VS Code, rather than the official VS Code, in particular the context menus and Copilot Chat seems missing, and thus it was only communicating with me through TAB-completions and me adding comments to guide or suggest. That’s not very practical, so didn’t push it too far for the relevant tasks of explanation and exploration of options that I wanted to do.

I also have Claude that I can chat with. If I wasn’t working on my 13 year old Lenovo ThinkPad X201, I’d probably set up Ollama, but that’s just excruciating with even the smallest models on this machine (until I upgrade something newer, or run the questions on my work M1 MacBook). So Claude it is for now.

Hello Fixes

I guess it’s one sign of hubris (or unlimited optimism), to jump into fixing compilation warnings, without knowing anything much of the codebase yet. This started in areas where the airplane tracking interacts with SQLite, for example had warnings about casting pointers to integers of different size while shuffling around SQLite query results:

int * t = (int *) sqlite3_value_int(argv[0]);

This was also part of a larger code section (formatting integers into hexadecimal or octal strings, for example for the ICAO codes…), and thus had to play around how much context to give to Claude to actually have something useful.

Chatting with Claude about the code snippet, my question and their answer shown in part.
A segment from a discussion with Claude.

A bit of mocking around there seemed to have worked, and while I should have asked more software architecture & best practices questions, probably knew about it enough to be dangerous, and left it as it was so far.

Having said that, after this change it turned out that some part of the interface now displaying stuff differently: the 24-bit ICAO airplane registration codes had useless leading zeros for 8 hex digits, rather than the expected 6 digits – since the fix was done without this context. Here we go, manual adaptation on this regression.

Now there were cases when “sprintf may write a terminating nul past the end of the destination“, as the code seems to have written its data back into the same place as this:

sprintf(data->avr_data, "*%s", data->avr_data + 13);

This ended up being again about a much bigger context (interative reading, processing, and passing on of recorded ADS-B packets), where based on the Claude’s suggestions I couldn’t really get to anything useful. The real point was always one step further:

  • instead of the line look at the nearby lines
  • instead of the near by lines, look at the whole funciton
  • instead of the whole function, look at the wider codebase with its configuration

These are of course no-brainers. However Claude with its chat interface cannot really do that, while Copilot without its chat interface also cannot do this digging. Catch-22? Since in the end I admitted myself (for the nth time) that I need to understand the purpose of the code better before “fixing” it. Then due to the lack of comments in the codebase + lack of natural intuition of the built in C functions’ behaviour, I’ve just left them as they were for now, since they do work.

From here I turned to other parts. The webserver was not serving some files with the correct MIME type, due to its hand-rolled file extension extraction (splitting filenames at the first . rather than the last), this was easy to fix – with a bit of StackOverflow this time, rather than asking Claude.

Then there was an issue with the tool apparently not playing back the recorded packet data, which I fixed with a combo of regular ol’ debug printouts, StackOverflow, and just thinking about how it could work (it’s the issue of explicitly filling in daylight saving data in the relevant tm struct – tm_isdst – and thus IMHO it’s doing a regular “undefined” behaviour: in this case jumped the first timestamp’s time ahead by an hour, and thus would have needed to wait an hour to pass as the playback (following the real timingof the packets) catch up and start actually replaying. Still weird, why only the first packet’s data was shifted, and could I do a more solid fix than setting it once as the code never seem to overwrite it? These are the questions that are more addressing C-knowledge or potential best practice of the code’s structure overall…

Finally I’ve started on replacing Google Maps with OpenFreeMap and got as far as displaying the map (which is the easy step:). The whole replacement would likely be a lot more, also given the amount of barely documented JavaScript code in the project – but hopefully I have more working knowledge of JS than C.

Lessons Learned

First lesson is that I likely have a “saviour complex”, trying to fix up every code I see being imperferct in some way, whether or not I am capable of doing it or not. This is something to meditate further on for sure.

When using LLMs for code work, they are just as useful as another mid-level coder without much context – almost not at all. The context of code is always relevant, so either the LLM would have to get it itself, or the person pairing with the LLM would have to provide it. Thus the work is always there, just not always possible.

It’s very nice that I can do things in programming languages that I don’t really understand, but that’s only the case if I either spend much-much time actually getting to know things so I can start to judge whether the changes even have a chance to be correct or not; or I don’t care whether they are correct or not (but is this really an option?)

Overall the LLMs need the same things as humans to do a good job, and cannot pretend that they really can do work without these (even if they might appear being able to do without these for some time):

  • good comment in the code so the intention can be ascertained as well
  • tests that show what the correct behaviour should be, and catch regressions or unintentional breakages
  • have domain knowledge to form better mental models about what should happen

The first two wasn’t true in this project. The last point is likely where LLMs are ahead in cases like this (having been trained on “all the Internet’s data”), though wouldn’t be the same for some niche, or work internal projects.

The LLMs suggestions are still ver much localised, thus they cannot really fix up the structure of the code too much – or maybe I’m not using the right tools, of course. And this is where my future big ask would lie: don’t just tell me how to fix this line, rather tell me that the entire block is no longer needed / could be merged with another part of the code / could be broken out to its own module that would help over there… Of course, this is moving the goal post a bit of what LLM programmers’ look like, though I also think that the current “fix this line” is something I most definitely want to have enough practice with that I don’t really need to ask (though it could suggest if there are good practices I haven’t picked up yet).

Where do I go from here?

This adsbox project is mostly obsolete, as I’ve found a bunch of other tools that are better, and better supported now (adsb_deku, tar1090), but surprising it still have stuff that are better here and in other tools (the plane’s status icons, some data displayed here that is not in others, showing what sort of packets (what Downlink Format or DF numbers) were received for the aircraft, etc… So there might be still value in using it occasionally, so there might be value.

Even if I could get a kick out of it, it’s likely useful to keep things time-boxed or constrained to some topics: change the map; add comments as I find them; fix issues if they arise; package it up for ArchLinux. That’s about it, but these should be generally useful (e.g. using OpenFreeMap for other projets in the future or rewriting the aforementioned Taiwan WWII Map project to use that).

My current fixes live in my fork in GitHub imrehg/adsbox, with no guarantees. Since the project also doesn’t have a license (just a note of “free for non-commercial use”, which doesn’t cover modifications), I’m probably keeping it simple for now.

I also got the hang of software defined radio again, and there’s just so much fun to have…

What’s the most useful is seeing in practice, what does software need to be maintainable almost a decade later, and what’s missing in most projects: explanatory comments to understand what is being done and why, and tests to know whether things are running correctly or not. And maube then both my future self, my colleagues, and any potential AI pair programmer would have a better chance of succeeding at “maintain all the things!”

Categories
Computers Machine Learning Programming

Adventures into Code Age with an LLM

It’s a relaxed Saturday afternoon, and I just remembered some nerdy plots I’ve seen online for various projects, depicting “code age” over time: how does your repository change over the months and years, how much code still survives from the beginning till now, etc… Something like this made by the author of curl:

Curl’s code age distribution

It looks interesting and informative. And even though I don’t have codebases that have been around this long, there are plenty of codebases around me that are fast moving, so something like a month (or in some cases week) level cohorts could be interesting.

One way to take this challenge on is to actually sit down and write the code. Another is to take a Large Language Model, say Claude and try to get that to make it. Of course the challenge is different in nature. For this case, let’s put myself in the shoes of someone who says

I am more interested in the results than the process, and want to get to the results quicker.

See how far we can get with this attitude, and where does it break down (probably no spoiler: it breaks down very quickly.).

Note on the selection of the model: I’ve chosen Claude just because generally I have good experience with it these days, and it can share generated artefacts (like the relevant Python code) which is nice. And it’s a short afternoon. :) Otherwise anything else could work as well, though surely with varying results.

Version 1

Let’s kick it off with a quick prompt.

Prompt: How would you generate a chart from a git repository to show the age of the code? That is when the code was written and how much of it survives over time?

Claude quickly picked it up and made me a Python script, which is nice (that being my day-to-day programming language). I guess that’s generally a good assumption these days if one does data analytics anyways (asking for another language is left for another experiment).

The result is this this code. I’ve skimmed it that it doesn’t just delete all my repo or does something completely batshit, but otherwise saved in a repo that I have at hand. To make it easier on myself, added some inline metadata with the dependencies:

# /// script
# dependencies = [
#   "pandas",
#   "matplotlib",
# ]
# ///

and from there I can just run the script with uv.

First it checked too few files (my repository is a mixture of Python and SQL scripts managed by dbt), so had to go in and change those filters, expanding them.

Then the thought struck me to remove the filter altogether (since it already checks only files that are checked in git, so it should be fine – but then it broke on a step where it reads a file as if it was text to find the line counts. I guess there could be a better way of filtering (say “do not read binary files”, if there’s a way to do that), but just went with catching the problems:

# ....
    for file_path in tracked_files:
        try:
            timestamps = get_file_blame_data(file_path)
            for timestamp in timestamps:
                blame_data[timestamp] += 1
                total_lines += 1
        except UnicodeDecodeError:
            print(f"Error reading file: {file_path}")
            continue
#....

(hance I know that a favicon PNG was causting those UnicodeDecodeError hubbub in earlier runs. Now we are getting somewhere, and we have a graph like this:

Version 1

This is already quite fun to see. There are the sudden accelerations of development, there are the plateaus of me working on other projects, and generally feel like “wow, productive!” (with no facts backing that feeling 😂). Also pretty good ROI on maybe 15 mins of effort.

Having said that, this is still fair from what I wanted.

Version 2

Promt: Could we change the code to have cohorts of time, that is configurable, say monthly, or yearly cohoorts, and colour the chart to see how long each cohort survives?

This came back with another set of code. Adding the metadata, skimming it (it has the filter on the file extensions again, never mind), and running it once more to see the output, I get this:

Version 2

Because of the file extension filter in place, the numbers are obviously not aligning with the above, but it does something. The something is a bit unclear, bit it feels like progress, so let’s give it a benefit of the doubt, and just change once more.

Version 3

Promt: Now change this into a cummulative graph, please.

One more time Claude came back with this code. Adding the metadata again, same drill. Running this has failed with errors in numpy, though:

TypeError: ufunc 'isfinite' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''

Now this needed some debugging. It turns out a column the code is trying to plot is actually numbers as strings rather than numbers as, you know, say floats…

# my "fix"
        df['cumulative_percentage'] = df['cumulative_percentage'].astype(float)
# end

        # Plot cumulative area
        plt.fill_between(df.index, df['cumulative_percentage'],
                        alpha=0.6, color='royalblue',
                        label='Cumulative Code')

It didn’t take too many tries, but it was confusing at first – why shouldn’t be, if I didn’t actually read just skim the code…

The result is then like this:

Version 3

Sort of meh, it feels like it’s not going to the right direction overall.

But while debugging the above issues, I first tried tried to ask Claude about the error (maybe it can fix it itself), but came back with “Your message exceeds the length limit. …” (for free users, that is). So I kinda stopped here for the time being.

Lessons learned

The first lesson is very much re-learned:

Garbage in, garbage out.

If I cannot express what I really want, it’s very difficult to make it happen. And my prompts were by no means expressing my wishes correctly, no wonder Claude wasn’t really hitting the mark. Whether or not a human engineer would have faired better, I don’t know. I know however, that this kind of “tell me exceedingly clearly what’s your idea” is an everyday conversation for me as an engineer (and being on both end of the convo).

The code provided by the model wasn’t really far off for some solution, so that was fun! On the other hand, when it hit any issues, I really had to have domain and language knowledge to fix things. This seems like an interesting place to be:

  • the results are quick and on the surface good-enough for a non/less technical person, probably
  • but they would also be the ones who couldn’t do anything if something goes wrong.

Even myself I feel that it would be hard to support the code as a software engineer if it was just generated like this. But that’s also a strange thought: so many times I have to support (debug, extend, explain, refactor) code that I haven’t had anything to do with before.

It seems to me that now that since Claude comes across as an eager junior engineer, writing decent code that always needs some adjustments, the trade-off is really in the dimension of spending time to get better at prompting vs better at coding.

If there’s a person with some amount of programming skills, mostly interested in the results not the process, and doubling down on prompting: they likely could get loads further than I did here. Good quality prompts and small amount of code adjustments being the sweet spot for them.

For others who have more programming expertise, and maybe more interested in the process, spending time on getting better at programming rather than getting really better at prompting: keeping to smaller snippets might be the sweet spot, or learning new languages, … Something as a starting point for digging in, a seed, is what this process can help with.

Future

Given the above notes on how this generated code is like a new codebase that I suddenly neet to support, here’s a different, fun exercise 💡 to actually improve engineering skills:

Take AI generated code that is “good enough” for a small problem and refactor, extent, productionise it.

I’m not sure if this would work, or would get me into wrong habits, but if I wanted do have some quick ways of doing deliberate practice – and not Exercism, LeetCode, or somilar, rather something that can be custom made, then this seems a way to get started.

Also, now that I’ve gotten even more interested in the problem, I’ll likely just dig into how to actually define that chart I was looking for and what kind of data I would need to get from git to make it happen. The example code made me pretty confident, that “all I need is Python” really, even though while prepping for this I found other useful tools like one allowing you to write SQL queries for your repo, that might be some further way to expand my understanding.

Either way, it’s just fun to mess with code on a lazy Saturday.

Categories
Computers

Git login and commit signing with security

Doing software engineering (well-ish) is pretty hard to imagine without working in version control, which most of the time means git. In a practical setup of git there’s the question of how do I get access to the code it stores — how do I “check things out”? — and optionally how can others verify that it was indeed me who did the changes — how do I “sign” my commits? Recently I’ve changed my mind about what’s a good combination for these two aspects, and what tools am I using to do them.

Access Options

In broad terms git repositories can be checked out either though the HTTP protocol, or through the SSH protocol. Both have pros and cons.

Having two-factor authentication (2FA) made the HTTP access more secure but also more setup (no more direct username/password usage, rather needing to create extra access keys used in place of passwords). Credentials were still in plain text (as far as I know) on the machine in some git config files.

The SSH setup was in some sense more practical one (creating keys on your own machine, and just passing in the public key portion), though there were still secrets in plain text on my machine (as I don’t think the majority of people used password-protected SSH keys, due to their user experience). This is what I’ve used for years: add a new SSH key for a new machine that I’m working on, check code out through ssh+git, and work away.

When I’ve recently came across the git-credential-manager tool that supposed to make HTTP access nicer (for various git servers and services), and get rid of plain text secrets. Of course this is not the first or only one of the tools that does git credentials, but being made by GitHub, it had some more clout. This made me re-evaulate what options do I have for SSH as well for similar security improvements.

Thus I’ve found that both 1Password and KeePassXC (the two main password managers I use) have ssh-agent integration, and thus can store SSH keys + give access to them as needed. No more plain text (or password protected) private keys on disk with these either!

Now it seems there are two good, new options to evaulate, and for the full picture I looked at how the code signing options work in this context as well.

Categories
Admin Computers

ZFS on a Raspberry Pi

I have a little home server, just like mike many other geeks / nerds / programmers / technical people… It can be both useful, a learning experience, as well as a real chore; most of the time the balance is shifting between these two ends. Today I’m taking notes here on one aspect of that home server that is widely swing between those two use cases.

When I say I have a home server, that might be too generous description of the status quo: I have a pretty banged up Raspberry Pi 3B. It’s running ArchLinux ARM, the 64-bit, AAarch64 version, looking a bit more retro on the hardware front while pushing for more modernity on the software side – a mix that I find fun.

There are a handful of services running on the device — not that many, mostly limited by it’s *gulp* 1GB memory; plenty of things I’d love to run, doesn’t well co-locate in such a tiny compartment. Besides the memory, it’s also limited by storage: the Raspberry Pi runs off an SD card, and those are both fragile, and limited in size. If one wants to run a home file server, say using a handful of other SD cards lying around, to expand the available storage, that will be awkward very soon. To make that task less awkward (or replace one kind of awkward with a more interesting one), I’ve set out to set up a ZFS storage pool, using OpenZFS.

The idea

Why ZFS? In big part, to be able to credibly answer that question.

But with a single, more concrete reason: being able to build a more solid and expandable storage unit. ZFS cancombine different storage units

  • in a way that combats data errors, e.g. mirroring: this addresses SD cards fragility
  • in a way that data can expand across all of them in a single file system: this addresses the SD cards size limitations

This sounds great in theory and after a bit of trial-and-error, I’ve made the following setup, relying on dynamic kernel modules for support for flexibility, and a hodgepodge of drives at hand for the storage

The file system supports needs is provided by the zfs-dkms package dynamic kernel module (DKMS), which means the kernel module required for being able to manage that file system is recompiled for each new Linux kernel version as it is updated. This is handy in theory, as I can use the main kernel packages provided by the ArchLinux ARM team.

For storage, I’ve started off with two SD cards in mirror mode (going for data integrity first). Later I’ve found — and invested in — some large capacity USB sticks that bumped the storage size quite a bit. With these, the currentl ZFS pool looks like this:

Terminal screenshot of the 'zpool status' command.

It already saved me — or rather my data — once where an SD card was acting up, though that’s par for the course. One very large benefit is that the main system card is being used less, so hopefully will last longer.

The complications

Of course, it’s never this easy… With non-mainline kernel modules and with DKMS, every update is a bit of a gamble, that can suddenly not pay off. That’s exactly what happened last year, when suddenly the module didn’t compile anymore on a new kernel version, and thus all that storage was sitting dump and inaccessible. After digging into the issue, it down to:

  1. the OpenZFS project being under Common Development and Distribution License (CDDL)
  2. the Linux kernel deliberately breaking non-GPL licensed code by starting to withold certain floating point capabilities, because “this is not expected to be disruptive to existing users”.

This wasn’t great, as user being between pretty much a rock & a hard place, even if this is a hobby and not strictly speaking a production use case on my side.

Nonetheless, it worked by downgrading to a working version and skipping updates to the kernel packages.

Then based on a suggestion, patching the zfs-dkms package (rewriting the license entry in the META file) to make it look like it’s a GPL-licensed module — which is fair game for one doing on their own machine. This is hacky, or let’s call it pragmatic.

--- META.prev   2024-02-28 08:42:21.526641154 +0800
+++ META        2024-02-28 08:42:36.435569959 +0800
@@ -4,7 +4,7 @@
 Version:       2.2.3
 Release:       1
 Release-Tags:  relext
-License:       CDDL
+License:       GPL
 Author:        OpenZFS
 Linux-Maximum: 6.7
 Linux-Minimum: 3.10

Now, with the current 2.2.3 version, it seems like there’s an official fix-slash-workaround for being able to get the module to compile, even if it’s not a full fix. From the linked merge request message I’m not fully convinced that this is not a fragile status quo, but it’s at least front of mind – good going for wider ARM hardware usage that brings out people’s willingness to fix things!

Future development

Some while back, while working at an IoT software deploument & management company, I had a lot of interesting hardware at hand, naturally, to build things with (or wrestle with…). Nowadays I have things I best describe as spare parts, and thus loads of thingss are more fragile than they need to be, as well as gosh-it-takes-a-long-time to compile things on a Raspberry Pi 3 – making every kernel update some half-an-hour longer!

Likely the best move would be to upgrade to a (much more powerful) Raspberry Pi 5 and use an external NVMe drive, where I’d have much less need for ZFS, at least for the original reasons. It would likely be still useful for other aspects (such as snapshotting, or sending/receiving the drive data, compression, deduplication, etc…), changing the learning path away from multi-device support to the file system features.

If I wanted to use more storage in the existing system, I could also get rid of the mirrored SD cards and just just 4 large USB sticks (maybe in a RAIDZ setup), a poor-man’s NAS, I guess. Though there I’d worry a bit about using the sticks with the same sizes for this to work (unlike pooling, which has no same-size requirements), given the differences in the supposedly same sized products from different companies (likely locking me into a having the same brand and model across the board).

I also feel like I’m not using ZFS to its full potential. If I know enough just to be dangerous… maybe that’s the generalists natural habitat?

Categories
Maker Programming

Making a USB Mute Button for Online Meetings

I use Google Meet every day for (potentially hours of) online meetings at work, so it’s very easy to notice when things change and for example new features are available. Recently I’ve found a new “Call Control” section in the settings that promised a lot of fun, connecting USB devices to control my calls.

Screenshot of the Google Meet Settings menu during calls, showing the call control menu and a call-out to connect my USB device.
Google Meet Settings menu during a call, witht the Call control section

As someone who enjoys (or drawn to, or sort-of obscessed with) hacking on hardware, this was a nice call of action: let’s cobble together a custom USB button that can do some kind of call control1: say muting myself in the call, showing mute status, hanging up, etc.

This kicked off such a deep rabbit hole that I barely made it back up to the top, but one that seeded a crazy amount of future opportunities.