Categories
Computers Machine Learning Programming

Refreshing Airplane Tracking Software With and Without AI

Bug fixing with LLMs won’t get you far if you don’t speak the (programming) language, so better use your head.

A bit like last time this post is about a bit of programmer hubris, a bit of AI, a bit of failure… Though I also took away more lessons this time about software engineering, with or without fancy tools. This is about rabbit-holing myself into an old software project that I had very little knowhow to go on…

The story starts with me rediscovering a DVB-T receiver USB stick, that I had for probably close to decade. It’s been “barnacled” by time spent in the Taiwanese climate, so I wasn’t sure if it still works, but it’s such a versatile tool, that it was worth trying to revive it.

When these receivers function, they can receive digital TV (that’s the DVB-T), but also FM radio, DAB, and also they can act as software defined radio (SDR). This last thing makes them able to receive all kinds of transitions that are immediately quite high on the fun level, in particular airplane (ADS-B transmission) and ship (AIS) tracking. Naturally, there are websites to do both if you just want to see it (for example Flightradar24 and MarineTraffic, respectively, are popular aggregators for that data but there are tons), but doing your own data collection opens doors to all kinds of other use cases.

So on I go, trying to find, what software tools people use these days to use these receivers. Mine is a pretty simple one (find out everything about it by following the “RTL-SDR” keywords wherever you like to do that :) and so I remembered there were many tools. However also time passed, I forgot most that I knew, and also there were new projects coming and going.

ADSBox

While I was searching, I found the adsbox project, that was interesting both kinda working straight out of box for me, while it was also last updated some 9 years ago, so it’s an old code base that tickles my “let’s maintain all the things!” drive…

The GitHub repo information of ADSBox, last commits overall have been 9 years ago, and there are very few of them.

The tool is written mostly in C, while it also hosts its own server for a web interface, for listing flights, and (back in the day) supporting things like Google Maps and Google Earth.

The ADSBox interface showing a bunch of airplane information.
The adsbox plane listing interface.

Both the Google Maps and Earth parts seem completely: Maps has changed a lot since, as I also had to update my Taiwan WWII Map Overlays project over time too (the requirement of using API keys to even load the map, changes to the JavaScript API…). Earth I haven’t tried, but I’m thinking that went the way of the dodo on the the desktop?

So in the supur of the this-is-the-weekend-and-I-have-energy-to-code moment, I started to think of the options:

  • could fix up the map, either with the Google Maps changes, or bring in some other map?
  • the project has barely any readme, and I mainly managed to make it work by looking at old articles from the time when adsbox waas new, could fix those up?
  • during the compilation, loads of warnings happened, that seem to call for some “better quality” coding, let’s fix stuff until -Werror (making all warnings errors) passes too! This would be a learning experience
  • I’m sure I can find other tasks to do as well, like an error message here, a strange behaviour there…

Here’s the kicker though: I don’t really know C. I spend most of my time in Python-land, and haven’t done a C project in anger yet. Is it worth trying to dig in, while there are other ADS-B projects that a) work better, b) are in languages that I’m more looking to learn, such as Rust?

There was an additional drive of curiosity, just like in my last post: can I use Large Language Models (LLMs) to complement me on things I lack, such as knowledge of the exact programming language at hand?

With this I thought let’s dig in, and let’s dig into the C code: that seemed immediately tractable, more limited in scope, and thus would help build up (hopefully) some successes and I’ll learn my way around the codebase better.

On the LLM side I have GitHub Copilot – though it seems somewhat crippled in my open source Code Server installation of VS Code, rather than the official VS Code, in particular the context menus and Copilot Chat seems missing, and thus it was only communicating with me through TAB-completions and me adding comments to guide or suggest. That’s not very practical, so didn’t push it too far for the relevant tasks of explanation and exploration of options that I wanted to do.

I also have Claude that I can chat with. If I wasn’t working on my 13 year old Lenovo ThinkPad X201, I’d probably set up Ollama, but that’s just excruciating with even the smallest models on this machine (until I upgrade something newer, or run the questions on my work M1 MacBook). So Claude it is for now.

Hello Fixes

I guess it’s one sign of hubris (or unlimited optimism), to jump into fixing compilation warnings, without knowing anything much of the codebase yet. This started in areas where the airplane tracking interacts with SQLite, for example had warnings about casting pointers to integers of different size while shuffling around SQLite query results:

int * t = (int *) sqlite3_value_int(argv[0]);

This was also part of a larger code section (formatting integers into hexadecimal or octal strings, for example for the ICAO codes…), and thus had to play around how much context to give to Claude to actually have something useful.

Chatting with Claude about the code snippet, my question and their answer shown in part.
A segment from a discussion with Claude.

A bit of mocking around there seemed to have worked, and while I should have asked more software architecture & best practices questions, probably knew about it enough to be dangerous, and left it as it was so far.

Having said that, after this change it turned out that some part of the interface now displaying stuff differently: the 24-bit ICAO airplane registration codes had useless leading zeros for 8 hex digits, rather than the expected 6 digits – since the fix was done without this context. Here we go, manual adaptation on this regression.

Now there were cases when “sprintf may write a terminating nul past the end of the destination“, as the code seems to have written its data back into the same place as this:

sprintf(data->avr_data, "*%s", data->avr_data + 13);

This ended up being again about a much bigger context (interative reading, processing, and passing on of recorded ADS-B packets), where based on the Claude’s suggestions I couldn’t really get to anything useful. The real point was always one step further:

  • instead of the line look at the nearby lines
  • instead of the near by lines, look at the whole funciton
  • instead of the whole function, look at the wider codebase with its configuration

These are of course no-brainers. However Claude with its chat interface cannot really do that, while Copilot without its chat interface also cannot do this digging. Catch-22? Since in the end I admitted myself (for the nth time) that I need to understand the purpose of the code better before “fixing” it. Then due to the lack of comments in the codebase + lack of natural intuition of the built in C functions’ behaviour, I’ve just left them as they were for now, since they do work.

From here I turned to other parts. The webserver was not serving some files with the correct MIME type, due to its hand-rolled file extension extraction (splitting filenames at the first . rather than the last), this was easy to fix – with a bit of StackOverflow this time, rather than asking Claude.

Then there was an issue with the tool apparently not playing back the recorded packet data, which I fixed with a combo of regular ol’ debug printouts, StackOverflow, and just thinking about how it could work (it’s the issue of explicitly filling in daylight saving data in the relevant tm struct – tm_isdst – and thus IMHO it’s doing a regular “undefined” behaviour: in this case jumped the first timestamp’s time ahead by an hour, and thus would have needed to wait an hour to pass as the playback (following the real timingof the packets) catch up and start actually replaying. Still weird, why only the first packet’s data was shifted, and could I do a more solid fix than setting it once as the code never seem to overwrite it? These are the questions that are more addressing C-knowledge or potential best practice of the code’s structure overall…

Finally I’ve started on replacing Google Maps with OpenFreeMap and got as far as displaying the map (which is the easy step:). The whole replacement would likely be a lot more, also given the amount of barely documented JavaScript code in the project – but hopefully I have more working knowledge of JS than C.

Lessons Learned

First lesson is that I likely have a “saviour complex”, trying to fix up every code I see being imperferct in some way, whether or not I am capable of doing it or not. This is something to meditate further on for sure.

When using LLMs for code work, they are just as useful as another mid-level coder without much context – almost not at all. The context of code is always relevant, so either the LLM would have to get it itself, or the person pairing with the LLM would have to provide it. Thus the work is always there, just not always possible.

It’s very nice that I can do things in programming languages that I don’t really understand, but that’s only the case if I either spend much-much time actually getting to know things so I can start to judge whether the changes even have a chance to be correct or not; or I don’t care whether they are correct or not (but is this really an option?)

Overall the LLMs need the same things as humans to do a good job, and cannot pretend that they really can do work without these (even if they might appear being able to do without these for some time):

  • good comment in the code so the intention can be ascertained as well
  • tests that show what the correct behaviour should be, and catch regressions or unintentional breakages
  • have domain knowledge to form better mental models about what should happen

The first two wasn’t true in this project. The last point is likely where LLMs are ahead in cases like this (having been trained on “all the Internet’s data”), though wouldn’t be the same for some niche, or work internal projects.

The LLMs suggestions are still ver much localised, thus they cannot really fix up the structure of the code too much – or maybe I’m not using the right tools, of course. And this is where my future big ask would lie: don’t just tell me how to fix this line, rather tell me that the entire block is no longer needed / could be merged with another part of the code / could be broken out to its own module that would help over there… Of course, this is moving the goal post a bit of what LLM programmers’ look like, though I also think that the current “fix this line” is something I most definitely want to have enough practice with that I don’t really need to ask (though it could suggest if there are good practices I haven’t picked up yet).

Where do I go from here?

This adsbox project is mostly obsolete, as I’ve found a bunch of other tools that are better, and better supported now (adsb_deku, tar1090), but surprising it still have stuff that are better here and in other tools (the plane’s status icons, some data displayed here that is not in others, showing what sort of packets (what Downlink Format or DF numbers) were received for the aircraft, etc… So there might be still value in using it occasionally, so there might be value.

Even if I could get a kick out of it, it’s likely useful to keep things time-boxed or constrained to some topics: change the map; add comments as I find them; fix issues if they arise; package it up for ArchLinux. That’s about it, but these should be generally useful (e.g. using OpenFreeMap for other projets in the future or rewriting the aforementioned Taiwan WWII Map project to use that).

My current fixes live in my fork in GitHub imrehg/adsbox, with no guarantees. Since the project also doesn’t have a license (just a note of “free for non-commercial use”, which doesn’t cover modifications), I’m probably keeping it simple for now.

I also got the hang of software defined radio again, and there’s just so much fun to have…

What’s the most useful is seeing in practice, what does software need to be maintainable almost a decade later, and what’s missing in most projects: explanatory comments to understand what is being done and why, and tests to know whether things are running correctly or not. And maube then both my future self, my colleagues, and any potential AI pair programmer would have a better chance of succeeding at “maintain all the things!”

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.