Categories
Computers Machine Learning Programming

Adventures into Code Age with an LLM

It’s a relaxed Saturday afternoon, and I just remembered some nerdy plots I’ve seen online for various projects, depicting “code age” over time: how does your repository change over the months and years, how much code still survives from the beginning till now, etc… Something like this made by the author of curl:

Curl’s code age distribution

It looks interesting and informative. And even though I don’t have codebases that have been around this long, there are plenty of codebases around me that are fast moving, so something like a month (or in some cases week) level cohorts could be interesting.

One way to take this challenge on is to actually sit down and write the code. Another is to take a Large Language Model, say Claude and try to get that to make it. Of course the challenge is different in nature. For this case, let’s put myself in the shoes of someone who says

I am more interested in the results than the process, and want to get to the results quicker.

See how far we can get with this attitude, and where does it break down (probably no spoiler: it breaks down very quickly.).

Note on the selection of the model: I’ve chosen Claude just because generally I have good experience with it these days, and it can share generated artefacts (like the relevant Python code) which is nice. And it’s a short afternoon. :) Otherwise anything else could work as well, though surely with varying results.

Version 1

Let’s kick it off with a quick prompt.

Prompt: How would you generate a chart from a git repository to show the age of the code? That is when the code was written and how much of it survives over time?

Claude quickly picked it up and made me a Python script, which is nice (that being my day-to-day programming language). I guess that’s generally a good assumption these days if one does data analytics anyways (asking for another language is left for another experiment).

The result is this this code. I’ve skimmed it that it doesn’t just delete all my repo or does something completely batshit, but otherwise saved in a repo that I have at hand. To make it easier on myself, added some inline metadata with the dependencies:

# /// script
# dependencies = [
#   "pandas",
#   "matplotlib",
# ]
# ///

and from there I can just run the script with uv.

First it checked too few files (my repository is a mixture of Python and SQL scripts managed by dbt), so had to go in and change those filters, expanding them.

Then the thought struck me to remove the filter altogether (since it already checks only files that are checked in git, so it should be fine – but then it broke on a step where it reads a file as if it was text to find the line counts. I guess there could be a better way of filtering (say “do not read binary files”, if there’s a way to do that), but just went with catching the problems:

# ....
    for file_path in tracked_files:
        try:
            timestamps = get_file_blame_data(file_path)
            for timestamp in timestamps:
                blame_data[timestamp] += 1
                total_lines += 1
        except UnicodeDecodeError:
            print(f"Error reading file: {file_path}")
            continue
#....

(hance I know that a favicon PNG was causting those UnicodeDecodeError hubbub in earlier runs. Now we are getting somewhere, and we have a graph like this:

Version 1

This is already quite fun to see. There are the sudden accelerations of development, there are the plateaus of me working on other projects, and generally feel like “wow, productive!” (with no facts backing that feeling 😂). Also pretty good ROI on maybe 15 mins of effort.

Having said that, this is still fair from what I wanted.

Version 2

Promt: Could we change the code to have cohorts of time, that is configurable, say monthly, or yearly cohoorts, and colour the chart to see how long each cohort survives?

This came back with another set of code. Adding the metadata, skimming it (it has the filter on the file extensions again, never mind), and running it once more to see the output, I get this:

Version 2

Because of the file extension filter in place, the numbers are obviously not aligning with the above, but it does something. The something is a bit unclear, bit it feels like progress, so let’s give it a benefit of the doubt, and just change once more.

Version 3

Promt: Now change this into a cummulative graph, please.

One more time Claude came back with this code. Adding the metadata again, same drill. Running this has failed with errors in numpy, though:

TypeError: ufunc 'isfinite' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''

Now this needed some debugging. It turns out a column the code is trying to plot is actually numbers as strings rather than numbers as, you know, say floats…

# my "fix"
        df['cumulative_percentage'] = df['cumulative_percentage'].astype(float)
# end

        # Plot cumulative area
        plt.fill_between(df.index, df['cumulative_percentage'],
                        alpha=0.6, color='royalblue',
                        label='Cumulative Code')

It didn’t take too many tries, but it was confusing at first – why shouldn’t be, if I didn’t actually read just skim the code…

The result is then like this:

Version 3

Sort of meh, it feels like it’s not going to the right direction overall.

But while debugging the above issues, I first tried tried to ask Claude about the error (maybe it can fix it itself), but came back with “Your message exceeds the length limit. …” (for free users, that is). So I kinda stopped here for the time being.

Lessons learned

The first lesson is very much re-learned:

Garbage in, garbage out.

If I cannot express what I really want, it’s very difficult to make it happen. And my prompts were by no means expressing my wishes correctly, no wonder Claude wasn’t really hitting the mark. Whether or not a human engineer would have faired better, I don’t know. I know however, that this kind of “tell me exceedingly clearly what’s your idea” is an everyday conversation for me as an engineer (and being on both end of the convo).

The code provided by the model wasn’t really far off for some solution, so that was fun! On the other hand, when it hit any issues, I really had to have domain and language knowledge to fix things. This seems like an interesting place to be:

  • the results are quick and on the surface good-enough for a non/less technical person, probably
  • but they would also be the ones who couldn’t do anything if something goes wrong.

Even myself I feel that it would be hard to support the code as a software engineer if it was just generated like this. But that’s also a strange thought: so many times I have to support (debug, extend, explain, refactor) code that I haven’t had anything to do with before.

It seems to me that now that since Claude comes across as an eager junior engineer, writing decent code that always needs some adjustments, the trade-off is really in the dimension of spending time to get better at prompting vs better at coding.

If there’s a person with some amount of programming skills, mostly interested in the results not the process, and doubling down on prompting: they likely could get loads further than I did here. Good quality prompts and small amount of code adjustments being the sweet spot for them.

For others who have more programming expertise, and maybe more interested in the process, spending time on getting better at programming rather than getting really better at prompting: keeping to smaller snippets might be the sweet spot, or learning new languages, … Something as a starting point for digging in, a seed, is what this process can help with.

Future

Given the above notes on how this generated code is like a new codebase that I suddenly neet to support, here’s a different, fun exercise 💡 to actually improve engineering skills:

Take AI generated code that is “good enough” for a small problem and refactor, extent, productionise it.

I’m not sure if this would work, or would get me into wrong habits, but if I wanted do have some quick ways of doing deliberate practice – and not Exercism, LeetCode, or somilar, rather something that can be custom made, then this seems a way to get started.

Also, now that I’ve gotten even more interested in the problem, I’ll likely just dig into how to actually define that chart I was looking for and what kind of data I would need to get from git to make it happen. The example code made me pretty confident, that “all I need is Python” really, even though while prepping for this I found other useful tools like one allowing you to write SQL queries for your repo, that might be some further way to expand my understanding.

Either way, it’s just fun to mess with code on a lazy Saturday.

Categories
Programming

Programming challenge: Protohackers 3

Protohackers is a server programming challenge, where various network protocols are set as a problem. It has started not so long ago, and the No 3. challenge was just released yesterday, aiming at creating a simple (“Budget”) multi-user chat server. I thought I sacrifice a decent part of my weekend give it a honest try. This is the short story of trying, failing, then getting more knowledge out than I’ve expected.

Definitely wanted to tackle it using Python as that’s my current utility language that I want to know most about. Since the aim of Protohackers, I think, is to go from scratch, I set to use only the standard library. With some poking around documentation I ended up choosing SocketServer as the basis of the work. It seemed suitable, but there was a severe dearth of non-dummy code and deeper explanation. In a couple of hours I did make some progress, though, that already felt exciting:

  • Figured out (to some extent) the purpose of the server / handler parts in practice
  • Made things multi-user with data shared across connections
  • Grokked a bit the lifecycle of the requests, but definitely not fully, especially not how disconnections happen.

Still it was working to some extent, I could make a server that functioned for a certain definition of “functioned”, as the logs attest:

Console log of server messages while trying my Budget Chat Server implementation.
Server logs from trying my Budget Chat Server
Categories
Programming

Creating a Prometheus metrics exporter for a 4G router

Recently I begun fully remote working from home, with the main network connectivity provided by a 4G mobile router. Very soon I experienced patchy connectivity, not the greatest thing when you are on video calls for half of each day. What does one do then (if not just straight replacing the whole setup with a wired ISP, if possible), other than monitor what’s going on and try to debug the issues?

The router I have is a less-common variety, an Alcatel Linkhub HH441 (can’t even properly link to on the manufacturer’s site, just on online retail stores). At least as it should, it does have a web front-end, that one can poke around in, and gather metrics from – of course in an automatic way.

The Alcatel HH41 LinkHub router that I had at hand to use

Looking at the router’s web interface, and checking the network activity through the browsers’ network monitor (Firefox, Chrome), the frontend’s API calls showed up, so I could collect a few that requested different things like radio receiving metrics, bandwidth usage, uptime, and so on… From here we are off to the races setting up our monitoring infrastructure, along these pretty standard lines:

Categories
Programming Taiwan

Taiwan WWII Map Overlays

A while ago I came across the Formosa (Taiwan) City Plans, U.S. Army Map Service, 1944-1945 collection, in the Perry-Castañeda Library Map Collection of the University of Texas in Austin. I’m a sucker for maps, enjoy learning about history a lot, and I have a lot of interest in my current home, Taiwan – so you can call this a magic mix of cool stuff.

There are 26 maps in the collection, made by the US Army by flying over different parts of the island, and mostly I guess stitching together aerial photographs. The maps themselves were not that easy check in an image viewer, since there’s no context, zoom is clumsy, and have no idea where about half the places should be located. Instead, I thought it would be great to have them as an overlay on top of current maps and satellite imagery on Google Maps.

The result is Taiwan City Maps overlays, which does exactly that. Feel free to click the link and explore right now! In the rest of this post, I try to first show how that page was made, and also some history lessons I gained by making it.

Categories
Programming

Automating the hell out of it

Even before the 4-Hour Work Week made me more serious about this, I really enjoyed automating tasks, that benefit from not needing to remember to do, or would be troublesome to do otherwise. This frees up a lot of time, keeps a bunch of problems away, and it is actually quite fun when the information comes to me instead me going to it.

Now I have automated checking my bank account and credit card balance, updating dynamic IP of server, ebook sales numbers, and network clock synchronizing. There are some general ideas that I summarize, then give an intro to all of those scripts.

Banking script

Tools

Most of my scripts are written in bash, because it’s relatively straightforward to hammer out simple stuff, and it is surprisingly simple to do a lot of things once I have thought enough about a problem. The Advanced Bash-Scripting Guide is always on my reading list, but I usually get to check only the parts that are relevant to the given problem. You can get quite far with a few simple constructs.

The most common parts I seem to come across:

  • if-then-else constructs: if [ -f ‘directory ‘]; then echo “Found!”; fi
  • for loops: for f in *.png; do optipng $f; done
  • loading the results of a command into a variable: VAR=$(command)

For most other problems with a little keyword-fu there’s always an answer on StackOverflow or on the web.

Another group of scripts uses Python, when a bit more data-manipulation is needed, like web scraping or JSON parsing. Actually, all of the scripts could be rewritten in Python for consistency, and it would probably be be simpler too, which is something for the future.

As a general tip, most of these scripts need tweaking, and all of them are sort of alpha-beta quality code. To facilitate hacking and reduce heartache of mangled clever code, I keep everything in git repos. I share those repos online, so have to make sure there are no secrets checked in, ever. It helps to strategically use .gitignore, separate files for the secrets, and having an example how that secrets file should look in the inside.

Most of these scripts are run periodically by cron, so it is worth having some basic knowledge about how to schedule it.

Some scripts send me emails under specific circumstances (some after every run, some when new information appears), and for good delivery I have set up postfix to use Gmail as an SMTP relay. This way I’m sure to receive the emails and receive them quickly.

Scripts

These are the scripts I use most often and the longest. Still, many of them are under development and adjust them whenever I learn how to do things better. I list the links to all their repos, where it can be improved.

Banking account balances

My two main bank accounts are queried once a day for available balance and I’m notified by email. Both accounts needed quite a bit of web scraping (and got them done at two different OpenHack Taipei events). The banks’ websites are pretty awfully organized (iframes within iframes within iframes; not using CSS classes and id), though it doesn’t have to be good for me, it has to be good for the bank.

Cathay United Bank

The cathaycheck (click for repo) script queries the available balance at Cathay United Bank by logging in with curl, and parsing the final page with Beautiful Soup. The script can be a skeleton for any other website where on has to log in and then navigate over a series of pages to get the information. The required HTML variable names can be extracted with the help of the Inspect Element tools in Chrome.

At the moment the credentials is stored in the crontab command, which is not really ideal, should rewrite to use a secrets file, though given that it runs on a server where I’m the only user (and root), for me there’s no practical difference at the moment. I have set it up to receive an email at the end of the day with the current balance.

ANZ Taiwan credit card

The anzcheck (click for repo) script queries my spending with the ANZ Taiwan credit card. Again bash for logging in and Beautiful Soup for parsing the final page. It needs a bit more logic extracting information from a table, because the websites developers added no classes or ids to the items to make it easier to understand – or for them to style, but that’s not my problem.

Just recently updated that it extracts the spending items added to my balance on a given day, so I can will never be caught by surprise again (hopefully). Since many of my charges go to companies that have Chinese names, I quickly run into the problem of having to tell my Heirloom Mailx (that I use to send emails on my ArchLinux box)  that the text I want to mail is plain text, not an attachment. With some hacking the solution was to add a few more commands to “mail” so it knows that the text is UTF-8. From “sendthatmail.sh” in the repo, the parameters needed are:

-S sendcharsets=utf-8 -S ttycharset=utf-8 -S encoding=8bit

I could still extract some more information from the bank’s website, though nothing really urgent.

No-IP address updater

At the Taipei Hackerspace we have a handful of servers running, but the residential internet connection is provided by Chunghwa Telecom only gives us a dynamic IP address. Applying for a static IP seems to be pretty troublesome, so in the meantime I’m using a script on one of the servers to update the IP address associated with our dynamic tpehack.no-ip.biz address.

The no-ip-bash-updater (click for repo) script is forked originally from elsewhere, but I have rewritten it quite a bit so that it

  • needs no extra file to store the current IP address, but compares external IP with a DNS query
  • stores no secrets in the file

It uses a pretty straightforward API call with HTTP authentication, the only real logic in there is to check when that call actually needs to be made.

E-book sales

Recently I have helped a friend to publish an ebook version of How to Start a Business in Taiwan on Leanpub, and of course I want to know when there are any sales are made (disclaimer: I don’t get a cut of the sales, all goes to the author). The leanpubsales (click for repo) script is written in Python, because using JSON there is easier than it would be with bash. The call otherwise is quite simple, just keep an external file around to check if the sales number have increased or not, if yes then send an email. To send an email conditional on the output the the script the “ifne” command from moreutils is very useful (meaning: “if input is not empty”).

The query is run periodically, and lovely to receive the results. I will surely set up a script when I get my own book ideas published on Leanpub.

RTC correction

As a physicist in atomic physics, which is the area of science very much concerned about keeping precise time, keep all my servers’ times synchronized with network time protocol (NTP) using chrony. One difficulty is that the real-time clock (RTC) of those computers is pretty crappy and drifts away. Wouldn’t be a problem if I never restart them, but a pain if I do: after restart it can be tens of seconds away until the time is synchronized again.

Chrony can sync NTP and the RTC, but it doesn’t do that automatically, I have to trigger it manually. Instead I have written up an rtccorrect (click for repo) script that is run every 2 hours or so (could be done just once a day, actually), and eliminates the drift of the RTC.

Server backup

For backing up data between servers rsync has proven invaluable. I have a couple of scripts that do just that, though those are among my oldest ones and at that time I haven’t separated out personal information (way too easy to inline every credential, email, login, and all that), so I need to sanitize that. A couple of  ideas about these backup scripts:

  • sometimes higher transfer speed can be achieved by messing with the ssh algorithms, eg. passing “-e ‘ssh -c arcfour'” to rsync
  • more often there’s even better performance when there’s an rsync daemon running on the remote computer (though with Raspberry Pi, both cases are still frustratingly slow)
  • can exclude some files if no need to transfer them, eg: “–filter=’- *.part'”
  • using rsync not just to transfer but to mirror, the “–delete” (delete at target if doesn’t exist at origin) and “–archive” are pretty useful

For these backups I also use the Dead Man’s Snitch to know when things didn’t work out, e.g having a similar command in the cron list, where backup.sh is my script’s name, xxxxxxxx is the snitch ID from my account:

backup.sh && curl -s https://nosnch.in/xxxxxxx > /dev/null

This way I got to know when my backup server was dying all the time because of bad heatsink, or my host server by flaky hosting company….

Afterword

I guess there will be just more automation in the future, and maybe many of these scripts can be ported onto a common base so new ones are made much easier. What else do you guys automate?