It’s a relaxed Saturday afternoon, and I just remembered some nerdy plots I’ve seen online for various projects, depicting “code age” over time: how does your repository change over the months and years, how much code still survives from the beginning till now, etc… Something like this made by the author of curl:
data:image/s3,"s3://crabby-images/6e9be/6e9bed0704bdefc0402814acaaa16e889de2b51c" alt=""
It looks interesting and informative. And even though I don’t have codebases that have been around this long, there are plenty of codebases around me that are fast moving, so something like a month (or in some cases week) level cohorts could be interesting.
One way to take this challenge on is to actually sit down and write the code. Another is to take a Large Language Model, say Claude and try to get that to make it. Of course the challenge is different in nature. For this case, let’s put myself in the shoes of someone who says
I am more interested in the results than the process, and want to get to the results quicker.
See how far we can get with this attitude, and where does it break down (probably no spoiler: it breaks down very quickly.).
Note on the selection of the model: I’ve chosen Claude just because generally I have good experience with it these days, and it can share generated artefacts (like the relevant Python code) which is nice. And it’s a short afternoon. :) Otherwise anything else could work as well, though surely with varying results.
Version 1
Let’s kick it off with a quick prompt.
Prompt: How would you generate a chart from a git repository to show the age of the code? That is when the code was written and how much of it survives over time?
Claude quickly picked it up and made me a Python script, which is nice (that being my day-to-day programming language). I guess that’s generally a good assumption these days if one does data analytics anyways (asking for another language is left for another experiment).