<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Claude Archives - ClickedyClick</title>
	<atom:link href="https://gergely.imreh.net/blog/tag/claude/feed/" rel="self" type="application/rss+xml" />
	<link>https://gergely.imreh.net/blog/tag/claude/</link>
	<description>Life in real, complex and digital.</description>
	<lastBuildDate>Thu, 30 Jan 2025 02:59:00 +0000</lastBuildDate>
	<language>en-GB</language>
	<sy:updatePeriod>
	hourly	</sy:updatePeriod>
	<sy:updateFrequency>
	1	</sy:updateFrequency>
	
	<item>
		<title>Adventures into Code Age with an LLM</title>
		<link>https://gergely.imreh.net/blog/2024/11/adventures-into-code-age-with-an-llm/</link>
					<comments>https://gergely.imreh.net/blog/2024/11/adventures-into-code-age-with-an-llm/#respond</comments>
		
		<dc:creator><![CDATA[Gergely Imreh]]></dc:creator>
		<pubDate>Sat, 09 Nov 2024 09:50:20 +0000</pubDate>
				<category><![CDATA[Computers]]></category>
		<category><![CDATA[Machine Learning]]></category>
		<category><![CDATA[Programming]]></category>
		<category><![CDATA[Claude]]></category>
		<category><![CDATA[llm]]></category>
		<category><![CDATA[python]]></category>
		<guid isPermaLink="false">https://gergely.imreh.net/blog/?p=2910</guid>

					<description><![CDATA[<p>It&#8217;s a relaxed Saturday afternoon, and I just remembered some nerdy plots I&#8217;ve seen online for various projects, depicting &#8220;code age&#8221; over time: how does your repository change over the months and years, how much code still survives from the beginning till now, etc&#8230; Something like this made by the author of curl: It looks [&#8230;]</p>
<p>The post <a href="https://gergely.imreh.net/blog/2024/11/adventures-into-code-age-with-an-llm/">Adventures into Code Age with an LLM</a> appeared first on <a href="https://gergely.imreh.net/blog">ClickedyClick</a>.</p>
]]></description>
										<content:encoded><![CDATA[
<p>It&#8217;s a relaxed Saturday afternoon, and I just remembered some nerdy plots I&#8217;ve seen online for various projects, depicting &#8220;code age&#8221; over time: how does your repository change over the months and years, how much code still survives from the beginning till now, etc&#8230; Something like <a href="https://daniel.haxx.se/blog/2024/10/31/curl-source-code-age/">this made by the author of curl</a>:</p>



<figure class="wp-block-image size-large is-style-default"><img fetchpriority="high" decoding="async" width="1024" height="644" src="https://gergely.imreh.net/blog/wp-content/uploads/2024/11/screenshot-2024-10-31-at-13-39-29-gnuplot-1024x644.png" alt="" class="wp-image-2914" srcset="https://gergely.imreh.net/blog/wp-content/uploads/2024/11/screenshot-2024-10-31-at-13-39-29-gnuplot-1024x644.png 1024w, https://gergely.imreh.net/blog/wp-content/uploads/2024/11/screenshot-2024-10-31-at-13-39-29-gnuplot-500x314.png 500w, https://gergely.imreh.net/blog/wp-content/uploads/2024/11/screenshot-2024-10-31-at-13-39-29-gnuplot-768x483.png 768w, https://gergely.imreh.net/blog/wp-content/uploads/2024/11/screenshot-2024-10-31-at-13-39-29-gnuplot-1536x966.png 1536w, https://gergely.imreh.net/blog/wp-content/uploads/2024/11/screenshot-2024-10-31-at-13-39-29-gnuplot-2048x1288.png 2048w, https://gergely.imreh.net/blog/wp-content/uploads/2024/11/screenshot-2024-10-31-at-13-39-29-gnuplot-1200x754.png 1200w, https://gergely.imreh.net/blog/wp-content/uploads/2024/11/screenshot-2024-10-31-at-13-39-29-gnuplot-1980x1245.png 1980w" sizes="(max-width: 1024px) 100vw, 1024px" /><figcaption class="wp-element-caption">Curl&#8217;s code age distribution</figcaption></figure>



<p>It looks interesting and informative. And even though I don&#8217;t have codebases that have been around this long, there are plenty of codebases around me that are fast moving, so something like a month (or in some cases week) level cohorts could be interesting.</p>



<p>One way to take this challenge on is to actually sit down and write the code. Another is to take a <a href="https://en.wikipedia.org/wiki/Large_language_model">Large Language Model</a>, say <a href="https://claude.ai/">Claude</a> and try to get that to make it. Of course the challenge is different in nature. For this case, let&#8217;s put myself in the shoes of someone who says</p>



<blockquote class="wp-block-quote is-layout-flow wp-block-quote-is-layout-flow">
<p>I am more interested in the results than the process, and want to get to the results quicker.</p>
</blockquote>



<p>See how far we can get with this attitude, and where does it break down (probably no spoiler: it breaks down very quickly.).</p>



<p>Note on the selection of the model: I&#8217;ve chosen Claude just because generally I have good experience with it these days, and it can share generated artefacts (like the relevant Python code) which is nice. And it&#8217;s a short afternoon. :) Otherwise anything else could work as well, though surely with varying results. </p>



<h3 class="wp-block-heading">Version 1</h3>



<p>Let&#8217;s kick it off with a quick prompt.</p>



<blockquote class="wp-block-quote is-layout-flow wp-block-quote-is-layout-flow">
<p><strong>Prompt</strong>: How would you generate a chart from a git repository to show the age of the code? That is when the code was written and how much of it survives over time?</p>
</blockquote>



<p>Claude quickly picked it up and made me a Python script, which is nice (that being my day-to-day programming language). I guess that&#8217;s generally a good assumption these days if one does data analytics anyways (asking for another language is left for another experiment).</p>



<span id="more-2910"></span>



<p>The result is this <a href="https://claude.site/artifacts/d7c4b1f6-e6c4-4d74-82dd-97e0d40d0c5e">this code</a>. I&#8217;ve skimmed it that it doesn&#8217;t just delete all my repo or does something completely batshit, but otherwise saved in a repo that I have at hand. To make it easier on myself, added some <a href="https://packaging.python.org/en/latest/specifications/inline-script-metadata/#inline-script-metadata">inline metadata</a> with the dependencies:</p>



<pre class="wp-block-code"><code class=""># /// script
# dependencies = [
#   "pandas",
#   "matplotlib",
# ]
# ///</code></pre>



<p>and from there I can just run the script with <a href="https://docs.astral.sh/uv/">uv</a>.</p>



<p>First it checked too few files (my repository is a mixture of Python and SQL scripts managed by <a href="https://www.getdbt.com/">dbt</a>), so had to go in and change those filters, expanding them.</p>



<p>Then the thought struck me to remove the filter altogether (since it already checks only files that are checked in git, so it should be fine &#8211; but then it broke on a step where it reads a file as if it was text to find the line counts. I guess there could be a better way of filtering (say &#8220;do not read binary files&#8221;, if there&#8217;s a way to do that), but just went with catching the problems:</p>



<pre class="wp-block-code"><code class=""># ....
    for file_path in tracked_files:
        try:
            timestamps = get_file_blame_data(file_path)
            for timestamp in timestamps:
                blame_data[timestamp] += 1
                total_lines += 1
        except UnicodeDecodeError:
            print(f"Error reading file: {file_path}")
            continue
#....</code></pre>



<p>(hance I know that a favicon PNG was causting those <code>UnicodeDecodeError</code> hubbub in earlier runs. Now we are getting somewhere, and we have a graph like this:</p>



<figure class="wp-block-image size-large"><img decoding="async" width="1024" height="512" src="https://gergely.imreh.net/blog/wp-content/uploads/2024/11/code_age_distribution-1024x512.png" alt="" class="wp-image-2911" srcset="https://gergely.imreh.net/blog/wp-content/uploads/2024/11/code_age_distribution-1024x512.png 1024w, https://gergely.imreh.net/blog/wp-content/uploads/2024/11/code_age_distribution-500x250.png 500w, https://gergely.imreh.net/blog/wp-content/uploads/2024/11/code_age_distribution-768x384.png 768w, https://gergely.imreh.net/blog/wp-content/uploads/2024/11/code_age_distribution.png 1200w" sizes="(max-width: 1024px) 100vw, 1024px" /><figcaption class="wp-element-caption">Version 1</figcaption></figure>



<p>This is already quite fun to see. There are the sudden accelerations of development, there are the plateaus of me working on other projects, and generally feel like &#8220;wow, productive!&#8221; (with no facts backing that feeling <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f602.png" alt="😂" class="wp-smiley" style="height: 1em; max-height: 1em;" />). Also pretty good ROI on maybe 15 mins of effort.</p>



<p>Having said that, this is still fair from what I wanted.</p>



<h3 class="wp-block-heading">Version 2</h3>



<blockquote class="wp-block-quote is-layout-flow wp-block-quote-is-layout-flow">
<p><strong>Promt:</strong> Could we change the code to have cohorts of time, that is configurable, say monthly, or yearly cohoorts, and colour the chart to see how long each cohort survives?</p>
</blockquote>



<p>This came back with another <a href="https://claude.site/artifacts/34b65023-9ba4-4abb-b214-1c10d82eacfd">set of code</a>. Adding the metadata, skimming it (it has the filter on the file extensions again, never mind), and running it once more to see the output, I get this:</p>



<figure class="wp-block-image size-large"><img decoding="async" width="1024" height="542" src="https://gergely.imreh.net/blog/wp-content/uploads/2024/11/code_survival_monthly-1024x542.png" alt="" class="wp-image-2912" srcset="https://gergely.imreh.net/blog/wp-content/uploads/2024/11/code_survival_monthly-1024x542.png 1024w, https://gergely.imreh.net/blog/wp-content/uploads/2024/11/code_survival_monthly-500x265.png 500w, https://gergely.imreh.net/blog/wp-content/uploads/2024/11/code_survival_monthly-768x406.png 768w, https://gergely.imreh.net/blog/wp-content/uploads/2024/11/code_survival_monthly-1536x813.png 1536w, https://gergely.imreh.net/blog/wp-content/uploads/2024/11/code_survival_monthly-2048x1083.png 2048w, https://gergely.imreh.net/blog/wp-content/uploads/2024/11/code_survival_monthly-1200x635.png 1200w, https://gergely.imreh.net/blog/wp-content/uploads/2024/11/code_survival_monthly-1980x1048.png 1980w" sizes="(max-width: 1024px) 100vw, 1024px" /><figcaption class="wp-element-caption">Version 2</figcaption></figure>



<p>Because of the file extension filter in place, the numbers are obviously not aligning with the above, but it does something. The something is a bit unclear, bit it <em>feels</em> like progress, so let&#8217;s give it a benefit of the doubt, and just change once more. </p>



<h3 class="wp-block-heading">Version 3</h3>



<blockquote class="wp-block-quote is-layout-flow wp-block-quote-is-layout-flow">
<p><strong>Promt:</strong> Now change this into a cummulative graph, please.</p>
</blockquote>



<p>One more time Claude came back with <a href="https://claude.site/artifacts/a391c5dc-743c-439f-aa09-70ec9b86a19a">this code</a>. Adding the metadata again, same drill. Running this has failed with errors in <code>numpy</code>, though:</p>



<pre class="wp-block-code"><code class="">TypeError: ufunc 'isfinite' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''</code></pre>



<p>Now this needed some debugging. It turns out a column the code is trying to plot is actually numbers as strings rather than numbers as, you know, say floats&#8230;</p>



<pre class="wp-block-code"><code class=""># my "fix"
        df['cumulative_percentage'] = df['cumulative_percentage'].astype(float)
# end

        # Plot cumulative area
        plt.fill_between(df.index, df['cumulative_percentage'],
                        alpha=0.6, color='royalblue',
                        label='Cumulative Code')</code></pre>



<p>It didn&#8217;t take too many tries, but it was confusing at first &#8211; why shouldn&#8217;t be, if I didn&#8217;t actually <em>read</em> just skim the code&#8230;</p>



<p>The result is then like this:</p>



<figure class="wp-block-image size-large"><img loading="lazy" decoding="async" width="1024" height="543" src="https://gergely.imreh.net/blog/wp-content/uploads/2024/11/code_growth_monthly-1024x543.png" alt="" class="wp-image-2913" srcset="https://gergely.imreh.net/blog/wp-content/uploads/2024/11/code_growth_monthly-1024x543.png 1024w, https://gergely.imreh.net/blog/wp-content/uploads/2024/11/code_growth_monthly-500x265.png 500w, https://gergely.imreh.net/blog/wp-content/uploads/2024/11/code_growth_monthly-768x407.png 768w, https://gergely.imreh.net/blog/wp-content/uploads/2024/11/code_growth_monthly-1536x814.png 1536w, https://gergely.imreh.net/blog/wp-content/uploads/2024/11/code_growth_monthly-2048x1085.png 2048w, https://gergely.imreh.net/blog/wp-content/uploads/2024/11/code_growth_monthly-1200x636.png 1200w, https://gergely.imreh.net/blog/wp-content/uploads/2024/11/code_growth_monthly-1980x1049.png 1980w" sizes="auto, (max-width: 1024px) 100vw, 1024px" /><figcaption class="wp-element-caption">Version 3</figcaption></figure>



<p>Sort of <em>meh</em>, it feels like it&#8217;s not going to the right direction overall.</p>



<p>But while debugging the above issues, I first tried tried to ask Claude about the error (maybe it can fix it itself), but came back with &#8220;Your message exceeds the <a href="https://support.anthropic.com/en/articles/7996848-how-large-is-claude-s-context-window">length limit</a>. &#8230;&#8221; (for free users, that is). So I kinda stopped here for the time being.</p>



<h2 class="wp-block-heading">Lessons learned</h2>



<p>The first lesson is very much re-learned:</p>



<blockquote class="wp-block-quote is-layout-flow wp-block-quote-is-layout-flow">
<p>Garbage in, garbage out.</p>
</blockquote>



<p>If I cannot express what I really want, it&#8217;s very difficult to make it happen. And my prompts were by no means expressing my wishes correctly, no wonder Claude wasn&#8217;t really hitting the mark. Whether or not a human engineer would have faired better, I don&#8217;t know. I know however, that this kind of &#8220;tell me exceedingly clearly what&#8217;s your idea&#8221; is an everyday conversation for me as an engineer (and being on both end of the convo).</p>



<p>The code provided by the model wasn&#8217;t really far off for <em>some</em> solution, so that was fun! On the other hand, when it hit any issues, I really had to have domain and language knowledge to fix things. This seems like an interesting place to be:</p>



<ul class="wp-block-list">
<li>the results are quick and on the surface good-enough for a non/less technical person, probably</li>



<li>but they would also be the ones who couldn&#8217;t do anything if something goes wrong.</li>
</ul>



<p>Even myself I feel that it would be hard to support the code as a software engineer if it was just generated like this. But that&#8217;s also a strange thought: so many times I have to support (debug, extend, explain, refactor) code that I haven&#8217;t had anything to do with before.</p>



<p>It seems to me that now that since Claude comes across as an eager junior engineer, writing decent code that always needs some adjustments, the trade-off is really in the dimension of spending time to get better at prompting vs better at coding. </p>



<p>If there&#8217;s a person with some amount of programming skills, mostly interested in the results not the process, and doubling down on prompting: they likely could get loads further than I did here. Good quality prompts and small amount of code adjustments being the sweet spot for them.</p>



<p>For others who have more programming expertise, and maybe more interested in the process, spending time on getting better at programming rather than getting really better at prompting: keeping to smaller snippets might be the sweet spot, or learning new languages, &#8230; Something as a starting point for digging in, a seed, is what this process can help with.</p>



<h2 class="wp-block-heading">Future</h2>



<p>Given the above notes on how this generated code is like a new codebase that I suddenly neet to support, here&#8217;s a different, fun exercise <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4a1.png" alt="💡" class="wp-smiley" style="height: 1em; max-height: 1em;" /> to actually improve engineering skills:</p>



<blockquote class="wp-block-quote is-layout-flow wp-block-quote-is-layout-flow">
<p>Take AI generated code that is &#8220;good enough&#8221; for a small problem and refactor, extent, productionise it.</p>
</blockquote>



<p>I&#8217;m not sure if this would work, or would get me into wrong habits, but if I wanted do have some quick ways of doing <a href="https://en.wikipedia.org/wiki/Peak:_Secrets_from_the_New_Science_of_Expertise">deliberate practice</a> &#8211; and not <a href="https://exercism.org/">Exercism</a>, <a href="https://gergely.imreh.net/blog/2023/08/doing-the-easy-problems-on-leetcode/">LeetCode</a>, or somilar, rather something that can be custom made, then this seems a way to get started.</p>



<p>Also, now that I&#8217;ve gotten even more interested in the problem, I&#8217;ll likely just dig into how to actually define that chart I was looking for and what kind of data I would need to get from <code>git</code> to make it happen. The example code made me pretty confident, that &#8220;all I need is Python&#8221; really, even though while prepping for this I found other useful tools like <a href="https://github.com/mergestat/mergestat-lite">one allowing you to write SQL queries for your repo</a>, that might be some further way to expand my understanding.</p>



<p>Either way, it&#8217;s just fun to mess with code on a lazy Saturday.</p>
<p>The post <a href="https://gergely.imreh.net/blog/2024/11/adventures-into-code-age-with-an-llm/">Adventures into Code Age with an LLM</a> appeared first on <a href="https://gergely.imreh.net/blog">ClickedyClick</a>.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://gergely.imreh.net/blog/2024/11/adventures-into-code-age-with-an-llm/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
			</item>
	</channel>
</rss>

<!--
Performance optimized by W3 Total Cache. Learn more: https://www.boldgrid.com/w3-total-cache/?utm_source=w3tc&utm_medium=footer_comment&utm_campaign=free_plugin

Object Caching 17/24 objects using APC
Page Caching using Disk: Enhanced 
Lazy Loading (feed)

Served from: gergely.imreh.net @ 2026-04-05 17:23:49 by W3 Total Cache
-->