Protohackers is a server programming challenge, where various network protocols are set as a problem. It has started not so long ago, and the No 3. challenge was just released yesterday, aiming at creating a simple (“Budget”) multi-user chat server. I thought I sacrifice a decent part of my weekend give it a honest try. This is the short story of trying, failing, then getting more knowledge out than I’ve expected.
Definitely wanted to tackle it using Python as that’s my current utility language that I want to know most about. Since the aim of Protohackers, I think, is to go from scratch, I set to use only the standard library. With some poking around documentation I ended up choosing SocketServer as the basis of the work. It seemed suitable, but there was a severe dearth of non-dummy code and deeper explanation. In a couple of hours I did make some progress, though, that already felt exciting:
- Figured out (to some extent) the purpose of the server / handler parts in practice
- Made things multi-user with data shared across connections
- Grokked a bit the lifecycle of the requests, but definitely not fully, especially not how disconnections happen.
Still it was working to some extent, I could make a server that functioned for a certain definition of “functioned”, as the logs attest:
On the other hand, ended up in a relative dead-end, as some message ordering issues kicked in, and reliably failed the test here, not knowing much what to try next just yet:
Since it’s a learning exercise and definitely not a competition on my part, I started to procrastinate. Not long before I’ve looked at the status of the leaderboard. Funnily enough, looking at the top entries, they were linking to the repositories where their solutions were! 💡
Shoulders of Giants
Here’s my surprise and delight started, though. Within the first 7 entries there were 3 with Python implementations that included code! Even better, they actually covered 3 completely different ways of solving the task. Jackpot, really!
- The first solution used pure sockets, which is quite versatile if I’d want to go all-in on low-level networking in the future. It had quite a lot of helper code, though which makes it look like a pretty decent effort to duplicate.
- The second solution went with SocketServer just like I’ve tried, and that is nice to dig in a bit more, given how small the whole code is. The main thing here was that I should have understood from the problem description this being a Streaming TCP connection case. Looks like streaming is the part that takes care of a lot of details, including the connection/disconnection that plagued me. Bam!
- The third solution then used asyncio, to take it in a different direction again. It’s amazing how simple it all is when the relevant components and abstractions are understood.
Which one is the most tempting solution to follow (and/or learn from)? Pure sockets are likely just a fallback option when there’s nothing else. On the SocketServer vs asyncio front however there was some useful StackOverflow discussion, even if a bit dated, coming from 2016. It pointed at the different use of threading and event loops. I guess this would make this answer a bit unsatisfying, but quite realistic: learn both and know when either is applicable for your use case.
What did we learn?
In the end I haven’t finished my code yet. Reading the existing solutions influences me and just adapting what others did and submit would feel like cheating (to myself). The way to resolve this is setting your own goals on top of the original challenge. Here I picked the following, and achieving these would complete things for me:
- Use proper project structure and try out PDM
- Figure out how to set up the project & code to be testable with pytest (basically grok testing of programs that run servers)
The combination of these focuses on something akin to “going to production”, besides obviously writing the actual code, which is very much relevant to my interests.
So far I haven’t seen many examples of testing SocketServer, though there’s Python’s own test suit that could be a starting place. It has a lot of super useful helper functions (such as finding an unused port to run the server on), but overall seems a lot of boilerplate too. For asyncio I haven’t looked around yet. It being “cooler” there might be more discussion around it, but it’s by no means a given. Would be interesting to combine this with a Basic Chat client as well.
Another impression from today’s effort is that Python modules are documented to very varying levels. Their complexity definitely jumps when I try to go from dummy stuff to anything useful. For example here understanding the proper role and interaction of the Server and Handler parts of this multi-user environment.
I’m also acutely aware that my networking knowledge is very patchy regardless of doing networking-adjacent stuff for decades. It’s a very useful frontier to tackle when I have a chance.
Finally, ngrok is still very cool tool, nice to be able to sit in a cafe and safely exposing a server to the internet.