Language of the Month: Rust, the results

Post author By Gergely Imreh
Post date December 7, 2015
7 Comments on Language of the Month: Rust, the results

Every now and then I do a “Language of the Month” feature when I spend one month to learn a new programming language. This last month in November I’ve spent with Rust, and it’s time to take stock. Will look at the impression I had in this short time, show one project that I get done in Rust, and some ideas what I’d like to do with Rust in the future!

Experience

According to my time log, I have spent about 20 hours this month learning Rust. That’s way too little to have a good understanding, but definitely enough to have some educated guesses (and excitement, and horror , as appropriate). This time I’m generally very impressed as Rust comes across as indeed a very modern and smart language, although that modernness is mostly in the tooling and non-essential parts. It is also changing very quickly, for good and bad. Here are some, admittedly subjective and incomplete list of observations. Good is what I like, Bad what’s less nice IMHO, and Ugly is what’s imperfect or confusing (at this stage of my Rust learning):

Good

It’s great to see that documentation is a not an afterthought, but a core part, that is making use a lot of modern development experience. Having a standard way to include example code in docstrings and actual tests are run on them to make sure the examples are also up-to-date with the code is a very cool concept. Also being able to auto-generate HTML documentation from the code is probably going to be standard in most new languages (I think Go does that also, and other languages have gained similar optional tools)

It’s also great to see how testing is a not an afterthought either. There are a bunch of different ways of including tests (in the same file, test file, directory with test, in documentation…), though most of the time there’s probably one way that fits the project the best and can use that.

Rust has some very interesting foundations, mostly on top of the memory model and traits, that take quite a bit of time to wrap one’s head around (I’m not there yet), but goes a very long way to eliminate programming by accident (or programming by permutation): it makes the it much harder to write something, see whether it works as hoped, and if not, then try to change something little and try again. Here it does not really work anymore as there are way too many things that have to be correct for the code to run, and the best to get there is to actually understand how the code should work.

Traits are intriguing, they remind me a bit of Python’s duck typing, though only similar. This also includes generics, or parametric polymorphism. Composed types are also the bomb – once you understand it and stop doing permutations of them to try to get the code to compile. :)

Interesting use semicolons, in particular for returning value when it is omitted on the last expression in a function (a bit like Matlab, where it is used as which function results to print on the debug console, and which just use silently).

Having a way to mark unimplemented sections (as part of the standard library) is very pragmatic, I’m sure it came out of a lot of engineering experience.

Strings being UTF-8 by default is a great choice and probably necessary nowadays, even if more complex. (Python goes a long way, but not quite there yet)

I’m glad to see that lists can have trailing commas after the last element. That’s a little thing, but boy how much I cannot stand Javascript lists because of the lack of this leniency.

Macros are a whole different level of power, if and when understood. It feels strange sometimes to mix “normal” functions and method with something that is just “sort of like a function”, but this is all part of the metal model. I see that they are useful, though they also come across as necessary only because of the design principles used in the language (and its relative verbosity). That also mean, that advanced Rust probably makes more use of macros than beginner’s Rust.

Enums are data types containing a list of variants. Coverage checking in enum matching (“did I handle all possible cases?”) shows again that plenty of lessons learned are put to use in making Rust.

Loop labels are cool too, they can come handy with nested loops to handle things clearly with less boilerplate / extra variables.

Cargo is a very interesting package manager. Have separate concept for binaries and libraries, for debug and release builds, running tests, lint, documentation, set up cross-compiling, and so on. I can also see from bits and pieces that there’s a lot to learn about this too to enable really cool use cases.

Bad

Most of the difficulty I had was connected to the combined fact of Rust is pretty new and changes quickly. Because of this others’ projects where I found answers to my questions could be already obsolete, the docs don’t go into enough detail to actually understand things well enough without a lot of practical trial and error, could find a programming pattern for a problem and it being already removed from the language, or the fact that (I think) the nightly builds are recommended to be used as they have the latest features enabled? This of course is not all bad, I feel that being a Rust developer now one could really leave a mark and contribute, learn a lot in unchartered territory and teach a lot of people because too few teachers are out there. It’s only troublesome when one just want to “solve a problem”, but then maybe other language is better to use.

Would love to see more documentation (and actual practical examples) to explore the memory model, reference and borrowing, traits, macros – these most powerful and useful parts of the language. This is connected to the previous step, and I think it’s up to us users to use them more, boil the examples down by experience and write the things up…

Ugly

Type declarations are required for functions, but for variable assignments they are inferred. The docs say it’s a sweet spot, but it strikes me a bit of schizophrenic. I often find that adding an explicit type to assignments makes things work a lot better here.

I find the docstrings hard to get going, eg. making them compatible to generate the library docs from inline documentation. I might have done something wrong, but even the example code in the Rust book does not seem to play with doc generation very well. Also include example code and testing code in the docstrings and the confusion amplifies.

The difference of method syntax between associated functions and method calls is probably a necessary one, it still feels somehow a bit of mental burden to carry (at least in the beginning, I’m sure it’s a beginner’s issue).

Diverging functions are an interesting beast too: there’s a different declaration for functions that have no return value. I kinda see how doing things this way can fit Rust’s model of doing things, but is it really necessary to create a whole new concept and different patterns for it? I guess this case could be handled conceptually in many ways and this is just one…

Need to import types from the standard library to be able to use some of their features, but some other features work automatically. This feels a bit of a halfway solution, though, might be necessary to keep the resulting binaries down (by not including what’s not needed), but that could probably be achieved more automatically?

Attributes are flags in the code to (sort of) tell Rust to how to handle the different code sections. Enabling debug mode, adding info about tests (that the next test should actually fail to consider it a pass, etc..), and a lot more. They are very powerful so could be considered good, but there are so many of them and so sparsely discussed most of the time that they belong for me more to this section. Need more documentation, I believe.

Project: Bitmessage Proof-of-Work library

I wanted to actually do a project in Rust for practical experience. To do something short but still somewhat useful, I decided to create a Bitmessage proof-of-work (PoW) library. I’ve already worked on it for a bit for other projects (trying to make some OpenGL, and Parallella accelerated versions) so it was kinda familiar and tractable task in during the month.

The Bitmessage proof of work in pseudocode is something like:

initialHash = hash(dataToCheck)
resultHash = hash(hash( nonce || initialHash ))
trialValue = the first eight bytes of resultHash converted to an integer

where “hash” is SHA512, and the trialValue is compared to a target calculated from the time-to-live of the message to be sent.

I took most of the inspiration from the Python version of this in the reference client, that reads:

trialValue, = unpack('>Q',hashlib.sha512(hashlib.sha512(pack('>Q',nonce) + initialHash).digest()).digest()[0:8])

Hard to beat a practical one-liner for clarity, but there’s a bit to say about performance.

I ended up creating bmpow-rust, that runs multithreaded calculation for this algorithm, using the rust-crypto. I’ve been through quite a bit of trial and error, both for the PoW algorithm and the threading.

For the PoW the hardest part was casting the data into suitable types that the different functions can pass them around correctly. How to get data from a caller in a shared library, how to set the endiannes, how to pass it to the hasing function, then get the data out again. I know the current version is not great, especially as some modifications to the code that I think should not change the outcome actually does end up with changing it.

Basic threading is dead simple as Rust promises it to be, but then the issue I had there is how to send data out of the different threads when a thread finds the right nonce and signal to the other threads to stop working. Channels are there to help (and they are brilliant), but the the memory model (borrowing and referencing) didn’t fit to how I thought it should be. Tried out 2-3 other libraries that implement similar messaging methods between threads, but in the end it turned out that channels are indeed the right choice – the trick was that I didn’t need to explicitly shut down threads, they were killed when the calling function finished, making the whole setup much simpler. I know I should add some more error checking into the code for niceness, though I think not strictly required by the problem being solved (and race conditions doesn’t matter, if multiple threads find nonces, any of them are suitable).

The result is this (likely horrifyingly bad) code to find the right nonce (so it does more than the pseudocode above):

fn bmpow(target: u64, hash: [u8; 64], starter: u64, stepsize: u64, chan_out: Sender<u64>) {
    let mut nonce: u64 = starter;
    let mut algoresult;

    loop {
        let mut wtr = vec![];
        let mut result: [u8; 64] = [0; 64];
        let mut hasher_inner = Sha512::new();
        let mut hasher_outer = Sha512::new();

        nonce += stepsize;
        match wtr.write_u64::<BigEndian>(nonce) {
            Ok(_) => {},
            Err(e) => { println!("error writing endian: {}", e) },
        }
        hasher_inner.input(&wtr);
        hasher_inner.input(&hash);

        hasher_inner.result(&mut result);
        hasher_outer.input(&result);

        let mut result_outer: [u8; 64] = [0; 64];
        hasher_outer.result(&mut result_outer);

        let mut r2 = vec![0; 64];
        hasher_outer.result(&mut r2);
        let mut rdr = Cursor::new(r2);
        // Converting from BigEndian to the endinannes of the system
        algoresult = rdr.read_u64::<BigEndian>().unwrap();
        if algoresult < target {
            chan_out.send(nonce).unwrap();
            return;
        }
    }
}

The resulting library is pretty good, compared to the multithreaded Python solution that clocks in ~400,000 check/s on my X201, on single thread it does 800,000+ check/s. When using multiple threads, it does not scale linearly, though, and much more variable: I’m getting 800,000 – 1,800,000 hash/s, different on each run. I guess spreading the work between cores is not very consistent in Rust yet? Will need to investigate more as the higher performance would be very nice to have reliably.

In the meantime, the library is available on Github. Feel free to send me a bitmessage at BM-NBooR8MZhawaba2hW6nwPHvNiQKrTVCB .

Future

I’ll definitely explore Rust more, it’s a very interesting language, and even through the difficulties understanding some parts I felt a strange happiness working with it. It feels good, even though I don’t quite know why, yet.

Some projects that I took a note to do (eventually):

An IPython kernel for Rust. Not sure whether it really makes sense, or even possible, but when I see so many languages listed to have kernels for IPython/Jupyter, it has to be!

Rust is aimed to be a systems programming language, and it’s fun to see how people use it to cross-compile for different architectures. Combining inspiration from Bare Metal Rust and some other language’s demo to cross-compile for the Pebble (lost the link), thinking about making Rust binding for the Pebble SDK to make watchfaces and apps in Rust. This is not a fully formed idea, and there other who already done some first steps (e.g. a Pebble app written in Rust), but I’m sure there’s a lot more path to map.

These are just samples, more ideas will likely come as I have more projects to do (boy, do I have a lot of projects already that I want to do…)

Links of Note

Some links that might be useful for others as well:

Awesome Rust: a collection of awesome rust code and resources, a lot of things!
Rust Playground
Cargo, though I’m sure every Rust programmer knows this
Rust by Example
Rust Reference, the Rust Standard Library
Rust for Pythonistas, Rust for C++ Programmers

Similar writeups than this:

24 Days of Rust (might be a bit old now, though)
Three Months of Rust

Also, thanks for This Week in Rust to mention my the previous post of this Language of the Month feature, and all the people who left a comment there to educate me about Rust, really appreciated!

Tags bitmessage, lotm, rust

7 replies on “Language of the Month: Rust, the results”

Hey there! Glad you’re having fun with Rust. I wanted to comment on one or two things…

> I might have done something wrong, but even the example code in the
Rust book does not seem to play with doc generation very well.

Please don’t hesitate to file bugs or post on users.rust-lang.org if something isn’t working! Even if it’s just a misunderstanding, it can help us make explanations more clear.

> I kinda see how doing things this way can fin Rust’s model of doing
things, but is it really necessary to create a whole new concept and
different patterns for it?

Yes, it’s important for type checking. You might find this interesting: https://users.rust-lang.org/t/what-s-the-difference-between-functions-with-no-return-value-and-diverging-functions/3874

> Need to import types from the standard library to be able to use some of their features, but some other features work automatically.

It’s not that, it’s about not causing conflicts. If we automatically imported traits, then, for example, if library A defines a trait with a `foo()` function, and so does library B, you couldn’t use them together, as their `foo()`s would conflict. By requiring you to import, you can choose which trait gets used, and in which scope.

Anyway, thanks for the great post!

Brilliant, thanks for the insights, Steve!

Often I feel it would help a lot if I had the CS basics down better as I could appreciate the choices that goes into language design a lot more. (self-taught programmer…). Rust really made me want to dig deeper which is a great feeling. Will be learning to get things less wrong. :)

Will definitely file bugs and send patches whenever I figure out something!

The rationale for how Rust handles explicit vs. implicit typing is:

1. Excessively explicit typing is bothersome and doesn’t really do much.

2. Nothing is stopping you from adding explicit type annotations. (see examples of how verbose it looks when you explicitly provide lifetime annotations for everywhere that they’d otherwise be inferred)

3. The Rust compiler will only infer a type in an unambiguous situation.

4. Function signatures must be explicitly typed because they are part of the API and ABI. (Otherwise, you end up with the massively chained/nested C++ type signatures that slow the compilation process and make errors obtuse and difficult to understand… and nobody wants that.)

Makes a lot of sense, thanks for the explanation!

The one thing I really like about Rust is it’s powerful and extensive functional programming capabilities. The Iterator type in itself makes it superior to many other languages.

Thanks for the write up, it is a very well written post.

I have been focusing a lot on unit test lately in C for embedded, As you have said, it is very nice to see that unit test cases are treated as the same as the code that you write. Do you know if the rust compiler will execute these tests automatically when a build is made? If the test fails, the build is stopped. Analogous to getting a compile time error.

Good question. As I can see, you need to run “cargo test” separately from “cargo build” at the moment. If you want to run them in the same time, could possibly but them together in a Makefile maybe? But I’m sure there are even better solutions that I don’t know about yet :)