Unterstanding Rust’s Vec and its capacity for fast and efficient programs

You want your programs to be fast and memory efficient. Rust’s built-in collections give you both speed and efficiency.

The most important collection is std::vec::Vec (just called Vec below), which is a dynamically growing array similar to std::vector in C++ or ArrayList in Java (but unlike the persistent Vector in Scala).

Vec is part of std::collection and probably the collection you’re are using the most in Rust. Actually, the Rust docs recommend that in most cases you just use Vec if you need a dynamically growing array (list) or a HashMap in case you need a map.

Vec is built on top of alloc::raw_vec::RawVec. This data structure itself is not meant to be used by Rust programmers. It has a lot of unsafe code in it (“unsafe” meaning the Rust language keyword. See the Rust language documantation for details).

The good news is that by using Vec you’ll never have to deal with unsafe code yourself. It has all been taken care of by the Rust core developers and you can use Vec with all the guarantees the Rust compiler is giving you (like memory safety, no data races using threads, etc.).

In order to get the most out of Vec you need to understand what capacity means.

A Vec stores it’s data in a buffer that can grow dynamically when you add elements. You can create a new vector that has an empty buffer or you can allocate a vector which already has a certain buffer size. That is known as the vector’s capacity.

As with everything in life and in programming there are pros and cons. Dynamically growing a vector involves allocating new memory and potentially copying the data to a new memory location. That has costs.

Let’s say you start with a Vec with a capacity for holding 5 integers that are 32 bit wide (i32).

That allows you to add 5 elements to the Vec.
If you add a 6th element the Vec (or to be more specific the internal RawVec) needs to allocate more memory internally.

If you start with a small Vec and add lots of elements it has to reallocate several times. That could result in bad performance.
In order to avoid too many reallocations Vec does allocate more than just the space for one element once it notices that a reallocation is necessary. That is because Vec assumes that you’re probably going to add even more elements and adding those will be much cheaper if there is already enough space allocated.
Of course if you don’t add any further elements then allocating more memory than what is strictly necessary will wasted memory resources.

That is the trade-off when using a dynamically growing data structure. Rust allows you to tune this in certain ways if you have an idea of the amount of elements you’re going to store.

For example if you know that your Vector will need space for 15-20 elements but not more you can allocate a Vec with an initial capacity of 20 and you’ll never have to allocate new memory.

Sometimes you’ll only know during runtime that you need to store more elements in a Vec. This is were the reserve and reserve_exact methods come in.

Below we will look at a lot of examples that explain in detail how the capacity of the Vec is affected by the different methods.

1) Empty Vec, reserve, clear and shrink_to_fit

In this example we create an empty Vec. An empty Vec does not allocate capacity and therefore it is 0. When we reserve space for 8 elements the internal RawVec allocates space for 2x as many elements resulting in a capacity of 16.
clear() doesn’t affect the capacity at all, it would just remove any element in the Vec. shrink_to_fit() frees any allocated memory that is not used, in our case the Vec is back at a capacity of 0.

2) Empty Vec, vec! and pushing elements

Here we create an empty Vector with a capacity of 0. When we add 1 element the Vec allocates a capacity of 4. This is a special case for empty Vecs. Normally the capacity would be doubled but for empty Vecs a space of 4 elements is allocated once the first the element is added.

When we create a Vec with vec! and only 1 element in it it has a capacity of 1. After adding a 2nd element the capacity is not increased to 4 (as when adding an element to an empty Vec) but it is just doubled to 2.

Similarly when we create an Vec with vec! and 4 initial elements. Then the capacity is doubled to 8.

3) Empty Vec, pushing one element, reserve() and shrink_to_fit()

Here we create another empty Vec and push 1 element. This results in a capacity of 4. When reserving 8 more the resulting capacity is 18.
Why is that? The Vec adds the number of elements (here 1) and the reserved capacity of 8 and then multiplies the value with 2.
In this case this is (1 + 8 ) * 2 = 18.
When we call shrink_to_fit() it will reduce the capacity back to 1.

4) Vec with initial capacity

If you already know how many elements your Vec will contain (either exactly or even just approximately) you can create a Vec with an an initial capacity. Then the Vec can already allocate all the necessary memory and doesn’t have to reallocate memory once you add elements – unless you add more than the initial capacity.
Here we create a Vec with an initial capacity of 5 and add a few elements. As long as we don’t add more than 5 elements, no resizing happens and the capacity stays at 5. If we add a 6th element the capacity doubles to 10.

5) Vec with initial capacity and a later call to reserve()

This is another interesting example. When we create a Vec with an initial capacity of 5 reserving 4 elements doesn’t affect the capacity.
When we reserve 6 elements it allocates space for twice that much (6 * 2) resulting in a capacity of 12.

6) Vec with initial capacity, push() and a later call to reserve()

This is a slightly different version of the one before. Here we push 2 elements to a Vec of 5. When we later reserve space for 4 more elements the Vec resizes to 12. It adds the number of elements already in it and the number of newly reserved elements (2 + 4) and multiplies them with 2. That results in a capacity of 12 ( (2 + 4) * 6 ). Another call to reserve(6) has no effect as the Vec already has a capacity of 12 but only 2 elements in it.
If we had called push() 5 more times after the first call to reserve than the 2nd call to reserve(6) would have had resulted in a reallocation.
Why? Because there would have been 7 elements in the Vec (2 pushed before the 1st call to reserve() and 5 pushed after that). In that case reserve() would have added the elements in the Vec (7) to the newly reserved capacity (6) and multiplied that with 2.
That would have resulted in a capacity (7 + 6) * 2, that is 26. Try it out for yourself to see the effect!.

7) Vec with initial capacity and reserve_exact()

reserve_exact() is important when you want to reserve extra space but already know how much you’re going to need. This is especially important when you want to reserve a lot of extra memory, let’s say for 500,000 new elements. If you just call reserve() the allocated memory might be twice as big (see examples above) and you’ll end up with memory allocated for 1,000,000 elements.
If you only need memory for 500K elements then you would waste a lot of memory. If this is in server code and you have a lot of requests calling your function at the same time, you might run out of memory. Rust immediately frees memory when no longer needed (remember, it has no GC) but if the Vecs stay around for some time (e.g. because your code is waiting in a blocking call to a database) it might cost too much and your program might stop working or crash.
In this example reserve_exact(3) has no effect on the Vec‘s capacity as it already has a capacity of 5 and only 2 elements in it.

8) Vec with initial capacity and reserve_exact(), 2nd example

Here we have a very similar example to the one above but now we are reserving exactly 4 elements in a Vec that has a capacity of 5 and already 2 elements in it. Because 2+4 does not fit into a capacity of 5 extra space is allocated. In this case the new capacity is 6 (2+4).

I hope all these examples shed some light on the Vec.

You can find all the examples of this article on Github:
https://github.com/MarkusJais/rust-collection-examples/blob/master/src/bin/vec_capacity.rs

Other data structures building on Vec

Several other data structures build on top of Vec. These include the VecMap (a Map based on Vec), BinaryHeap (a Priority Queue) or VecDeque.
I will cover those data structures in more detail in upcoming articles.

A final note about Performance

In many cases tweaking the capacity and reallocation of memory for the Vector might not make a lot of difference in your application.
Good architecture (e.g. avoiding needless network calls, batching of requests, etc) might have a much more noticeable impact.
That said, in some cases every bit of improved performance is better, e.g. when you’re writing a new NoSQL database, a big data processing tool or other software that needs to deal with a lot of data or a lot of requests or both. If you use Vec a lot in that kind of code, knowing how it works internally is crucial.
BTW, Rust is a great language for writing high-performance NoSQL databases or big data processing tools and I expect to see more than one in the future).

The Rust team has put a lot of effort into making Vec work as good as it gets. Don’t try to reinvent the wheel. It is a fantastic data structure.
Only come up with something new if you’ve done tests and really, really (I mean really!!!) know that Vec is not good enough for your requirements.

This might change

The code was tested with the stable Rust 1.13 version. The exact algorithms for reserve() are implementation details, so this might change in future versions of Rust.

New book: Fluent Python by Luciano Ramalho

O’Reilly has just published a new and very interesting book about Python called Fluent Python – Clear, Concise, and Effective Programming and is written by Luciano Ramalho.

It is targeted at Python 3.4 but might also be useful for someone stuck with the old 2.7 version.

An advanced book, not for Python beginners. It looks like THE most advanced book for Python and is huge with more than 700 pages.

Although today I very much prefer statically typed languages with powerful compilers like Rust or Scala (not everything can be checked with unit tests) I still like Python. It is my favorite non statically typed language and fun to use.

A review will be published when I am done reading it – which might take a while because there is so much stuff about Rust I want to read :-)

Book review: Rust Essentials

Rust is a very exciting new programming language developed by Mozilla. In Mai 2015 Packt Publishing published the first print book, also available as an ebook incl. in PDF format (DRM-free!!!) about Rust. It is called Rust Essentials and it is a rather short book covering the basics of Rust so that an experienced developer can start using Rust quickly.

Overall I liked the book and I’ve learned a few things I hadn’t already known or missed when reading the online documentation.
As the title says this book covers just the essentials. It often refers to Rust’s excellent online documentation for further details. So this book will not turn you into a Rust guru but that is not the goal of the book.

Rust Essentials covers most of the core ideas and concepts of Rust incl. the type system, control structures, how Rust achieves memory safety and how to use the pointer types in Rust. The module system, basic concurrency and even macros are also covered to some extent.

My favorite chapters where chapter 5 about “Generalizing Code with Higher-order Functions and Parametrization” and chapter 6 about “Pointers and memory safety”. Examples where well chosen to explain the concepts and I liked the idea of a computer game (although I do never play games mysefl). In the 6th chapter about pointers chapter I missed a deeper coverage of types like Ref and RefCell and I definitely missed at least a short introduction to the Cow (clone-on-write) pointer type. But maybe these types are too advanced for an “Essentials” book and belong into a bigger book (there is at least one in the making, see below).

The writing style the book is good and easy to follow. There are a few typos but nothing serious (I hope the publisher will update the ebooks for all customers who bought the electronic versions when the errata becomes available).
The code examples are easy to follow (note to Publisher: Please use syntax highlighting in the PDF!) and is also available on Github.

The question is: Is the book necessary if you’ve already read the online documentation? For some, probably not. Other, though, learn better when reading more than one explanation about the same topic and for them reading Rust Essentials in addition to the online book is a good idea.

Summary: In and of itself a good and easy to read introduction to Rust. If you want extra reading material in addition to the online docs, this book is a good choice.

Publishers website for the book:
Rust Essentials

Alternatives:
At the time of this writing there is not other Rust book available. One has already been announced and that is Programming Rust written by Jim Blandy and will be published by O’Reilly. It is currently scheduled for November and will have 400 pages (according to the book’s website).

Book review: Scala Cookbook

Disclaimer: I received this book as part of the O’Reilly Blogger Review program. The opinion about the book below is only my own!

Since O’Reilly published the Perl Cookbook many years ago I have become a huge fan of that style of books. Back then – when I was still using Perl – this book was a tremendous help for all kind of things I wanted to do with Perl. Later the Ruby, Python and Java cookbooks were equally helpful.
Now O’Reilly has pubished a Scala cookbook and I am extremely happy with the result. The book is pretty big with more than 700 pages (and an extra chapter on Play is available from the author’s website because it didn’t make it into the book).

The book covers many things you will encounter daily in your Scala projects from creating classes, using Scala’s amazing collections, reading files to working with actors (using Akka).

Each recipe has a problem, shows and solution and then goes into detail explaining the solution. If you already know other O’Reilly cookbooks, you will be very familiar with the book’s style.

My favorite chapters are the ones on functional programming and the Scala collection library.

The chapters on functional programming are very pragmatic without much theory. You won’t find advanced discussions on Monads (there are other books for this and many websites) but many very useful tips on how to create your own functions and use them (as well as the existing ones from the Scala standard library) effectively in everyday projects.

The two chapters on the Scala collection library are about 130 pages long and explain in great detail the amazing features of what I consider the best collection library for any programming language out there.

The writing is clear and very easy to follow and the sample code is easy to read. Because it is all in small recipes you don’t need to read through lot’s of theory to figure out how to open a file or write a partial function.

Summary:
If you are a beginning or intermediate Scala developer the Scala Cookbook is a must have. Even advanced Scala developers should save time looking up solutions from the book – you just can’t have everything in your head 🙂
This book would also make a very good second Scala book to deepen your knowledge after you’ve read an introductionary book (Scala for the Impatient by Cay Horstmann is my favorite)

More information from the publisher:
Scala Cookbook

Book review: MongoDB – The Definitive Guide, 2nd edition

Disclaimer: I got a free copy of this book as part of the O’Reilly Blogger Review program. The opinion here is not influenced by that.
It is only my own and honest opinion of this product.

MongoDB is one of the stars in the NoSQL market (surpassed in popularity maybe only by Cassandra) and several books are now available.
One of the most popular ones has been MongoDB: The Definitive Guide of which a 2nd edition is now available .

The 2nd edition is about twice as big as the first edition and has over 400 pages.

The book covers everything from the basics to replication, sharding, server administration and schema design.
I liked the examples chosen. They are not too complicated and as a reader I could concentrate on the MongoDB stuff and didn’t have to invest a lot of time to understand the business logic of the examples (some books come up with way to complicated examples and then have a hard time explaining what’s really important to the reader).

The text is easy to read and never gets boring. I like the fact that the PDF ebook has syntax highlighting in the code examples, that makes the examples easier to read.

I would have liked to see a short comparison with some other NoSQL databases like Riak or Cassandra but one can find this information online via a Google search easily.

The book is not intended as a guide to all the different drivers for different languages. But when you understand the content of the book, it shouldn’t be too hard to use it from any programming language that has a decent driver for MongoDB like Scala, Java, Python or Ruby.

Summary:
If you are a developer or system administrator working with MongoDB and want a comprehensive, easy to read book, I highly recommend “MongoDB: The Definitive Guide, 2nd Edition”.

More information:

O’Reilly book website:
MongoDB – The Definitive Guide

Video review: Functional Thinking by Neal Ford, O’Reilly Media

Disclaimer: I got a free copy of this video as part of the O’Reilly Blogger Review program. The opinion here is not influenced by that.
It is only my own and honest opinion of this product.

I’ve liked Neal Ford’s freely available talks about functional programming and I’ve attended one of his talks last year at WJAX in Munich. He is a great communicator and has a vast technical knowledge.

Functional Programming is one the rise and in my opinion something every good developer should learn. This new video Functional Thinking is a good way to get started.

In the first part Neal explains why functional programming is something that every developer should now care about. He makes a very good point about why and how functional programming can enrich your toolbox. For example he mentions that Clojure is to concurrency what Java was to garbage collection. While in Java you don’t have to worry much about memory leaks, in Clojure (and of course other functional languages) you don’t have to worry about low level concurrency details like locks.

Neal does a great demonstration showing the beauty of functional code when he transform a verbose Java method from Apache Commons into some beautiful Clojure code and then explains how much better and more general the Clojure version is.

He explains very clearly some of the main concepts of functional programming like laziness, strict evaluation or high-order functions and closures (and it makes you realize how “noisy” Java (before Java 8) is compared to more functional languages like Scala).

Java is used for many examples and when Java doesn’t offer what is needed Neal switches to Groovy most of the time but also shows Scala and Clojure versions. I personally would have preferred more Scala and less Groovy but this is just a personal preference. Functional libraries like Functional Java or totallylazy are also introduced.

I’ve found only one error in the videos: When Option is presented the slide and Neal say that the code is Scala but in fact it is Java.

I really liked Neal’s recommendation to look at the concept of new languages like Clojure instead of just comparing them with Java or C++ or looking at the syntax. Too often programmers ignore the newer languages because the don’t feel familiar at first.

All the code samples are easy to follow and Neal is really great at explaining new concepts to OO programmers who come from a language like Java, Ruby or C++.

Conclusion:
This video is a fantastic introduction to functional programming if you come from an object oriented language and now want to learn more about functional programming or just see what it is all about and why so many people are now talking about it.

Even if you’ve already using functional programming concepts and languages for a while you can probably learn something new or get a new perspective on some functional concepts.
So unless you’ve just written your own Haskell compiler, I highly recommend watching this video to get started with functional programming.

The talk also shows how badly Java needs lambda expressions and a more functional collection library! 🙂

After watching that video it is a good idea to download Clojure or Scala and play with it. Neal Ford (together with Stuart Halloway) has recently also released a new almost 6 hour video about Clojure. And Martin Odersky’s free course about Functional Programming Principles in Scala is available online for free (if you don’t want to do the exercises, you can just download the videos).

You can find the video Functional Thinking by Neal Ford here:
Functional Thinking by Neal Ford

Scala: Put a long one line operation on collections in it’s own method

Scala collections are extremely powerful and once you get used to all the methods and different collections you don’t want to get back to other programming languages that don’t offer all that power.
Often you can write rather complex operations in just one line instead of many loops and tempory variables thanks to methods like filter, map, foldLeft, reduceLeft, etc.

But while it is easy to chain such method calls for powerful transformations it can take a while for someone new to Scala to understand your code:
Let’s take the following code as an example. It is a contrived example about how to build a URL query string from a Map[String, String].

The code is a “one liner” even though it is better to write it over several lines as I did here to make it more readable.
What the code does is build a query string for a URL from keys and values in a Map.
The code is not really difficult.
In the first line uses the filterKeys method to filter out only the books that belong to the fantasy category. The second line does the URL encoding using UTF-8, the builds the query string using foldLeft and at the end calls dropRight to get rid of the last “&”.

This code is quite easy to understand when you have some experience with Scala. You could come up with a similar code in Ruby or even Java 8 using the new lambdas.

But I recommend putting code like this in it’s own method with a good name. This makes it more readable for people new to your code – and to you too when you get back to the code after few months.

The following code shows how this might look like:

Calling the method makes the code easier to read and everyone who reads your code does not need to know the details of how you build the query string.

Always try to make the code as easy to read as possible.

Note that I prefer to add the return type to the method declaration. This is not necessary because the Scala compiler can figure it out automatically but it makes the code easier to read and when you change it later and make a mistake and return a different data type the compiler will tell you immediately.

Book review: Confessions of a Public Speaker by Scott Berkun

I like to go to developer conferences. It is great to meat new people and learn something about new technologies like Clojure, Scala or Akka.
Unfortunately sometimes the talks I listen to during conferences are terrible. Very boring speaker, many bullet points, monotone voice and other things make it hard to stay awake after 5 minutes.

Giving good talks is not easy and I have been guilty of giving bad and boring talks myself.

The good news is that public speaking can be learned. Not everybody will be able to speak like Steve Jobs or Barack Obama but everybody can improve. (If you want to learn how Job and Obama do or did it, I recommend The Presentation Secrets of Steve Jobs: How to Be Insanely Great in Front of Any Audience and Say it Like Obama and Win!: The Power of Speaking with Purpose and Vision, Revised and Expanded Third Edition).

The most important thing to do is to actually practice public speaking by giving talks. For next year I plan to give at least one talk a month. All this helps to practice. But practice is only good when you practice how to do it correctly. If you practice bad things you will just get better at doing bad things.

There are many ways to learn how to be a better speaker incl:

  • Listening to talks on youtube
  • Going to great talks and ask the speaker for advice
  • Going to a Toastmasters club near you. (I will do that next year)
  • reading a good book about public speaking

Reading a good book is always a good idea and Confessions of a Public Speaker by Scott Berkun is one of the best books I’ve read so far on public speaking.

It is a much more personal book than most other English and German books I’ve read (or plan to read) on public speaking.

Scott writes about many things a public speaker (from a professional speaker to anyone who wants to give a talk) should know incl:

  • How to deal with your fear of public speaking
  • How to work a tough room
  • How to prepare
  • How to talk on television
  • How not to be boring
  • How realistic it is to make $30,000 an hour
  • How to teach people something
  • Funny and scary stories by other speakers and what went wrong during their speeches

Scott not only explains what went well but also what went wrong during his own speeches.

Scott’s book is not only a great source of information about public speaking, it is a joy to read and often very funny because Scott is also a great writer (check out this blog)

Confessions of a Public Speaker is great for you if you want to give a talk at work or during your free time. No matter what you want to talk about, this book will be helpful. Highly recommend!
Note: Just reading through the book once is a good start but to actually get a lot out of it, it is important to check your talks against the advice in the book and see how you can improve your speeches and presentations!

Author’s website:
www.speakerconfessions.com

Scala Cookbook announced

The cookbooks from O’Reilly are among the most popular books among developers. I was very happy this morning when I discovered that there is now an upcoming Scala Cookbook. I am sure it will be a great book for all Scala developers.
It is written by Alvin Alexander.

Once available, I will publish a review of the book. It is currently schedule for April 2013, according the the publishers website.

Order it from amazon:

New book “Akka in Action” announced

The Akka framework is one of the stars in the Scala and Java world. It was only a matter of time until books would appear.
Now the “Akka in Action” book has been announced. It is scheduled for final publication in Spring 2013 but 2 chapters are already available.
The “in action” serious of Manning has been very good so far and I have several very good books published by Manning about different topics like C++, Java, Scala or Ruby.

More information about the upcoming Akka book can be found here:
Akka in Action

By continuing to use the site, you agree to the use of cookies. more information

The cookie settings on this website are set to "allow cookies" to give you the best browsing experience possible. If you continue to use this website without changing your cookie settings or you click "Accept" below then you are consenting to this.

Close