BitSets in Scala are much more fun than in Java

Recently I played with BitSets in Java because I needed an efficient way to store huge amounts of long values.
For Java there is the java.util.BitSet class. It is a very efficient implementation when you only need to store bit values.

Here is a trivial example on how to use it:

It is very easy to understand and use. But it seemed to require quite a lot of code and using methods like “and” doesn’t seem as natural as using “&” to get the intersection of two BitSets or “&~” to get the difference. In a language like C++ you can use operator overloading to get a more natural way of dealing which those operations.

After playing with the Java code, I looked up BitSets in Scala and with Scala using BitSets is much more fun.
See here for a simple example:

Scala BitSets have many advantages over Java BitSets:

  • In Scala you can add values to a BitSet in the Constructor. In Java you can only create a empty BitSet (optionally specifying it’s size) and then you have to use the set method to add the individual elements.
  • Scala allows you to define methods like “&” “&~” or “+” which allows for a more concise and natural code. Some more examples:

  • In Scala BitSet comes in two versions: scala.collection.immutable.BitSet and scala.collection.mutable.BitSet. They are almost identical but the mutable version changes the bits in place. This is the same behaviour as in the Java java.util.BitSet class and is slightly faster than the immutable one (no copying required). I prefer the immutable one when performance is not an issue (make sure to profile to see if you really need the mutable one) because immutable data structures are much better for concurrency.
  • In Scala the BitSet classes are part of the Scala collection framework and give all the great many methods available for other collections.
    Here are a few examples:

    In Java you can’t use the features of the collection framework like the new style for loop introduced with Java 5 because java.util.BitSet is not part of the collection framework. There are methods to iterate over the java.util.BitSet but it is not as convenient as using a java.util.List.

  • The Scala code also allows you to use other collections like Sets to work the BitSets and methods like “&” because those methods are overloaded to work collections of type scala.collection.Set.

As you can see from just a few simple examples, the Scala BitSets are much more powerful, require less code and result in easier to read and more natural code then the Java BitSet.
This is just one example on how you can use Scala to express ideas much more concise and easier to understand than in Java. If you haven’t yet looked at Scala, I highly recommend giving it a try – even if you never use BitSets 🙂

To see all the different BitSet implementations and available methods see the Scala Docs:
http://www.scala-lang.org/api/current/scala/collection/BitSet.html

1 Comment

  1. alex says: Reply

    Hi

    More fun for sure, but speed is yet lacking I’m afraid.

    I made some benchmark of both implementation at http://alexandre-masselot.blogspot.com/2013/01/sparse-bitset-in-scala-benchmarking-5.html

    I would love to hear your opinion
    alex

Leave a Reply

This blog is kept spam free by WP-SpamFree.