Monthly Archives: June 2011

Using DirectoryStreams in Java 7

Java 7 comes with lot’s of new stuff for IO. The new interfaces and classes added to the java.nio package contain lot’s of useful functionality for working with files and other things like asynchronous IO.

Here I want to show you a little bit about the interface DirectoryStream which is very useful when you want to work with the content of a directory (e.g. the files it contains).

In order to create a new Directory Stream you use one of the methods in the new utility class java.nio.file.Files. This method contains tons of useful methods for working with files and directories. A few let you create a new DirectoryStream. This interface can be used like a collection because it extends Iterable. That is you can use it in a for like this:
for (Path p : directoryStream)
which is very convenient.

Below are a few methods that show you how to use a directory stream:

First we define a simple utility methods:

private static void printStreamInfo(DirectoryStream<Path> dirStream) {
    for (Path path : dirStream) {
        System.out.println("Filename: " + path.getFileName());
    }
}

This method just iterates over a DirectoryStream and prints out the filename. The interface Path is also new in Java 7 and represents a file system path like a file or a directory. It also contains a huge amount of useful methods. Make sure to check it out.

Here is a method that prints the file name for all files in a given directory (it does not walk into subdirectories. For this you will need the FileVisitor interface, which I will describe in an upcoming post).

private static void printInfoForAllFiles(Path dirPath) {
    try (DirectoryStream<Path> dirStream = Files.newDirectoryStream(dirPath)) {
        printStreamInfo(dirStream);
    } catch (IOException ex) {
        //no valid exception handling for production code!!!
        System.out.println("Cannot open dir " + dirPath + ": " + ex.getMessage());
    }
}

This method is very simple. It just creates a new DirectoryStream and then calls the method printStreamInfo shown above. The code uses the Try with resources feature that comes with Java 7. You can also catch a DirectoryIteratorException which is a RunTimeException thrown when there is an error iterating over the DirectoryStream

Here is another, very similar method:

private static void printInfoForFilesWithPattern(Path dirPath, String globPattern) {
    try (DirectoryStream<Path> dirStream = Files.newDirectoryStream(dirPath, globPattern)) {
        printStreamInfo(dirStream);
    } catch (IOException ex) {
        //no valid exception handling for production code!!!
        System.out.println("Cannot open dir " + dirPath + ": " + ex.getMessage());
    }
}

The difference here is the glob pattern. You can use it do show only files that correspond to a special pattern. For example to only list all the Ruby files in your directory, use "*rb". To show both Python and Ruby files, you can use "*.{rb,py}". There are many more patterns available. See here for a detailed documentation:
http://download.oracle.com/javase/tutorial/essential/io/fileOps.html

You can call the above method like this:

 // only java files
 printInfoForFilesWithPattern(Paths.get("/home/markus/temp/"), "*java");
        
 // all files ending with "a", for example all java and scala files
 printInfoForFilesWithPattern(Paths.get("/home/markus/temp/"), "*a");
        
 // all files staring with "L"
 printInfoForFilesWithPattern(Paths.get("/home/markus/temp/"), "L*");
        
 // all Ruby and Python files (or other files ending with "rb" or "py")
 printInfoForFilesWithPattern(Paths.get("/home/markus/temp/"), "*.{rb,py}");

The last example uses the DirectoryStream.Filter interface to list only files that are larger that a given amount of kilobytes.
This interface defines only the method accept(T entry) which returns true of an entry confirms to the rules specified by the filter.

private static void printInfoForLargeFiles(Path dirPath, final int sizeInKB) {
    DirectoryStream.Filter<Path> largeFileFilter = new DirectoryStream.Filter<>() {

        @Override
        public boolean accept(Path path) throws IOException {
            if (Files.size(path) > (sizeInKB * 1024)) {
                return true;
            }
            return false;
        }
    };
        
    try (DirectoryStream<Path> dirStream = Files.newDirectoryStream(dirPath, largeFileFilter)) {
        printStreamInfo(dirStream);
    } catch (IOException ex) {
        //no valid exception handling for production code!!!
        System.out.println("Cannot open dir " + dirPath + ": " + ex.getMessage());
    }
}

As you can see, using the new DirectoryStream and DirectoryStream.Filter is very simple. All the new methods, classes and interfaces added to Java 7 for file IO make working with files very easy and convenient. If you already use Java 7 and have to work with files, make sure to have a close look.

A much more detailed documentation can be found in the Java Tutorial for Java 7 on the Oracle website:
File I/O (Featuring NIO.2)

How to make Java classes immutable

Immutable classes have been a hot topic lately. The rise of functional languages who operate mostly on immutable data and the advantage of immutable data when using multiple threads (correct immutable classes are thread safe) have also created a new interest in immutable data in Java.
While some languages like Scala encourage (but do not enforce) a programming style using immutable classes, in the Java world this is less common but it can be done nonetheless. In face, many classes in the JDK are immutable, for example classes like Long, Integer, Double, etc are all immutable. Also BigInteger or BigDecimal and of course the String class are immutable.

Immutable classes are also more secure. For example an attacker could change the members of your classes and do bad stuff with it. For example he could subclass your classes and send an email from one of the overridden methods with private data.

In this article, I show you how to turn an ordinary mutable classes into an immutable one.

A mutable class

Let’s imagine your boss wants a new class that represents a bill for an online shop. Here is a first example of a mutable class called Bill. This is of course not a realistic example of a real online shop. :-)


import java.util.Date;

public class Bill {
    
    private int amount;
    private Date date;

    public Bill(int amount, Date date) {
        this.amount = amount;
        this.date = date;
    }

    public int getAmount() {
        return amount;
    }

    public void setAmount(int amount) {
        this.amount = amount;
    }

    public Date getDate() {
        return date;
    }

    public void setDate(Date date) {
        this.date = date;
    }
}

In this version of the class, all the members can be changed after instances of the class have been created.

An immutable class

Let’s make this class immutable.


import org.joda.time.DateTime;

public final class Bill {
	 
    private final int amount;
    private final DateTime dateTime;
 
    public Bill(int amount, DateTime dateTime) {
        this.amount = amount;
        this.dateTime = dateTime;
    }
 
    public int getAmount() {
        return amount;
    }
 
    public DateTime getDateTime() {
        return dateTime;
    }
}

In this example, several changes have been made to make the class immutable:

  • The class is final. That means no subclasses can be created. A subclass of an immutable class can be made mutable again. As noted above, an attacker could use that to get to confidential data.
  • The variables are all final and cannot be changed after construction
  • In the constructor we use the import org.joda.time.DateTime class. This is a better version than the java.util.Date because it is immutable. Using a java.util.Date would be dangerous as it is a mutable class and we can’t control the calling thread (which might modify it).
  • There are no setter methods for the members.

This version of the Bill class is immutable. Now imagine your boss calls you and tells you that you need to implement another method which increased the amount of the bill after the Bill object was already created. At first you try to explain that changed the state of the class is not possible but your boss insists on this change.
You think a little bit and come up with this design of the new method


 public Bill addAmount(int amount) {
        return new Bill(this.amount + amount, dateTime));
 }

This does the trick. Instead of changing the internal state of the Bill object and using a void method, you create a completely new Bill object and return it. The caller of the addAmount method will have to use this new object if he wants to use the correct bill. This is similar to methods like replace of the String class. They don’t really change the string on which you called the method but return a new String object.

Using immutable collections in immutable classes

Now your boss comes again and tells you that the Bill object must also keep a list of orders.
In order to do that, first you have to make an immutable Order object.


public final class Order {
    
    private final int id;

    public Order(int id) {
        this.id = id;
    }

    public int getId() {
        return id;
    }
}

The new version of the Bill object now looks like this:


import org.joda.time.DateTime;

import com.google.common.collect.ImmutableList;

public final class Bill {

	private final int amount;
	private final DateTime dateTime;
	private final ImmutableList<Order> orders;

	public Bill(int amount, DateTime dateTime, ImmutableList<Order> orders) {
		this.amount = amount;
		this.dateTime = dateTime;
		this.orders = orders;
	}

	public ImmutableList<Order> getOrders() {
		return orders;
	}

	public int getAmount() {
		return amount;
	}

	public DateTime getDateTime() {
		return dateTime;
	}

	public Bill addAmount(int amount) {
		return new Bill(this.amount + amount, dateTime, orders);
	}

	public Bill addOrder(Order newOrder) {
		ImmutableList<Order> newOrderList = new ImmutableList.Builder<Order>()
				.addAll(orders).add(newOrder).build();
        	return new Bill(this.amount, dateTime, newOrderList);
	}
}

This version uses a new final com.google.common.collect.ImmutableList. This
is part of the Google Guava library where there are many different immutable collections (see here for more details: ImmutableCollectionsExplained .

The addOrder method creates a new com.google.common.collect.ImmutableList and then creates a new Bill object, similar to the addAmount method.
The caller of the addOrder method will have to use the newly returned object to use the correct Bill instance.

Note: com.google.common.collect.ImmutableList implements the java.util.List interface but I normally use the com.google.common.collect.ImmutableList in the type declaration to make it clear that I want this object to be immutable.

What about performance?

You may wonder about performance? The creation of new ones in the constructors or methods like addAmount or addOrder are more expensive than in a mutable class. In some situations this can be a disadvantage of your immutable classes but in most projects this probably won’t matter. To be sure, you should of course profile and test your application.

If possible immutable classes are preferred for thread safety and security. If you come from functional languages like Haskell or Lisp, this will feel very natural to you anyway. If you’ve been using mutable classes mostly, this may require some new thinking but could greatly improve your code. Of course you always have to decide for each class you develop if it makes sense.

Why Scala seems difficult but really isn’t

When I learned C++ and Java a long time ago, I loved Bruce Eckel’s books Thinking in C++ (two volumes) and Thinking in Java .
They had very clear and detailed explanations and I learned a lot from them. So I value his opinion. Bruce also often wrote very positively about Python, a language I like a lot.

A few days ago, he published a great article about Scala:

Scala: The Static Language that Feels Dynamic”

It is a very interesting article which shows that Scala is not complex – at least not more complex than Java. He writes “… Scala should be a lot easier than learning Java!”.

I agree. When you use the subset of Scala that let’s you do with Scala what you can do with Java, it is not more difficult at all, probably even easier than Java.

Here are a few reasons why I think Scala seems more difficult:

1) Programmers think they can master Scala within a few days or weeks

Programmers are used to Java and learning something new is always an effort. Not everyone is willing to really make the effort. To truly master Java, it takes years of practice. Some programmers think after programming in Java for 10 years or more, they should become as proficient in Scala as they are in Java within a few weeks. But that’s not how it works, no matter how easy a language is. You can write simple or even somewhat complex programs in Scala after a few days studying it, but to truly master it, it will take a lot of time and effort. But this can be very rewarding. I am still relatively new to Scala and don’t claim to be an expert or even understand everything about the language. But when I play with it, learn or read something about, I always have fun and learn something. But I know to give it time to become an expert. If you just started with Scala, give it time. When you are stuck, ask on the mailing lists or try a different website or book with explanations. Don’t give up. Even if you end up not liking Scala very much, you will definitely become a better programmer by learning it.

2) Programmers don’t know anything or not much about functional programming

Java is not a functional programming language. If a programmer has been using mostly Java and C++ for the last years, he may have created great software with Java in an object oriented way. This works great with Java and there is absolutely nothing wrong with that. I love Java and OOP is a great way for building software. But when people start playing with Scala they are also confronted with functional programming. You can do Scala in a pure OOP way but you will get more out of the language if you also master functional programming (FP). Because most programmers don’t have much experience with it, it seems very difficult and strange at first. But like I wrote above, you will have to give it some time to really understand it. It will be worth it and you will even think differently about your Java code when you’ve played more with high order functions, closures and recursion. FP is not always better or worse than OOP. It is another tool that is good to have in your toolbox. Sometimes FP is a better fit, sometimes OOP and sometimes a combination. This is why Scala supports both.

3) Many Scala websites are blogs can seem intimidating to beginners

When someone is really good at a language, he or she often want’s to show it. This is why Ruby sometimes looks very complicated when a Ruby guru writes a blog post with 50 lines of code and 10 different meta programming techniques within them.
The same is true for Scala. In many blog posts and websites, you find stuff about Monads, advanced FP, very concise but not necessarily very readable code and other things you won’t need in your daily business and other stuff that is way too confusing for a beginner.
This can be and sometimes is very intimidating for beginners who just want to read a file with Scala.
That doesn’t mean I don’t like those blogs, but I think the Scala community should publish more simple stuff.

4) Lack of a good cookbook for Scala

Many programmers learn by examples of how to do everyday tasks like opening files, sending an emails or building a socket server. For many languages there are great cookbooks with hundreds of recipes about how to do that. Such a book does not yet exist for Scala. I think it could really help to bring even more people to this wonderful language.

Conclusion: Scala is not difficult – just keep on learning

This are just a few reasons why I think Scala sometimes seems more difficult than it really is. If you don’t know Scala yet, I highly recommend reading Bruce Eckel’s article mentioned above.
If you already know some Scala but struggle a bit to grasp some of it’s concepts, keep on learning. As I wrote, you cannot become a Scala expert within 3 weeks. Keep pushing until you are comfortable with Scala and you will very likely really love it. And if you don’t you will nonetheless have learned a lot. (Btw, this holds true for all languages).

File system events with Java 7

In the last post, I showed how to listen to Linux file system events using C, Ruby and Python.
In this post, we look at Java 7. Java 7 has several new classes in the java.nio.file package that let you listen to file system events. The number of events available is not as extensive as the ones in the C, Ruby and Python example.
If you want to use the inotify mechanism directly in Java, look at the following libraries:
JNotify
inotify-java

In this example, we only look at the features of the new classes that come with Java 7. For this test, I used jdk-7-ea-bin-b144 64 bit on a Linux machine. It should work exactly like that when Java 7 is final.

Here is the source code:


package com.markusjais;

import java.io.IOException;
import java.nio.file.FileSystems;
import java.nio.file.Path;
import java.nio.file.StandardWatchEventKinds;
import java.nio.file.WatchEvent;
import java.nio.file.WatchEvent.Kind;
import java.nio.file.WatchKey;
import java.nio.file.WatchService;

// Simple class to watch directory events.
class DirectoryWatcher implements Runnable {

    private Path path;

    public DirectoryWatcher(Path path) {
        this.path = path;
    }

    // print the events and the affected file
    private void printEvent(WatchEvent<?> event) {
        Kind<?> kind = event.kind();
        if (kind.equals(StandardWatchEventKinds.ENTRY_CREATE)) {
            Path pathCreated = (Path) event.context();
            System.out.println("Entry created:" + pathCreated);
        } else if (kind.equals(StandardWatchEventKinds.ENTRY_DELETE)) {
            Path pathDeleted = (Path) event.context();
            System.out.println("Entry deleted:" + pathDeleted);
        } else if (kind.equals(StandardWatchEventKinds.ENTRY_MODIFY)) {
            Path pathModified = (Path) event.context();
            System.out.println("Entry modified:" + pathModified);
        }
    }

    @Override
    public void run() {
        try {
            WatchService watchService = path.getFileSystem().newWatchService();
            path.register(watchService, StandardWatchEventKinds.ENTRY_CREATE,
                    StandardWatchEventKinds.ENTRY_MODIFY, StandardWatchEventKinds.ENTRY_DELETE);

            // loop forever to watch directory
            while (true) {
                WatchKey watchKey;
                watchKey = watchService.take(); // this call is blocking until events are present

                // poll for file system events on the WatchKey
                for (final WatchEvent<?> event : watchKey.pollEvents()) {
                    printEvent(event);
                }

                // if the watched directed gets deleted, get out of run method
                if (!watchKey.reset()) {
                    System.out.println("No longer valid");
                    watchKey.cancel();
                    watchService.close();
                    break;
                }
            }

        } catch (InterruptedException ex) {
            System.out.println("interrupted. Goodbye");
            return;
        } catch (IOException ex) {
            ex.printStackTrace();  // don't do this in production code. Use a loggin framework
            return;
        }
    }
}

public class FileEventTest {

    public static void main(String[] args) throws InterruptedException {
        Path pathToWatch = FileSystems.getDefault().getPath("/tmp/java7");
        DirectoryWatcher dirWatcher = new DirectoryWatcher(pathToWatch);
        Thread dirWatcherThread = new Thread(dirWatcher);
        dirWatcherThread.start();
        
        // interrupt the program after 10 seconds to stop it.
        Thread.sleep(10000);
        dirWatcherThread.interrupt();

    }
}


This is a simple example on how to use the new classes. I created a new Thread that listens in an infinite loop for changes in the directory “/tmp/java7″. For each event (when a file is created, modified or deleted), the event and the file name is printed to Stdout. Note that this also works when creating or deleting directories.

Basically you create a WatchService, register the directory to watch (with the events to watch for), loop forever, create a WatchKey and poll on the WatchKey for events, then go over the events and do something with them, like printing as in this example. When done processing the events, reset the WatchKey so that it can contain new events.
The method reset returns true if the WatchKey is still valid. When you delete the watched directory, it returns false and in this example, the code breaks out of the while loop and terminates.

Note that in a real production system, you would probably not use System.out.println but do something else, like updating the directory view in a file manager, sending an email (for example, when watching a directory for activities that are not allowed, etc) or other actions.

In this example, I interrupt the program after 10 seconds. This is just to show you how to end watching a directory.

To test it, create the directory “/tmp/java7″ and then create, modify and delete a few files in it. To see the reset method in action, remove the directory. If you want to play longer than 10 seconds, just remove the call to interrupt at the end of the main method.

For more information, see the javadoc of Java 7:
http://download.oracle.com/javase/7/docs/api/java/nio/file/WatchService.html

Linux file system events with C, Python and Ruby

Some applications (like file managers, monitoring tools, etc) need to know about events in the file system, for example when a file was created, opened or deleted.

With Linux, you can use the inotify mechanism to react to those events (with kernel 2.6.13 or above). In this article, I show you examples for it’s usage with C, Python and Ruby. In an upcoming post, we use Java 7 to monitor file events.

The C version


#include <stdio.h>
#include <sys/inotify.h>
#include <stdlib.h>
#include <limits.h>


// hard coded directory and file to watch. don't do this in production code
#define DIR_TO_WATCH "/tmp/notify-dir"
#define FILE_TO_WATCH "/tmp/notify-dir/notify-file.txt"

#define EVENT_SIZE (sizeof (struct inotify_event))

// define large enough buffer
#define EVENT_BUFFER_LENGTH (1024 * EVENT_SIZE + NAME_MAX + 1)

void print_event(struct inotify_event *event) {

    if (event->mask & IN_CREATE)
        printf("file created in directory\n");
    if (event->mask & IN_DELETE)
        printf("file deleted in directory\n");
    if (event->mask & IN_ACCESS)
        printf("file accessed\n");
    if (event->mask & IN_CLOSE)
        printf("file closed after reading or writing \n");
    if (event->mask & IN_OPEN)
        printf("file opened\n");

    if (event->len)
        printf("name: %s\n", event->name);

}

int main(int argc, char** argv) {

    int notify_fd;
    int watch_fd;
    long input_len;
    char *ptr;
    char buffer[EVENT_BUFFER_LENGTH];
    struct inotify_event *event;

    notify_fd = inotify_init();
    if (notify_fd < 0) {
        perror("cannot init inotify");
        exit(EXIT_FAILURE);
    }

    watch_fd = inotify_add_watch(notify_fd, DIR_TO_WATCH, IN_CREATE | IN_DELETE);
    if (watch_fd < 0) {
        perror("cannot add directory");
        exit(EXIT_FAILURE);
    }
    watch_fd = inotify_add_watch(notify_fd, FILE_TO_WATCH, IN_ACCESS | IN_CLOSE | IN_OPEN);
    if (watch_fd < 0) {
        perror("cannot add file");
        exit(EXIT_FAILURE);
    }

    while (1) {
        input_len = read(notify_fd, buffer, EVENT_BUFFER_LENGTH);
        if (input_len <= 0) {
            perror("error reading from inotify fd");
            exit(EXIT_FAILURE);
        }

        ptr = buffer;
        while (ptr < buffer + input_len) {
            event = (struct inotify_event *) ptr;
            print_event(event);
            ptr += sizeof (struct inotify_event) +event->len;
        }
    }

    exit(EXIT_SUCCESS);
}

This code is relatively straightforward. You start the mechanism mit inotify_init(), add watches with inotify_add_watch, read the events from the inotify file descriptor and call the print_event function to print some information on stdout. Of course, depending on your software, you will do something completely different for each event.
The meaning of the events should be clear from the names of the constants. IN_CLOSE is used for closing a file after reading it or writing to it. You can also use two different events for those types of closing (IN_CLOSE_WRITE, IN_CLOSE_NOWRITE, see Python example below).
The inotify mechanism supports many more events than used in this example. See the man page for inotify for details.

If you use C++, you may want to have a look at this:
inotify C++ interface

Ruby Version



require "rb-inotify"

DIR_TO_WATCH = "/tmp/notify-dir"

notifier = INotify::Notifier.new

notifier.watch(DIR_TO_WATCH, :create, :delete) do |event|
  puts "Create event for: #{event.name}" if event.flags.include?(:create)
  puts "Delete event for: #{event.name}" if event.flags.include?(:delete)
end

notifier.run

This uses the rb-inotify Ruby library. In this example, Ruby 1.9 was used.
To keep the example short, the Ruby version only watches events for a given directory (when a file is created or deleted). That should be enough to show you how the library works.
If you watch only one event, you don’t need the if behind the puts. I added this because I watch for several events but wanted different output for each event.
In order to watch for file events like in the C version, do the same thing for a file and use different events.
More information about the Ruby rb-inotify library can be found here:
http://rdoc.info/projects/nex3/rb-inotify

Python


import pyinotify

DIR_TO_WATCH="/tmp/notify-dir"
FILE_TO_WATCH="/tmp/notify-dir/notify-file.txt"

wm = pyinotify.WatchManager()

dir_events = pyinotify.IN_DELETE | pyinotify.IN_CREATE 
file_events = pyinotify.IN_OPEN | pyinotify.IN_CLOSE_WRITE | pyinotify.IN_CLOSE_NOWRITE 

class EventHandler(pyinotify.ProcessEvent):
    def process_IN_DELETE(self, event):
        print("File %s was deleted" % event.pathname) #python 3 style print function
    def process_IN_CREATE(self, event):
        print("File %s was created" % event.pathname)
    def process_IN_OPEN(self, event):
        print("File %s was opened" % event.pathname)
    def process_IN_CLOSE_WRITE(self, event):
        print("File %s was closed after writing" % event.pathname)
    def process_IN_CLOSE_NOWRITE(self, event):
        print("File %s was closed after reading" % event.pathname)

event_handler = EventHandler()
notifier = pyinotify.Notifier(wm, event_handler)

wm.add_watch(DIR_TO_WATCH, dir_events)
wm.add_watch(FILE_TO_WATCH, file_events)

notifier.loop()

The Python example used Python 3 (version 3.2 in my machine) as you can see by the way print is used (as a function).
In the Python example I used different handlers for IN_CLOSE_WRITE (used after a file was closed after writing something to it) and IN_CLOSE_NOWRITE (used after a file was closed after just reading the content).
You could write only one callback method process_IN_CLOSE to handle both events, but I wanted different output messages. And sometimes it is better to write a little more code to make it cleaer.

The pynotify module is available for both Python 2 and Python 3 and is very easy to use. More information can be found here:
https://github.com/seb-m/pyinotify/wiki

Conclusion

As you can see, listening to different file system events on Linux is not difficult using C, C++, Python or Ruby. The inotify mechanism is also available for other languages (for example, see here for a Haskell Version).

I prefer to use the Ruby or Python version over the C version as the source code is considerably shorter and easier to understand (as it is often the case with shorter code).
Of course it depends on your project. If you use C for your project, you have to go with the C version. If you just need a short script, for example for monitoring, I recommend going with the Ruby or Python solution (or a Perl implementation)

These examples only show some basic functionality of the inotify mechanism. For example, the Python version has different notifiers, for example a ThreadedNotifier. For more details, check the man pages and the documentation listed above. I hope this example serves as a starting point for your own programs.

Those modules and libraries are specific to Linux and won’t work on other operating systems. If you are a Java developer, you can use inotify-java, but this won’t be platform independent (which is often a goal for Java software).
In an upcoming posting, I will show you how to use the latest features of Java 7 to monitor file system events.