String Joiner using StringUtils and Streams

Posted by Abhay on December 18, 2018

Several times we are in a dilemma with the approach to join collection of Strings using a particular separator in java. In this post I’m going to demonstrate and compare two ways of doing this, it will be a simple function to convert a collection to Comma Separated Values. This might seem like a very naive topic with the advent of Streams in Java, but trust me you will shocked to see the performance difference in the following CSV utility.

String Joiner Using Streams

public static String getCSV(Collection<String> coll) {
        if (CollectionUtils.isEmpty(coll)) {
            return null;
        }
        return coll.stream().collect(Collectors.joining(","));
    }

String Joiner Using StringUtils

public static String getCSV(Collection<String> coll) {
        if (CollectionUtils.isEmpty(coll)) {
            return null;
        }
        return StringUtils.join(coll, ",");
    }

Let's Benchmark!

Now that we have seen the implementation using both the ways, let’s try and benchmark both the implementations and see which is fast and is more suited for a production environment.

public class Main {
    private static final int listSize = 100;
    private static final int collSize = 10000;
    private static final int WARMUP_ITERATIONS = 10000;
    public static void main(String[] args) {
        jvmWarmup();
        List<Collection<String>> randomSets = Lists.newArrayListWithCapacity(listSize);
        for (int i = 0; i < listSize; i++) {
            Collection<String> tempSet = Sets.newHashSet();
            for (int j = 0; j < collSize; j++) {
                tempSet.add(UUID.randomUUID().toString());
            }
            randomSets.add(tempSet);
        }
        //Using Streams Joiner
        long total = 0;
        for (Collection<String> collection : randomSets) {
            long start = System.nanoTime();
            String out = StreamsUtility.getCSV(collection);
            total += (System.nanoTime() - start);
        }
        System.out.println("Using Streams Joiner : " + total / listSize);
        //Using StringUtils Joiner
        total = 0;
        for (Collection<String> collection : randomSets) {
            long start = System.nanoTime();
            String out = StringUtilsUtility.getCSV(collection);
            total += (System.nanoTime() - start);
        }
        System.out.println("Using StringUtils Joiner : " + total / listSize);
    }
    private static void jvmWarmup() {
        for (int i = 0; i < WARMUP_ITERATIONS; i++) {
            UUID.randomUUID().toString();
        }
    }
}

Benchmark Results

The above code was ran with different collection sizes, and run times were noted down. The code was run using 2.2 GHz Intel Core i7 CPU with 16 GB 1600 MHz DDR3 memory and -Xmx3g -Xms3g -XX:+PrintCompilation -verbose:gc jvm args. Please note the time unit is nano seconds.

Collection Size StringUtils Streams
10 56365 679153
100 117285 922402
1000 513523 1125027
10000 918946 2369459
100000 11471419 19090185

As you can see, a StringUtils join outperforms Streams API by far! With growing collection size, the difference starts coming closer but still the StringUtils join beats Streams join by a huge margin.

Wondering why such a huge difference?! A simple answer to that is a lot of objects as well as lambdas are created and StringUtils uses native code like iterators and StringBuilder which has been highly optimized over the years.

Check out the Code for more info.