Java Stream Collectors

Java 8 has introduced a new abstraction called Stream, letting us processing data in a declarative way. Furthermore, streams can leverage multi-core architectures without you having to write a single line of multithread code.

Collectors is class an implementations of Collector that implement various useful reduction operations, such as accumulating elements into collections, summarizing elements according to various criteria, etc.

Using Collectors

To demonstrate the usage of stream Collectors, let me define a class to hold my data as:

class Employee {
    private String empId;
    private String name;
    private Double salary;
    private String department;

    public Employee(String empId, String name, Double salary, String department) {
        this.empId = empId;
        this.name = name;
        this.salary = salary;
        this.department = department;
    }

    // getters and toString
}

 

So, let me have a list of Employee as:

Employee john = new Employee("E123", "John Nhoj", 200.99, "IT");
Employee south = new Employee("E223", "South Htuos", 299.99, "Sales");
Employee reet = new Employee("E133", "Reet Teer", 300.99, "IT");
Employee prateema = new Employee("E143", "Prateema Rai", 300.99, "Benefits");
Employee yogen = new Employee("E323", "Yogen Rai", 200.99, "Sales");

List<Employee> employees = Arrays.asList(john, south, reet, prateema, yogen);

 

1. Calculating statistical values

Finding average salary

Double averageSalary = employees.stream().collect(averagingDouble(Employee::getSalary));
// 260.79

Similarly, there are averagingInt(ToIntFunction<? super T> mapper) and averagingLong(ToLongFunction<? super T> mapper) to find the average values for Integer and Long types.

 

Finding total salary​


Double totalSalary = employees.stream().collect(summingDouble(Employee::getSalary));
// 1303.95

summingInt(ToIntFunction<? super T> mapper) and summingLong(ToLongFunction<? super T> mapper) are available for summing Integer and Long types.

Finding max salary

Double maxSalary = employees.stream().collect(collectingAndThen(maxBy(comparingDouble(Employee::getSalary)), emp -> emp.get().getSalary()));
// 300.99

collectingAndThen function has declaration of:

Collector<T,A,RR> collectingAndThen(Collector<T,A,R> downstream, Function<R,RR> finisher)

Function finisher can be used to format the final result of Collector output as:

String avgSalary = employees.stream()
        .collect(collectingAndThen(averagingDouble(Employee::getSalary), new DecimalFormat("'$'0.000")::format));
// $260.790

 

Calculating statistics in one shot

DoubleSummaryStatistics statistics = employees.stream().collect(summarizingDouble(Employee::getSalary));
System.out.println("Average: " + statistics.getAverage() + ", Total: " + statistics.getSum() + ", Max: " + statistics.getMax() + ", Min: "+ statistics.getMin());
// Average: 260.79, Total: 1303.95, Max: 300.99, Min: 200.99                                                             

​Similarly, summarizingInt(ToIntFunction<? super T> mapper) and summarizingLong(ToLongFunction<? super T> mapper) are available for Integer and Long types.

2. Mapping and Joining Stream

Mapping only employee names

List<String> employeeNames = employees.stream().collect(mapping(Employee::getName, toList()));
// [John Nhoj, South Htuos, Reet Teer, Prateema Rai, Yogen Rai]

 

Joining employee names

String employeeNamesStr = employees.stream().map(Employee::getName).collect(joining(","));
// John Nhoj,South Htuos,Reet Teer,Prateema Rai,Yogen Rai

joining() function has overloaded version to take prefix as suffix as:

Collector<CharSequence,?,String> joining(CharSequence delimiter, CharSequence prefix, CharSequence suffix)

So, if you want collect employee names in specific format, then you can do

String employeeNamesStr = employees.stream().map(Employee::getName).collect(joining(", ", "Employees = {", "}"));
// Employees = {John Nhoj, South Htuos, Reet Teer, Prateema Rai, Yogen Rai}

 

3. Grouping Elements

Grouping employees by Department
groupingBy() takes classifier Function as:

Collector<T,?,Map<K,List<T>>> groupingBy(Function<? super T,? extends K> classifier)

So, grouping of employees by department is:

Map<String, List<Employee>> deptEmps = employees.stream().collect(groupingBy(Employee::getDepartment)); 

// {Sales=[{empId='E223', name='South Htuos', salary=299.99, department='Sales'}, {empId='E323', name='Yogen Rai', salary=200.99, department='Sales'}], Benefits=[{empId='E143', name='Prateema Rai', salary=300.99, department='Benefits'}], IT=[{empId='E123', name='John Nhoj', salary=200.99, department='IT'}, {empId='E133', name='Reet Teer', salary=300.99, department='IT'}]}

Counting employees per Department

There is overloaded version of groupingBy() as:

Collector<T,?,Map<K,List<T>>> groupingBy(Function<? super T,? extends K> classifier,Collector<? super T,A,D> downstream)

So, counting of employees per department would be:

Map<String, Long> deptEmpsCount = employees.stream().collect(groupingBy(Employee::getDepartment, counting()));
// {Sales=2, Benefits=1, IT=2}

Calculating average salary per Department with sorted Department name

Another overload method of groupingBy() is:

Collector<T,?,M> groupingBy(Function<? super T,? extends K> classifier, Supplier<M> mapFactory, Collector<? super T,A,D> downstream)

TreeMap can be used to groupBy department name sorted as:

Map<String, Double> averageSalaryDeptSorted = employees.stream().collect(groupingBy(Employee::getDepartment, TreeMap::new, averagingDouble(Employee::getSalary)));
// {Benefits=300.99, IT=250.99, Sales=250.49}

There are ConcurrentHashMap version of groupBy()  leveraging multi-core architectures.

Map<String, Long> deptEmpCount = employees.stream().collect(groupingByConcurrent(Employee::getDepartment, counting())); 
// {Sales=2, IT=2, Benefits=1}

 

4. Partitioning Elements

partitioningBy() takes a predicate to partion the result into true for meeting the predicate criterion and false for not as:

Collector<T,?,Map<Boolean,List<T>>> partitioningBy(Predicate<? super T> predicate)

Finding employees with salary greater then average salary is:

Map<Boolean, List<Employee>> portionedEmployees = employees.stream().collect(partitioningBy(e -> e.getSalary() > averageSalary));
// {false=[{empId='E123', name='John Nhoj', salary=200.99, department='IT'}, {empId='E323', name='Yogen Rai', salary=200.99, department='Sales'}], 
true=[{empId='E223', name='South Htuos', salary=299.99, department='Sales'}, {empId='E133', name='Reet Teer', salary=300.99, department='IT'}, {empId='E143', name='Prateema Rai', salary=300.99, department='Benefits'}]}

You can use overloaded version of this method to filter the result as:

Collector<T,?,Map<Boolean,D>> partitioningBy(Predicate<? super T> predicate, Collector<? super T,A,D> downstream)

 

Conclusion

Collectors class has many utility functions to operate over the stream and extract the result meaningfully.

All the source code for the example above are available on GitHub.

Deep Clone Collection Objects in Java