In this guide we will learn in detail how to perform reduce operation on a stream.
Most of the developers find it difficult to grasp reduce operation at the initial stage.
Understanding core concepts is important before we jump into detail
Basics
A reduction operation takes a sequence of input elements and combines them into a single summary result.
Take a list of numbers and give me the sum
Take a list of strings and concatenate them together and give me the result.
Reduce operation consist of three main arguments out of which 2 are mostly used
List of arguments
- Identity
- Accumulator
- Combiner
Identity
A) First argument is the identity which means initial value of the expected summary result.
If we want sum of numbers initial value is 0.
If we want to join strings initial value is empty string “”.
Let us take a simple example where we have list of numbers and we want to sum of those numbers
|
|
In the above example we create stream of given Arraylist and call reduce() method, in the first argument we specify the identity as 0 which is nothing initial value of sum.
Accumulator
It consists of accumulator arguments and accumulator implementation which is the 2nd argument of the reduce method.
It contains the logic to take values one by one from stream and perform operation of them to reduce them to
single value.
In the below example we are just taking values from the stream and creating sum from them.
(subtotal,nextElement) -> subtotal+nextElement
(subtotal,nextElement) : This is accumulator arguments
subtotal+nextElement : This is accumulator implementation
subtotal stores partial sum of the integer values during the current step
nextElement is the next element processed by the stream
|
|
Steps
Stream —> 10,20,30,40,50
Step 1: subtotal=initialValue(0) so subtotal=0
nextElement=10
Returns subtotal=subtotal+nextElement=0+10=10
Step 2: subtotal=10
nextElement=20
Returns subtotal= subtotal+nextElement=10+20=30
Step 3: subtotal=30
nextElement=30
Returns subtotal=subtotal+nextElement=30+30=60
Step 4: subtotal=60
nextElement=40
Returns subtotal=subtotal+nextElement=60+40=100
Step 5: subtotal=100
nextElement=50
Returns subtotal=subtotal+nextElement=100+50=150
Similarly we can combine list of strings into 1 string
In this case identity is empty string “”.
|
|
Combiner
Combiner is used in the case of parallel stream or if arguments in the accumulator donot match in sequential streams
Parallel Stream
We can perform parallel operations on the streams.
In such cases reduce operations are executed in parallel taking advantage of multi-core hardware architecture.
We should use parallel stream when we are working with large streams and need to perform expensive operations.
Let us take a simple example where we are using parallel streams.
|
|
Instead of calling stream() on list we use parallelStream(). In this case reduce() operation is run in parallel and each reduce() operation produces intermediate result.
To combine this intermediate result we use combiner which is the 3rd argument. Here accumulator and combiner is the same since argument in the accumulator is of the same type hence we can skip the 3rd argument.
|
|
Let us consider another example whether argument type in the accumulator is of different types.
Here we are taking list of string and need to find sum of all the length of the string.
Argument to accumulator is (String,int), hence we need to provide combiner here which just combines results of the different reduce operations which are run in parallel.
Here we cannot skip 3rd argument. Combiner method is executed here
|
|
Arguments in the accumulator donot match for sequential streams
Let us consider a scenario where we take a stream of string values and we want to find the sum of length of all strings
|
|
For the above code you will get compilation error “argument mismatch; int cannot be converted to java.lang.String”
The above code using 2 arguments reduce method which accepts 2 string arguments and returns a string.
|
|
To make above use case work we need to use reduce method which has 3 arguments
3 arguments reduce function is defined as follows
|
|
In the above reduce function we can use accumulators which take different arguments and combiner
|
|
In the above case combiner is never invoked, but still we have to specify it.
Here we cannot skip 3rd argument. Combiner method is not executed here for sequential streams
Conclusion here is:
- For sequential stream, if arguments are of different types in the accumulator method we need to provide combiner method.Combiner method is not executed here.
- For paralled stream, if arguments are of different types in the accumulator method we need to provide combiner method. If arguments are of same type we can skip combiner method since combiner and accumulator are same. Combiner method is always executed here.