DataWeave - The flatMap Function

Use cases for DataWeave's flatMap function.

DataWeave - The flatMap Function

Introduction

When I first saw the flatMap function introduced in DataWeave 2.0, I was a bit confused. At its core it's very simple: it's a convenient way to flatten Arrays after map using a single function instead of two. I assumed it was added to the language for good reason, but I couldn't think of a single use case! Well, that time has finally come so I wanted to share. The purpose of this post is to briefly describe what flapMap is by exploring increasingly complex use cases.

Use Case 1: Refactoring

This use case is rather simple: If you ever see flatten wrapping the result of map, eliminate the flatten call and change map to flatMap. Here's an example which doubles every number and flattens them all into a single Array:

%dw 2.0
output application/json

var data= [1, 2, 3, 4, 5]
---
flatten(
  data map (n) -> [n, n * 2]
)

// Returns: [1,2,2,4,3,6,4,8,5,10]

This simplifies to:

data flatMap (n) -> [n, n * 2]

Try to keep this in mind if you ever get confused about how flatMap works. It's just sending the result of map to flatten.

Use Case 2: Adding data between existing Array items

flatMap is also useful when you want to add data to existing Array items. You can do this with operators like + and ++, but they only append to the end of the Array. What if you wanted to add values in between the existing values? For example:

Input  : [1,2,3]
Output : [1, <new_val>, 2, <new_val> 3, <new_val>]

flatMap provides a very convenient way to do this. Use flatMap to map into an Array with multiple values:

%dw 2.0

var arr = [1,2,3,4,5]
---
arr flatMap [$, isEven($)]

// Returns: [1, false, 2, true, 3, false, 4, true, 5, false]

If you're more of a visual learner, flatMap works like this:

Screen-Shot-2020-03-03-at-13.27.02

While this example is good for illustrating what flatMap can do, it doesn't do a good job of illustrating how this particular function might be useful. Let's come up with a business use case that lends itself to using flatMap.

Practical Use Cases

Imagine that you rent out part of your house on Airbnb. At the end of each month you get an itemized list of your Airbnb income in the form of a CSV. Each line of the CSV represents a reservation for your Airbnb. It contains an ID, the date the reservation started on, the amount that was charged to the tenant for the entire reservation, and the amount that was charged to the tenant as a cleaning fee. Here's an example:

id,startDate,reservation,cleaningFee
1,2020-01-01,201.23,20
2,2020-01-09,100.05,20
3,2020-01-11,301.40,20
4,2020-01-20,407.80,20
5,2020-01-30,101.40,0

You want to import this file into your accounting system as an invoice. You can import invoices into your accounting system by uploading a CSV where each line of the CSV is a line item, or a single charge. Aside from the different headers necessary, the main problem is that the CSV you're receiving from Airbnb has multiple line items on a single CSV line (both reservation amount and cleaning fee). You need an output like this to accomodate what your accounting system is expecting:

invoiceNo,lineItem,itemAmount
10000,Reservation - 2020-01-01,201.23
10000,Cleaning Fee - 2020-01-01,20
10000,Reservation - 2020-01-09,100.05
10000,Cleaning Fee - 2020-01-09,20
10000,Reservation - 2020-01-11,301.40
10000,Cleaning Fee - 2020-01-11,20
10000,Reservation - 2020-01-20,407.80
10000,Cleaning Fee - 2020-01-20,20
10000,Reservation - 2020-01-20,101.40
10000,Cleaning Fee - 2020-01-20,0

This is a great use case for flatMap because you want to explode each input record into multiple records. That's pretty much the crux of when you'd want to use flatMap: if you want to explode single Array items into multiple Array items, use flatMap.

Here's how we can accomplish the task:

%dw 2.0
output application/csv

var invoiceNo = 10000
---
payload flatMap (line) -> [
  {
    invoiceNo  : invoiceNo,
    lineItem   : "Reservation - " ++ line.startDate,
    itemAmount : line.reservation
  },
  {
    invoiceNo  : invoiceNo,
    lineItem   : "Cleaning Fee - " ++ line.startDate,
    itemAmount : line.cleaningFee
  }
]

Notice that we're mapping each line into an Array of two values. The first value contains the reservation line item, and the second one contains the cleaning fee line item.

Note also that the amount of items in the lambda's Array (right side of ->) dictates the number of values the original Array item will explode into. If that Array contains 5 values, flatMap will explode every item in the input Array into 5 values in the output Array. What if the lambda's Array only contains 1 value? 1 value "explodes" into 1 value. Sounds like map right? It is, with more steps! So if you find yourself in a situation where you're using flatMap to explode one value into one value, just use map instead.

Practical Uses Cases (Cont.)

Being able to explode an item in an Array into N number of items is useful, but what if you need to explode some items and not explode others? Perhaps you need to explode certain items into 2 items, and others into 3.

Let's check out a concrete example based off of what we accomplished previously. In the data above there is a cleaning fee of $0. It doesn't make sense to add that as a line item, so we'd like to remove it from the output. We can still use flatMap for this, but we'll need to leverage another DW language feature: conditional Array items (https://docs.mulesoft.com/mule-runtime/4.2/dataweave-types#conditional_elements_array).

Conditional Array items allow you to construct an Array (e.g. [1,2,3]) while using Boolean expressions to determine if a particular item should appear in the resulting Array. To do this, you wrap the item in parentheses, and append an if <boolean_expression> afterwards. Here's an example:

%dw 2.0
output application/json
---
[
  (1) if isOdd(1),
  (2) if isEven(2),
  (3) if isEven(3)
]
// Returns: [1, 2] 

You can put any valid DW expression in between the parentheses, but you MUST have parentheses surrounding the expression, even if it's just a single value like a Number or String.

We can leverage this feature in combination with flatMap to get what we're looking for:

%dw 2.0
output application/csv

var invoiceNo = 10000
---
payload flatMap (line) -> [
  {
    invoiceNo  : invoiceNo,
    lineItem   : "Reservation - " ++ line.startDate,
    itemAmount : line.reservation
  },
  ({
     invoiceNo  : invoiceNo,
     lineItem   : "Cleaning Fee - " ++ line.startDate,
     itemAmount : line.cleaningFee
  }) if line.cleaningFee > 0
]

Which outputs:

invoiceNo,lineItem,itemAmount
10000,Reservation - 2020-01-01,201.23
10000,Cleaning Fee - 2020-01-01,20
10000,Reservation - 2020-01-09,100.05
10000,Cleaning Fee - 2020-01-09,20
10000,Reservation - 2020-01-11,301.40
10000,Cleaning Fee - 2020-01-11,20
10000,Reservation - 2020-01-20,407.80
10000,Cleaning Fee - 2020-01-20,20
10000,Reservation - 2020-01-30,101.40

Now, the cleaning fee associated with 2020-01-30 of $0 is no longer in the output. As you can see, flatMap and conditional Array items work well together to accomplish many use cases associated with exploding Array items into multiple items without requiring a lot of code.

Considerations for Code Organization

When you're using flatMap like described above, it's often true that you're doing the same kind of mapping for every item in the lambda Array, with just slightly different inputs. When you find that this is the case, break out the mapping into a separate function like this:

%dw 2.0
output application/csv

var invoiceNo = 10000
    
fun lineItem(invoiceNo, itemDesc, amount) =
  {
    invoiceNo  : invoiceNo,
    lineItem   : itemDesc,
    itemAmount : amount
  }
---
payload flatMap (line) -> [
  lineItem(invoiceNo, 
           "Reservation - " ++ line.startDate, 
           line.reservation),
  (lineItem(invoiceNo,
            "Cleaning Fee - " ++ line.startDate,
            line.cleaningFee)) if line.cleaningFee > 0
]

While not obvious in this example, this technique can be particularly useful if the data mapping is large or complex.

Conclusion

This post explained the flatMap function in DataWeave. We went over how to use it for simplifying code that uses flatten and map together, as well as how we can use it to add data in between Array items. When we turned to practical examples, we found that flatMap is useful for exploding each Array item into multiple items without having them nested in the end result. Finally we saw that we're not limited to a 1:N explosion for each item in the Array (i.e. each item explodes into the same number items); we can leverage conditional array items to get a 1:? explosion (i.e. each item explodes into a different number of items depending on the results of some predicates). Can you think of other use cases for flatMap? If so, please share them below in the comments!