DataWeave - How to Modify All Values of an Element

An iterative approach to developing a utility function that modifies all the values of an element according to a function.

DataWeave - How to Modify All Values of an Element

Recently, I was getting information from a stored procedure that contained a very large amount of whitespace on some fields, but not on others, and sometimes those fields were null. I wanted to remove this white space on these fields. Below I will document the process to get to the final iteration of the code, using more and more functional programming concepts as I go. Here was my first go at it:

%dw 1.0
%output application/java

%function removeWhitespace( v ) trim v when v is :string otherwise v
---
payload map {
  field1: removeWhitespace( $.field1 ),
  field2: removeWhitespace( $.field2 )
  ...
}

This works. But it has two problems: it obfuscates the mapping by adding a repetitive function call for every field, and if you don't want to do every field, you would need to individually identify which fields have a bunch of whitespace and which ones don't. This is time consuming, and potentially impossible, given that this could change from row-to-row in the database. So maybe this solution works now, but it might not work for every field, and if things change, it's incredibly brittle. Here's my second iteration:

%dw 1.0
%output application/java

%function removeWhitespaceFromValues(obj) 
  obj mapObject { ( $$ ): trim $ when $ is :string otherwise $ }

%var object = removeWhitespaceFromValues(payload);
---
object map {
  field1: $.field1,
  field2: $.field2
  ...
}

Awesome! We no longer have to individually identify which fields have a bunch of white space because the function doesn't care. It will trim every value that is a string, which is exactly what we want. But... could it be better? What if someone wanted to use this code in the future to do the same thing to a JSON object with nested objects and lists? The second iteration of the function will not accomplish this i.e., it will not apply trim to nested objects and arrays. Let's take a stab at it:

%dw 1.0
%output application/java

%function removeWhitespaceFromValues( e )
  e match {
    :array  -> $ map removeWhiteSpaceFromValues( $ ),
    :object -> $ mapObject { ( $$ ): removeWhiteSpaceFromValues( $ ) },
    default -> trim $ when $ is :string otherwise $
  } 

%var object = removeWhitespaceFromValues( payload );
---
...

Cool. Now we have a function that will remove the white space from every value in an element that is a string, including deeply-nested elements. You might think we're done. But we can do A LOT better using something called higher-order functions. In other words, we're going to pass a function into our existing function to specify exactly how we want it to work. Higher-order functions are functions that take other functions as arguments, and/or return functions in response to a call. This works in DataWeave because functions (like objects in Java) are first-class citizens in DataWeave. This is not the case in languages like Java. You can assign these functions to variables, pass them around to functions, return them from functions, etc. This provides a very powerful means of abstraction for functional programmers, like classes do for object-oriented programmers. Try doing the same in Java with a similar amount of code (hint: anonymous classes, or Java 8 lambdas). Here’s the final iteration of the code:

%dw 1.0
%output application/java

%function applyToValues( e, fn )
  e match {
    :array  -> $ map applyToValues( $, fn ),
    :object -> $ mapObject { ( $$ ): applyToValues( $, fn ) },
    default -> fn( $ )
  } 

%function trimWhitespace( v )
  trim $ when $ is :string otherwise $

%var object = applyToValues( payload, trimWhitespace );
---
...

This effectively makes the act of looping through every value in an element completely generic (applyToValues). From here, we can define exactly what we want to happen for each value in the element (trimWhitespace). What if we wanted to do something different for every value in the object? Just change the function you pass in. Maybe you want to trim the value if it's a string, and increment it if it's a number. Let's see what that would look like:

%dw 1.0
%output application/java

%function applyToValues( e, fn )
  e match {
    :array  -> $ map applyToValues( $, fn ),
    :object -> $ mapObject { ( $$ ): applyToValues( $, fn ) },
    default -> fn( $ )
  } 

%function trimOrIncrement( v )
  v match {
    :string -> trim v,
    :number -> v + 1,
    default -> v
  }

%var object = applyToValues( payload, trimOrIncrement );
---
...

Notice the most important thing here, the applyToValues function did not need to change at all. The only thing we changed was the function we passed into it. One last point, we don't even need to give our function a name, we can create the second argument to applyToValues on the spot using a lambda, or anonymous function. Here we will use a lambda to increment the value if it's a number:

%dw 1.0
%output application/java

%function applyToValues( e, fn )
  e match {
    :array  -> $ map applyToValues( $, fn ),
    :object -> $ mapObject { ( $$ ): applyToValues( $, fn ) },
    default -> fn( $ )
  } 

%var object = applyToValues( payload, ( ( v ) -> v + 1 when v is :number otherwise v ) );
---
...