Introduction

Removing nested values seems like it would be simple, but there are a few things about the design of the DataWeave language itself that make it a bit more difficult than one might initially think. If you remember my post about coming to DataWeave from a Java background, or have seen a talk of mine, you might remember me saying that immutable data is great for most cases, but there are some areas where it can actually require code that is more complicated than you might expect. Removing nested key:value pairs is one of those cases. I'll start by discussing how we typically remove elements from flat collections, why we can't apply those techniques directly to nested collections, and finally, how to remove values from nested collections. Let's get into it.

Removing Elements from Flat Collections

Removing elements with DataWeave is actually fairly simple. You can typically get away with using filter, filterObject or -, depending on your situation.

With Arrays

If you have an array, you can use filter if there is some criteria you want to test on all elements of the array, or - if you know the index of the value you want to remove:

[1,2,3,4,5] filter mod($, 2) == 0
// Returns: [2,4]

[1,2,3,4,5] - 2
// Returns: [1,2,4,5]

You can view - as a kind of shorthand for filtering based on index. The following is funcionally equivalent to the last piece of code (remember that with filter, $$ refers to the index of the current iteraiton):

[1,2,3,4,5] filter $$ != 2
// Returns [1,2,4,5]

With Objects

If you have an object, the syntax is very similar, but you would need to use filterObject instead of filter. You can still use - to remove values because it is polymorphic (i.e., it works over multiple types):

{goodbye:"world", hello: "space"} filterObject $$ == "hello"
// Returns {"hello": "space"}

{goodbye:"world", hello: "space"} - "goodbye"
// Returns {"hello": "space"}

As an aside, you do get some additonal functionality when using - with objects. You can remove on key AND value at the same time (let's assume the following eventually evaluates to XML so you can forgive my duplicate keys):

{goodbye:"world", goodbye:"space"} - {goodbye: "space"}
// Returns {"goodbye": "world"}

Removing Elements from Nested Collections

As I mentioned previously, removing elements from nested collections provides a new set of challenges, and unfortunately we can't just rearrange our previous tools to overcome them. What if we only wanted to remove the secondary: "Big Ern" pair from the object associated with the "nickNames" key below:

var obj = {
  name: "Joshua Erney", 
  nickNames: {
    primary: "Jerney", 
    secondary: "Big Ern"
  }
}

An Initial Approach

For our initial approach we might try to remove it by key like this:

var obj = ...
---
obj.nickNames - "secondary"

Will that work? Unfortunately not. The obj.nickNames selector will evaluate to {primary: "Jerney", secondary: "Big Ern"}, so applying ... - "secondary" after that selector will return {name: "Jerney"}. That's happening because we're doing nothing to retain the other elements in the obj data. You could try to use filterObject but you'll run into the same problem because of the obj.nickNames selector.

Our first taste of success might look like this:

var obj = ...
---
obj mapObject (v, k) -> 
  if k != "nickNames"
    {(k): v}
  else
    {(k): v - "secondary"} // You could also use filterObject here

And this will return what we want:

{
  name: "Joshua Erney", 
  nickNames: {
    name: "Jerney"
  }
}

Is our solution adequate? In certain situations, absolutely. But if we have a situation where we might get an array or an object, or if we'd just like to provide a generic library function for the rest of our organization to use, this script will fail trying to call mapObject on an array. We could using pattern matching to provide different functionality based on type, and this would give us some flexibility. Here's an abbreviated example:

e match {
  case is Object -> e mapObject ...
  case is Array  -> e map ...
}

However, we would still hardcoding both the key associated w/ the data we want to work with, and the key of the key:value pair that we'd like to remove. Our current code works, but it's extremely brittle and offers no hopes of reusability.

A Generic Solution

As it turns out, using pattern matching in combination with recursion can get us what we need. If you read my posts on applyToValues, and applyToValuesWhenKey, this pattern should look very familiar:

fun removePair(e, key) =
  e match {
    case is Array  -> e map removePair($, keyToRemove)
    case is Object -> e mapObject (v, k) ->
                        if ((k as String) == key)
                          {}
                        else
                          {(k): removePair(v, key)}
    else           -> e
  }

Note: Thank you to my fellow MuleSoft Ambassador Manik Magar for pointing out and correcting the error in this function!

This function will crawl the data structure, rebuilding it UNLESS it finds a key that matches what gets passed for the key parameter. In that case, it will not rebuild that particular pair, effectively removing it from what is passed in as e. Notice also that we could pass in an object, an array, a single value (like a number), or any combination of the 3, and our function will perform without crashing.

Here's how we might tackle our initial problem, now. Functions like removePair are extremely general, so I like to wrap them in a very specific API that defines what they're supposed to do:

fun removePair(e, key) = ... // Same as above

fun removeSecondaryNickname(e) =
  removePair(e, "secondary")
  
var obj = ... // Same as above
---
removeSecondaryNickname(obj)

Shortcomings of this Function

While this function is much more versatile than what we original developed, it's not completely generic and suffers a couple of problems:

  1. If the key appears more than once in the data, and we don't want to remove ALL matches, this function will not accomodate.
  2. We can only remove key:value pairs in objects, not single array elements.
  3. We can only match by directly comparing two strings

While addressing 1 and 2 will require you to write additional functions to handle those cases, we can address 3 right away by making removePair a higher-order function. Instead of taking in a string that defines what key to remove, we will take in a function that defines what key to remove instead. The function should take in a single parameter, which is a key, and return true if the key:value pair associated with that key should be removed, and false otherwise. Functions that return true or false are called predicates, that's how we will refer to them in the code:

fun removePair(e, predicate) =
  e match {
    case is Array  -> e map removePair($, predicate)
    case is Object -> e mapObject (v, k) ->
                        if (predicate(k))
                          {}
                        else
                          {(k): removePair(v, predicate)}
    else           -> e
  }

By doing this, we've opened up an infinite number of possible ways to determine what keys to remove. We could remove all keys that have more than 3 characters, for example:

fun removeKeysLongerThan3Chars(e) =
  removePair(
    e,
    ((k) -> sizeOf(k as String) > 3)
  )

And of course, we have our original functionality still intact just by passing removePair a different function:

fun removeSecondaryNickname(e) =
  removePair(
    e,
    ((k) -> (k as String) == "secondary")
  )

Conclusion

Removing, nested key:value pairs provides challenges to DataWeave programmers due to the language's full embrace of immutable data structures. While I've demonstrated that it is possible to create a relatively generic solution that works across all data structures, it has a pretty big limitation in that it cannot be used to remove elements in arrays. It also requires knowledge of recursion and pattern matching.

I'd love to hear your thoughts! Feel free to leave a comment below, or comment on social media.