Tips on Debugging DataWeave Code

Is the Mule debugger just not cutting it for your DataWeave code? Try some of these alternatives.

Tips on Debugging DataWeave Code

DataWeave code can be difficult to debug for those that are new to the language. This is especially true for those who are new to the functional programming paradigm as well. You probably know by now that you cannot use the Mule debugger to step into DW code and run it line by line, which makes the problem of how to debug DW code even worse. Luckily, there are a few properties inherent in functional languages like DW that can make them easier to debug and reason about. We can use the aforementioned properties to break up our code to make our DW scripts easier to debug and understand. Lastly, the DW language comes with some utilities that make the lack of debugger not such a big deal.

Refactoring using %var and %function

Most DW code that I read uses none of the organizational features of the language. Just two of these features, %var and %function, are powerful tools for refactoring your code so that it is less repetitive, more easily understood by others, and easier to debug. Here's a piece of code that I see quite often:

%dw 1.0
%output application/java
---
payload map {
	firstName: $.firstName unless $.firstName == null otherwise "",
	lastName:  $.lastName  unless $.lastName  == null otherwise ""
}

We have a repetitive pattern here: <field> unless <field> != null otherwise "". If we ignore the fact that we can get this functionality with the default operator (i.e. <field> default ""), we can refactor this repetitive code into a function so we can DRY out the code, and get consistent behavior whenever we want to do a null check and default a value. We also get the advantage of naming the behavior (i.e. naming the function) so that its intent is more obvious:

%dw 1.0
%output application/java
%function defaultIfNull(field, defaultValue) field unless field != null otherwise defaultValue
---
payload map {
	firstName: defaultIfNull($.firstName, ""),
	lastName:  defaultIfNull($.lastName,  "")
}

There a two good uses I find for using %vars in DW code. First, is to give meaning to things that don't currently have a descriptive name. For example, in the DW code above we have payload. When I see code like this I ask myself: payload of what? What does the payload represent? This is probably more of a personal preference, but I find that like using %var to name behavior, giving meaning to values that don't currently have a descriptive name makes the intent of the code more obvious. I'd write the above code like this instead:

%dw 1.0
%output application/java
%var people = payload
%function defaultIfNull(field, defaultValue) field unless field != null otherwise defaultValue
---
people map {
	firstName: defaultIfNull($.firstName, ""),
	lastName:  defaultIfNull($.lastName,  "")
}

Taking Advantage of Referencial Transparency

When it comes to debugging, %vars are great for storing intermediate values during a longer computations. For example let's say we have the following DW script:

%dw 1.0
%output application/java
---
(flatten (payload.products map $.availabilities)) map $.region filter $.id == 1

It can be a bit hard to understand what's going on here, and it's going to be even more difficult to determine how small changes in the code will affect the output. We can use %vars to break up this long chain of expressions into smaller expressions. We can then use the debugger or a logger after the DW transformer to view what these expressions return.

%dw 1.0
%output application/java
%var availabilities = flatten (payload.products map $.availabilities)
%var regions        = availabilities map $.region
%var wantedRegions  = regions filter $.id == 1
---
{
	// Test if availabilities returns what we expected
	availabilities: availabilities,
	
	// Test if regions returns what we expected
	regions: regions,

	// Test if wantedRegions returns what we expected
	wantedRegions: wantedRegions
}

With the code this way, it is incredibly easy to determine what part of the code might be incorrect, as each piece is isolated and named. If you're paying close enough attention, or are familiar with other functional languages you may have noticed that the expression flatten (payload.products map $.avilabilities) map $.region and availabilities map $.region return the same thing when availabilities in the latter expression is set to the appropriate value (i.e. what is returned from the expression flatten (payload.products map $.availabilities)). With functional languages like DW, you're free to make these substitutions because all DW expressions are referentially transparent. All of the repercussions of this are outside the scope of this article, but for now you can know that "an expression is said to be referentially transparant if it can be replaced with its corresponding value without changing the program's behavior" (according to Wikipedia at the time of writing of this article). Use this principle to your advantage when debugging and writing DW code.

Using log to Trace Execution

DW has a function called log (documentation here) that takes a string, and value or expression as input. As output, it will log to the console the string with the value or result of the expression appended, and return the value or result of the expression. Here's a simple example:

%dw 1.0
%output application/java
---
log("1 + 2 = ", 1 + 2)

This script will set the payload of the message to 3 and will log the following to the console: 1 + 2 = 3. In the last example of the previous section we could've used log instead of setting the payload to determine if the values were correct. Here's a refactored example:

%dw 1.0
%output application/java
%var availabilities = log("availabilities is: ", flatten (payload.products map $.availabilities))
%var regions        = log("regions is: ",        availabilities map $.region)
%var wantedRegions  = log("wantedRegions is",    regions filter $.id == 1)
---
wantedRegions

Which would log the appropriate values to the console when executing the script.

In the above example, the use of log was pretty much optional. You could've gone with our initial example, or the one refactored for log, and been happy either way. But this isn't always the case. You've probably noticed that DW doesn't have imperitive looping constructions like Java's for, enhanced for, and while loops. Looping in functional languages is typically done with recursion and DW is no exception. Let's say you want to need a function that takes in a char sequence and repeats it n times. This solution needs recursion. Here's a working solution:

%dw 1.0
%output application/java

%function repeatChar(char, n, acc=[])
  acc joinBy ""
    when ((sizeOf acc) == n)
    otherwise repeatChar(char, n, acc + char)
---
repeatChar("*", 3)

which returns "***". When I was developing this solution, I had a very difficult time getting the recursion to work the way I wanted it to. When making modifications to determine the solution, sometimes I would get back null, and other results that weren't pointing me to an obvious issue with the code. Without tracing the solution on paper, which I really didn't want to do, I didn't see a lot of options for debugging my code. After all, DW is not imperitive, you can't just throw in a System.out.println() call in the middle of the function like you can in Java. It turns out that log was the solution to this problem. Here's the refactored recursion example using log to output intermediate values as it reaches the final array.

%dw 1.0
%output application/java

%function repeatChar(char, n, acc=[])
  acc joinBy ""
    when ((sizeOf acc) == n)
    otherwise repeatChar(
                log("char is ",    char), 
                log("n is ",       n), 
                log("new acc is ", acc + char))
---
repeatChar("*", 3)

The console output looked like this enabling me to easily see where things were going wrong in the recursive loop:

char is  - "*"
n is - 5
new acc is - [
  "*"
]
char is  - "*"
n is  - 5
new acc is - [
  "*",
  "*"
]
char is - "*"
n is  - 5
new acc is - [
  "*",
  "*",
  "*"
]

I hope this article gave you some good insights into how you can organize your DataWeave, take advantage of referential transparency, and use log to make debugging DataWeave code a lot easier.