Explore the hidden potential of Excel's GROUPBY and PIVOTBY functions by integrating custom lambdas for dynamic and flexible data aggregation.
Among the latest additions to the Excel family of dynamic array functions are two aggregation tools: GROUPBY and PIVOTBY. While useful in their basic form, they truly shine when paired with custom Lambda functions, enabling users to summarize data in ways beyond what predefined Eta lambdas offer. This guide will walk you through using these advanced features effectively.
What is Lambda function?
Before creating custom lambdas for PIVOTBY, GROUPBY, and other Excel aggregation functions, let's take a moment to understand the basic terms and concepts.
In simple terms, LAMBDA is an anonymous function that allows you to build your own functions using other functions. Think of it as a shortcut for creating a custom function for a specific task without needing to formally define and name it.
For instance, the below lambda adds 1 to a number x:
=LAMBDA(x, x + 1)
Lambda functions are useful because they let you encapsulate complex logic in a simple, reusable way. This can make your formulas more readable and your calculations more flexible.
Microsoft Excel has its own LAMBDA function, enabling users to create custom functions that can be called by user-friendly names and reused throughout a workbook.
What is Eta Lambda?
Eta lambda is a shorthand term for Eta-reduced lambda, also known as η-conversion or η-reduction.
The purpose of eta-reduction is to simplify a function by eliminating unnecessary abstractions when the function has no further operations to perform on its argument.
Simple example
Imagine you have two functions that do the same thing but look different. Eta reduction helps you determine that these functions are essentially the same.
Suppose you have a simple function 𝑓 that adds 1 to a number:
𝑓(x) = x + 1
And another function g that takes a number x and then applies 𝑓 to x:
g(x) = 𝑓(x)
Using eta conversion, you can see that g is really just 𝑓 in disguise because g does exactly what 𝑓 does. So, the equation can be simplified to:
g = 𝑓
In other words, Eta reduction shows that two functions are essentially the same as they give the same result for any input.
Eta-reduced lambdas in Excel
In Microsoft Excel, the term "eta lambda" generally refers to a simpler syntax of aggregation functions (like SUM, AVERAGE, COUNT, etc.) used within new dynamic array functions such as GROUPBY, PIVOTBY, BYROW, BYCOL, and other functions that accept lambdas.
For example, to group data in B2:B30 and sum the values in C2:C30, the full syntax would be:
=GROUPBY(B2:B30, C2:C30, LAMBDA(x, SUM(x)))
In this formula, x is just a dummy variable, or a placeholder, for the values in each row of C2:C30. The GROUPBY function iterates through each row in that range and applies the LAMBDA function, which in turn calls SUM.
While the full syntax works fine, it's unnecessarily complex for such a simple operation. With an eta lambda, you can remove the LAMBDA wrapper and call the SUM function directly by its name:
=GROUPBY(B2:B30, C2:C30, SUM)
This not only improves formula readability but makes it super easy and intuitive to use. In fact, you don't need to know anything about lambdas – simply select the required function from the list:
Note. The eta-reduced lambda syntax is only possible for functions that accept a single argument like SUM, MIN, MAX, COUNT, etc. It cannot be used for a function that requires two or more arguments.
Understanding how the inbuilt eta-lambdas operate within Excel aggregation functions, let's look at how you can construct your custom lambdas to perform calculations that extend beyond the capabilities of predefined eta-reduced lambdas.
Examples of using custom lambdas within Excel aggregation functions
These examples assume you are familiar with the syntax and basic uses of Excel's GROUPBY and PIVOTBY functions. If you're not, we recommend starting with the above-linked tutorials to build a solid foundation before diving into the advanced techniques discussed below.
Example 1: GROUPBY with lambda for custom aggregations
Using the dataset below, suppose you want to group data in column B by project type and aggregate the numbers in column C to show 10% of the revenue. This could be useful for calculating tax, bonuses, or other similar purposes.
To achieve this, you can use the following lambda function that calculates 10% of the sum of a given range. More precisely, the function takes a range x as input and returns the sum of the range multiplied by 10 percent:
LAMBDA(x, 10%*SUM(x))
Use this lambda for the function argument of GROUPBY:
=GROUPBY(B2:B30, C2:C30, LAMBDA(x, 10%*SUM(x)))
After entering the formula in, say E3, Excel will display the grouped data with a new column showing 10% of the revenue for each project type, exactly as needed for your analysis:
Example 2: GROUPBY with lambda to return strings with custom delimiter
Among the list of functions available with GROUPBY is a new ARRAYTOTEXT function that converts an array of values into text strings. By default, the values are separated with a comma and a space, as shown in this example that demonstrates the standard usage of the function.
However, if you wish to use a different delimiter, you can create your own Lambda function using TEXTJOIN and specify the desired character in its delimiter argument. For instance, to group data by the manager names in column A and return all projects associated with each manager as strings separated with a vertical bar surrounded by spaces on both sides, you would use the following formula:
=GROUPBY(A3:A30, B3:B30, LAMBDA(x, TEXTJOIN(" | ", TRUE, x)), 0, 0)
As a result, you will receive a neatly organized list where each manager's projects are grouped together and separated by the delimiter of your choosing, providing a clear and readable summary.
Additionally, you can arrange the joined values alphabetically, by nesting the SORT function within TEXTJOIN like this:
=GROUPBY(A3:A30, B3:B30, LAMBDA(x, TEXTJOIN(" | ", TRUE, SORT(x))), 0, 0)
By doing so, you ensure that the projects for each manager are not only grouped together but also sorted from A to Z, enhancing the clarity and organization of your data.
Example 3: GROUPBY with lambda to aggregate unique values
In case your dataset contains multiple occurrences of the text values you are aggregating – for example, certain projects generating revenue in more than one region – you can use the UNIQUE function within your lambda to list only unique projects for each manager. Optionally, you can also use the SORT function to arrange the returned values in ascending or descending order. Here's how you can do it:
=GROUPBY(B3:B30, C3:C30, LAMBDA(x, TEXTJOIN(" | ", TRUE, SORT(UNIQUE(x)))), 0, 0)
By using this formula, you will get each manager's projects presented in the form of a string, where values are separated by the specified delimiter, listed only once, and sorted alphabetically:
Example 4: PIVOTBY with custom lambda function
Using a custom lambda with the PIVOTBY function is similar to using one with GROUPBY.
Let's consider a dataset below with separate columns for regions (A), product names(C), and annual sales (D).
To get an average of annual sales for each product-region category, you can use a built-in eta lambda AVERAGE – this will work nicely as column C lists annual sales:
=PIVOTBY(C2:C30, A2:A30, D2:D30, AVERAGE)
In case you want the quarterly sales average for each group, you need to divide the annual sales figures by 4. This can be accomplished using a custom lambda function:
=PIVOTBY(C2:C30, A2:A30, D2:D30, LAMBDA(x, AVERAGE(x)/4))
Hopefully, these examples have given you insights into how powerful and flexible Excel's new dynamic array functions can be when combined with custom lambda functions. In a similar manner, you can aggregate data in various ways using other functions that accept lambdas, such as BYROW, BYCOL, etc.
Practice workbook for download
Custom lambda with GROUPBY and PIVOTBY - examples (.xlsx file)