Why does the DAX syntax require the original name for a filtered table?
0
votes
1
answer
38
views
I am learning DAX and reading Russo & Ferrari's Definitive Guide to Dax (2nd ed).
I understand how
SUM ( Sales[Quantity] )
is really SUMX ( Sales, Sales[Quantity] )
, which iterates over the rows of the Sales
table and sums the values in the Quantity
column. If I'm not mistaken, even though the book says it's against the standard formatting (so its clear the column is not a measure), in SUMX
you could remove the table name Sales
from the argument Sales[Quantity]
, since SUMX
already knows to look in Sales
.
I took some issue with the syntax of how SUMX
(and other iterators) is combined with FILTER
, based around the table name being used in the column reference.
On page 58 an example code is given:
Sales Amount Multiple Items :=
VAR
MultipleItemSales = FILTER ( Sales, Sales[Quantity] > 1 )
RETURN
SUMX (
MultipleItemSales,
Sales[Quantity] * Sales[Unit Price]
)
My initial understanding was that MultipleItemSales
is an entirely new object in memory, which is a filtered version of the Sales
table - therefore I thought the second argument should be MultipleItemSales[Quantity] * MultipleItemSales[Unit Price]
, since SUMX
is iterating over MultipleItemSales
and not Sales
. But apparently this would give an error.
I think the question becomes *what is DAX doing internally to make this syntax valid?*
Let me give another example:
VAR
MultipleItemSales = FILTER ( Sales, Sales[Quantity] > 1 )
VAR
SingleItemSales = FILTER ( Sales, Sales[Quantity] = 1 )
VAR
SalesAmountMultipleItems =
SUMX (
MultipleItemSales,
Sales[Quantity] * Sales[Unit Price]
)
VAR
SalesAmountAllItems =
SUMX (
Sales,
Sales[Quantity] * Sales[Unit Price]
)
VAR
SalesAmountSingleItems =
SUMX (
SingleItemSales,
Sales[Quantity] * Sales[Unit Price]
)
VAR
AvgSalesMultipleItems =
AVERAGEX (
MultipleItemSales,
Sales[Quantity] * Sales[Unit Price]
)
RETURN
....
In each of the SUMX
functions, a different version of the table (filtered or unfiltered) is being iterated, yet Sales
is always used as the table name for the columns. So, how does it know which version of Sales
to use?
I then thought, rather than creating an entirely new object in memory, FILTER
might actually filter the object which Sales
points to in-place. So in SalesAmountMultipleItems
, the first argument to SUMX
tells the system to apply some filtering to Sales
, before the looking down the Sales[Quantity]
and Sales[Unit Price]
columns. Then, when escaping the SUMX
function scope, whatever filtering done to Sales
gets undone, allowing SalesAmountAllItems
to access the unfiltered version Sales
.
But then I learned about lazy evaluation. In my previous argument, when evaluating AvgSalesMultipleItems
, the filtering applied by MultipleItemSales
would have to be applied a second time. However, to save CPU and memory, it's my understanding that the filtering wouldn't need to happen a second time if it's already been carried out once.
So, my current understanding is more like the follwing:
FILTER
does indeed create a new object in memory - so MultipleItemSales
is a filtered version of Sales
which can be referenced multiple times but need only be created once. But then to allow the continued use of the name Sales
, the effect of the first argument of SUMX
being MultipleItemSales
must be to **temporarily change which object in memory the name Sales
is pointing at**. At the end of the scope, the name Sales
reverts back to pointing to the original, unfiltered version of the table, allowing SalesAmountAllItems
to access it.
Is this the correct understanding?
Asked by jonnybolton16
(101 rep)
Jan 15, 2025, 09:57 AM
Last activity: Mar 2, 2025, 01:04 PM
Last activity: Mar 2, 2025, 01:04 PM