Sample Header Ad - 728x90

Why does the DAX syntax require the original name for a filtered table?

0 votes
1 answer
38 views
I am learning DAX and reading Russo & Ferrari's Definitive Guide to Dax (2nd ed). I understand how SUM ( Sales[Quantity] ) is really SUMX ( Sales, Sales[Quantity] ), which iterates over the rows of the Sales table and sums the values in the Quantity column. If I'm not mistaken, even though the book says it's against the standard formatting (so its clear the column is not a measure), in SUMX you could remove the table name Sales from the argument Sales[Quantity], since SUMX already knows to look in Sales. I took some issue with the syntax of how SUMX (and other iterators) is combined with FILTER, based around the table name being used in the column reference. On page 58 an example code is given:
Sales Amount Multiple Items :=
VAR
    MultipleItemSales = FILTER ( Sales, Sales[Quantity] > 1 )
RETURN
    SUMX (
        MultipleItemSales,
        Sales[Quantity] * Sales[Unit Price]
    )
My initial understanding was that MultipleItemSales is an entirely new object in memory, which is a filtered version of the Sales table - therefore I thought the second argument should be MultipleItemSales[Quantity] * MultipleItemSales[Unit Price], since SUMX is iterating over MultipleItemSales and not Sales. But apparently this would give an error. I think the question becomes *what is DAX doing internally to make this syntax valid?* Let me give another example:
VAR
    MultipleItemSales = FILTER ( Sales, Sales[Quantity] > 1 )
VAR
    SingleItemSales = FILTER ( Sales, Sales[Quantity] = 1 )
VAR
    SalesAmountMultipleItems =
        SUMX (
            MultipleItemSales,
            Sales[Quantity] * Sales[Unit Price]
        )
VAR
    SalesAmountAllItems =
        SUMX (
            Sales,
            Sales[Quantity] * Sales[Unit Price]
        )
VAR
    SalesAmountSingleItems =
        SUMX (
            SingleItemSales,
            Sales[Quantity] * Sales[Unit Price]
        )
VAR
    AvgSalesMultipleItems =
        AVERAGEX (
            MultipleItemSales,
            Sales[Quantity] * Sales[Unit Price]
        )
RETURN
    ....
In each of the SUMX functions, a different version of the table (filtered or unfiltered) is being iterated, yet Sales is always used as the table name for the columns. So, how does it know which version of Sales to use? I then thought, rather than creating an entirely new object in memory, FILTER might actually filter the object which Sales points to in-place. So in SalesAmountMultipleItems, the first argument to SUMX tells the system to apply some filtering to Sales, before the looking down the Sales[Quantity] and Sales[Unit Price] columns. Then, when escaping the SUMX function scope, whatever filtering done to Sales gets undone, allowing SalesAmountAllItems to access the unfiltered version Sales. But then I learned about lazy evaluation. In my previous argument, when evaluating AvgSalesMultipleItems, the filtering applied by MultipleItemSales would have to be applied a second time. However, to save CPU and memory, it's my understanding that the filtering wouldn't need to happen a second time if it's already been carried out once. So, my current understanding is more like the follwing: FILTER does indeed create a new object in memory - so MultipleItemSales is a filtered version of Sales which can be referenced multiple times but need only be created once. But then to allow the continued use of the name Sales, the effect of the first argument of SUMX being MultipleItemSales must be to **temporarily change which object in memory the name Sales is pointing at**. At the end of the scope, the name Sales reverts back to pointing to the original, unfiltered version of the table, allowing SalesAmountAllItems to access it. Is this the correct understanding?
Asked by jonnybolton16 (101 rep)
Jan 15, 2025, 09:57 AM
Last activity: Mar 2, 2025, 01:04 PM