I have received a couple of questions about Aggregations in Power BI and whether they can be used to cover time series calculations such as Year to Date, Quarter to Date, and Month To Date. The answer is yes. Since time series calculations break down into calculations over a series of days, an aggregation table defined at day level with the basic summarization methods (min, max, sum, count) and the right relationship with a Date dimension, can answer Year to Date, Quarter to Date, and Month To Date calculations.
Let’s take a quick look at one such calcualtion and how it can be covered with an aggration. I am going to use the same version of AdventureWorks sample database and Power BI model that I used in my previous blog post on aggregations, with a few changes.
My goal is to cover the following two queries that both return Year to Date Sales Amount by Date so that they can be answered from an Aggregation table vs. the main Sales table:
The first query uses the DATESYTD function to generate a set of dates to sum Sales Amount over.
EVALUATE SUMMARIZECOLUMNS ( 'Date'[Date], "Sales Amount", CALCULATE(SUM ( 'Internet Sales'[Sales Amount] ) ), "YTD Sales Amount", CALCULATE(SUM ( 'Internet Sales'[Sales Amount] ), ALL('Date'), DATESYTD('Date'[Date]) ) ) ORDER BY 'Date'[Date]
The above query can also be written using the TotalYTD function instead and returns the same results.
EVALUATE SUMMARIZECOLUMNS ( 'Date'[Date], "Sales Amount", CALCULATE(SUM ( 'Internet Sales'[Sales Amount] ) ), "YTD Sales Amount", TOTALYTD(SUM ( 'Internet Sales'[Sales Amount] ), 'Date'[Date]) ) ORDER BY 'Date'[Date]
If you were to graph these results, you would see something like this:
Creating the Aggregation table
I created an aggregation table at Day level in AdventueWorks database using the following SQL code:
SELECT fs.DateKey, SUM([Sales Amount]) AS [Sales Amount] INTO dbo.Sales_Agg4_By_Day FROM [dbo].[vFactInternetSales] fs GROUP By fs.DateKey
This table has 1,024 rows which is much smaller than the original sales table with 60,398 rows.
Power BI Model
I imported this table into the model and connected it to the Date table on DateKey. Notice how in the model below, this relationship is “1 to many” with the aggregation table on the many side. I will explain this in more details in the next section.
Relationship between the Sales_Agg4_By_Day and Date tables
When I first connected the Sales_Agg4_By_Day and Date tables, Power BI created a 1-to-1 relationship between the two tables with a bi-directional cross filter. I had to change this before I could configure the aggregation table, otherwise the interface did not allow it and gave a series of warning messages, including:
An aggregation table cannot have a bidirectional filter.
An aggregation table cannot be referenced by a one to one relationship.
Table storage modes
I explained table storage modes in detail in my previous blog post on Aggregations. Here everything is the same as it was in that post with the exception of the Date table:
- The Internet Sales and Sales Territory tables are in Direct Query storage mode.
- The Product tables are in Dual storage mode.
- The Sales_Agg4_By_Day table has Import storage mode.
The Date dimension table is in Dual storage mode which is a change from my previous post where it was left in Direct Query. Here, its storage mode must be set to Dual; otherwise, even after the Sales_Agg4_By_Day table is configured as an aggregation, it won’t be used to cover queries. This is because the relationship between the Sales_Agg4_By_Day and Date tables is considered to be a weak relationships:
One side in Import, the other side in Direct Query.
It is interesting that in this case, Power BI won’t stop you from configuring the aggregation, it just won’t be used. If you use DAX Studio or another tool to check the usage of the aggregation, you will see a failed attempt at using the aggregation table and the only message you get is that no mapping was found for the DateKey column even though you have a relationship in place. Even though there is no mention of a weak relationship in the reason for the miss, the weak relationship is the main culprit here.
It is important to remember:
Agggreation hits based on relationships are only considered when the tables involved in the query have the same storage mode or one side is at least set to Dual where it can change its nature on demand.
Configuring the Sales_Agg4_By_Day table
Configuring the Sales_Agg4_By_Day table is easy. The following picture shows the details:
Since there is already a relationship between the Date and Sales_Agg4_By_Day tables, there is not need to fill out the row for DateKey in the aggregation properties window.
Profiling the sample queries
I used DAX Studio to connect to the Power BI model. I ran the sample queries after configuring the aggregation with the Server Timings option selected so that I can see the details of how the query was answered. (Alternatively, you could use SQL Server Profiler/Extended Events and run a trace against your PowerBI model to monitor the same event but that is a lot more work.)
Both queries were able to use the aggregation table instead of the main sales table. The following shows the details of the “Aggregate Rewrite Attempt” event. It shows that Power BI was smart enough to use the aggregation after it broke down the TotalYTD and DatesYTD functions into a series of dates.
Time series functions that break down to a series of dates, can be covered by an aggregation table defined at day level.