In the penultimate blog in this sequence, I wrote about how to plan with Sentinel data lake. In my final blog I’ll explain how you can save money, too. This entire set of blogs is based on a series called A little slice of… published by Jon Shectman, Principal Program Manager for Security at Microsoft.

I’ll run you through an example which would save 64% of your data ingestion costs for just one table. For this, let’s use the AADNonInteractiveSignInLogs table.

This process was essentially built into the product in the following SOC Optimization recommendation:

So, what if you want to save money on conditional access evaluation logs, while retaining the data for future use? Now you can! In this example, I’ve put the verbose columns into the data lake, while retaining the others in analytics.

You can do this in three steps:

Step 1: Determine Potential Cost Savings (Build on SOC Optimization)

In the first step you’ll drop what you can from the high-cost tier, but won’t lose what you might need.

AADNonInteractiveSignInLogs is an excellent place to start. Use this process whenever you can.

To get started, let’s go figure out exactly what’s what. This is the query for that:

Which gives us:

We see that the verbose, less valuable information comprises around 13GB or a rather significant 64% of our total cost for this table. Using the Azure pricing calculator and East US as our example, here’s the monthly breakdown of keeping those columns in analytics versus data lake tier:

You can see the good news immediately. You’d get huge savings – $1,677 versus $19.50. That’s a 98.84% saving month over month.

Here are a few important points to note:

  • You need to check with your teams before moving any data (tables or columns). Threat hunters, SOC analysts, detection engineers, SOC engineers, IT, compliance, third-party vendors and other teams might be using this data in one way or another.
  • Table management in Sentinel only allows you to move this entire table, not columns. To move columns, you’ll have to use a workspace data collection rule (DCR).

Step 2: Use Data Lake for Retention

This is well documented in Sample data collection rules (DCRs) in Azure Monitor. Essentially, you’ll create multiple data flows in the DCR with a novel JSON, outputting to each table as shown in this diagram we’ve put together.

To create your DCR as illustrated, you’ll use the following JSON which we’ll step you through. The top part is the piece that stay in the analytic tier. The bottom part pushes the second part to the data lake.

In the top box, the key portion is the “project-away.” You’re keeping the entire table in analytics except for the projected away piece – the verbose conditional access policies.

In the bottom box, the key portion is “project-keep.” Here, you are declaring a new custom table called newAccessPolicies_CL and populating it with the conditional access policies from the original table. This is documented here in Add or Delete Tables and Columns in Azure Monitor Logs.

Please bear in mind that it can take a few minutes for the custom table to show up, and a bit longer for the DCR. Also, the “Custom- “ portion of the table name is automatically dropped.

Once everything finishes, you can navigate over to the UI to confirm that both of the tables exist. Both tables still exist in the analytics tier (because we haven’t moved the custom one yet). So don’t stop now or you won’t realise any savings from your work.

Let’s confirm that both tables exist, and are populated:

We could summarise count() by Type or something similar:

You might want to dig in further, to ensure the new table holds the desired columns.

Step 3: Validate & Monitor

Now, it’s just a matter of firing up the Table management GUI and moving the new custom table to the data lake.

Move newAccessPolicies_CL to the data lake tier.

To summarise, this example shows you how to save 64% while continuing to retain ‘queryable’ data. Hopefully, you can see that this is a repeatable process that you can leverage throughout your environment.

N.B.: At the time of writing Microsoft are refining the prices mentioned (and meters they use).  The process above is correct, but savings and the results of data lake in the pricing calculator are subject to change.

Further Insights from Quorum Cyber.

Privacy Preference Center

Skip to content