You’ve no doubt seen the exciting news about the launch of the new Microsoft Sentinel data lake. This year, as part of my role as a Solutions Director at Quorum Cyber, I’ve been heavily involved in helping to develop and test the Sentinel data lake. And, on 1st October, Quorum Cyber announced that it’s a proud participant in the Microsoft Sentinel partner ecosystem. 

To help educate the cyber security community and get as many people as possible up to speed, I’ve also collaborated with Jon Shectman, Principal Program Manager for Security at Microsoft, on a series of blogs called A little slice of… Here’s a summary of our blogs about planning your data lake strategy.

Evaluating tables for Sentinel data lake

The first step in identifying good candidates for the data lake is to review the Table Management feature. Once you understand its capabilities, the next priority is to track and analyse your table usage. This helps determine the most appropriate storage tier, for example analytics or data lake.

You have several tools within your Sentinel toolkit to support this process, including SOC Optimization.

Using SOC Optimization

SOC Optimization is a valuable resource for research and cost management. You can use the Microsoft Sentinel Optimization Workbook, available in the Content Hub, to start making data-driven decisions.

When evaluating long-term storage requirements, remember:

  • Retention needs may differ by table, particularly where compliance requirements dictate
  • Risk tolerance varies. Would you accept losing out-of-the-box detections if a table moves to the data lake? Are you comfortable relying on ad hoc detections or hunting?
  • Detection and MITRE coverage strategies should align with your security goals and, where relevant, with your managed security service provider’s (MSSP’s) recommendations
  • Budget constraints may require trade-offs between coverage and cost.

Example: AADNonInteractiveUserSignInLogs

This is often a prime candidate for cost optimisation due to its size. First, confirm whether it’s eligible for the data lake. In this case, the table supports both Total retention and Data Lake tiers, confirming its eligibility.

However, moving it requires careful consideration. If analytic rules or hunting queries depend on this data, removing it from the Analytics tier may introduce risk. Detections and advanced hunting rules do not run in the data lake. While ad hoc queries are possible, assuming the same security coverage would be risky.

This is where stakeholder engagement becomes essential – review your plan with detection teams, SOC engineers, or other key decision makers before proceeding.

Identifying table dependencies

To research which tables are used in detections, you have several options:

  • The Advanced Hunting page includes a “Data Source” column, though it is often incomplete
  • The Log Source & Analytic Rules Coverage Workbook (available in the Content Hub) offers a more reliable view (but it’s not always 100%).

This workbook allows you to:

  • Review which analytic rules reference a given table
  • Determine whether those detections are critical
  • Access and customize the underlying Kusto Query Language (KQL) queries for deeper investigation.

Also consider the Last Modified Date – a stale rule may no longer be relevant if coverage is provided elsewhere (for example, by an XDR tool). Importantly, this workbook includes both Microsoft-provided and custom analytics rules. Don’t forget to also review Advanced Hunting and saved searches within the same workbook.

Based on this data, you may conclude that AADNonInteractiveUserSignInLogs should remain in the Analytics tier.

A different case: AADManagedIdentitySignInLogs

By contrast, AADManagedIdentitySignInLogs has only a single analytic rule associated with it, offering limited detective value. The same query can easily be run in the data lake, allowing you to allocate budget to higher-value tables.

In this scenario, moving the table to the data lake makes sense, freeing resources for detections that truly matter.

On a final note, it’s important to remember that table placement is a balance of compliance, risk tolerance, detection coverage, and budget. Always validate dependencies before moving data to the lake.

Costs

At the time of writing the Microsoft Cost Calculator is still using the preview data but we expect it to update shortly to reflect the final pricing.  It’s worth looking at this preliminary pricing – to see why data lake should be important to you. So head over to
Pricing Calculator | Microsoft Azure –> select “Microsoft Sentinel”

You can see here using (UK South) and GBP (£) that one Gigabyte of storage in the Analytic (Logs) tier is £120.43 (per month) vs. £1.40 for data lake (preview pricing, but GA pricing is expected to be similar). This makes the data lake compelling for many.

Let’s also figure in query costs in the data lake (also from the same calculator). Let’s assume you do 30 queries per month and each query scans about a gigabyte of data:

You’re looking at about £140 per month.

But what if you run a very active SOC and you’re worried about SOC analysts performing queries that run up the bill? There are two new cost management features at your disposal (note, you need Billing and Security Admin to even “see” these features).

1. Usage-based alerts: You can set usage-based alerts on specific meters to keep an eye on and manage costs.

2. In-product reports: You can use in-product reports to gain insights into usage trends over time, helping you identify cost drivers and optimize your data retention and processing strategies.

In the final blog of this series, I explain how to save money with Sentinel data lake.

Further Insights from Quorum Cyber.

Privacy Preference Center

Skip to content