Archive Microsoft 365 Defender logs
with Sentinel Archive and the new custom logs ingestion API [part 2/2]
Update
I’ve recently wrote an article about storing Microsoft 365 Defender data in Azure Data Explorer (ADX). Because of the versatility of ADX, you might want to reconsider using Sentinel Basic logs. I’ve built a fully automated solution to setup the ADX environment and called it ‘ArchiveR’ 🤖 and here you can read all about it.
Welcome back!
I’ve recently wrote an article about Microsoft Sentinel Basic and Archive logs and the new custom log ingestion API with Data Collection Endpoint. That was the first part in a series of two.
In this second part we’re going to dive deeper in leveraging the new custom log ingestion API to archive Microsoft 365 Defender logs. This way we can make use of Basic logs and keep the costs as low as possible.
- Part 1 | Introduction to new log tiers
– Basic and Archive log tier details
– New custom log ingestion method with Data Collection Endpoints - Part 2 | Archiving Microsoft 365 Defender logs [📍you are here ]
– Downsides and limitations of the integrated M365 data connector
– Use Logic App to ingest 365 Defender data as custom logs into Basic table
If you’re not already familiar with the concepts described in the first part, you might want to consider reading part #1 first and then return here afterwards.
Microsoft 365 Defender data connector
In the previous part I’ve introduced the idea of of archiving 365 Defender logs to Sentinel, why you would want to do this and how you can achieve this.
Since Microsoft Sentinel now comes with a fully functional data connector for Microsoft 365 Defender. (it now supports ingestion for all underlying products) This is by far the easiest and shortest way of collecting and storing your logs long-term. You can store your data for up to two years in Log Analytics and are able to extend this to up to seven years with Archive logs.
But as pointed out earlier, we’re going to deviate from this path and show a different approach. For when you don’t need all features that come with the native data connector (i.e. analytic rules correlations) and want to optimize the costs that come with ingesting data.
Preparations
To get you started quickly, I’ve prepared an ARM template deployment which will deploy all necessary Azure resources. And it’s fully automatic! No manual deployments needed at all. Everything will be setup and configured to get started right away. Try it out in your demo/lab environment and take a closer look at how all the intricate details work together as a whole. If you’re satisfied, you can always re-deploy it into production with or without the adjustments you might seem fit.
You can also visit my Github repository and download the template yourself.
Logic App walkthrough
Before we continue with a breakdown of the deployed solution, please make sure the streaming API within Microsoft 365 Defender is configured to stream all events to the newly created storage account.
Navigate to https://security.microsoft.com →
Settings
→Microsoft 365 Defender
→Streaming API
→Add
This might take some time and in the meantime we can check out how everything works.
First we need to declare quite some variables:
Some of these variables are purely here because of the automated deployment. It’s much easier to just put an ARM template parameters or variable into a Logic App variable, then to cram it in between the code somewhere.
storageAccountName
contains de displayName of the storage account where all the blob containers reside which are going to be filled with data coming from Microsoft 365 Defender. The deployed storage account comes standard with a 7-days retention policy so it ill cleanup old data for you.- Variables
year
,month
andday
are going to represent the blob container structure for each event type and are based on the current date/time minus one day to make sure the Logic App will retrieve yesterdays logs. - Both
dceUri
(Data Collection Endpoint ingestion URI) anddcrPrefix
(Data Collection Rule prefix) are here to simplify the automated deployment so that the values can be dynamically updated. - Then there’s the
customTableSuffix
which I’ve set to_archive_CL
but you might want to chose a different one for your tables. - The
dataCollectionRules
variable is an interesting one because it contains all of the Data Collection Rule names that were deployed, as well as all of the related imutableIds for each respective DCR. Again, dynamically filled in during deployment because the imutableIds are globally unique. And although it might look like an array, it’s datatype is still a string. The next step will parse the list so that we can select an imutableId from it later.
Data Collection Rules
- The DCR will determine where the data will be stored once you’ll send logs to its imutableId which in term is part of the URI you’ll post data to. More on this later…
- It is linked to both a table and a Data Collection Endpoint as demonstrated by Microsoft in their tutorial. I learned the hard way that you can’t go on and create table after table and use the same DCR for all of them. There’s a limit of 10 tables (streams) within a DCR, so I figured that I might as well create a separate DCR for every table.
- It also contains the transformation query used to transform your logs on-the-fly. In our particular instance the transformation query is always the same for all tables and DCRs and quite simple. The log definitions are already fine and for this purpose we also want to keep them in tact and original. The only thing that’s missing is a
TimeGenerated
column which is mandatory. Data coming from 365 Defender does however contain a columnTimestamp
, which I could’ve used, but I’ve opted forTimeGenerated
to be the ingestion time in the workspace:
source
| extend TimeGenerated = now()
Tables
As of right now my lab environment only had 17 different tables in Microsoft 365 Defender and therefore I also have an equivalent amount of blob containers and Data Collection Rules.
These tables and DCRs are already part of the automated deployment you’ve used. But it might very well be possible that Microsoft will add more tables to the product as time goes by. Additional tables and Data Collection Rules are then required in Azure as well.
- Go into the Log Analytics workspace → Tables → Create → DCR-based
For the table name you can look at the name of the blob container and use the string after the last hyphen, followed by _archive_CL
.
For example: the blob container for
alertinfo_archive_CL
is named “insights-logs-advancedhunting-alertinfo”.
2. Create a new Data Collection Rule and select the already existing Data Collection Endpoint from the list.
3. When asked to upload a sample log file you can retrieve a blob from the blob container on your storage account.
The storage account, which was part of the automated deployment, comes with network ACLs so you have to temporarily whitelist your IP address to get access.
Do note that in order to keep the original schema of the table in tact, you need to alter the blob a bit before it can serve as a proper sample file. This process is automated withing the Logic App but needs to be performed manually here once to create a proper sample.json
file.
4. Find that newly created Data Collection Rule you’ve created in step 2 and note down it’s name and imutableId
. You’ll need to add these to the “Initialize ‘dataCollectionRules’ variable” step in the Logic App as pointed out above.
Retrieving blobs
So, now we can start connecting to storage account and retrieve al blob container names from the root folder. Next, it will cycle through each of the blob containers.
For each blob container it will collect and construct a couple of variables needed for communicating with the Data Collection Endpoint:
- Define the correct table name.
- Determine the correct
dcrName
to lookup the imutableId for in thedataCollectionRules
variable. - Retrieval of the imutableId from that variable we’ve parsed earlier is done with a Javascript step.
Javascript can be used within a Logic App workflow after an Integration Account is deployed in Azure as well. Currently a ‘free’ pricing tier is in use, but check the pricing details before using in production.
Next, it will browse through the year/month/day folder structure, until it reaches yesterdays container, and then cycle through all the of the ‘hour’ containers in there.
Within each ‘hour’ container resides a single PT1H.json
blob. But this can (and probably will) consist of many log rows, all defined as separate items in a single object. There’s also another challenge and that’s that the appropriate object definition brackets ([
and ]
) and item delimiters ( ,
) are missing!
Composing a proper ‘body’
- “Compose body…” will fix the json structure by concatenating in these required characters so that the results can be parsed.
- “Parse string…” will make sure only the
properties
section will be kept in tact. (to remove unwanted columns as shown earlier above) The output is a new object which can be processed item by item (data element).
Splitting the object into separate items, and processing them one-by-one is important to not reach the Data Collection Endpoint body limit of 1 MB
Sending logs to the Data Collection Endpoint
For every item/data element we’ll be POST
ing the contents of properties
as a Body
in the web request:
Here it all comes together:
- Data Collection Endpoint URI
- The Javascript result containing the correct imutableId for that specific table based on the originating blob container
- And of course the correct table name
Authentication
To make sure the Logic App can both authenticate to the Storage Account and the Data Collection Endpoint, a system-assigned Managed Identity is used.
Once enabled the Logic App has its own identity you can authorize on Azure resources. In this particular case two permissions are required:
- Monitoring Metric Publisher is required on the Data Collection Rule
- Storage Blob Data Contributor is required on the storage account
But the automated deployment already took care of this by assigning both permissions to the resource group containing all relevant resources:
Some lessons learned
During the creation of this whole solution I went through some extensive trial-and-error. Mostly because of the fact that the new custom logs ingestion approach is still in public preview, and a lot is still undocumented or unclear due to the lack of proper feedback.
Here are some pitfalls I made so you won’t have to:
- This Logic App is heavily relying on “foreach” loops and has multiple loops within loops. (Logic Appception!) It’s not possible to “initialize” a variable inside a loop, but you should also avoid “setting” a variable inside one as well. Otherwise you won’t be able to use parallelism, which is quite necessary to get some acceptable running times. Use “compose” instead! I started of without parallelism, but workflow runtimes decreased with 80% once I could enable it back again.
- To “fix” the json structure of the blob contents I made use of
concat()
andreplace()
, to replace carriage return and new line characters. When you input this within the graphical designer view, the underlying code gets messed up and additional escape characters (\
) are added which leads to undesired results. Go into code view and fix this by removing these:
Replace:
"value": "@{replace(items('...')['...'],'\\r\\n',' ')}"To:
"value": "@{replace(items('...')['...'],'\r\n',' ')}"
- Data Collection Rules can only be assigned to 10 different “streams” (destinations/tables)
- The Data Collection Endpoint will not process a
Body
larger than 1 MB in size.
{
"error": {
"code": "ContentLengthLimitExceeded",
"message": "Maximum allowed content length: 1048576 bytes (1 MB). Provided content length: 3101601 bytes."
}
}
- Other than that the API is very silent when it comes to error handling. You’ll always receive a response code 204 on your request with an empty response message. So you won’t know if everything went well until you see data coming into the workspace. (which will also have a delay on the very first push for up to 15 minutes)
- The same goes for a proper
Body
notation. Without the square brackets ([
]
) for example, the data won’t be processed but you’ll still receive that status 204 on the API.
Conclusion
We’ve reached the end of the line and part #2 of my first multi-part article!
By “abusing” the new custom logs ingestion capabilities, together with Basic logs tier, we were able to archive M365 data and cut down costs significantly compared to the out-of-the-box Data Connector. Granted, we’re also losing some features, like triggering analytics rules within Sentinel for example. And you might argue that this solution is “a bit” more involved. But it works! So, depending on your needs this might very well be a viable solution.
I really like the new custom logs ingestion capabilities, especially the on-the-fly transformation and filtering. This comes in handy with all of those syslog logs out there, where you had to use something like Logstash or fluentD to create similar results. I’m looking forward to see how Microsoft will integrate this into their Azure Monitoring Agent and what other features and capabilities will be added along the way.
I hope you’ve find this informative and clearly explained. If you have additional questions, never hesitate to reach out to me!
— Koos