Collecting Office365 & AzureAD audit logs using Office Audit Collector

Hi everyone,

I was asked to create an article on a tool I made for Graylog that collects Office365 and AzureAD audit logs. I’ll explain how it works and how to set it up in the article below. It’s open source and free to use. I hope it can be useful to you. If there’s any questions feel free to ask me here, or create an issue on the Github repo.

Collecting Office365 & AzureAD audit logs using Graylog and Office Audit Collector

Monitoring and archiving audit logs is an essential aspect of security. This is especially true of Office365/Azure audit logs, as they expose so much useful data: Azure (failed) logins, Data Loss Prevention events, access to sensitive documents, prevented phishing attempts, etc.

That is why I build an Audit Log collector (in Rust and Python) that that can output all these events into Graylog. It’s an executable that runs on both Windows and Linux. By archiving the audit log trail in Graylog, we own the data; it does not disappear after 90 days, and we can take advantage of Graylog’s great dashboards.

In this article I will give a quick overview on how to use the Office365 Audit Log Collector tool to get your full Office365/AzureAD audit trail into Graylog by scheduling to run the tool regularly in a Cron job or Scheduled task. It was designed to be easy to use and should not take too long to get running. To collect Audit logs you need an Azure App Registration with the appropriate permissions; this onboarding process is also easy and takes only around 5 minutes. I will describe it at the end of the article, so you can get an impression of the tool first.

How the tool works: Office365 management APIs

This section gives a brief explanation of the APIs; if you just want to get the tool running you can skip it.

This tool works by accessing the Management APIs of Office365 using Rust (collector engine) and Python (everything else). This is the only stable way to retrieve large amounts of logs. Unfortunately using Microsoft’s own Powershell commands you cannot retrieve the sometimes millions of logs that the users of even a small to medium sized tenant might generate, let alone a larger one. Retrieving the logs is a cumbersome process, which is why I wrote a tool to automate it in the first place. For anyone interested, I described the technical aspects of this process in a Stackoverflow answer.

The APIs expose five Audit Log Feeds, which (if you weren’t using this tool) you would have to subscribe to first in order to access them:

  • General
  • Azure Active Directory
  • Exchange
  • Sharepoint
  • DLP

The API allows you to retrieve logs for the last 7 days, so I recommend collecting them every few hours using a Cron job / Scheduled task to store them in Graylog, where you’ll own the data permanently. The API only allows you to collect logs for 24-hour time spans, but the Collector tool allows you to retrieve all 7 days at once if you need (by automatically splitting it into 24-hour periods and merging the results).

Each Audit Log Feed contains pages of content blobs. Each content blob contains multiple logs. So if you want to retrieve logs manually, you need to access a feed, collect all pages of content blobs, then collect all the content blobs themselves, then extract the logs, and then send them to an output. This is what the Collector tool does for you automatically.

Running the Office365 Audit Log Collector executable:

I’ll now give an explanation on how to use the tool. Combined with the Azure onboarding section at the end of the article, you could have this running in your own environment if you’re interested.

Downloading the executable:

After choosing where to run the tool from (preferably an always-on environment where you can schedule regular execution) you can download the executable (check for the latest release) and place it in its own folder somewhere. All dependencies are baked into the executable, so this is all you need to get started apart from creating an Azure App Registration (see end of the article).

For any of these steps, it makes no difference whether you use Windows or Linux, other than OS syntax. In every release you’ll find a Linux- and a Windows executable to download:

  • LINUX-OfficeAuditLogCollector
  • WIN-OfficeAuditLogCollector.exe

Creating a Graylog input:

The Graylog input which receives the audit data from the executable is a simple Raw/Plaintext TCP input, which we can (but don’t have to) create with default values. For this example I’ve only changed the name and port:

Graylog > New Input

On your new input click “manage extractors”. We need to create an extractor to get all the data from the JSON that Microsoft gives us. Click “import extractor” and paste the following simple JSON extractor:

{
  "extractors": [
    {
      "title": "Audit Log Extractor",
      "extractor_type": "json",
      "converters": [],
      "order": 0,
      "cursor_strategy": "copy",
      "source_field": "message",
      "target_field": "",
      "extractor_config": {
        "flatten": true,
        "list_separator": ", ",
        "kv_separator": "=",
        "key_prefix": "",
        "key_separator": "_",
        "replace_key_whitespace": false,
        "key_whitespace_replacement": "_"
      },
      "condition_type": "none",
      "condition_value": ""
    }
  ],
  "version": "4.2.9"
}

Creating our config file:

We now have the executable to send logs, and a Graylog input to receive them. How logs are collected and where they are sent to is defined by a YAML config file used by the executable. In the Github repo you can find a folder ConfigExamples to show you the ropes. Especially useful is the “fullConfig.yaml”, which contains all possible options with explanatory comments. Let’s first look at the config file we will be using for this example:

log:  # Define the executables own log settings
  path: 'collector.log'
collect:  # Define how to collect audit logs
  autoSubscribe: True 
  skipKnownLogs: True
  hoursToCollect: 24 
  contentTypes:
    Audit.General: True
    Audit.AzureActiveDirectory: True
    Audit.Exchange: True
    Audit.SharePoint: True
    DLP.All: True
output:  # Define outputs to send audit logs to
  graylog:
    enabled: True
    address: 172.16.1.1
    port: 5010

Since the collector strives to use sane defaults, we can keep our config file fairly short! There are three sections: log, collect and output, which define how our collector will run. I will describe the parameters that are not self-explanatory:

  • autoSubscribe:
    Before you can collect audit logs it is necessary to subscribe to audit log feeds using the Office Management APIs. By using this parameter, this is done for you automatically.
  • skipKnownLogs:
    To prevent duplicates, the content ID of each log is saved to a file so we can prevent sending it to Graylog twice. Microsoft gives us a content Expiration for each ID, so the file will not grow infinitely. We will keep each ID for a week at most.
  • hoursToCollect:
    The amount of hours to look back and collect logs for. Setting this to 24 hours and running the tool regularly (every few hours at least) will ensure that there are multiple chances to download each log. That way, if Microsoft has any delay in exposing certain logs, we will pick them up later.
  • contentTypes:
    Choose which audit log feeds to collect logs from. Usually you collect all five.
  • graylog:
    We choose Graylog as our output. This is where you enter your Graylog address and port. There are many more outputs supported and you can use more than one at the same time if you need.

Now that you understand the config file, you can copy the text and place it in a file on your machine (you only have to change your Graylog address and port). It can be anywhere and have any name, but for convenience place it in the same folder as the executable, and call it “config.yaml”.

Our folder now looks like this:

Executable folder

Running the tool:

Everything is now in place to run the tool. We will first run it manually to verify that it works, and to see which parameters to use. Then if we are happy with the results we will schedule regular execution to start archiving our audit trail. Here’s the help message for the executable:

Executable help

While there a quite a few parameters, most belong to outputs we aren’t using. We only need the required parameters at the bottom:

  • Tenant_id:
    The ID of your tenant; when you onboard you can find this in the overview page of your App Registration.
  • Client_key:
    The ID of your azure app registration; when you onboard you can find this in the overview page of your App Registration.
  • Secret_key:
    The password of your Azure App Registration. You will create this when onboarding (see onboarding section).
  • –Config:
    path to our config file

To run the tool I will first define some environment variables (not mandatory):

Env

Now we will run the tool using the following command:

.\WIN-OfficeAuditLogCollector-V2.0.exe $env:tenant_id $env:client_id $env:secret_key --config .\config.yaml

Running the tool

The collector retrieves the audit logs and starts sending them in to Graylog:

Logs starting to come in

We can verify that the extractor is working by viewing the messages of our new input and checking the ‘fields’ tab of Graylog. It should show us a lot of new fields:

New fields

The first run always takes the longest, as all data that is retrieved is new. With ‘skipKnownLogs’ enabled, all subsequent runs will only retrieve new content.

Scheduling regular execution:

The final step is to schedule running the tool, preferably every few hours at least. You can do this using Cron or Windows task scheduler, or any other way you want to. Simply run the command we used in the last chapter on a schedule:

.\WIN-OfficeAuditLogCollector-V2.0.exe $env:tenant_id $env:client_id $env:secret_key --config .\config.yaml

Onboarding:

If you would like to start using the Audit Log Collector you will need an Azure App Registration. This is easy to create. Here is short summary on how to do this:

  1. Make sure Auditing is turned on for your tenant (it’s probably already enabled):
    1.1 Use these instructions: If you had to turn it on, it may take a few hours to process.
  2. Create App registration:
    2.1. Azure AD > ‘App registrations’ > ‘New registration’
    2.2. Choose any name for the registration
    2.3. Choose “Accounts in this organizational directory only (xyz only - Single tenant)”
    2.4. Hit ‘register’
  3. Save ‘Tenant ID’ and ‘Application (Client) ID’ from the overview page of the new registration, you will need them to run the collector.
  4. We will create an app secret. Azure AD > ‘App registrations’ > Click your new app registration > ‘Certificates and secrets’ > ‘New client secret’.
    4.1. Choose any name and expire date and hit ‘add’.
    4.2. Actual key is only shown once upon creation, store it somewhere safe. You will need it to run the collector.
  5. We will grant our new app registration ‘application’ permissions to read the Office API’s:
    Azure AD > ‘App registrations’ > Click your new app registration > ‘API permissions’ > ‘Add permissions’ > ‘Office 365 Management APIs’ > ‘Application permissions’
    5.1. Enable ‘ActivityFeed.Read’
    5.2. Enable ‘ActivityFeed.ReadDlp’
    5.3. Click ‘Add permissions’
  6. You now have a valid ‘tenant ID’, ‘application (client) ID’ and ‘secret key’ to run the collector with!

That’s it!

Both onboarding and scheduling the tool to retrieve logs is fairly simple and quick, but the benefits of having all Office365 and AzureAD audit data in Graylog are (in my opinion) huge. Next steps could be to create dashboards for your new data, but this is outside the scope of this article.

Lastly: any and all contributions / pull requests are appreciated. This tool is distributed under the MIT license.

I hope this can be useful to you!

5 Likes

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.