Good Afternoon:
I currently use and benefit (greatly) from @ddbnl 'sOffice365/Azure collector: ddbnl’s Office365/Azure Collector.
Unfortunately (for me) the extractor they provide also stores a lot of extraneous information, including one field that causes ~2,000 errors per 24 hours (Example):
OpenSearchException[OpenSearch exception [type=mapper_parsing_exception, reason=failed to parse field [ListBaseType] of type [long] in document with id '09fb8989-436d-11ee-bb9b-9acc4b3b621e'. Preview of field's value: 'DocumentLibrary']]; nested: OpenSearchException[OpenSearch exception [type=illegal_argument_exception, reason=For input string: "DocumentLibrary"]];
I understand a better approach would be to create a Pipeline + Rules. Unfortunately, Pipelines are well outside of my comfort zone.
So my question is whether anyone could provide a few breadcrumbs on how to extract just certain fields from a message. For example: “Operation”, “Record Type”, “Device Properties”, etc. from a message like:
{
"AzureActiveDirectoryEventType": 1,
"gl2_remote_ip": "192.168.128.117",
"gl2_remote_port": 41596,
"UserKey": "<redacted>",
"ActorIpAddress": "<redacted>",
"source": "192.168.128.117",
"Operation": "UserLoginFailed",
"OrganizationId": "<redacted>",
"gl2_source_input": "<redacted>",
"ExtendedProperties": "{Name=ResultStatusDetail, Value=UserError}, {Name=UserAgent, Value=Windows-AzureAD-Authentication-Provider/1.0}, {Name=UserAuthenticationMethod, Value=262144}, {Name=RequestType, Value=OAuth2:Token}",
"IntraSystemId": "d0b9c2a4-ee31-4130-b2f8-03fb7ed56600",
"Target": "{ID=<redacted>, Type=0}",
"RecordType": 15,
...
}
If there is an easier way, like continuing to use the JSON extractor but with the ability to “ignore” certain fields, I am all ears.
As always, thank you!