Updated to 6.3.1 and my Office365 Input (RAW TCP) no longer writes to index

accidentaladmin · August 6, 2025, 4:36pm

Greetings, as stated above, I recent upgraded to Graylog 6.3.1. It has not been a smooth update and I have been ironing out a log of little issues for the past week or so.

However, the issue I can’t seem to resolve is the Office365 Audit Log input issue. Ever since I upgraded, the input no long writes to the index.

I have created new inputs, streams, adding extractors, taking away extractors, and haven’t had any luck. There are no immediately apparent errors within either the graylog server logs or opensearch cluster logs.

I do know Graylog receives the logs because:

But:

Any thoughts?

Thank you!

Edit: Graylog Diagram
Opensearch:

Manager 0
Manager 1
Manager 2
HotData 0
HotData 1
HotData 2
ColdData 0

Mongo:
mongodb 0
mongodb 1
mongodb 2

Graylog:
graylog server 0

Joel_Duffield · August 6, 2025, 8:01pm

What happens if you look at the input diagnosis screen for that input (its on the “more options” drop-down next to the input. Do you see anything under message errors etc.

accidentaladmin · August 6, 2025, 8:43pm

No errors and, oddly enough, now this, a huge message dump (despite the remote data provider running every 15 minutes):

Joel_Duffield · August 6, 2025, 9:36pm

How long ago did you start the input, could this actually be a timestamp issue where the messages are being ingested but show at an incorrect time?

accidentaladmin · August 6, 2025, 9:50pm

The input has been running months and months but the issue just recently occurred. Ive been thinking a timestamp issue would explain it but I’m just not sure where the issue would arise. Its a RAW TCP input and Graylog doesn’t modify the timestamp as far as I know. At minimum it should have logs spread out across time even if the timestamps were wrong.

Joel_Duffield · August 7, 2025, 1:15am

On raw the timestamps should be fine. Are all other logs okay? Have you tried turning off all processing of these messages and seeing what happens?

Wine_Merchant · August 7, 2025, 8:13am

In recreating the stream, is it for sure now writing to the m365 index?

Try recalculating the index range under maintenance.

accidentaladmin · August 7, 2025, 12:47pm

Now that you mention it, it looks like all my logs are 4 hours behind,

For instance, something that happens at 08:45 EST, it shows in Graylog as 04:45.

accidentaladmin · August 7, 2025, 5:19pm

Okay, Ive dug deeper and have a HUGE problem: Graylog is neither using its Input buffer nor its Output Buffer but the Processing Buffer is pegged at 100%. Further, I have approximately 4.7 million unprocessed messages. Graylog runs in a LXC and I have maxxed it out to 16 vcpus and 32GB memory. Heap and Garbage collection is temporarily set at

GRAYLOG_SERVER_JAVA_OPTS=“-Xms20g -Xmx24g -server -XX:+UseG1GC -XX:-OmitStackTraceInFastThrow -Djavax.net.ssl.trustStore=/etc/graylog/graylog.jks”

and my buffers are:

processbuffer_processors = 14
outputbuffer_processors = 8
processor_wait_strategy = blocking
ring_size = 262144

inputbuffer_ring_size = 262144
inputbuffer_processors = 6
inputbuffer_wait_strategy = blocking

any suggestions?

Joel_Duffield · August 7, 2025, 5:50pm

How many messages per second are you ingesting, and how much processing are you doing? A machine of that size should easily process 10-20k messages per second, BUT some really intensive pipeline rules can kill that number very fast.

accidentaladmin · August 7, 2025, 6:51pm

So I disabled all pipelines and re-jiggered a ptr data adapter and was able to catch up quickly (i.e. no backlog). So now I suppose the next step is to figure out which pipeline is causing so many issues for graylog.

accidentaladmin · August 8, 2025, 11:44am

For anyone that comes here with a similar issue, I believe the solution for me was that I had my ptr lookup data adapter pointed at some stale dns servers. I had two of my largest streams - Fortigate Firewall and my M365 Audit log - utilizing that adapter. I believe the 4MM backlog was a result of each and every attempt to utilize the ptr data adapter timing out. It does not take long for an unwatched stream to backup.

Joel_Duffield · August 8, 2025, 10:26pm

Glad you found it, yes lookup adapters are powerful tools, but if you have them working on a lot of messages the performance of those can be very important. Caching can help with this often, but often caches do not work well on IP type lookup because there are so little repeats to get the benefits of caching.

system · August 22, 2025, 10:26pm

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Inputs are booming, no indexed messages Graylog Central (peer support) sidecar , winlogbeat	17	2069	November 4, 2020
Messages being fed to Graylog and apparently indexed, but they're not visible anymore Graylog Central (peer support)	32	9255	November 19, 2019
Proccess Buffer Full, Slow Output, Journaling Graylog Central (peer support) windows , nxlog , gelf , architecture	8	192	October 22, 2024
Slow down the Graylog system and not all inputs are process Graylog Central (peer support)	41	6564	April 6, 2017
Office365 plugin stops everynight; requires restart in morning Graylog Central (peer support)	11	854	June 24, 2022

Updated to 6.3.1 and my Office365 Input (RAW TCP) no longer writes to index

Related topics