Dashboard not populating when filter is less than 8 hours from now

Thanks for the info @tmacgbay I am not sure why my previous command gave that older version maybe I used the wrong command (I was looking at the apt-cache version)

running the command you provided I am on:

elasticserach 7.10.2
mongo 4.4.10
mongo-database-tools 100.5.1

Under system->overview - what is your time configuration? You mentioned server and user timezone but not web browser time zone. Just double checking. If you have more than one Graylog server, you should also check the time on each under system/node. Has the newest data updated to 6 hours ago still - which would suggest Elastic is still receiving?

image

It does like like they all match there, here are the times from the config page:
times

This is super odd because it worked on 4.0 and now still works but seems broken’ish ( I tried to add more screens as examples but as a new user I cannot ad more than one image to a post)

what the stream looks like when filtering under 8 hours until now:

what the stream looks like when filtering 8 hours ago until now (notice it now populates however cuts off and does not display current log entries however the log entries are still coming in as you can see from the messages below:

How about the “process-buffer dump” under the system/Nodes Actions button
image

Mine show idle but I have seen things in there when the server was hung up - how does input/process/output buffer processing look for all nodes? Anything hanging up in the buffers? What is the utilization at? possibly six hours back on processing?

EDIT: If you search absolute time 6-8 hours in the future, do you get current messages?

EDIT2: What are the results of $sudo timedatectl (from reading through here… old but may still be relevant)

1 Like

Thanks again for the ideas, I know just enough about Graylog to set it up and make it look pretty but have not really had to do much troubleshooting until now…

looks like mine are all idle as well

utilization is typically under 1%

buffers are 0%

sudo timedatectl returns correct values for local time, UTC, and RTC as well as the proper timezone, NTP service is active, and system clock is in sync

interesting, if i search for an absolute range of a few days it returns emptiness. setting refresh to every 1 second just scrolls emptiness each second

this is bizarre, it should not be acting this way and maybe it has nothing to do with date/time settings and more to do with something else… I do see some indexer failures in the overview section, they are all the same this looks suspect although I am not sure why it would only break things when filtered under 8 hours:

ElasticsearchException[Elasticsearch exception [type=mapper_parsing_exception, reason=failed to parse field [HighResolutionTimestamp] of type [date] in document with id… Preview of field’s value’VoSDvb’]]. Illegal_argument_exception, reason=failed to parse date field with format [strict_date_optional_time | | epoch_millis. failed to parse with al enclosed parsers.

EDIT: looks like these are just messages that it cannot parse due to junk mis-aligning the fields in the stream and happened 7 hours ago. hard to tell what character threw it off

well… You are having time errors in Elastic and time errors in searching against Elastic… That is a place to start. What are the extractors or pipeline rules you have that have anything to do with time or [Tt]imestamp of a given message? Post them and a relevant message you are parsing … and use the </> forum tool to make them look nice?

Hello

I agree with @tmacgbay

Just out of curiosity,
I’m not sure if it will solve your issue but you might try rotating the index/indices. Are you using any custom index mappings?

EDIT: I forgot to mention , you could also check the field " HighResolutionTimestamp" shown in the error reason=failed to parse field [HighResolutionTimestamp] of type [date] and post it here.

Example:

curl -X GET "localhost:9200/index_name/_mapping/field/HighResolutionTimestamp?pretty"
1 Like

The extractor set I am using is the PANOS 10.x from here: PANOSGraylogExtractor/10.0.json at master · jamesfed/PANOSGraylogExtractor · GitHub

The only pipeline rule I have is for the GeoIP lookup.

I also cleared the indexer failures from mongo to get a fresh look, that was several hours ago and I have not seen any new errors yet but I still see the issue of dashboards not populating unless the filter is set to anything .= 8 hours.

I have also attempted to remove the HighResolutionTimeStampf rom the equation by telling Graylog not to process that field (This may be why I am no longer seeing the index errors) but ruling that out as the culprit is worth a try. I am hoping to still see no indexer errors for that field when I check again later but I dont think that resolves the original issue here…

Last week I did rotate through my indexes (I am only running with 5 - (4 shards)) but that did not make any differences in the behavior I am seeing.

Looking at the extractors I see that field appears 3 times and is processed by the extractor when conditions are met (log type is TRAFFIC or THREAT or SYSTEM) currently because I have removed that field from being processed it just comes up with “mappings” : { }

Since I already have other date/time fields in these logs I do not even need the HighResolutionTimestamp field so I will likely monitor to make sure I do not get any more indexer errors on that field and leave it off.

Removal and no new errors does not seem to be resolving the issue of Graylog not populating when ,8 hours is used for a filter tho so while this does seem to be a good exercise I am still left scratching my head as to what the cause could be.

For anyone curious, this is a post I made 5 months ago when running Graylog 4.0 and you can see the dashboards are properly populating and showing data using a “5 minutes ago until now” filter, this issue started after that update to Graylog 4.1 (along with elastic and mongo updates required for that), everything else was the same (same OS, same server, same device sending logs, etc)

Using Graylog with Palo Alto Networks Firewall running PANOS 10.x : graylog (reddit.com)

I keep coming back to the Graylog Browser thinks it’s in a different Tim Zone. The times on the actual messages are correct but Graylog is displaying in UTC and since you are 7 hours back… What about your web browser time zone? Have you tried a different web browser to access Graylog. So frustrating!

Also - looked at your PA dashbaords and they look great - I will likely snags some ideas from there. We use PA as well but I am always looking for ideas for clearer information! I didn’t use the plugin, I ended up building out my own pipeline. PA changed around log formatting associated with VPU users and it was easier to keep up with my own work than wait for the plugin maintainer. :slight_smile:

I did try this on: Firefox in Linux, Chrome, on ChromeOS, Windows 11, and Edge but get the same results all around :frowning:

Yep, one of the things that I am assuming is that eventually they will update the log format and break everything, I have been running 10.x since it was in beta, having the Graylog dashboards have been super handy for getting quick and timely information. I have also made dashboard tabs for specific endpoints and unknown hosts (kind of a cheap NAC).

Let me know if you think of anything else I could look into, I am going to continue just poking around and researching.

p.s. after removing the trouble fields from being indexed I have been indexer error free and I think that issue is squished.

niggling in the back of my brain - iterate through all the different time zone settings from Graylog to your browser, change them out one by one to something else and test to see if the results are expected…

Maybe it is as simple as the system needs to be jostled or even you have a stray spelling/formatting error in a conf file somewhere?

@orthonovum

This is weird, If all devices and configurations for Date/Times are correct, and you still are unable to see messages under 8 hours.

I see you running UTC/GMT -8 hours (Pacific Standard Time). Correct me if I’m wrong but It just seems odd that your time zone reflex on this issue with no messages under 8 hours.

To sum it up

  • All remote devices have the correct Date/Time
  • Graylog configuration is set America/Los_Angeles
  • Graylog Time Configuration on Web UI shows all three Date/Times correct.
  • User Time Zone is correct
  • Your date/time stamp in the raw logs is correct.
  • All Date/Times within your environment reflects America/Los_Angeles ( Date/Time)

This may not solve your issue because I’m also running out of ideas but did you upgrade mongodb? If so I was wondering if you checked for feature compatibility. For example if you open mongo shell then execute following command

db.adminCommand( { getParameter: 1, featureCompatibilityVersion: 1 } )

you should see the same version of mongo as your installed versions.
I installed MongoDb 4.4 and you can see feature Compatibility is the same version.

EDIT: You said you rotated the indices. Have you recalculated the index ranges?

1 Like

Yep, that summarizes things perfectly, I ran the command in mongo and get the same results as your screenshot there.

I may just reinstall and rebuild from scratch at this point :frowning:

To be honest I think your system might fine. I believe the modification you implemented maybe the culprit. And the upgrade to 4.1 broke it.
The best way to test this is back track.
For example: Create a syslog UDP input and send some data. wait 2 -8 hours. Try to remove any or all modification. When I have an issue like yours I revert back to the basics and if that does not work then I know for a fact I have messed up my system. If it does work then what ever I configured before or add to I know that is my source of my issues.

UPDATE:

They released Graylog 4.2.1 which includes the following fixes that resolved my issue:

  • Fix issue where the first day on the traffic graph didn’t show the full day traffic value
  • Add timezone setting to PaloAlto inputs to fix timestamp parsing

I updated, rebooted, and BOOM everything looks like it should just like it did on Graylog 4.0

I knew it was a bug… (>.>) (^_^)/

2 Likes

Nice glad you found the fix :smiley: and thanks for posting the resolution to this issue.