Messages counted on an Input of new Graylog2 node is duplicated


(Joel B. Hilke) #1

I have recently installed a 2 node Graylog v2.4 system on Ubuntu 16.04 with recommended versions of Elasticsearch (v5.6) and Mongodb (v3.6.3) as an improvement to the existing single Graylog v2.1 system (production) built from the OVA. My plan is eventually to convert the existing system into a 3rd node, but right now the old Graylog 2.1 is using a GELF output to forward all messages to the 2nd new node. This seems to work well, and the 2 new nodes see each other and data is replicated with Elasticsearch. Searching messages on either node works well.

However, when sending a log input directly to either of the 2 new nodes (using the Plain/Rawtext UDP or Syslog UDP) from a firewall, I seem to end up with duplicated messages. I am comparing using the same time periods across nodes, and when look at the new nodes via this Input, the count of messages is double that of either the original node or a Stream which captures all these. Here is summary of setup:

Graylog - original OVA server running v2.1
Graylog2a - new server Node 1 running v2.4 built manually
Graylog2b - new server Node 2 running v2.4, cloned from 2a, then various graylog and elasticsearch conf files modified to create original node name, and point to Mongodb on Graylog2a.

Here is a summary of sample stats for 5 minutes for this firewall source:

Graylog 93,000 messages from firewall Input and quick value stats for “loguid” field all show a count of 1 (showing unique ID from messages is unique)

Graylog2a 186,000 messages from firewall Input and quick value stats for “loguid” field all show a count of 4 (showing unique ID quadrupled), but query of the Stream showing these messages shows count of 93,000 (showing unique ID doubled)

Graylog2b 186,000 messages from firewall Input and quick value stats for “loguid” field all show a count of 4 (showing unique ID quadrupled), but query of the Stream showing these messages shows count of 93,000 (showing unique ID doubled)

I’ve looked through numerous logs and cannot see any obvious errors. Any idea why this would happen, and what else to check?


(Jan Doberstein) #2

Did you have multiple indices defined?

Check if you have the messages that are duplicated in multiple streams that are saved in different indices.


(Joel B. Hilke) #3

I should have noted that in my first post. Yes that’s another reason we moved to the latest version of Graylog in order to get index sets. I created a short time length index set just for the firewall logs but have also checked the box to remove them from the all messages stream. So to my knowledge they should not be in more than one Index.

When I search via defined stream for firewall logs, the overall message count seems accurate, but when viewing via their specific Input, the count is doubled.

  • Joel

(Jan Doberstein) #4

ok, we should clarify on that:

  • you look/search in on stream and you see the messages only once
  • you search as admin (or a user with admin rights) via search side (which searches over all messages/all streams/all indices) you get the messages double displayed

Is that correct?

If you look at one of the doubled messages and expand that message (to get the message details) you can see from what input the message is received, where it is stored and in which stream this message is displayed. You should compare that in the selected message and it companion that is saved in addition. This way you might get the point where this happens.


(Joel B. Hilke) #5

Problem solved. By looking at two doubled messages I could see they were included in another Stream I had created to show all messages from a particular Graylog node. This stream was set in the default longer length index, so they were getting stored in 2 different index sets. I modified the GL Node stream to exclude these messages, and the message count dropped in half. Thank you!


(system) #6

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.