Email notifcation, no backlogs

After upograding to the latest GRaylog verison, 4.2.4, some of my backlog messages are not sent with my alert notifications.

I have multiple alerts set up, some send the backlogs, some don’t. All of these were working prior to the upgrade and nothing was changed in my notification alerts after I upgraded.

Below are my notification templates for one that is working and one that isn’t, they are identical. Both event definitions have the “Message backlog” check box enabled and set to 1.

Message alert that works:
— [Event Definition] ---------------------------
Title: ${event_definition_title}
Description: ${event_definition_description}
Type: ${event_definition_type}
— [Event] --------------------------------------
Timestamp: ${event.timestamp}
Message: ${event.message}
Source: ${event.source}
${if backlog}
— [Backlog] ------------------------------------
Last messages accounting for this alert:
${foreach backlog message}
Source of alert: ${message.source}
Real message: ${message.message}
${message}
${end}
${end}

Message alert that doesn’t work:
Bad Email Connection Attempts

Timestamp: ${event.timestamp}
Message: ${event.message}
${if backlog}
${foreach backlog message}
Source of alert: ${message.source}
Real message: ${message.message}
${end}
${end}

Here’s examples of the emails that I receive:
— [Event Definition] ---------------------------
Title: FTP connection failure
Description: Email when a connection failure occurs via FTP
Type: aggregation-v1
— [Event] --------------------------------------
Timestamp: 2021-12-21T11:36:29.000Z
Message: FTP connection failure
Source: syslog

— [Backlog] ------------------------------------
Last messages accounting for this alert:

Source of alert: ftp
Real message: (?@35.195.93.98) [WARNING] Authentication failed for user [anonymous]
{index=graylog_187, message=(?@35.195.93.98) [WARNING] Authentication failed for user [anonymous], timestamp=2021-12-21T11:36:29.000Z, source=ftp, stream_ids=[611c1f9f42c6cb7725ef4d63, 000000000000000000000001], fields={gl2_accounted_message_size=219, application_name=pure-ftpd, level=4, gl2_remote_ip=10.1.1.1, gl2_remote_port=49236, facility_num=11, gl2_message_id=01FQEC337S9VRGXVYNRTC19X84, gl2_source_node=db8a38f9-c646-41ba-8dfc-8a04a70a387b, gl2_source_input=5f4fe62742c6cb17792cba7e, facility=FTP}, id=5c13d780-6251-11ec-9964-be7fe1bcfd20}

The one that does not work, it’s like everything within the {if backlog} fails to send:

Bad Email Connection Attempts

Timestamp: 2021-12-21T15:55:52.980Z
Message: Email Error Report: sendmail - count(application_name)=3.0

It would help if you edit or repost and use the forum tool </> to change your code to something readable. Highlight it and use the </> to format it in a code readable way. Without proper formatting it is very difficult to find issues or be sure of details.

Sorry about that.
Working code:

— [Event Definition] ---------------------------
Title: ${event_definition_title}
Description: ${event_definition_description}
Type: ${event_definition_type}
— [Event] --------------------------------------
Timestamp: ${event.timestamp}
Message: ${event.message}
Source: ${event.source}
${if backlog}
— [Backlog] ------------------------------------
Last messages accounting for this alert:
${foreach backlog message}
Source of alert: ${message.source}
Real message: ${message.message}
${message}
${end}
${end}

Non-working code

Bad Email Connection Attempts

Timestamp: ${event.timestamp}
Message: ${event.message}
${if backlog}
${foreach backlog message}
Source of alert: ${message.source}
Real message: ${message.message}
${end}
${end}

I can’t see any notable differences - not that I am surprised. What do you see for error messages? Are they coming in the GUI and/or the Graylog log files? You can watch the logs with this command as you test:

tail -f /var/log/graylog-server/server.log

Just this:

2021-12-21T15:46:01.181-05:00 ERROR [PivotAggregationSearch] Aggregation search query returned an error: Search type returned error:

Fielddata is disabled on text fields by default. Set fielddata=true on [message] in order to load fielddata in memory by uninverting the inverted index. Note that this can however use significant memory. Alternatively use a keyword field instead.
ElasticsearchException{message=Search type returned error:

Fielddata is disabled on text fields by default. Set fielddata=true on [message] in order to load fielddata in memory by uninverting the inverted index. Note that this can however use significant memory. Alternatively use a keyword field instead., errorDetails=[Fielddata is disabled on text fields by default. Set fielddata=true on [message] in order to load fielddata in memory by uninverting the inverted index. Note that this can however use significant memory. Alternatively use a keyword field instead.]}
at org.graylog.storage.elasticsearch6.jest.JestUtils.specificException(JestUtils.java:135)
at org.graylog.storage.elasticsearch6.views.ElasticsearchBackend.doRun(ElasticsearchBackend.java:255)
at org.graylog.storage.elasticsearch6.views.ElasticsearchBackend.doRun(ElasticsearchBackend.java:69)
at org.graylog.plugins.views.search.engine.QueryBackend.run(QueryBackend.java:83)
at org.graylog.plugins.views.search.engine.QueryEngine.prepareAndRun(QueryEngine.java:164)
at org.graylog.plugins.views.search.engine.QueryEngine.lambda$execute$6(QueryEngine.java:104)
at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1604)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)

Though it should be noted this message was occurring long before I made the upgrade and things worked fine. And some of my notifications are working still, nothing else comes through the logs when testing.

More of a bug?

if you take the working code and replace the code on the broken alert, will it work then? I am wondering if there is a hidden expectation of a particular field.

Made no difference, printed everything up until the “if backlog”

— [Event Definition] ---------------------------
Title: Email Error Report
Description: Email to report on errors and connections from both relay and mailserver
Type: aggregation-v1
— [Event] --------------------------------------
Timestamp: 2021-12-21T21:54:52.980Z
Message: Email Error Report: sendmail - count(application_name)=3.0
Source: syslog

Only thing I haven’t tried is deleting the current event definition and it’s notification and recreating everything. If it works, that doesn’t necessarily prove what stopped it from working in the first place.

Hello,

I did a test in my lab on this issue.

This is my Notification template and I’m using Email Notification for my setting.

  • Test #1
--- [Event Definition] ---------------------------
Title:       ${event_definition_title}
Description: ${event_definition_description}
Type:        ${event_definition_type}
--- [Event] --------------------------------------
Timestamp:            ${event.timestamp}
Message:              ${event.message}
Source:               ${event.source}
Key:                  ${event.key}
Priority:             ${event.priority}
Alert:                ${event.alert}
Timestamp Processing: ${event.timestamp}
Timerange Start:      ${event.timerange_start}
Timerange End:        ${event.timerange_end}
Fields:
${foreach event.fields field}  ${field.key}: ${field.value}
${end}
${if backlog}
--- [Backlog] ------------------------------------
Last messages accounting for this alert:
${foreach backlog message}
Source of alert: ${message.source}
${message.message}
${end}
${end}
  • Results #1

  • TEST # 2 using Real message: ${message.message} in template
--- [Event Definition] ---------------------------
Title:       ${event_definition_title}
Description: ${event_definition_description}
Type:        ${event_definition_type}
--- [Event] --------------------------------------
Timestamp:            ${event.timestamp}
Message:              ${event.message}
Source:               ${event.source}
Key:                  ${event.key}
Priority:             ${event.priority}
Alert:                ${event.alert}
Timestamp Processing: ${event.timestamp}
Timerange Start:      ${event.timerange_start}
Timerange End:        ${event.timerange_end}
Fields:
${foreach event.fields field}  ${field.key}: ${field.value}
${end}
${if backlog}
--- [Backlog] ------------------------------------
Last messages accounting for this alert:
${foreach backlog message}
Source of alert: ${message.source}
Real message: ${message.message}
${end}
${end}
  • Results #2

  • Results #2 Using HTML

Hope that helps

EDIT: I probably should have shown these settings.

Thanks for the info. My event notification screen is similar to yours, though I have less backlog messages set, 3 instead of 8.

I’ll try and remove the event and notification and see if that clears anything out.

I’ll report back.

Deleting out the notification/event did nothing.

Still only showing Event definition and Event data, no Backlog data. It’s like graylog is ignoring the “if backlog” macro for whatever reason.

UPDATE:

In my event definition I had enabled the “group by fields” option so that it would catch specific errors within the application, sendmail, and not accidentally group other applications in with it. Didn’t want three or more matches from sendmail/imap showing up.

I removed my group by entry and that allowed the backlog message to be displayed again.

I’ve looked through my other events, and only two are using this feature. I also stopped receiving messages from that, so I’ll go ahead and disable this feature.

I’ll provide another update sometime next week on if everything has continued to work as expected.

Thanks for the help @gsmith and @tmacgbay, it’s much appreciated.

Glad you found it!! Mark your post as the answer so that future searchers can find it! :slight_smile:

Nice, Glad you solved your issue. Keep us post :slight_smile:

UPDATE:

This has been working for the past week. It’s safe to say the solution was to stop using the “group by fields” feature.

Thanks again to those who helped.

@drwt30
Thanks for the update :slight_smile:

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.