Hello,
I don’t receive an Alert nor it is being displayed in Alerts & Events list.
Filter preview in Event Definition displays the missing event during the edit if switched to search within last 24 hours.
Searching for the filter search query displays missing events.
Here’s the part of log that is being sent from the remote instance to Graylog via filebeat:
Mar 28 19:26:31 mail postfix/submission/smtpd: disconnect from ec2-18-214-89-240.compute-1.amazonaws.com[18.214.89.240] commands=0/0
Mar 28 19:26:34 mail postfix/submission/smtpd: warning: ec2-18-214-89-240.compute-1.amazonaws.com[18.214.89.240]: SASL LOGIN authentication failed: UGFzc3dvcmQ6
Mar 28 19:26:34 mail postfix/submission/smtpd: lost connection after AUTH from ec2-18-214-89-240.compute-1.amazonaws.com[18.214.89.240]
Mar 28 19:26:34 mail postfix/submission/smtpd: disconnect from ec2-18-214-89-240.compute-1.amazonaws.com[18.214.89.240] ehlo=2 starttls=1 auth=0/1 commands=3/4
Mar 28 21:27:13 mail postfix/submission/smtpd: lost connection after UNKNOWN from unknown[103.45.246.23]
Mar 28 21:27:13 mail postfix/submission/smtpd: disconnect from unknown[103.45.246.23] unknown=0/2 commands=0/2
Mar 29 03:27:21 mail postfix/submission/smtpd: SSL_accept error from <redacted>: lost connection
Mar 29 03:27:21 mail postfix/submission/smtpd: disconnect from <redacted> commands=0/0
Mar 29 03:27:21 mail postfix/smtpd: disconnect from <redacted> commands=0/0
Mar 29 03:27:21 mail postfix/submission/smtpd: disconnect from <redacted> commands=0/0
Mar 29 06:10:41 mail postfix/submission/smtpd: lost connection after UNKNOWN from unknown[168.121.97.40]
Mar 29 06:10:41 mail postfix/submission/smtpd: disconnect from unknown[168.121.97.40] ehlo=1 auth=0/1 unknown=0/2 commands=1/4
Mar 29 13:25:25 mail postfix/smtpd: warning: non-SMTP command from 233.150.216.162.bc.googleusercontent.com[162.216.150.233]: GET / HTTP/1.1
Mar 29 13:25:25 mail postfix/smtpd: disconnect from 233.150.216.162.bc.googleusercontent.com[162.216.150.233] unknown=0/1 commands=0/1
Those events were not delivered via Notifications, they are also missing in “Alerts & Events” [1]
Mar 28 21:27:13 mail postfix/submission/smtpd: lost connection after UNKNOWN from unknown[103.45.246.23]
Mar 28 21:27:13 mail postfix/submission/smtpd: disconnect from unknown[103.45.246.23] unknown=0/2 commands=0/2
Mar 29 06:10:41 mail postfix/submission/smtpd: lost connection after UNKNOWN from unknown[168.121.97.40]
Mar 29 06:10:41 mail postfix/submission/smtpd: disconnect from unknown[168.121.97.40] ehlo=1 auth=0/1 unknown=0/2 commands=1/4
When editing the Event Definition, if I set Filter settings (“search within the last” and “execute search every”) to display last 24 hours, I can clearly see the missing events. [2]
Searching manually with the same Search Query as used in Event Definition returns missing messages. [3]
For the reference, I’m attaching also the Event Definition summary as it is configured and used (with no altering). [4]
Environment:
Debian 10 host with Docker 26.0.0
Graylog 5.2.5
Elasticsearch 7.10.2
Mongo 5.0.13
All three running with Docker
docker-compose.yml
version: '3'
services:
mongo:
image: mongo:5.0.13
container_name: graylog_mongo
restart: always
volumes:
- mongo_data:/data/db
networks:
- graylog
ports:
- 127.0.0.1:27017:27017/tcp
elasticsearch:
image: docker.elastic.co/elasticsearch/elasticsearch-oss:7.10.2
container_name: graylog_es
restart: always
volumes:
- es_data:/usr/share/elasticsearch/data
environment:
- http.host=0.0.0.0
- transport.host=localhost
- network.host=0.0.0.0
- "ES_JAVA_OPTS=-Dlog4j2.formatMsgNoLookups=true -Xms512m -Xmx512m"
ulimits:
memlock:
soft: -1
hard: -1
deploy:
resources:
limits:
memory: 1g
networks:
- graylog
graylog:
image: graylog/graylog:5.2.5
container_name: graylog_graylog
restart: always
volumes:
- graylog_data:/usr/share/graylog/data
- ./certs:/opt/certs:ro
- ./graylog.conf:/usr/share/graylog/graylog.conf
- ./graylog-plugin-telegram-notification-2.4.0.jar:/usr/share/graylog/plugin/graylog-plugin-telegram-notification-2.4.0.jar
environment:
- GRAYLOG_PASSWORD_SECRET=<redacted>
- GRAYLOG_ROOT_PASSWORD_SHA2=<redacted>
- TZ=Europe/Prague
- GRAYLOG_ROOT_TIMEZONE=Europe/Prague
entrypoint: /usr/bin/tini -- wait-for-it elasticsearch:9200 -- /docker-entrypoint.sh
networks:
- graylog
depends_on:
- mongo
- elasticsearch
ports:
- 127.0.0.1:9000:9000/tcp
- 5044:5044/tcp
networks:
graylog:
driver: bridge
volumes:
mongo_data:
driver: local
es_data:
driver: local
graylog_data:
driver: local
graylog.conf:
is_leader = true
node_id_file = /etc/graylog/server/node-id
bin_dir = bin
data_dir = data
plugin_dir = plugin
http_publish_uri = http://172.0.0.1:9000/
http_enable_tls = true
http_tls_cert_file = /opt/certs/cert.pem
http_tls_key_file = /opt/certs/privkey.pem
stream_aware_field_types=false
allow_leading_wildcard_searches = true
allow_highlighting = false
output_batch_size = 500
output_flush_interval = 1
output_fault_count_threshold = 5
output_fault_penalty_seconds = 30
processbuffer_processors = 5
outputbuffer_processors = 3
processor_wait_strategy = blocking
ring_size = 65536
inputbuffer_ring_size = 65536
inputbuffer_processors = 2
inputbuffer_wait_strategy = blocking
message_journal_enabled = true
message_journal_dir = data/journal
lb_recognition_period_seconds = 3
mongodb_uri = mongodb://localhost/graylog
mongodb_max_connections = 1000
Graylog container has no logs around the time of missing messages. (I’ve fixed those warnings and errors by removing problematic Custom Fields at the time of gathering information for this topic.)
2024-03-28 19:26:53,533 WARN : org.graylog.events.processor.EventProcessorEngine - Key spec <[message]> for event <EventImpl{eventId=01HT3516WNKT1SMJ3P93CEDZRR, eventDefinitionType=aggregation-v1, eventDefinitionId=65e50cfceba8cf7317beded8, originContext=urn:graylog:message:es:graylog_0:bc909871-ed30-11ee-8201-0242ac190004, eventTimestamp=2024-03-28T18:26:37.279Z, processingTimestamp=2024-03-28T18:26:53.461Z, timerangeStart=null, timerangeEnd=null, streams=[], sourceStreams=[000000000000000000000001], message=New mail issue, source=3fa06454e9e7, keyTuple=[], priority=2, alert=true, fields={message=FieldValue{dataType=ERROR, value=}}, groupByFields={}, replayInfo=EventReplayInfo{timerangeStart=2024-03-28T18:26:17.585Z, timerangeEnd=2024-03-28T18:26:47.585Z, query=source:mail.<redacted> AND ("does not resolve to" OR "non-SMTP command" OR "authentication failed" OR "Relay access denied" OR "Domain not found" OR "warning: TLS library problem" OR "Service unavailable" OR "SSL_accept error" OR "failed to parse" OR "too many errors" OR "commands=0/0" OR "auth=0/1" OR "auth=0/2" OR "unknown=0" OR "SERVFAIL" OR "lost connection after AUTH" OR "lost connection after EHLO" OR "lost connection after UNKNOWN" OR "lost connection after STARTTLS"), streams=[000000000000000000000001]}}> cannot be fulfilled
2024-03-29 03:27:48,439 ERROR: org.graylog.events.fields.providers.TemplateFieldValueProvider - No value found for variable "message" in template "${message}"
2024-03-29 03:27:48,482 WARN : org.graylog.events.processor.EventProcessorEngine - Key spec <[message]> for event <EventImpl{eventId=01HT40HSHM9D974AT22V5VB714, eventDefinitionType=aggregation-v1, eventDefinitionId=65e50cfceba8cf7317beded8, originContext=urn:graylog:message:es:graylog_0:e67281c1-ed73-11ee-8201-0242ac190004, eventTimestamp=2024-03-29T02:27:23.833Z, processingTimestamp=2024-03-29T02:27:48.404Z, timerangeStart=null, timerangeEnd=null, streams=[], sourceStreams=[000000000000000000000001], message=New mail issue, source=3fa06454e9e7, keyTuple=[], priority=2, alert=true, fields={message=FieldValue{dataType=ERROR, value=}}, groupByFields={}, replayInfo=EventReplayInfo{timerangeStart=2024-03-29T02:27:17.585Z, timerangeEnd=2024-03-29T02:27:47.585Z, query=source:mail.<redacted> AND ("does not resolve to" OR "non-SMTP command" OR "authentication failed" OR "Relay access denied" OR "Domain not found" OR "warning: TLS library problem" OR "Service unavailable" OR "SSL_accept error" OR "failed to parse" OR "too many errors" OR "commands=0/0" OR "auth=0/1" OR "auth=0/2" OR "unknown=0" OR "SERVFAIL" OR "lost connection after AUTH" OR "lost connection after EHLO" OR "lost connection after UNKNOWN" OR "lost connection after STARTTLS"), streams=[000000000000000000000001]}}> cannot be fulfilled
2024-03-29 13:25:47,865 ERROR: org.graylog.events.fields.providers.TemplateFieldValueProvider - No value found for variable "message" in template "${message}"
Elasticsearch has no logs around the time of missing messages:
{"type": "server", "timestamp": "2024-03-28T18:25:22,016Z", "level": "INFO", "component": "o.e.m.j.JvmGcMonitorService", "cluster.name": "docker-cluster", "node.name": "0e6ecec725cb", "message": "[gc][272490] overhead, spent [273ms] collecting in the last [1s]", "cluster.uuid": "qbsNRgP8SaWhxjWwTjrIog", "node.id": "6sI6NmX1Q5eN0khzul73jg" }
{"type": "server", "timestamp": "2024-03-28T23:00:10,963Z", "level": "INFO", "component": "o.e.m.j.JvmGcMonitorService", "cluster.name": "docker-cluster", "node.name": "0e6ecec725cb", "message": "[gc][288968] overhead, spent [270ms] collecting in the last [1s]", "cluster.uuid": "qbsNRgP8SaWhxjWwTjrIog", "node.id": "6sI6NmX1Q5eN0khzul73jg" }
{"type": "server", "timestamp": "2024-03-28T23:00:10,963Z", "level": "INFO", "component": "o.e.m.j.JvmGcMonitorService", "cluster.name": "docker-cluster", "node.name": "0e6ecec725cb", "message": "[gc][288968] overhead, spent [270ms] collecting in the last [1s]", "cluster.uuid": "qbsNRgP8SaWhxjWwTjrIog", "node.id": "6sI6NmX1Q5eN0khzul73jg" }
{"type": "server", "timestamp": "2024-03-29T15:36:12,362Z", "level": "INFO", "component": "o.e.n.Node", "cluster.name": "docker-cluster", "node.name": "0e6ecec725cb", "message": "version[7.10.2], pid[7], build[oss/docker/747e1cc71def077253878a59143c1f785afa92b9/2021-01-13T00:42:12.435326Z], OS[Linux/4.19.0-26-amd64/amd64], JVM[AdoptOpenJDK/OpenJDK 64-Bit Server VM/15.0.1/15.0.1+9]" }
Mongo has no logs around the time f missing messages - last log entry dated 23.03.2024 and is obviously unrelated.
Maybe the “Key spec for event […] cannot be fulfilled” error was causing problems, that I don’t know at this time. Will keep watching for the issue to happen again
Any contributions highly appreciated! I hope someone can point me into right direction.
__
Four screenshots has been merged into single one due to limitation of “new users”