Not receiving all logs in Graylog


#1

Hi,

I appreciate your priceless help and support in the questions we post. Currently, the configuration get the logs from filebeat and send to elasticsearch. In UI graylog I am receiving logs from filebeat, but not all of them. For instance, I cannot received the last log in tomcat container which is from Monday April 11th:

2019-03-11 06:22:48 [Thread-4            ] DEBUG:   ca.bc.gov.WEB.dbpool.WEBConnectionCacheMonitor Connection cache monitor in thread: Thread-4 shutting down for pool: WEB

On Filebeat the log is processed:

2019-03-14T16:18:50.377-0700    DEBUG   [publish]       pipeline/processor.go:308       Publish event: {
  "@timestamp": "2019-03-14T23:18:45.376Z",
  "@metadata": {
    "beat": "filebeat",
    "type": "doc",
    "version": "6.6.0"
  },
  "host": {
    "name": "tomcat",
    "architecture": "x86_64",
    "os": {
      "codename": "Core",
      "platform": "centos",
      "version": "7 (Core)",
      "family": "redhat",
      "name": "CentOS Linux"
    },
    "id": "6aaed308aa5a419f880c5e45eea65414",
    "containerized": true
  },
  "source": "/app/logs/WEB/WEB-rest-api/WEB-rest-api.log",
  "log": {
    "file": {
      "path": "/app/logs/WEB/WEB-rest-api/WEB-rest-api.log"
    }
  },
  "message": "2019-03-11 06:22:48 [Thread-4            ] DEBUG:   ca.bc.gov.WEB.dbpool.WEBConnectionCacheMonitor Connection cache monitor in thread: Thread-4 shutting down for pool: WEB",
  "beat": {
    "name": "tomcat",
    "hostname": "tomcat",
    "version": "6.6.0"
  },
  "offset": 6771071,
  "prospector": {
    "type": "log"
  },
  "input": {
    "type": "log"
  },
  "meta": {
    "cloud": {
      "instance_name": "tomcat",
      "machine_type": "Standard_D8s_v3",
      "region": "CanadaCentral",
      "provider": "az",
      "instance_id": "6452bcf4-7f5d-4fc3-9f8e-5ea57f00724b"
    }
  }
}

In elasticsearch I am not getting anything related to the log from filebeat:

[2019-03-15T11:45:11,884][INFO ][o.e.g.GatewayService     ] [D6DChHc] recovered [3] indices into cluster_state
[2019-03-15T11:45:12,277][INFO ][o.e.c.r.a.AllocationService] [D6DChHc] Cluster health status changed from [RED] to [GREEN] (reason: [shards started [[graylog_1][0]] ...]).
[2019-03-15T11:45:14,140][INFO ][o.e.c.m.MetaDataIndexTemplateService] [D6DChHc] adding template [filebeat-6.6.0] for index patterns [filebeat-6.6.0-*]
[2019-03-15T11:45:14,265][INFO ][o.e.c.m.MetaDataCreateIndexService] [D6DChHc] [filebeat-6.6.0-2019.03.15] creating index, cause [auto(bulk api)], templates [filebeat-6.6.0], shards [3]/[1], mappings [doc]
[2019-03-15T11:45:14,650][INFO ][o.e.c.m.MetaDataMappingService] [D6DChHc] [filebeat-6.6.0-2019.03.15/QITBikbzRISYq7QkkXXpGQ] update_mapping [doc]

In graylog I am getting this while processing the logs in filebeat:

2019-03-15 11:58:44,120 INFO : org.graylog2.inputs.InputStateListener - Input [GELF UDP/5c88357e389808ada1e8c2cd] is now STARTING
2019-03-15 11:58:44,232 INFO : org.graylog2.inputs.InputStateListener - Input [GELF UDP/5c88357e389808ada1e8c2cd] is now RUNNING
2019-03-15 11:58:44,234 WARN : org.graylog2.inputs.transports.UdpTransport - receiveBufferSize (SO_RCVBUF) for input GELFUDPInput{title=GELF UDP Input, type=org.graylog2.inputs.gelf.udp.GELFUDPInput, nodeId=c3201551-f5d9-41c4-8b89-20b4c843c2ca} (channel [id: 0xaf61abfd, L:/0:0:0:0:0:0:0:0%0:12201]) should be 262144 but is 425984.
2019-03-15 11:58:44,234 WARN : org.graylog2.inputs.transports.UdpTransport - receiveBufferSize (SO_RCVBUF) for input GELFUDPInput{title=GELF UDP Input, type=org.graylog2.inputs.gelf.udp.GELFUDPInput, nodeId=c3201551-f5d9-41c4-8b89-20b4c843c2ca} (channel [id: 0xfae76cc0, L:/0:0:0:0:0:0:0:0%0:12201]) should be 262144 but is 425984.
2019-03-15 11:58:44,235 WARN : org.graylog2.inputs.transports.UdpTransport - receiveBufferSize (SO_RCVBUF) for input GELFUDPInput{title=GELF UDP Input, type=org.graylog2.inputs.gelf.udp.GELFUDPInput, nodeId=c3201551-f5d9-41c4-8b89-20b4c843c2ca} (channel [id: 0xc9ab7125, L:/0:0:0:0:0:0:0:0%0:12201]) should be 262144 but is 425984.
2019-03-15 11:58:44,235 WARN : org.graylog2.inputs.transports.UdpTransport - receiveBufferSize (SO_RCVBUF) for input GELFUDPInput{title=GELF UDP Input, type=org.graylog2.inputs.gelf.udp.GELFUDPInput, nodeId=c3201551-f5d9-41c4-8b89-20b4c843c2ca} (channel [id: 0x918ee7ff, L:/0:0:0:0:0:0:0:0%0:12201]) should be 262144 but is 425984.

I also tried to change the configuration and I send the output from filebeat to logstash instead of sending directly to elasticsearch. Applying this change, I was able to get the same log processed by filebeat:

2019-03-15T10:32:25,982][DEBUG][logstash.outputs.gelf    ] Sending GELF event {:event=>{"short_message"=>["2019-03-11 06:22:48 [Thread-4            ] DEBUG:   ca.bc.gov.WEB.dbpool.WEBConnectionCacheMonitor Connection cache monitor in thread: Thread-4 shutting down for pool: WEB", " Connection cache monitor in thread: Thread-4 shutting down for pool: WEB"], "full_message"=>"2019-03-11 06:22:48 [Thread-4            ] DEBUG:   ca.bc.gov.WEB.dbpool.WEBConnectionCacheMonitor Connection cache monitor in thread: Thread-4 shutting down for pool: WEB, Connection cache monitor in thread: Thread-4 shutting down for pool: WEB", "host"=>"{\"name\":\"tomcat\",\"os\":{\"name\":\"CentOS Linux\",\"version\":\"7 (Core)\",\"codename\":\"Core\"}}", "_source"=>"/app/logs/WEB/WEB-rest-api/WEB-rest-api.log", "_class"=>"ca.bc.gov.WEB.dbpool.WEBConnectionCacheMonitor, %{JAVACLASS}", "_tags"=>"beats_input_codec_plain_applied", "_beat_hostname"=>"tomcat", "_beat_name"=>"tomcat", "_meta_cloud"=>{}, "_log_file"=>{"path"=>"/app/logs/WEB/WEB-rest-api/WEB-rest-api.log"}, "level"=>6}}

However, I am not getting this log in Graylog. I am confused why I receive some logs, but others not. It is worth noting that in Graylog I am getting the logs with INFO level related to this class at similar time, but not getting the ones with DEBUG level.

What are your ideas about this issue? Please let me know if you need the details of any configuration.

Thanks a lot


(Jan Doberstein) #2
[2019-03-15T11:45:14,140][INFO ][o.e.c.m.MetaDataIndexTemplateService] [D6DChHc] adding template [filebeat-6.6.0] for index patterns [filebeat-6.6.0-*]
[2019-03-15T11:45:14,265][INFO ][o.e.c.m.MetaDataCreateIndexService] [D6DChHc] [filebeat-6.6.0-2019.03.15] creating index, cause [auto(bulk api)], temp

This looks for me like you send direct from filebeat to elasticsearch. So you bypass Graylog.


#3

Thanks Jan,

I have changed Filebeat to send the output to logstash in order to send the output as GELF to graylog, but I am not able to retrieve the latest logs. The logs in elasticsearch have changed and created the graylog_0 index:

[2019-03-18T14:36:44,647][INFO ][o.e.c.m.MetaDataIndexTemplateService] [D6DChHc] adding template [kibana_index_template:.kibana] for index patterns [.kibana]
[2019-03-18T14:36:44,756][INFO ][o.e.c.m.MetaDataIndexTemplateService] [D6DChHc] adding template [kibana_index_template:.kibana] for index patterns [.kibana]
[2019-03-18T14:37:00,573][INFO ][o.e.c.m.MetaDataDeleteIndexService] [D6DChHc] [graylog_0/K_FFV007RTqKzrchNxSbTQ] deleting index
[2019-03-18T14:37:03,759][INFO ][o.e.c.m.MetaDataIndexTemplateService] [D6DChHc] adding template [graylog-internal] for index patterns [graylog_*]
[2019-03-18T14:37:03,766][INFO ][o.e.c.m.MetaDataCreateIndexService] [D6DChHc] [graylog_0] creating index, cause [api], templates [graylog-internal], shards [1]/[0], mappings [message]
[2019-03-18T14:37:03,874][INFO ][o.e.c.r.a.AllocationService] [D6DChHc] Cluster health status changed from [YELLOW] to [GREEN] (reason: [shards started [[graylog_0][0]] ...]).
[2019-03-18T14:37:28,853][INFO ][o.e.c.m.MetaDataMappingService] [D6DChHc] [graylog_0/f1hQYCDMQOmwrxZMDULGog] update_mapping [message]
[2019-03-18T14:37:28,893][INFO ][o.e.c.m.MetaDataMappingService] [D6DChHc] [graylog_0/f1hQYCDMQOmwrxZMDULGog] update_mapping [message]

In filebeat logs I can see almost all the events all published except the events of the latest day. Of course, to process again the logs I reset the offset to 0 and deleted the indexes in elasticsearch before starting filebeat.

The configuration in filebeat is as follows:

###################### Filebeat Configuration Example #########################

# This file is an example configuration file highlighting only the most common
# options. The filebeat.reference.yml file from the same directory contains all the
# supported options with more comments. You can use it as a reference.
#
# You can find the full configuration reference here:
# https://www.elastic.co/guide/en/beats/filebeat/index.html

# For more available modules and options, please see the filebeat.reference.yml sample
# configuration file.

#=========================== Filebeat inputs =============================

filebeat.inputs:

# Each - is an input. Most options can be set at the input level, so
# you can use different inputs for various configurations.
# Below are the input specific configurations.

- type: log

  # Change to true to enable this input configuration.
  enabled: true

  # Paths that should be crawled and fetched. Glob based paths.
  paths:
    - /apps/logs/WEB/web-api/web-api.log
    
  # Exclude lines. A list of regular expressions to match. It drops the lines that are
  # matching any regular expression from the list.
  #exclude_lines: ['^DBG']

  # Include lines. A list of regular expressions to match. It exports the lines that are
  # matching any regular expression from the list.
  #include_lines: ['^ERR', '^WARN']

  # Exclude files. A list of regular expressions to match. Filebeat drops the files that
  # are matching any regular expression from the list. By default, no files are dropped.
  #exclude_files: ['.gz$']

  # Optional additional fields. These fields can be freely picked
  # to add additional information to the crawled log files for filtering
  #fields:
  #  level: debug
  #  review: 1

  # Ignore files which were modified more then the defined timespan in the past
  # Time strings like 2h (2 hours), 5m (5 minutes) can be used.
  ignore_older: 0

  ### Multiline options

  # Multiline can be used for log messages spanning multiple lines. This is common
  # for Java Stack Traces or C-Line Continuation

  # The regexp Pattern that has to be matched. The example pattern matches all lines starting with [
  multiline.pattern: '^[0-9]{4}-[0-9]{2}-[0-9]{2}'

  # Defines if the pattern set under pattern should be negated or not. Default is false.
  multiline.negate: true

  # Match can be set to "after" or "before". It is used to define if lines should be append to a pattern
  # that was (not) matched before or after or as long as a pattern is not matched based on negate.
  # Note: After is the equivalent to previous and before is the equivalent to to next in Logstash
  multiline.match: after


#============================= Filebeat modules ===============================

filebeat.config.modules:
  # Glob pattern for configuration loading
  path: ${path.config}/modules.d/*.yml

  # Set to true to enable config reloading
  reload.enabled: false

  # Period on which files under path should be checked for changes
  #reload.period: 10s

#==================== Elasticsearch template setting ==========================

setup.template.settings:
  index.number_of_shards: 3
  #index.codec: best_compression
  #_source.enabled: false

#================================ General =====================================

# The name of the shipper that publishes the network data. It can be used to group
# all the transactions sent by a single shipper in the web interface.
#name:

# The tags of the shipper are included in their own field with each
# transaction published.
#tags: ["service-X", "web-tier"]

# Optional fields that you can specify to add additional information to the
# output.
#fields:
#  env: staging


#============================== Dashboards =====================================
# These settings control loading the sample dashboards to the Kibana index. Loading
# the dashboards is disabled by default and can be enabled either by setting the
# options here, or by using the `-setup` CLI flag or the `setup` command.
#setup.dashboards.enabled: false

# The URL from where to download the dashboards archive. By default this URL
# has a value which is computed based on the Beat name and version. For released
# versions, this URL points to the dashboard archive on the artifacts.elastic.co
# website.
#setup.dashboards.url:

#============================== Kibana =====================================

# Starting with Beats version 6.0.0, the dashboards are loaded via the Kibana API.
# This requires a Kibana endpoint configuration.
#setup.kibana:

  # Kibana Host
  # Scheme and port can be left out and will be set to the default (http and 5601)
  # In case you specify and additional path, the scheme is required: http://localhost:5601/path
  # IPv6 addresses should always be defined as: https://[2001:db8::1]:5601
  #host: "localhost:5601"

  # Kibana Space ID
  # ID of the Kibana Space into which the dashboards should be loaded. By default,
  # the Default Space will be used.
  #space.id:

#============================= Elastic Cloud ==================================

# These settings simplify using filebeat with the Elastic Cloud (https://cloud.elastic.co/).

# The cloud.id setting overwrites the `output.elasticsearch.hosts` and
# `setup.kibana.host` options.
# You can find the `cloud.id` in the Elastic Cloud web UI.
#cloud.id:

# The cloud.auth setting overwrites the `output.elasticsearch.username` and
# `output.elasticsearch.password` settings. The format is `<user>:<pass>`.
#cloud.auth:

#================================ Outputs =====================================

# Configure what output to use when sending the data collected by the beat.

#-------------------------- Elasticsearch output ------------------------------
#output.elasticsearch:
  # Array of hosts to connect to.
  #hosts: ["serverlog.dev.ca:9200"]

  # Enabled ilm (beta) to use index lifecycle management instead daily indices.
  #ilm.enabled: false

  # Optional protocol and basic auth credentials.
  #protocol: "https"
  #username: "elastic"
  #password: "changeme"

#----------------------------- Logstash output --------------------------------
output.logstash:
  # The Logstash hosts
  hosts: ["serverlog.dev.ca:5044"]

  # Optional SSL. By default is off.
  # List of root certificates for HTTPS server verifications
  ssl.certificate_authorities: ["/etc/pki/tls/certs/logstash.crt"]

  # Certificate for SSL client authentication
  ##ssl.certificate: "/etc/pki/client/cert.pem"

  # Client Certificate Key
  ##ssl.key: "/etc/pki/client/cert.key"

#================================ Processors =====================================

# Configure processors to enhance or manipulate events generated by the beat.

processors:
  - add_host_metadata: ~
  - add_cloud_metadata: ~

#================================ Logging =====================================

# Sets log level. The default log level is info.
# Available log levels are: error, warning, info, debug
logging.level: debug

# At debug level, you can selectively enable logging only for some components.
# To enable all selectors use ["*"]. Examples of other selectors are "beat",
# "publish", "service".
#logging.selectors: ["*"]

#============================== Xpack Monitoring ===============================
# filebeat can export internal metrics to a central Elasticsearch monitoring
# cluster.  This requires xpack monitoring to be enabled in Elasticsearch.  The
# reporting is disabled by default.

# Set to true to enable the monitoring reporter.
#xpack.monitoring.enabled: false

# Uncomment to send the metrics to Elasticsearch. Most settings from the
# Elasticsearch output are accepted here as well. Any setting that is not set is
# automatically inherited from the Elasticsearch output configuration, so if you
# have the Elasticsearch output configured, you can simply uncomment the
# following line.
#xpack.monitoring.elasticsearch:

In logstash, I have the following configuration sending the output to elasticsearch:

input {
  beats {
    port => 5044
  }  
}

filter {
        grok {
            match => {
                "message" => "%{TIMESTAMP_ISO8601}(?:%{SPACE})%{SYSLOG5424SD}(?:%{SPACE})%{LOGLEVEL}*(?:%{SPACE}):*(?:%{SPACE})%{WORD}*(?:%{SPACE})%{JAVACLASS:class}%{GREEDYDATA:message}"
            }
        }
}

output {
    gelf {
      host => "10.99.2.19"
      port => 12201
    }
}

I keep having problems to get some lines of the logs. I do not understand how some inputs in the log are sent by filebeat, ingested by logstash, but not displayed in Graylog. How can I ensure that a record sent by logstash to GELF event is really getting part of graylog index?

[2019-03-18T16:53:13,073][DEBUG][logstash.outputs.gelf    ] Sending GELF event {:event=>{"short_message"=>["2019-03-16 15:58:24 [https-jsse-nio-8020-exec-6] DEBUG: ASCLP4482D226FF02  ca.bc.gov.nrs.common.rest.resource.BaseResource etag=null", " etag=null"], "full_message"=>"2019-03-16 15:58:24 [https-jsse-nio-8020-exec-6] DEBUG: ASCLP4482D226FF02  ca.bc.gov.nrs.common.rest.resource.BaseResource etag=null, etag=null", "host"=>"{\"os\":{\"name\":\"CentOS Linux\",\"version\":\"7 (Core)\",\"codename\":\"Core\"},\"name\":\"d1tomcat\"}", "_log_file"=>{"path"=>"/apps_ux/logs/AS/as-as-api/as-as-api.log"}, "_source"=>"/apps_ux/logs/AS/as-as-api/as-as-api.log", "_beat_name"=>"d1tomcat", "_beat_hostname"=>"d1tomcat", "_class"=>"ca.bc.gov.nrs.common.rest.resource.BaseResource", "_tags"=>"beats_input_codec_plain_applied", "_meta_cloud"=>{}, "level"=>6}}

Please let me know if I should provide additional information.

Thanks a lot for your help


(Jan Doberstein) #4

why you put logstash in?

filebeat - output used logstash-> Graylog - Input used BEATS

Read the files with filebeat, in filebeat use logstash named output. Send to Graylog what has a Beats input configured.

Clear?


#5

Thanks Jan,

Not really clear. Braking down I have the following:

On filebeat.yml:

#=========================== Filebeat inputs =============================
filebeat.inputs:
- type: log
  enabled: true
  paths:
    - /apps/logs/APP/app-api/app-api.log
	
#----------------------------- Logstash output --------------------------------
output.logstash:
  # The Logstash hosts
  hosts: ["log1.cgi-dev.ca:5044"]

  # Optional SSL. By default is off.
  # List of root certificates for HTTPS server verifications
  ssl.certificate_authorities: ["/etc/pki/tls/certs/logstash.crt"]

  # Certificate for SSL client authentication
  #ssl.certificate: "/etc/pki/client/cert.pem"

  # Client Certificate Key
  #ssl.key: "/etc/pki/client/cert.key"

On logstash,conf I am sending the output from filebeat to graylog:

input {
  beats {
    port => 5044
    ssl => true
    ssl_certificate => "/etc/pki/tls/certs/logstash.crt"
    ssl_key => "/etc/pki/tls/private/logstash.key"
  }
  
}

output {
    gelf {
      host => "10.99.2.19"
      port => 12201
    }
}

On Graylog, I created an input GELF UDP Input:

But, I understand that I should change from GELF UDP to Beats input:

Is that correct?

Thanks Jan


(Jan Doberstein) #6

no - that is not the way:

use filebeats and send direct to Graylog. No logstash is needed.

In filebeats use the output named logstash to send messages to Graylog on a beats input.


#7

Thanks Jan, I have created Beats Input using the same port as logstash and I was able to receive all the logs. I understand that Graylog is taking the tasks which are done by logstash through the extractors and rules.

Thanks for all your help