'Failed to report collector status to server' message


(Eugene Gordienko) #1

I see on few Few collectors:
Apr 11 00:45:21 collector-39 /usr/bin/graylog-collector-sidecar[125650]: time=“2017-04-11T00:45:21Z” level=error msg="[UpdateRegistration] Failed to report collector status to server: "

What are possible causes to investigate?

Thanks,
Eugene


How can I help to debug issue in https://community.graylog.org/t/failed-to-report-collector-status-to-server-message/772?
(Jan Doberstein) #2

What version of Collector and Graylog are you using?


(Eugene Gordienko) #3

We use
Graylog 2.2.2+691b4b7
sidecar from https://github.com/Graylog2/collector-sidecar/releases/download/0.1.0-alpha.1/collector-sidecar-0.1.0-1.x86_64.rpm

Plan to upgrade to lates Graylog and sidecar


(Jochen) #4

Please upgrade to a non-alpha (!) version of the Graylog Collector Sidecar.


(Eugene Gordienko) #5

What is the right way to upgrade sidecar without data loss or duplication - the rpm install
over existing one doesn’t work on centos at least?
So if we consider uninstall and then install new version of sidecar it may result in log
data duplication since we loose the cursor of where we stopped collecting data.


(Jochen) #6

What doesn’t work exactly?


(Eugene Gordienko) #7

Ok we upgraded to latest sidecar:
rpm -qa |grep side
collector-sidecar-0.1.1-1.x86_64

and we have Graylog Version:
2.2.3+7adc951, codename Stiegl

and we still see every 5-15 seconds message:
level=error msg="[UpdateRegistration] Failed to report collector status to server: "

What can we do to help investigate?


(Jochen) #8

Please post the configuration of the Graylog Collector Sidecar instances showing this error and check the logs of your Graylog node(s) for warning and error messages (or post them here).


(Eugene Gordienko) #9

Hi

Below are collector logs (where I see
/usr/bin/graylog-collector-sidecar[4742]: time=“2017-05-14T17:24:50Z” level=error msg="[UpdateRegistration] Failed to report collector status to server: ")

But I don’t see server side log messages related to this issue - the only one I have is dated by 05/01 and I
restarted both server and collector quite few times after that.

/usr/bin/filebeat -version
filebeat version 5.1.1 (amd64), libbeat 5.1.1

/etc/graylog/collector-sidecar/collector_sidecar.yml:

server_url: https://192.168.10.22:9000/api/
update_interval: 10
tls_skip_verify: true
send_status: true
list_log_files:
node_id: collector-192.168.10.22/
collector_id: file:/etc/graylog/collector-sidecar/collector-id
log_path: /var/log/graylog/collector-sidecar
log_rotation_time: 86400
log_max_age: 604800
tags:
    - sidecar-template-tag
backends:
    - name: nxlog
      enabled: false
      binary_path: /usr/bin/nxlog
      configuration_path: /etc/graylog/collector-sidecar/generated/nxlog.conf
    - name: filebeat
      enabled: true
      binary_path: /usr/bin/filebeat
      configuration_path: /etc/graylog/collector-sidecar/generated/filebeat.yml

/etc/graylog/collector-sidecar/generated/filebeat.yml

filebeat:
  prospectors:
  - document_type: log
    encoding: plain
    fields:
      collector_node_id: collector-192.168.10.22
      gl2_source_collector: 72143a4c-f79d-4064-b296-8f7f4fc9ac11
      harvester_buffer_size: "65536"
      publish_async: "true"
    ignore_older: 0
    input_type: log
    multiline:
      match: after
      negate: true
      pattern: ^[0-9]{4}-[0-9]{2}-[0-9]{2}
    paths:
    - /var/log/*.log
    - /var/log/messages
    - /var/log/secure
    - /tmp/pipeline_etl.log
    scan_frequency: 5s
    tail_files: false
output:
  logstash:
    hosts:
    - 192.168.10.10:5044
    - 192.168.10.11:5044
    - 192.168.10.12:5044
    loadbalance: true
    ssl:
      certificate: /etc/graylog/collector-sidecar/secrets/cert.pem
      key: /etc/graylog/collector-sidecar/secrets/pkcs8-plain.pem
      verification_mode: none
path:
  data: /var/cache/graylog/collector-sidecar/filebeat/data
  logs: /var/log/graylog/collector-sidecar
tags:
- sidecar-template-tag

(Eugene Gordienko) #10

Anyway - that what I had on server

102396 2017-05-01T20:16:13.915-04:00 ERROR [NettyTransport] Error in Input [Beats/58f975d51d3c4a371b7305b5] (channel [id: 0x2ad66cb8, /192.168.10.22:55593 => /192.168.10.12:5044])
102397 org.jboss.netty.handler.ssl.NotSslRecordException: not an SSL/TLS record: 32570000000532430000776a785eecbdeb92e3b69220ecefdb17e1ea8ffb440814c19b486d54c4b4bbca76efe9db7455cf9933530e054882122cde4c5075e989f378fb5cbb912025811409a9d4a5b23d333fdcaeee029089bc2191c84c9afffbbbefbe       fbffbefbeefac37f8cfe29a51589484546b3ff18558f051dcd4649be188d470125d568368a5942034aaad13fc6239eafcb10464cee483949f2c5846424ccb388e8f51c96       15eb6ade5a2666348938acbe24e51de5152de7c13a8e6939e7ec2b2ce63a8ee58ec6a3621d248c2fe7843f66e16836aaca351d8d478bc49c7301781ee64942c32a2f47b3       d1d4c4b645ec10c5533f42b6e1da28307d1779f134b6e3d02721c6a37f8c4729e59c2ce86836329c9961cc1c63ec625f7bfbe1c78f9aa669336d45cb8c265a98a729c922       2d61199d692c6355195db08c5565a4b374a155f4a1d24a92468cafe69c7da517d8300c8d1517d87675ac63cfd0eda996d12a257c75613a8e6e3a8e6e3ab66e680b52d17b       f278816d57c73af6b06e3ab6b6e211bd6321bda029d656fc625955c56c32c1aea9db531d9b3a36272b3e0ff32c668b491aa1202211bd43b68b562c5cf18a9495f6c3c78f       37f3b7ef5fff74757197262c5b7fd56eb3dbcc70668631738cb18b7deded871f3f6a9aa6cdb48a3e545a9a47548bf332a491169779aa856994b08cde668633338c995393       e8f2ea872f3f89592525d1075abdcde25c9b546931e1966f64b4d2b2bcd2e27c9d45638d923279d44a5aadcb6c10fc465cb43b5a7296671ab67413eb2676b43cd31e3c77       eeda1aaf4859b16c719b1935c7acf1d470b41d3ad7e48e650b2dcda375423556dcb95b800786f290b3390be2ea4913823cafe6fc91c7fcc86945c88b5579e4601a45478e       e4c7928447f3343f76d5b00ce7153622161f8b46f994d5a3324f8f5c9809f6546171e4f88405a74e3912428d5049325ee4653517d08e9dfadb9af0e5d1221396243d7af0       9a07735ee52559d0416c8a320f404b8235a72db9355d4995ee09ab60549c97da9294d13d29a956e51acb58c548c2beee004cc7d8b6a5a9fd00a663eccaba7a2c003c33f0       d8b21d6d679817b412b8ad58d8d8ba98250d3ef570471e4e53ac2d09d71296adc6da9ac3b658a3e8f570d795568f7218b05d5bd7758d370059a5ad1b196c26fad2be4958       b13b22308be81d0ba946535cb301cf0c6b6c1b860406f63f8ff372ce6212d2f966729e69af4cc730ff32d3768b68cd6f69a3bacd72a6b41c50404bf290542ccf66dab2aa       8ad9134f8c16aaf2da15c8794ccb1208f3ad2be3a984f586b0eb42db125c46a335783b428bf332045cc4d10b87d6e1393f9679faa54c06079694af938a6b79acadcb445b       f1b1f87fdf6edfd0acfa78ed3af5a9d45ad193f656958f02c75c4bf37556696f2e371c9d44f46ec24b
102398         at org.jboss.netty.handler.ssl.SslHandler.decode(SslHandler.java:857) ~[graylog.jar:?]
102399         at org.jboss.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:425) ~[graylog.jar:?]
102400         at org.jboss.netty.handler.codec.frame.FrameDecoder.messageReceived(FrameDecoder.java:303) ~[graylog.jar:?]
102401         at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70) ~[graylog.jar:?]
102402         at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564) [graylog.jar:?]
102403         at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559) [graylog.jar:?]
102404         at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268) [graylog.jar:?]
102405         at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255) [graylog.jar:?]
102406         at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88) [graylog.jar:?]
102407         at org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:108) [graylog.jar:?]
102408         at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:337) [graylog.jar:?]
102409         at org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89) [graylog.jar:?]
102410         at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178) [graylog.jar:?]
102411         at org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108) [graylog.jar:?]
102412         at org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42) [graylog.jar:?]
102413         at com.codahale.metrics.InstrumentedExecutorService$InstrumentedRunnable.run(InstrumentedExecutorService.java:176) [graylog.jar:       ?]
102414         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_121]
102415         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_121]
102416         at java.lang.Thread.run(Thread.java:745) [?:1.8.0_121]

(Eugene Gordienko) #11

And the server version I run is:
Version:
2.2.3+7adc951, codename Stiegl


(Eugene Gordienko) #12

looks like erroe message comes from code
} else if err != nil && err != io.EOF { // err is nil for GL 2.2 and EOF for 2.1 and earlier
log.Error("[UpdateRegistration] Failed to report collector status to server: ", err)
}
What is err value anticipated in such case?


(marius) #13

The error message from the underlying HTTP library.

How did you setup HTTPS for the API port? Is it a proxy that does the TLS termination or do you use the Graylog build-in keystore/TLS implementation?

What happens when you manually call the API from the Sidecar host with: curl -v -k -u admin:admin https://1.2.3.4:9000/api/plugins/org.graylog.plugins.collector/collectors


(João Ciocca) #14

hi there - hope 12d isn’t long enough to be considered a gravedigger =p
Any ways, I started testing Graylog and was following both this post


and the documentation step-by-step, but I still fail at something. I’m getting two errors:

time="2017-06-03T14:18:30-03:00" level=error msg="[RequestConfiguration] Fetching configuration failed: Get https://172.28.97.4:443/api/plugins/org.graylog.plugins.collector/9118dac7-157f-4255-8bec-98f895f7b400?tags=%5B%22windows%22%5D: dial tcp 172.28.97.4:443: connectex: Nenhuma conexão pôde ser feita porque a máquina de destino as recusou ativamente." 
time="2017-06-03T14:18:30-03:00" level=error msg="[UpdateRegistration] Failed to report collector status to server: Put https://172.28.97.4:443/api/plugins/org.graylog.plugins.collector/collectors/9118dac7-157f-4255-8bec-98f895f7b400: dial tcp 172.28.97.4:443: connectex: Nenhuma conexão pôde ser feita porque a máquina de destino as recusou ativamente." 

I’m using the OVA virtual appliance, already updated to 2.2.3, I already successfully installed the Threat Intel plugin and configured the pipeline:

But I can’t seem to get my sidecar to deliver the sysmon logs. Here’s collector_sidecar.yml:

server_url: https://172.28.97.4:443/api 
update_interval: 10
tls_skip_verify: true
send_status: true
list_log_files:
node_id: graylog-collector-sidecar
collector_id: file:C:\Program Files\graylog\collector-sidecar\collector-id
cache_path: C:\Program Files\graylog\collector-sidecar\cache
log_path: C:\Program Files\graylog\collector-sidecar\logs
log_rotation_time: 86400
log_max_age: 604800
tags: [windows]
backends:
    - name: nxlog
      enabled: false
      binary_path: C:\Program Files (x86)\nxlog\nxlog.exe
      configuration_path: C:\Program Files\graylog\collector-sidecar\generated\nxlog.conf
    - name: winlogbeat
      enabled: true
      binary_path: C:\Program Files\graylog\collector-sidecar\winlogbeat.exe
      configuration_path: C:\Program Files\graylog\collector-sidecar\generated\winlogbeat.yml
    - name: filebeat
      enabled: false
      binary_path: C:\Program Files\graylog\collector-sidecar\filebeat.exe
      configuration_path: C:\Program Files\graylog\collector-sidecar\generated\filebeat.yml

one thing I had to do differently from what’s explained on the step-by-step was that I needed to download beats to get some files, namely: “winlogbeat.template.json”, “winlogbeat.template-es2x.json” and “winlogbeat.template-es6x.json” and put them in the same folder as winlogbeat.exe. I also needed a winlogbeat.yml on the generated folder, or else all I was getting on winlogbeat_stderr.log was errors pointing to those files.

At least these were the files that I identified were missing from errors on the log. I installed the sidecar using the installer from Graylog_Sysmon on github, properly configured for my graylog server.

Almost forgot! This is the result from running that curl command on the Windows machine were the sidecar is installed:

c:\Program Files\curl-7.54.0-win64-mingw\bin>curl -v -k -u admin:admin https://172.28.97.4:9000/api/plugins/org.graylog.plugins.collector/collectors
*   Trying 172.28.97.4...
* TCP_NODELAY set
* Connected to 172.28.97.4 (172.28.97.4) port 9000 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* Cipher selection: ALL:!EXPORT:!EXPORT40:!EXPORT56:!aNULL:!LOW:!RC4:@STRENGTH
* TLSv1.2 (OUT), TLS handshake, Client hello (1):
* OpenSSL SSL_connect: SSL_ERROR_SYSCALL in connection to 172.28.97.4:9000
* stopped the pause stream!
* Closing connection 0
curl: (35) OpenSSL SSL_connect: SSL_ERROR_SYSCALL in connection to 172.28.97.4:9000

"Unknown beats protocol version: 3" using winlogbeat v5.1.1
(Eugene Gordienko) #15

Sorry for late response - was reassigned to different project and had not access to
GL environment.

In our case we have some other team set a proxy, so$ curl -v -k -u admin:admin https://1.2.3.4:9000/api/plugins/org.graylog.plugins.collector/collectors

  • About to connect() to proxy ourproxy.domain.com port 4080 (#0)
  • Trying 1.2.3.4… Connection timed out
  • couldn’t connect to host
  • Closing connection #0
    curl: (7) couldn’t connect to host

That explains everything I guess.
But how is it recommended to set side-car connections in case of proxy?

Thanks,
Eugene


(João Ciocca) #16

that made me think that we also have proxy… my user is one of the few
that can bypass it.

Guess I’ll have that same problem then.