GL Cluster installation with settings using nginx

Hi,
I am very new to graylog installation and I could not find enough data for the installation that I draw in this topic.
I have thousands of questions and documents seem not to reply most of it so I need help

1_ how should I set my filebeat so it can collect any log (I even tried to use sidecar but no way it does not get my logs to graylog.
2_Somehow graylog nodes seems in elasticsearch cluster so my elasticsearch cluster has 6 nodes but I can not see any indices in graylog screens (tough when I look at my elasticsearch clusters I have two indices using ElasticHQ plugin)
3_Why there is no step by step examples for some simple or example configurations to make it easier.
4_ How should I set my nginx so http 9000 works for sidecar and filebeat to communicate with otherside back and forth

Would really appriciate any help
Thank you

Create a Beats input in Graylog and configure Filebeat to use a logstash output to send data to the Graylog node(s).

Up to version 2.2.3, Graylog joins the Elasticsearch cluster as a client node (i. e. not master eligible, doesn’t store data) which is shown in the Elasticsearch cluster state.

There are step-by-step guides in the documentation and the Graylog Collector Sidecar aims to help users to configure all sorts of log collectors such as NXLOG, Filebeat, and Winlogbeat.

See http://docs.graylog.org/en/2.2/pages/configuration/web_interface.html#nginx for a working nginx configuration.

Filebeat (or any other beat) doesn’t communicate over HTTP, though. You’ll need to create a Beats input in Graylog for them.

I would like to go step by step.
How should I set my sidecar so it uses filebeat harvests logs and send data to nginx here is my "collector_sidecar.yml "

server_url: http://89.22.104.82:9000/api/
update_interval: 10
tls_skip_verify: false
send_status: true
list_log_files:
node_id: graylog-collector-sidecar
collector_id: file:/etc/graylog/collector-sidecar/collector-id
cache_path: /var/cache/graylog/collector-sidecar
log_path: /var/log/graylog/collector-sidecar
log_rotation_time: 86400
log_max_age: 604800
tags:
    - linux
backends:
    - name: filebeat
      enabled: true
      binary_path: /usr/bin/filebeat
      configuration_path: /etc/graylog/collector-sidecar/generated/filebeat.yml

The configuration of the Graylog Collector Sidecar looks correct (if http://89.22.104.82:9000/api/ is the address of the Graylog REST API), although be aware that Filebeat doesn’t send data to nginx (or the Graylog REST API), but to a Beats input in Graylog which you’ll have to create on the System / Inputs page in the Graylog web interface.

.82 is nginx ip.
But there is something I did not understand.
When I start sidecar it connects to graylog and asks for configuration receives it and starts filebeat.
What are the steps of this process.
For example :
sidecar–>graylog rest (protocol - http)
graylog rest -->sidecar (protocol tcp)…
I know thats not right but knowing the correct version can help

I have beats input setup in Graylog (tough cant be sure if its correct but…) but can not request its data yet need to pass through nginx first

The Graylog Collector Sidecar connects to the Graylog REST API (HTTP) to send its ping and fetch the intended configuration.
After that, the Graylog Collector Sidecar configures the actual log shipper (Filebeat in your case) accordingly and starts it.
The actual log shipper (Filebeat) then starts and sends data to the Graylog Beats input (port 5044/tcp by default).

What’s the configuration of your Beats input?

Name : access-logs
Forward to (req) : Beats-Output[filebeat]
type : [FileBeat] file input
Path to lLogfile : [/home/cb/Apache-Log/*.log']
Encoding : plain
Type of Input : log
Ignore files older than :slight_smile: 0
Scan frequency in seconds : 10s
Exclude []
Include []
Tail Files

here is my
/etc/graylog/collector-sidecar/generated/filebeat.yml

filebeat:
  prospectors:
  - document_type: log
    encoding: plain
    fields:
      collector_node_id: graylog-collector-sidecar
      gl2_source_collector: ed46ff49-e9cb-42d1-9bac-bb3fe1cce285
    ignore_older: 0
    input_type: log
    paths:
    - /home/cb/Fake-Apache-Log-Generator-master/*.log'
    scan_frequency: 10s
    tail_files: true
output:
  logstash:
    hosts:
    - 89.22.104.82:9000
path:
  data: /var/cache/graylog/collector-sidecar/filebeat/data
  logs: /var/log/graylog/collector-sidecar
tags:
- linux

So when I give anyof my graylogs servers IP than it gets the config and starts the filebeat

INFO[0000] Fetching configurations tagged by: [linux]
INFO[0000] Starting signal distributor
INFO[0000] [filebeat] Starting (exec driver)
INFO[0010] [filebeat] Configuration change detected, rewriting configuration file.
INFO[0010] [filebeat] Stopping
INFO[0012] [filebeat] Starting (exec driver)
server_url: http://89.22.104.80:9000/api/

this works but when I make .82/api/ for nginx it does not work (I want load balancing and HA so I need a load balancer)

That’s not the configuration of the Beats input.

You can find it on the System / Inputs page in the Graylog web interface.

Thats another interesting problem there I have a 3 node cluster for graylog but only 2 runs

settings in Input Beats :
Global
Title : Beats
Bind address 127.0.0.1 (not accepted 0.0.0.0)
Port 9000
Receive Buffer Size : 1048576
the rest of it is empty.

127.0.0.1 is the local loopback interface and only accessible from the very same machine.

You have to use a public IP address or hostname which is accessible by Filebeat.

Additionally, port 9000 is already taken by the Graylog web interface, so that naturally won’t work. Just use the default (port 5044/tcp).

But thats the ip its trying to bind and listen to is not it?
my graylogs are .79 .80 and .81 can I use any of thems Ip?
Or can you give an exaple for it? may be I didnot understand the purpose of that ip

Each input can be bound to a different network interface, but usually it’s fine to simply use 0.0.0.0 for “all network interfaces”.

In your case that didn’t work because you’ve configured an already occupied port (9000/tcp) instead of the default port (5044/tcp) of the Beats input.

May be this little extra can be added in documentation.
But why does it start only on 2 nodes of GL instead of 3?

For that you have to check the logs of the respective services.

so I setup my port to 8000 and did nginx tcp loadbalancing
this has given me
Throughput / Metrics
1 minute average rate: 0 msg/s
Network IO: 0B 0B (total: 1.3KB 0B )
Active connections: 0 (11 total)
Empty messages discarded: 0
in beats inputs

If i am not mistaken this shows my beats connected to Gl over nginx is that correct?

so everytime I start filebeat number of connections increases.
when I look at the logs of filebeat everytime I get this error one more connection is also added
2017-07-18T11:04:23+02:00 ERR Connecting error publishing events (retrying): Get http://89.22.104.82:8000: EOF
Whats the problem here I can not understand