Question about cluster

Hello friends,

I’m trying to understand how to make a graylog cluster.

It would be, a vm (master), and another physical (slave)

MASTER    --        SLAVE
GRAYLOG   ->        GRAYLOG (REPLICA)
MONGODB   ->        MONGODB (REPLICA)
ES        ->        ES (TWO OPTIONS TO STORE) (TWO OR MORE NODES)

See, I would like to replicate graylog settings, but not elasticsearch or opensearch.
And I don’t intend to have a load balancer on the front.

In short:
My idea is to have two options to store the data when defining an “input”… NODE-A and NODE-B

It is possible?

When reading about, in:

https://go2docs.graylog.org/5-1/setting_up_graylog/multi-node_setup.html?tocpath=Setting%20up%20Graylog|Getting%20Started|Initial%20Configuration%20Settings|_____3

It wasn’t clear to me if it’s possible to cluster in that mold.

Grateful!

Hey @isotecviac2022

If your referring to settings AKA configurations. Metadata is kept in MongoDb so replicating that database would be the way to go. I have transferred Graylog database ( mongo) to a completly new host and started Graylog up without an issue, keep in mind Mongo versions should match.
EDIT: as for Graylog that is your front end and of cource ES/OS is where doc’s are stored. As you know clustering is for fault redundence and how big your environment is this would depend on your cluster settup. if you only need it for a backup and this is a virtual machine you can just backup the virtual dive daily /weekly would do fine, or make two nodes with GL,Mongo & ES/OS on each. depends on what you want to do.

@gsmith
Thanks for your reply. Sorry for taking too long to respond.
I understood that the current understanding of clustering is for data redundancy and that makes it safer against failures.

In my scenario, I understand that it is important for me to have a redundancy of mongodb because all the configuration metadata is there.

However, I would not like to have redundancy or data replication in ES.
I would like to know if when I create an “input” I can redirect to different ES (nodes) and without replications between them.

These drawings try to represent what I would like to do:


clester_grl1
or


Grateful!

Hey @isotecviac2022

For ES/OS there is a setting called discovery.type: single-node

From that setting replication between ES/OS nodes should should not happen.

Now, as for sending data from INPUT to a specific ES/OS node I’m not 100% sure but i have never done that or seeing it done.

BUT, a stream has a setting called "Outputs’ so basically you can forward data to another Graylog server, But its not forwarding to a single ES/OS instance.

Another Idea would be using Logstash on Graylog server an redirecting it to ES/OS instance, now I have done that.
Example:

logstash_example
# Sample Logstash configuration for creating a simple
# Beats -> Logstash -> Elasticsearch pipeline.
input {
  beats {
    port => 5044
    tags => [ 'beat' ]
      }
}
input {
  udp {
    port => 5144
    tags => ['syslog']
  }
}
input {
  http {
    port      => 12345
    tags => ['fluent']
    add_field => { "[@metadata][input-http]" => "" }
  }
}

filter {
  if [@metadata][input-http] {
    date {
      match => [ "date", "UNIX" ]
      remove_field => [ "date" ]
    }
    mutate {
      remove_field => ["headers","host"]
    }
  }
}


filter {

if "syslog" in [tags] {

grok {
      match => ["message", "%{SYSLOG5424PRI}%{GREEDYDATA:message}"]
      overwrite => [ "message" ]
        }


    kv {
       source => "message"
       value_split => "="
    }

   }
  }


filter {

if "syslog" in [tags] {
 mutate {
        remove_field => [ "addr","appcat","craction","crlevel","crscore","devtype","dstdevtype","dstosname","dstserver","dstserver","fazlograte","freediskstorage","interface","log.syslog.priority","masterdstmac","mastersrcmac","osname","policytype","poluuid","setuprate","srchwvendor","srcserver","total","totalsession","used","user","vd"]
  }
 }
}

output {
if "beat" in [tags] {
  opensearch {
    hosts => ["https://graylog-server_1:9000"]
    auth_type => {
              type => 'basic'
              user => 'admin'
              password => 'changeit'
            }
    ecs_compatibility => disabled
    ssl => true
    ssl_certificate_verification => false
    cacert => "/opt/logstash-8.6.1/root-ca.pem"   
     }
  }
if "syslog" in [tags] {
          opensearch {
             hosts => ["https://elasticsearch_node_1:9200"]
                        auth_type => {
                            type => 'basic'
                            user => 'admin'
                            password => 'changeit'
                          }
                        ecs_compatibility => disabled
                        ssl => true
                        ssl_certificate_verification => false
                        cacert => "/opt/logstash-8.6.1/root-ca.pem"
                        index => "firewall-%{+YYYY.MM.dd}"
        }
    }
if "fluent" in [tags] {
          opensearch {
             hosts => ["https://elasticsearch_node2:9200"]
                        auth_type => {
                            type => 'basic'
                            user => 'admin'
                            password => 'changeit'
                          }
                        ecs_compatibility => disabled
                        ssl => true
                        ssl_certificate_verification => false
                        cacert => "/opt/logstash-8.6.1/root-ca.pem"
                        index => "fluent-bit-%{+YYYY.MM.dd}"

        }
    }
}

Note I use aother OpenSource software attached to the single ES/OS instance to analyze my data and using Graylog as conduit.

1 Like

@isotecviac2022, I’m not clear what you mean. All of your ES/OS nodes would make up one cluster. They do not replicate between them unless you enable replicas.

Do you mean to have four separate, but not clustered, instances of ES/OS? If so, why? That would be a very unusual configuration. What is your overall goal? What are you trying to achieve?

As for MongoDB replicas, you need a minimum of three nodes for a replica set, but you can use the mongodump command on a chron job to back up Mongo as often as you like.

1 Like

Hi @joe.gross
Thanks for interacting too.
Because I receive logs from many different sources of IT assets, I’m starting to run into problems with running out of storage space. So I think if I have a way to store the logs already processed in different places it would help me.
In my case I also have little local resource. That is, I have no way to scale vms and more vms.
So I thought:
I can process everything in one place and store it on other servers. So I can scale as the volume of records grows in size.

See that if by chance there is a failure in these “nodes” of the remote “es/os”, for me it is not a disaster.

I have a local server (vm) with 8g/ram and 8 cores/cpu with only 120gb storage.
But I also have a server on the same network (physical) with 4g/ram and 2Tb of storage.

Grateful.

Then,
I believe this is the way to go!
Make the cluster. And in the case of es/os not activating replicas.

Grateful.

1 Like

Good luck @isotecviac2022.

A couple of caveats, a cluster that is made up of nodes located in different physical sites may struggle with latency.

Also, an OS node with only 4GB of RAM may struggle with a lack of Java heap memory. If there is any way to add RAM to that server, it will make a big difference in performance. The rule of thumb is to assign half of system RAM to heap, not to exceed 31 GB per node.

1 Like

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.