Help to optimize processing

mulgurul · March 1, 2019, 1:45pm

I’m having performance issues on our Graylog installation. It can’t keep up with incomming messages, and I need some tips on improving.

The version is 2.4.6, on docker on Ubuntu. I have a standard 1 node setup with Graylog, ES and Mongodb on seperate docker containers.

The processing shows 100 utilization, and memony consumption jumps up 3 times from up from around 1gb to 1.6 gb of 1.8 in steps each second and then return to 1gb.

Process buffer is full/100%

I have a processing average on 500-1000 msg per second.

The docker PS shows and CPU utilization on 90-140%.

I guess I’ll might have to bring in more CPU power, or maybe tweak the number of processor buffers?

I only have a single rule and a pipeline setup. All really simple and it seems to be very restricted in what it is doing.

AWS lookup and GeoIP resoler are disabled.

Any hints to what I can do for getting better performance?

Best regards, Peter Meldgaard

tmacgbay · March 1, 2019, 2:24pm

You may want to consider moving your Elasticsearch to another machine so it can handle the influx from GL. I used elasticdump (https://www.npmjs.com/package/elasticdump) to do this since I don’t have resources to start up a elastic cluster. There are also a lot of other posts on optimizing in the GL community if you search for them… here is one, though they have clusters rather than docker instances. Status Green, all systems go. How to optimize?

macko003 · March 1, 2019, 7:37pm

500-1k message not a lot…
before you do anything (without sence…) analize the problem.
after you know where is the problem solve it.
that’s simple, isn’t it?

check the top on your machine. ES or GL cause the load?
check the server.conf, count the processors number. it sould less than the 3/4 of your cpu cores. (count with other processes also on your machine)(but the best, leave the original values…, its work over 15k messages/host also)
check the java heap size. eg. lookup tables can eat a lot memory.
if you don’t see any error… and you see GL eat all of your CPUs, check your extensions, and pipelines. GL contains a lot of metrics, you can check where is the problem. I suggest start with your regexps. If you don’t have time to digging, do logarithm search, temp disable your pipelines.

Do this tings, and share all the information what you collect.

mulgurul · March 11, 2019, 9:37am

Thank you macko003, for suggested bullets.

I’ll certanly analyze., but first I need to upgrade to 3.0. So this is causing a little delay on this.

I’ll be back with more finding as soon as we’re on 3.0 and have analyzed the performance problems.

BR Peter

system · March 25, 2019, 9:37am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Process and output buffer is 100% utilized Graylog Central (peer support)	5	9361	July 26, 2018
Performance Tuning Whitepaper, Guide, Doc Graylog Central (peer support)	5	4816	August 8, 2017
Performace Problems Graylog Central (peer support)	5	561	December 6, 2019
Optimize Process buffer at 100% Graylog Central (peer support)	4	1462	March 16, 2023
Graylog Processing Messages Super Slow Graylog Central (peer support)	3	3966	October 16, 2018

Help to optimize processing

Related topics