Use of gl2_accounted_message_size for measuring outgoing traffic

Hello to all,
I am currently trying to implement a dashboard for measuring the traffic processed by Graylog; the goal here is to have the data of “Outgoing traffic” from the Overview Page, but in a table and with a history of two or three month. Also, a sum of all outgoing traffic for a month would be nice.
I have read some documentation and several posts in this forum about this problem and came across the field gl2_accounted_message_size. It is my understanding that this field (introduced in Graylog 3.2) contains the size of a message in Byte.
I build some dashboards, using timestamps as a row with an interval of one day / one month as a row and the sum of gl2_accounted_message_size as a metric. This does look good at first, but I noticed there are heavy differences to the data shown on the Overview page. So obviously, I am going wrong at some point.

  • OS Information:
    openSUSE Leap 15.4, latest patches; the environment consists of two Graylog nodes, two elastic search nodes, a third node for the third mongodb replica and a pacemaker cluster with HAProxy for load balancing.

  • Package Version:
    graylog-server-4.3.3-1 community edition (will be updated soon)
    elasticsearch-oss-7.10.2-1

  • Steps taken
    I read the documentation and did some research; there are some topics within this forum that deal with gl2_accounted_message_size, but at least to my understanding they did not exactly answer my problem. I understand that the data in Overview is all traffic going out to elastic search, but (if I am not mistaken here) the accounted messages should be the data going out to elastic search.

So, my questions here are:
Where do I think wrong? (This is obviously a problem on my site, not on Graylog.)
Is it even possible to solve this problem?
If so, how?

Hello && Welcome @oebhardt

I made a quick Widget to see if I’m on the same page. Does your table look something like this?

Hello @gsmith,
thanks for responding! Yes, my table looks like that - not as detailed, but generally that’s it. (I am at a customer’s right now, so I can’t do a screenshot.)

1 Like

Hello,
I found a solution; it is not exactly elegant, but it works.
A closer look at the mongodb revealed that all the data I would like to have is actually there - graylog does write the usage of every hour to the database. So I wrote a script pulling the data every day from the database, sum it up and write it to the journal. As this journal is (of course…) in graylog, it can be processed there. I also do this once per month, but this time not only accumulating it but also calculating the avaerage use per day for that month. This also is sent to the journal and processed with graylog.

If anyone is interested in this solution, I can post the script, the systemd-units and the configurations in grayog here.

That’s awesome and for sure I would like to take a look, sharing is caring man :+1:

Basically, this solution consists of three parts:

  1. A script collect_graylog_statistics.sh with a configuration file collect_graylog_statistics.conf
  2. Four systemd-units (could also done by cron)
  3. Dashboards in Graylog
    The collector and the systemd-units can run on every machine holding a replica of the MongoDB. The script makes use of the tool bc, so that one has to be installed.

The collector-script

A look into the graylog database revealed a field “traffic”, wherein every hour the sum of traffic in Byte is saved. So the general idea is to get this values and add them. Bear in mind that I do not have much knoweldge about MongoDB - so this is a combination of imaptience, google-foo and bash-scripting… I’ll try to explain what I thought, but I can not guarantee its always correct.
The config-file is quite simple and holds all needed to access the database:

GUSER=graylog
GPASSWORD=*************
CAFILE=/etc/pki/trust/anchors/graylogca.pem
KEYFILE=/etc/mongod.crt
DATABASE=graylog

It is used in the following script:

#!/bin/bash

# Gets the graylog outgoing traffic from mongodb and aggregates it for days and month
PATH=$PATH:/usr/bin

CONFIG=collect_graylog_statistics.conf

. $CONFIG

MONGOSH="mongosh -u $GUSER -p $GPASSWORD --tls -tlsCAFile $CAFILE --tlsCertificateKeyFile $KEYFILE --host $(hostname) $DATABASE"

Just intializations upto now. After some considerations, I decided to let the script do the work of reporting the average amount per day within a month also. (This would also have been possible to do in graylog, it just seemed easyer to me.) So if the script is called with --monthly, it gives out the monthly data, if not, it reports the daily data.

if [ "$1" == "--monthly" ];
then
  DATE=$(date -d last-month +'%Y-%m')
  STARTDATE="$(date -d last-month +'%Y-%m')-01T00:00:00.000Z"
  DAYNUMBER=$(date -d "$(date +'%Y-%m-01') -1 day" +'%d')
  ENDDATE="$(date +'%Y-%m-01')T01:00:00.000Z"

Here the first date of the bygone month, the last day and the number of days is gotten. The first hour is 1:00 O Clock, as the field traffic includes all the data from the hour before that. The last date is 0 O Lock of the next day. I also give a date with only the year and the month - this is only to make the dashboards more easy.

AMB=$((a=0;echo "DBQuery.shellBatchSize=100000;db.traffic.find({bucket: {\$gt: ISODate(\"$STARTDATE\"), \$lt: ISODate(\"$ENDDATE\")}},{output: 1});" $MONGOSH|grep Long|cut -d\" -f2|while read line; do a=$[$a+$line]; echo $a; done;)|tail -n1)

This consist of multiple parts:
echo "DBQuery.shellBatchSize=100000;db.traffic.find({bucket: {\$gt: ISODate(\"$STARTDATE\"), \$lt: ISODate(\"$ENDDATE\")}},{output: 1});" $MONGOSH
This query gets all traffic-fields between the start and the end date from the MongoDB. The DBQuery.shellBatchSize=100000 is needed, as MongoDB would only return a certain number of entries (I think it is 24 or something) else. The Output is piped into a while loop:
|grep Long|cut -d\" -f2|while read line; do a=$[$a+$line]; echo $a; done;
This just greps all needed fields and filters the data out and adds them to the sum of all already processed lines. The last output goes into AMB:
AMB=$((....)|tail -n1)

As this is a database, I am quite sure there must be a much better way to do this.

  AMGB=$(echo "scale=2;$AMB/1073741824"|bc|sed 's/^\./0./')
  AAMB=$(echo "scale=2;$AMB/$DAYNUMBER"|bc|sed 's/^\./0./')
  AAMBG=$(echo "scale=2;$AAMB/1073741824"|bc|sed 's/^\./0./')

The total sum is divided, so in AMGB is the amount in Gigabyte. Also, it is devided by the number of days, and finally, the amount per days is devided to represent Gigabyte.

logger -t graylog_monthly_traffic "statistic_month=$DATE message_amount_byte=$AMB message_amount_gbyte=$AMGB message_amount_daily_average_byte=$AAMB message_amount_daily_average_gbyte=$AAMBG"

All data is written to the journal, using the syslog_identfier graylog_monthly and pairs. So in graylog, it can easily be extracted to fields.

To get the daily usage, it is much more easy; a start and an end time are calculated, as a date the year, month and day are given,the rest is very similar:

else
  DATE=$(date -d yesterday -I)
  STARTDATE="$(date -d yesterday -I)T00:00:00.000Z"
  ENDDATE="$(date -I)T01:00:00.000Z"
  AMB=$((a=0;echo "DBQuery.shellBatchSize=100000;db.traffic.find({bucket: {\$gt: ISODate(\"$STARTDATE\"), \$lt: ISODate(\"$ENDDATE\")}},{output: 1});" \
    |$MONGOSH|grep Long|cut -d\" -f2|while read line; do a=$[$a+$line]; echo $a; done;)|tail -n1)
  AMGB=$(echo "scale=2;$AMB/1073741824"|bc|sed 's/^\./0./')
  logger -t graylog_daily_traffic "statistic_day=$DATE message_amount_byte=$AMB message_amount_gbyte=$AMGB"
fi

exit 0

The systemd units

I want to run this script every day shortly after 1 O’clock, and with the --monthly on every first day of a month shortly after that. So the systemd.timers look like this:
graylog_statistic.timer:

[Unit]
Description=regulary logging status to journal
#Requires=network.service
#After=network.service

[Timer]
OnCalendar=*-*-* 01:05:00

[Install]
WantedBy=timers.target

graylog_statistic_monthly.timer:

[Unit]
Description=regulary logging status to journal
#Requires=network.service
#After=network.service

[Timer]
OnCalendar=*-*-01 01:30:00

[Install]
WantedBy=timers.target

The according services are quite simple also:
graylog_statistic.service:

[Unit]
Description=logging status to journal

[Service]
Type=oneshot
ExecStart=/The Actual Path/collect_graylog_statistics.sh

graylog_statistic_monthly.service:


[Unit]
Description=logging status to journal

[Service]
Type=oneshot
ExecStart=/opt/capricorn/bin/collect_graylog_statistics.sh --monthly

The Dashboards

A look at a message in graylog shows the following fields:


Those can easily be used in a dashboard; here is one for daily usage:

And here it is for the monthly usage:

1 Like

Pretty freakin amazing :+1:

EDIT:
Oh I see, I must have missed that, good catch

> db.traffic.find().Prtty()
uncaught exception: TypeError: db.traffic.find(...).Prtty is not a function :
@(shell):1:1
> db.traffic.find().pretty()
{
        "_id" : ObjectId("5a501e28e433355eb03b71e3"),
        "bucket" : ISODate("2018-01-06T00:00:00Z"),
        "decoded" : {
                "c7f567fc-5ff7-459c-99f2-bf088359cd7a" : NumberLong(0)
        },
        "output" : {
                "c7f567fc-5ff7-459c-99f2-bf088359cd7a" : NumberLong(0)
        },
        "input" : {
                "c7f567fc-5ff7-459c-99f2-bf088359cd7a" : NumberLong(0)
        }
}
{
        "_id" : ObjectId("5a501fcce433355eb03b71ea"),
        "bucket" : ISODate("2018-01-06T01:00:00Z"),
        "decoded" : {
                "c7f567fc-5ff7-459c-99f2-bf088359cd7a" : NumberLong(72924857)
        },
        "output" : {
                "c7f567fc-5ff7-459c-99f2-bf088359cd7a" : NumberLong(74002963)
        },
        "input" : {
                "c7f567fc-5ff7-459c-99f2-bf088359cd7a" : NumberLong(62913272)
        }
}
{
        "_id" : ObjectId("5a502ddc706123b67ae965d4"),
        "bucket" : ISODate("2018-01-06T02:00:00Z"),
        "decoded" : {
                "c7f567fc-5ff7-459c-99f2-bf088359cd7a" : NumberLong(95324686)
        },
        "output" : {

Thanks!
As for cron (I saw that one popping up somewhere) - yes, the script could be run from cron also. I just use systemd.timer as an alternative to it.

yeah I did state about using cron I deleted that questioned because im blind :laughing: and noticed you said that above.