Two questions: delay in applying configuration changes & difficulty extracting key=value pairs

1. Issue
Hello All,

I’ve been using graylog as a raw dump for some logs that expire more frequently than I’d like on some other systems, but have recently wanted to come back to it and implement some of the more advanced features. I’m having two issues:

  1. When I make changes to the configuration (e.g. change/add an input, change/add an extractor) the changes don’t seem to be reflected in a timely fashion. The delay is at least 20-30 minutes, and may be several hours. For instance, I added an extractor yesterday mid-day to copy some input from the message field to another field. Even at the end of the day, that wasn’t happening. However, when I returned this morning, it was happening. Is this normal? Is there something to do to speed it up? Where should I start troubleshooting?

  2. I have messages coming in from a WiFi controller that have a bunch of key/value pairs included in the message, in the format:
    (controllername) Core: @@236,clientInfoUpdate,“apMac”="(value)",“clientMac”="(value)",“ssid”="(value)", (and so forth - about 20 key/value pairs in total)
    I’ve tried using a regex extractor to retrieve just the “key”=“value” part of the message (which seems to work) and added the “Add Key=Value pairs as fields” converter in it, but the converter doesn’t seem to be doing anything. Is this how it should be working? Are there other ways to do this? Any tips on where to start troubleshooting?

  • Subpoint: What I’d really like is to extract about 8-10 of those key-pair values, and ignore the rest. Is there a better way to do that?

2. Environment
My Graylog instance is a everything running on a single server running Ubuntu 20.04 as a Hyper-V guest. It was installed following the Ubuntu installation directions in the Graylog documentation. It’s Graylog 4.2.5.

  • Package Version: 4.2.5+59802bf (I think this is what this means?)

  • Service logs, configurations, and environment variables: (I’m sorry - I’m not entirely sure what information is being requested here)

3. What steps have you already taken to try and solve the problem?
I have done several searches about both problems, both general internet searches as well as on the reddit and in this forum, as well as watched several graylog-101-style youtube videos. I’ve also read the graylog documentation page on extractors and converters although I’m sure I’ve missed things.

When trying to extract the key-value pairs I came across the key=value converter which seemed promising, but doesn’t seem to work the way the internet tells me it should - this could easily be a failure in my understanding. I understand a pipeline may also be a solution to my problem, but haven’t yet gone down that road.

I’ve been completely unable to find anyone suffering from the ‘changes to configurations take a long time to apply’ problem I’m having, so I’m not even sure where to begin. It doesn’t seem(??) like a resource problem based on what I can see, but am happy to do any tests y’all think would be useful.

4. How can the community help?
I’d appreciate any troubleshooting tips, strategies, or starting points. Any links to documents that you think would be useful for me to read would also be valued. And if you’ve solved or experienced either of these issues and know exactly what I need to do, so much the better :slight_smile:

Thank you in advance for your help and expertise.

Hello && Welcome @bking

I might be able to help, your issue does not seams normal.

I have only experienced this once, it was mainly because of the amount of logs being ingest but it was just a few minutes. Sometimes this may happen depending on the Index rotation strategy. Configuration/s made prior to Index rotation will kick in and this will depend on how you setup your environment.

If you could give of an example of what you want and what you tried would be helpful ( details , pictures, etc…) and maybe we can come up with a solution for you. Knowing how much logs you are ingesting daily/seconds would be nice. You mention about resource? What do you have on the single Graylog node? What version of Elasticsearch/Mongo do you have? How many Extractors do you have on your Graylog server? To be honest, this issue could be anything so we need to narrow it down.

1 Like

Hello @gsmith! Thanks for your response. I’ve done more digging since my initial post, so I’ll try to answer your questions and update you with where I think the problem may be.

The server runs only the things necessary for Graylog. I started with a fresh Ubuntu 20.04 install and followed the install instructions from the Graylog documentation. If I’ve understood you correctly, this is the version information you’re looking for:

Package: mongodb
Version: 1:3.6.9+really3.6.8+90~g8e540c0b6d-0ubuntu5.3

Package: elasticsearch-oss
Version: 7.10.2

Package: graylog-server
Version: 4.2.6-1

The server is currently ingesting logs on 514 (redirected using iptables from 1514 - important later) from 6 sources - 5 basic HP switches and 1 Wi-Fi controller. Total, it’s average 15 messages per second (between 10-20), the vast majority of which come from the Wi-Fi Controller. The switches combined produce 1-2 messages per minute.

Regarding current updated situation:

This morning I found something interesting that definitely points to some kind of delay in processing somewhere. I redirect UDP 514 to a higher port with iptables, and I haven’t gotten around to setting those permanently (I know, I know) so they reset on reboot. I had to reboot the server at 15:00 EST yesterday, and shortly after that I was checking the node page and noted no messages in any buffers. However, I was still seeing new messages actively showing up in the search timeline. When I checked this morning, new messages into the search timeline tapered off zero about 20:00 EST - 5 hours after the reboot. In theory, no new messages should have been received after the reboot, as I never re-enabled the iptables rule.

Just now I re-enabled the iptables rule, and immediately the node started processing messages (10-20 messages per second). However, only messages from our switches are appearing in the search timeline. In the 5 minutes since the Graylog server started receiving syslog messages, it should have received about 4500 messages, but only 7 (from our basic switches) show in the timeline. No messages from the Wi-Fi controller show, even though clearly they’re being received.

What happens to the messages after they go through the buffers and the journal, but before they appear in the search timeline? It seems like that’s probably where my problem is, but I have no idea what that is or why it would only affect messages from one source (albeit the most talkative one).

Once again, thanks for your help!

Sorry - just saw you’d also asked about extractors. When I was experiencing this problem yesterday, I had one extractor on the input, the one described in the initial post (regex copy to new field with the key=value converter enabled). Since then, I’ve added two more (both regex copy to new field, one to pull event ID and one to pull event description), for a total of 3.

Ok - this is definitely a timezone issue. The logs that are coming in from the WiFi controller now are being logged in the search pane as coming from 5 hours in the future, but only from the WiFi controller, not the switches. That is, if I use the default “from 5 minutes ago until now” search, I get nothing. But if I search in absolute times (e.g. from 2022-02-10 11:45:00 until 2022-02-10 20:45:00), I see all the logs I’d expect from the WiFi controller, but shifted 5 hours into the future from when they should have arrived.

However, I have no idea how to resolve it. The Wi-Fi controller is sync’d to the same NTP server as the Graylog server. Both are set to the same timezone. Both show the same UTC and local time at the command line and in the web interface. I’m obviously missing something basic, but can’t figure it out for the life of me.

I can confirm that, once I’ve accounted for messages from the future, configuration changes do seem to take place instantly. However, I’m still unable to extract key=value pairs the way I though I should be able to. Here is an exact message that I’m receiving from the controller (with formatting preserved as much as possible when redacting).

(CONTROLLER_NAME) Core: @@209,clientRoaming,"apMac"="XX:XX:XX:XX:XX:XX","clientMac"="XX:XX:XX:XX:XX:XX","ssid"="(SSID_NAME)","bssid"="XX:XX:XX:XX:XX:XX","userId"="","wlanId"="5","iface"="wlan34","tenantUUID"="XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX","apName"="(AP_NAME)","apGps"="XX.XXXXXX,-XX.XXXXXX","toRadio"="11a/n/ac","fromApMac"="XX:XX:XX:XX:XX:XX","clientIP"="XX.XX.XX.XXX","userName"="XXXX@XXXX.XXX","vlanId"="1","radio"="a/n/ac","encryption"="WPA2-AES","rssi"="41","receivedSignalStrength"="-64","Instantaneous rssi"="0","Xput"="0","fwVersion"="X.X.X.X.XXXX","model"="(MODEL)","zoneUUID"="XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX","zoneName"="(NAME)","timeZone"="EST+5EDT,M3.2.0/02:00,M11.1.0/02:00","apLocation"="(LOCATION)","apGps"="XX.XXXXXX,-XX.XXXXXX","apIpAddress"="XX.XX.XX.XXX","apIpv6Address"="","apGroupUUID"="XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX","domainId"="XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX","serialNumber"="(NUMBER)","domainName"="(NAME)","wlanGroupUUID"="XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX","idealEventVersion"="X.X.X","apDescription"="(DESCRIPTION)"

What I’d really like to get out of this is something like:
Event = clientRoaming
ClientMac = (mac address)
userID = (userid)
apName = (name)
toRadio = 11a/n/ac
clientIP = xx.xx.xx.xx
userName = (username)
radio = a/n/ac
rssi = (number)
receivedSignalStrength = (number)

plus a few other fields that aren’t present in this message, but are in other event messages. I’m ok if extra fields are there, I just don’t really need them (for instance, GPS cords or zone UUID).

What’s the ‘right’ way to go about that? Is the converter on the extractor supposed to work that way? If so, how can I troubleshoot what’s going wrong? If not, where should I start looking?

Thanks again!

Hello,

Thanks for the added details, I believe what you were explain is the direct results of time difference. Also be aware if you reboot your server, the messages that did not get processed in the journal will be processed once Elasticsearch is up and running.

Timestamp suggestions:
Check Graylog Web UI under System/Overview under Time configuration make sure there all the same date/time. Big time delays like that perhaps go to each device and make sure the date/time/time zone matches Graylog server date/time. It is suggested that Graylog have NTP install or something similar. I personal use NTP and default settings. Make sure all your Time zones are correct then you should not have a problem with delayed messages. From the amount of log you have you really should not have any problems.

INPUT Suggestions:
This would depends on how you setup you GL server, but my environment has different INPUT’s for different devices.
All my switches go to Raw Plaintext/UDP/TCP because of the need to configure exactors for those specific devices. What I noticed was if I have all my device on one INPUT, my extractors it will create my field needed but sometimes I would have random data under those fields that is not needed.
All my firewalls go to a Syslog UDP/TCP INPUT. This would create some fields but not a lot so if need be I would create a extractor for those device.
Windows/Linux device I used GELF TCP/UDP on different Inputs this does create a lot of fields and found it to be resource hog but I like it :smiley:
I haven’t noticed by adding a different INPUT/s on Graylog that its using any more resources (depending on type) then usual. On the other hand, the type of INPUT used, may create more fields and this will use up your disk volume.
So basically, I have things organized so if a change is needed, its made simple.

As for the following fields needed.

You mentioned

By chance are these Windows/Linux devices? The reason and depend on the type of input your using this may be resolves by a different INPUT, just an Idea.
Are looking into creating Regex or GROK pattern for these fields?

1 Like

Thanks for working through this with me, @gsmith.

I hadn’t realized there was a difference between Syslog & Plaintext inputs (or more accurately, didn’t realize plaintext inputs existed). That bit of info led me to discover that my WiFi controller is sending the logs with a non-timezone-marked UTC timestamp. So, at 09:39:59 EST, the WiFi controller sends a timestamp of “Feb 11 14:39:59”, despite the WiFi controller reporting itself to be in EST at the proper time, both within the GUI and from the command line. I’m guessing that these kinds of timestamps are supposed to be either sent in local time, or marked with a timezone somehow? And since Graylog was parsing the message as a syslog formatted message, it accepted the timestamp at face value as local time and promptly filed it at that time in the future. However, when it’s received as a Plaintext message it just accepts the received time as the timestamp? Is that how that works?

The devices in question are HP switches and a Ruckus Smartzone wireless controller. The HP switches seem to send completely standard syslog messages, and I have no problems with them. The Ruckus wifi controller sends lots of messages, but I’m only interested in doing processing on a subset related to wifi clients. Those messages all look similar to the formatted text snippet I posted above. I have a regex extractor that runs only on those messages based on a pattern match, and which successfully pulls out the non-key/value pair data. The key/value pair data is different depending on which type of client message it is (join, roam, update, disconnect, etc).

My ideal goal would be to specify certain keys and extract them and their values as fields within Graylog if they exist. I’d settle for just extracting all the key/value pairs that are present. I thought adding the “Add Key=Value pairs as fields” converter on the regex extractor would do that, but it doesn’t seem to.

What should the “Add Key=Value pairs as fields” converter do, exactly? And under what conditions would it run? Under what conditions would it fail?

If that converter doesn’t function as I understand it to, or if I can’t make it work, what are my alternatives? Is setting up a pipeline to run on the Ruckus client messages the way to go?

Thanks again for all your help.

Key=value doesn’t easily let you pick and choose which fields get picked up. The alternative is that you do a lot more initial work. You could build out a GROK statement that pulls only the fields you want for each type of message coming in but that would be building a GROK for each of join, roam, update… etc. Time intensive. There is a RUCKUS content pack posted in the Graylog Marketplace that has extractors that might help… although the last posting for that is 3 years old…which means if it’s incompatible, you might need to pick out the data in he JSON to use it.

I prefer pipelines since they are easier for me to think about for sequencing. You could set them up to process the message through several layers of tools getting progressively closer to what you want - like removing all parenthesis, replacing spaces with underscores, removing quotes, … GROK -or-Key=value -or- split… All of it would take some time and tuning but you certainly can get to a great end product. :smiley:

1 Like

Thanks @tmacgbay!

I had looked at that Ruckus content pack. It appears to be several regex extractors each for an individual field, which is a road I’d considered going down. Do you know if there’s a performance difference in doing a pipeline vs. multiple extractors?

Also, and I’m sorry for belaboring the point, but can someone confirm that adding a key=pair converter to an extractor should result in a bunch of new fields being added to the message, if the source field contains a string of “key”=“value”,“key”=“value”,“key”=“value” content?

I don’t mind putting in the initial work to get some nice results, but at this point I’m trying to make sure I understand the tools. It feels like there is a tool that I think should work in a certain way, and it’s definitely not working in that way, so I’m trying to find where my misunderstanding is.

As always, thanks for all your help!

Not that I have seen or know of.

If it is set up properly, this should happen - You note in previous posts you are having trouble with the extractors, and showed an example message but we (mostly @gsmith ) would have to see the extractor settings to help there.

2 Likes

Hello,

I heard extractors :smiley:

Let’s say I wanted Regex extractor on a INPUT with quotes around the data, I also want to make a field for better searches.

Example of the Log file.

<189>date=2022-02-11 time=17:04:06 devname=“mydevice” devid=“Fgahroeojjjssss” logid=“0000000013” type=“traffic” subtype=“forward” level=“notice” vd=“domain-labs” eventtime=1644620646 srcip=8.8.8.8 srcport=53928 srcintf=“VLAN1” srcintfrole=“lan” dstip=8.8.8.8 dstport=10051 dstintf=“VLAN2” dstintfrole=“lan” sessionid=4183364881 proto=6 action=“deny” policyid=0 policytype=“policy” service=“Zabbix Port/51” dstcountry=“Reserved” srccountry=“Reserved” trandisp=“noop”

`

Lets say I want to extract this from my message level="notice"

I would create regex extractor with the following configurations.

Results

I have actually a few of them on my Firewall INPUT.

I tend to use Regex instead of GROK, couple times I did have some issues with a input and bad GROK, I have also had bad issues with a incorrect Regex. I just prefer Regex nowa days.

hope that helps

1 Like

Hello,

I had time to upload your example log in your above post, replaced a couple xx.xx.xx.xx. with some IP address and UUID for testing in my lab with a regex extractor.
Here is the log I used to test extractors.

ONTROLLER_NAME) Core: @@209,clientRoaming,“XX:XX:XX:XX:XX:XX”,“clientMac”=“192.168.1.12”,“ssid”="(SSID_NAME)",“bssid”=“XX:XX:XX:XX:XX:XX”,“userId”="",“wlanId”=“5”,“iface”=“wlan34”,“tenantUUID”=“00112233-4455-6677-c899-aabbccddeeff”,“apName”="(AP_NAME)",“apGps”=“XX.XXXXXX,-XX.XXXXXX”,“toRadio”=“11a/n/ac”,“fromApMac”=“XX:XX:XX:XX:XX:XX”,“clientIP”=“10.10.10.10”,“userName”=“XXXX@XXXX.XXX”,“vlanId”=“1”,“radio”=“a/n/ac”,“encryption”=“WPA2-AES”,“rssi”=“41”,“receivedSignalStrength”="-64",“Instantaneous rssi”=“0”,“Xput”=“0”,“fwVersion”=“X.X.X.X.XXXX”,“model”="(MODEL)",“zoneUUID”=“XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX”,“zoneName”="(NAME)",“timeZone”=“EST+5EDT,M3.2.0/02:00,M11.1.0/02:00”,“apLocation”="(LOCATION)",“apGps”=“XX.XXXXXX,-XX.XXXXXX”,“apIpAddress”=“XX.XX.XX.XXX”,“apIpv6Address”="",“apGroupUUID”=“XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX”,“domainId”=“XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX”,“serialNumber”="(NUMBER)",“domainName”="(NAME)",“wlanGroupUUID”=“XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX”,“idealEventVersion”=“X.X.X”,“apDescription”="(DESCRIPTION)"

And this is what the results are. All you need is to add the field you want and your done, BTW, you could either set this extractor to copy data from the original message or cut it.

Hope that help

1 Like

Hi everyone,

Thank you for all your time and help, and apologies for the delay in responding - we had a long weekend! It’s good to know that I can manually extract multiple fields with multiple regex extractors without hurting performance too much. I feel pretty comfortable with that, and that will be my next step.

Here’s what I have currently that’s working differently than I expect it to. The extractor is set to pull the entire “key”=“value” section of the message and dump it into a field called “ruckus_client_fields”. It does this. I’ve also added a “add key=value pairs as fields” converter. I’d expect that to make a bunch of fields (one for each ‘=’ sign in the message) and populate them. It does not do that. Here are screen shots of the extractor and of a message that the extractor has run on. Is there something obvious I’m doing wrong?

EDIT 2: let’s get creative. It’s a kludge, but here is the screenshot of the extractor. And here is the screenshot of a message processed by the extractor.

EDIT: well, apparently I can’t upload screenshots? Any tips there? When I use the ‘upload’ button in the editor, it lets me select the files (I’ve tried .png and .jpg, under 100k in size) and it inserts some text about uploading a file in square brackets, but then there’s an website error “Sorry, there was an error uploading that file. Please try again.”

Hello @bking

I finished it for you :smiley: Just what you see in the screen shot.

Core: (.+?),

EDIT: just a side note, if your regex Config works , it may not be right and could be issues so be carful , This may help

I use Paint on Windows , then copy & paste what’s needed. Maybe try that.

EDIT2: I’ve must have overlooked you post, You stated you want multiple fields created at the same time with one extractor? If this is correct , a extractor may not be what you want.

If I’m reading this right you basically want to copy the whole message into another field
(i.e. duplicating ) ? If this is correct, I have done that without an extractor. Using the log/s from your post that I uploaded into my Lab Graylog server I have produced that already by using a different
INPUT GELF UDP/TCP as shown below. Maybe in your case it might not work.

If only a few fields are needed to be extracted from the message field, then yes, I would use an extractor for those.

So… I fell into a Extractor hole trying to figure this Key=value pairs :laughing:
I haven’t played with this type of settings yet, mainly because I do not have a device send those types of logs.

What I tried so far is shown below.

Create a new extractor of type “Copy Input” and select to read from the field message. (Or any other string field that containskey=valuepairs.) Configure the extractor to store the (copied) field value to the same field. In this case message. Thetrick is to add the “Key=Value pairs to fields” converter as last step. Because we use the “Copy Input” extractor, the converterwill run over the complete field you selected and convert all key=valuepairs it can find.

Tested with Syslog /UDP and RawPlaintext INPUTS.
Thinking that I needed to replace all the marks in the logs xxx.xxx.xxx.xxx and with something that match’s like IP Addresses, fake MAC address etc…
It was a no go, but I found something interesting. The log in test mode I replaced the first part of your log shown above with datetime=2019-09-10T12:36:21-0400

Example:

datetime=2019-09-10T12:36:21-0400 Core: @@209,clientRoaming,“192.168.1.234”,“clientMac”=“192.168.1.12”,“ssid”="(SSID_NAME)",“bssid”=“23:23:23:23”,“userId”="",“wlanId”=“5”,“iface”=“wlan34”,“tenantUUID”=“00112233-4455-6677-c899-aabbccddeeff”,“apName”="(AP_NAME)",“apGps”=“2.23,2.456”,“toRadio”=“11a/n/ac”,“fromApMac”=“234:23:21:12”,“clientIP”=“10.10.10.10”,“userName”=“greg.smith@domain.com”,“vlanId”=“1”,“radio”=“a/n/ac”,“encryption”=“WPA2-AES”,“rssi”=“41”,“receivedSignalStrength”="-64",“Instantaneous rssi”=“0”,“Xput”=“0”,“fwVersion”=“3.3.3.3456”,“model”="(MODEL)",“zoneUUID”=“00112233-4455-6677-c899-aabbccddeeff”,“zoneName”="(NAME)",“timeZone”=“EST+5EDT,M3.2.0/02:00,M11.1.0/02:00”,“apLocation”="(LOCATION)",“apGps”=“34.5678.”,“apIpAddress”=“8.8.8.8”,“apIpv6Address”="",“apGroupUUID”=“00112233-4455-6677-c899-aabbccddeeff”,“domainId”=“hahahw-asdf4-3453-df567”,“serialNumber”="(NUMBER)",“domainName”="(NAME)",“wlanGroupUUID”=“00112233-4455-6677-c899-aabbccddeeff”,“idealEventVersion”=“4.3.2”,“apDescription”="(DESCRIPTION)"

And resending that message back over to Graylog with the Key=value extractor setup, It did create a new field as shown below in the red box.

So this was making think, Perhaps those quotes is your problem. So therefor you either need to create one at a time or use a pipeline.
Hope that helps

in the pipeline you can have the key_value() function trim the key and value fields… that could take the quotes out…

Unless you like them. :smiley:

1 Like

Wow! Thanks for all your work on this!

I did end up going down the road of making separate regex extractors for each individual field I wanted.

That’s working great! I’m pretty happy with my extractor game at this point.

I’m still trying to play around with the key/value options, more out of sheer stubbornness than anything else at this point. I made a pipeline using the key_value() function mentioned by @tmacgbay , but am still not having much success. It works in the simulator, but not in practice. The pipelilne rule is:

rule "key_value_parser"
when
    has_field("message")
then
set_fields(
    fields:key_value(
        value: to_string($message.message),
        trim_value_chars: "\"",
        trim_key_chars:"\"",
        delimiters:",",
        kv_delimiters:"="
        )
    );
end

And when I copy a sample message from the stream into the rule simulator, it successfully creates 32 fields as I’d expect it to. However, even though the pipeline says it’s got active throughput at about the right volume and 0 errors, the real messages aren’t getting any of those fields created. I’d post some screenshots, but still can’t figure out how. Copy/paste from MS Paint results in that same ‘error uploading that file’ message.

I’m all ears if y’all have any obvious tips, but please don’t spend a lot of time on this. Like I said, I’ve got the basic stuff I wanted accomplished with regex extractors and am just trying to further my understanding at this point.

I would break up the code a bit and use the debug() function to see what is going on in there:

rule "key_value_parser"
when
    has_field("message")
then
let booya = key_value(
        value:            to_string($message.message),
        delimiters:       ",",
        kv_delimiters:    "=",
        trim_key_chars:   "\"",
        trim_value_chars: "\""
        );
debug(concat("This is full message: ",     to_string($message.message)));
debug(concat("This is the booya resuts: ", to_string(booya)));

set_fields(booya);
    
end

You can watch for results in the Graylog log:

tail -f /var/log/graylog-server/server.log

Sometimes things in Graylog need to be double escaped… trim_key_chars: "\\"", This might be that case…

1 Like

Hello,

No worries, I’m also learning and I don’t mind helping out.

@tmacgbay :+1: One day I’m going to get off my fat @$$ and go through the forum an pick all these Pipelines out and add them to Templates :smiley:

1 Like