Difficulties to apply extractors using regex

Yes I see that you are, Ok now this strange :thinking:
Give me few, Ill start looking into it.

Edit: Have you tried another section of the logs. I’m just curious on something.


1 Like

Think I may need to come up with something else. It might be the data after pc_name= and I made a regex for a specific data. Perhaps what We need is just to grab everything after pc_name and stop after the pc name.

Edit: Have you tried another section of the logs. I’m just curious on something.
Example: operation=(\w+)

Yeah, I did, but I feel that the problem isn’t the regex per se, as you can see int the test work

I see , there is something funky going on.

I was thinking in something like this.

This may sound weird have you tried reloading the browser? or even log off and back in?
What gets me is, You have the message right there. so its not the INPUT or anything like that.
I think you stated on another message your REGEX worked, am I correct?

Yeah give it a try, doesn’t hurt anything to try.

Yeah man, it worked only with the Syslog input, because of that I was thinking that may be connected to the fact that this particular beat input was getting unformatted data, just guessing obviously.

Its hard to tell, I was just using Beats input and your examples of logs. even thou the data after pc_name= is different it still should of picked something up or part the data after “=” sign.

Did the test right now, but unfortunately, the issue persists, this is just odd, and I read a lot of articles, I’m trying to solve this for about a week already, lol.
As the guys from OSCP, I need to try harder, hahaha.

1 Like

Let me get someone else in here. Perhaps we can put more eyes on this.
For now Ill do some research.

@tmacgbay :smiley:


Thank you very much for your efforts to help me, I hope someday to retribute with some help too.
I learned a lot from your explanations, I really did, and yeah we will continue to try to solve this issue.

All the best @gsmith!

1 Like

No problem and that would be good :+1: Just hanging around would be great.

Control testing

I create two new Graylog servers last night, Create one index and one Beat INPUT all standard stuff. No extra configuration or anything. I sent logs to these two new Graylog servers which you provided and all my Regex configuration worked.

If this is possible, I would like to do a control test. Start from scratch. What needs to happen is the following:

Remove old beat input

Create a new one with the following specs.

---Global box should be highlighted---
Title "FileBeats"
no_beats_prefix: false
number_worker_threads: 12
override_source: <empty>
port: 5044
recv_buffer_size: 1048576
tcp_keepalive: false
tls_cert_file: <empty>
tls_client_auth: disabled
tls_client_auth_cert_file: <empty>
tls_enable: false
tls_key_file: <empty>

Create only one REGEX extractor with the following, just looking to pull/create something. I know this one will work. It should create and pull something from that log.

Source field: message
Store as field: some_field
Extractor title: some_title

Save and exit

Restart your FileBeat service to pull those logs from the text file mssql.logs and monitor log/s coming in from that MSSQL Device.

I don’t think you have any problems from this point.

Now look for the new field called some_field. If you don’t see it right away give it a second.

Also check Elasticsearch/Graylog log files for anything that would pertain to this issue.

NOTE: If you cannot remove the old BEAT input then create a new one with port number 5055 or something like that and reconfigure FileBeat to point to the new port, restart service and check for logs going through the new INPUT/port number.

To sum it up.

This will eliminate any forgotten configuration so were both seeing the same output of what’s happening. Maybe something got stuck, and this might unstick it, really guessing here but for troubleshooting it would not hurt. I’m also trying to find out if it could be a bug. so wee need to be sure just basic stuff is configured which would be appreciated.

Try not to use any old messages and grab the new ones being ingested.

If you have to make any other configuration besides the instructions above, please post it here.

I think this is the same issue where we were working on key_value() to extract the fields and that wouldn’t work either… we had arrived at the point where I had noted to add a few more debug messages into the pipeline… did those show as being added to the message? From what we were seeing the work was being done but it was not being written to Elasticsearch… That is odd behavior and the behavior in this thread where it doesn’t like simple regex searches is strange as well…

Can you run one of these (depending on what OS you have):

dpkg -l | grep -E ".*(elasticsearch|graylog|mongo).*"
yum list installed | grep -E ".*(elasticsearch|graylog|mongo).*"

Also which Java version you are using?

$ java -version
openjdk version "1.8.0_312"
OpenJDK Runtime Environment (build 1.8.0_312-8u312-b07-0ubuntu1~20.04-b07)
OpenJDK 64-Bit Server VM (build 25.312-b07, mixed mode)

1 Like

Hi @tmacgbay ,

Thank you for joining this thread.

did those show as being added to the message?

I will do it right now(the addition of that mentioned lines).

The commands output that you have asked.

Screenshot from 2022-04-08 10-35-19

Hi @gsmith,

I did all the steps in the exact same way, that you have explained, and the issue persists.

The regex extractor still didn’t work when I hit the button “try” (it keep saying that the regex does not match) like the examples I posted it before, and even if I save the configuration and generate new events from the Filebeat side, the regex extractor still doesn’t work.

I will deploy a new Graylog instance, but seems to me, the root cause isn’t the server or the configurations, and why I’m saying that? because when the input is a Syslog input, the regex work, so, this seems to be more related to this Sidecar/Filebeat input, at least for me.


I had a snap and tried something that I was suspecting, but didn’t give it a try, I was checking that the file my script creates, was using the encoding UTF-16 LE BOM, I just tried to convert to UTF-8, and boom, the field some_field that the extractor should be creating, started to appear in some messages, I still doing more tests, but it was the first time that I was able to extract a field, and this is working in the “simulator” too.



So that explains it. When I copied your examples of logs to a text file and then sent them to Graylog it worked :thinking:

1 Like

Update: I added some code on the Powershell script to convert the log file to ASCII encoding and all the tests are successful until now.

I will continue to do more tests but seems that the encoding was the root cause.

Thank you @gsmith and @tmacgbay to be so helpful, you guys rock!
As I have said I have learned a lot this week with you guys, thanks again.

1 Like