Difficulties to apply extractors using regex

Its hard to tell, I was just using Beats input and your examples of logs. even thou the data after pc_name= is different it still should of picked something up or part the data after “=” sign.

Did the test right now, but unfortunately, the issue persists, this is just odd, and I read a lot of articles, I’m trying to solve this for about a week already, lol.
As the guys from OSCP, I need to try harder, hahaha.

1 Like

Let me get someone else in here. Perhaps we can put more eyes on this.
For now Ill do some research.

@tmacgbay :smiley:


Thank you very much for your efforts to help me, I hope someday to retribute with some help too.
I learned a lot from your explanations, I really did, and yeah we will continue to try to solve this issue.

All the best @gsmith!

1 Like

No problem and that would be good :+1: Just hanging around would be great.

Control testing

I create two new Graylog servers last night, Create one index and one Beat INPUT all standard stuff. No extra configuration or anything. I sent logs to these two new Graylog servers which you provided and all my Regex configuration worked.

If this is possible, I would like to do a control test. Start from scratch. What needs to happen is the following:

Remove old beat input

Create a new one with the following specs.

---Global box should be highlighted---
Title "FileBeats"
no_beats_prefix: false
number_worker_threads: 12
override_source: <empty>
port: 5044
recv_buffer_size: 1048576
tcp_keepalive: false
tls_cert_file: <empty>
tls_client_auth: disabled
tls_client_auth_cert_file: <empty>
tls_enable: false
tls_key_file: <empty>

Create only one REGEX extractor with the following, just looking to pull/create something. I know this one will work. It should create and pull something from that log.

Source field: message
Store as field: some_field
Extractor title: some_title

Save and exit

Restart your FileBeat service to pull those logs from the text file mssql.logs and monitor log/s coming in from that MSSQL Device.

I don’t think you have any problems from this point.

Now look for the new field called some_field. If you don’t see it right away give it a second.

Also check Elasticsearch/Graylog log files for anything that would pertain to this issue.

NOTE: If you cannot remove the old BEAT input then create a new one with port number 5055 or something like that and reconfigure FileBeat to point to the new port, restart service and check for logs going through the new INPUT/port number.

To sum it up.

This will eliminate any forgotten configuration so were both seeing the same output of what’s happening. Maybe something got stuck, and this might unstick it, really guessing here but for troubleshooting it would not hurt. I’m also trying to find out if it could be a bug. so wee need to be sure just basic stuff is configured which would be appreciated.

Try not to use any old messages and grab the new ones being ingested.

If you have to make any other configuration besides the instructions above, please post it here.

I think this is the same issue where we were working on key_value() to extract the fields and that wouldn’t work either… we had arrived at the point where I had noted to add a few more debug messages into the pipeline… did those show as being added to the message? From what we were seeing the work was being done but it was not being written to Elasticsearch… That is odd behavior and the behavior in this thread where it doesn’t like simple regex searches is strange as well…

Can you run one of these (depending on what OS you have):

dpkg -l | grep -E ".*(elasticsearch|graylog|mongo).*"
yum list installed | grep -E ".*(elasticsearch|graylog|mongo).*"

Also which Java version you are using?

$ java -version
openjdk version "1.8.0_312"
OpenJDK Runtime Environment (build 1.8.0_312-8u312-b07-0ubuntu1~20.04-b07)
OpenJDK 64-Bit Server VM (build 25.312-b07, mixed mode)

1 Like

Hi @tmacgbay ,

Thank you for joining this thread.

did those show as being added to the message?

I will do it right now(the addition of that mentioned lines).

The commands output that you have asked.

Screenshot from 2022-04-08 10-35-19

Hi @gsmith,

I did all the steps in the exact same way, that you have explained, and the issue persists.

The regex extractor still didn’t work when I hit the button “try” (it keep saying that the regex does not match) like the examples I posted it before, and even if I save the configuration and generate new events from the Filebeat side, the regex extractor still doesn’t work.

I will deploy a new Graylog instance, but seems to me, the root cause isn’t the server or the configurations, and why I’m saying that? because when the input is a Syslog input, the regex work, so, this seems to be more related to this Sidecar/Filebeat input, at least for me.


I had a snap and tried something that I was suspecting, but didn’t give it a try, I was checking that the file my script creates, was using the encoding UTF-16 LE BOM, I just tried to convert to UTF-8, and boom, the field some_field that the extractor should be creating, started to appear in some messages, I still doing more tests, but it was the first time that I was able to extract a field, and this is working in the “simulator” too.



So that explains it. When I copied your examples of logs to a text file and then sent them to Graylog it worked :thinking:

1 Like

Update: I added some code on the Powershell script to convert the log file to ASCII encoding and all the tests are successful until now.

I will continue to do more tests but seems that the encoding was the root cause.

Thank you @gsmith and @tmacgbay to be so helpful, you guys rock!
As I have said I have learned a lot this week with you guys, thanks again.

1 Like


If you want to give something back, I’m very interested In your PowerShell command and the setup you did in your environment. I was wondering if you could demonstrate all this here. Not only for my self but others. Sharing is caring :smiley:


For sure my friend, It will be a pleasure to be helpful to others.

# Credentials
$SQLServer = "\server"
$db = "dummy_data"
$user = "dummyuser"
$pwd ="dummypass"

#Select to get the sql data
  $selectdata = "SELECT TOP 5
  FROM [dummy_data].[pbi].[data_security_view] ORDER BY date_time DESC"

$dump = Invoke-Sqlcmd -ServerInstance $SQLServer -Username $user -Password $pwd -Database $db -Query $selectdata 

for ($count=0; $count -lt $dump.Count; $count++)

#Format the events per line     
echo "$("pc_name="+$dump[$count].pc_name) $("user_name="+$dump[$count].user_name) $("file_name="+$dump[$count].file_name) $("operation="+$dump[$count].operation)"  >> "C:\Program Files\Management Console\Logs\noencoding_logs.log"
#Convert the output file to ascii and output to a file
Get-Content "C:\Program Files\Management Console\Logs\noencoding_logs.log" | Out-File -Encoding ascii "C:\Program Files\Management Console\Logs\sql_logs.log" 

Isn’t pretty As I already have said, but it works, and this will give me more time to implement something more elegant in the future.

Any doubts just asking.

And does this also do the text conversion format?

This line, to be more precise.

#Convert the output file to ascii and output to a file
Get-Content "C:\Program Files\Management Console\Logs\noencoding_logs.log" | Out-File -Encoding ascii "C:\Program Files\Management Console\Logs\sql_logs.log"

Nice and thank you. I was hoping you write up a post under here so you get credit for it.

If not I can write something up for you. This way in about 20 years this would be easy to access :smiley:

That’s cool, my English it’s not that good but I can try, can I write something and send it to you to check if it’s ok? I just want to know how it’s the format, like a mini-tutorial or something?

Oh course also feel free to DM me if you like. Since this is Friday, I have about 6 more hours here at work then Im playing Video games next two days :laughing: Actually I have close friends from Switzerland and UK

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.