Search specific string in path in various servers to show only the specific logs

Hello,
I need to colect all the logs that are in one path that is variable according to the client, so i have the logs in /var/www/html/client/site/ so what i have a filebeat_log_file_path that assumes all the log files in the subfolders of this root /var/www/html/*.

My question is how i get only the logs from one client and how can i make a search easy to do like a simple variable that identifies the string in the path of the client.

After that how can i get information of one specific url because one client have more than one site.

i have to perform simple search to this data but without have to use an extensive query.

Please help me to understand what is the best way to handle with this.
And if it’s possible.

Can we safely assume you have a Graylog environment set up and you are receiving all the messages? So hard to help when I don’t know what we are working with.

You should be able to search against source:<hostname> to get messages from a specific host. For the URL, you could do a regex search… though that might make it complicated…

sample messages to compare/contrast would help to come up with solutions…

thanks for you answer
the graylog is working fine, is the way that i want the information that maybe are not easy to get.

i have one message but to made aregex i can’t have the information of the client or the site only in the field of path that is the log .
I’m sorry if don’t explain very well.
There’s a way to filter by the name of the client? adding a search field for example…
Sorry if is a crazy question…

You could add a field into the message either at the Beats client side or add it in while in the pipeline based on message characteristics… In general I think it’s more efficient to put it in with Beats on the client since it’s an extremely small and distributed change. Below is an example of an Exchange Sidecar configuration for beats. There is a lot in there but the key part is:

     fields:
        unique_log_tag: rpc_http

This creates a field called unique_log_tag that contains rpc_http which is along the lines of what you want to do. I am putting in the whole configuration because there are other helpful goodies in there for excluding lines etc…

# Needed for Graylog
fields_under_root: true
fields.collector_node_id: ${sidecar.nodeName}
fields.gl2_source_collector: ${sidecar.nodeId}
output.logstash:
   hosts: 
   - ${user.BeatsInput}
   ssl:
   verification_mode: none
path:
  data: C:\Program Files\Graylog\sidecar\cache\winlogbeat\data
  logs: C:\Program Files\Graylog\sidecar\logs
tags:
 - windows, exchange 
logging.metrics.enabled: false
filebeat:
  inputs:
##### find owa logon and logoff 
    - type: log
      enabled: true
      include_lines: ['auth.owa', 'logoff.owa', 'ClientDisconnect']
      exclude_lines: ['HealthMailbox','^#']
      fields:
        exchange_tag: OWA
      ignore_older: 72h
      paths:
        - C:\Program Files\Microsoft\Exchange Server\V15\Logging\HttpProxy\Owa\*.LOG
#
##### find RPC/HTTP logins 
    - type: log
      enabled: true
      include_lines: ['Exchange.asmx']
      exclude_lines: ['HealthMailbox','^#','^DateTime','AnchorMailboxHeader-SMTP','10.8.[0-9]+.[0-9]+']
      fields:
        exchange_tag: rpc_http
      ignore_older: 72h    
      paths:
        - C:\Program Files\Microsoft\Exchange Server\V15\Logging\HttpProxy\Ews\*.LOG
#
##### find Activesync logins
    - type: log
      enabled: true
#     include_lines: ['Microsoft-Server-ActiveSync','NewConnection=']
      include_lines: ['NewConnection=']
      exclude_lines: ['localhost','^#','^DateTime',',OPTIONS,']
      fields:
        exchange_tag: activesync
      ignore_older: 72h    
      paths:
        - C:\Program Files\Microsoft\Exchange Server\V15\Logging\HttpProxy\Eas\*.LOG

I understand, but what i need is and identifier that makes the correlation of the string of the client in the path and add to the message… for example add the name of the client in the message to create a regex to find by field… is that possible…

Without a sample message and/or fields, all I can tell you is regex can find it… :upside_down_face:

-Tad

thanks a lot for your time to trying to help me.
so the message is for example:
[2022-08-18 17:46:44] security.DEBUG: Guard authenticator does not support the request. {“firewall_key”:“main”,“authenticator”:“App\Security\PartnerAuthenticator”}

we divide for some fiels with grok:

symfony_channel
security
symfony_message
Guard authenticator does not support the request. {“firewall_key”:“main”,“authenticator”:“App\Security\PartnerAuthenticator”}
symfony_severity
DEBUG
symfony_timestamp
2022-08-18 17:46:44

So the message don’t have the field of the client only the path where the message are captured has.
my question is to add the field of the name of the client to identify where the message came from.

so from for example from the filebeat_source
/var/www/html/clientname/clientname-dev/var/log/development.log

we can retrieve the clientname and add to the various messages to create a field to search for the different clients.

the clientname may change according to the log that is captured. we are searching for all logs in the servers in the path /var/www/html/*

is now clear for you what i need?

manny thanks for your patience… i’m very grateful!

Ahh! Now THAT is the information I needed! :smiley: filebeat includes a field called log_file_path

How about a pipeline rule similar to:

rule "logfile client pull"
when
  has_field("log_file_path")
then
  
  let the_client = regex("^\\/var\\/www\\/html\\/(\\w+)", to_string($message.log_file_path));
  set_field("client_name", the_client); 

end

NOTE:

  • In the pipeline any escape \has to be doubled \\ due to the nature of pipelines.
  • The ^ in the beginning of the regex locks in the beginning of the message to keep searching to a minimum

You can create similar rules for other log file paths.

ALSO: when you are posting code or logs to the forum, use the </> tool to make the formatting readable.

1 Like

manny thanks for your answer.
now i’ve the field client name but it can’t be searched.
and the values i think is not well parsed

the rule that i made is:

rule “logfile client pull”
when
has_field(“filebeat_log_file_path”)
then

let client = regex(“^\/var\/www\/html\/(\w+)”, to_string($message.filebeat_log_file_path));
set_field(“client_name”, client);

end

There is an error in my code - this should be:

set_field("client_name", the_client["0"]);

Because you want item zero of the regex search that results in {"0":"elixer"}



On a side note, use the forum tool </> to make your code readable… highlight the code and click on the forum tool to make it look nice… so:


rule “logfile client pull”
when
has_field(“filebeat_log_file_path”)
then

let client = regex(“^/var/www/html/(\w+)”, to_string($message.filebeat_log_file_path));
set_field(“client_name”, client[“0”]);

end


Will look like this (I did some indentation too):

EDITED:

rule "logfile client pull"
when
  has_field("filebeat_log_file_path")
then
  let client = regex("^\\/var\\/www\\/html\\/(\\w+)", to_string($message.filebeat_log_file_path));
  set_field("client_name", client["0"]); 
end

it also usually corrects bad quotes “” to good quotes ""

it didn’t work…
the field don’t appear…

All you changed was adding in the ["0"]?

Make sure all your quotes look like "" and NOT like these: “”

Question:

Does that regex have to be double \\?

Great catch! All escapes have to be doubled in the pipeline! (Original had it)

-Tad

1 Like

yes i have the right quotes and the double slash with this client["0"] the field don’t appear.

the code is like this:

rule "logfile client pull"
when
  has_field("filebeat_log_file_path")
then
  let client = regex("^\\/var\\/www\\/html\\/(\\w+)", to_string($message.filebeat_log_file_path));
  set_field("client_name", client["0"]); 
end

I edited your post using the </> forum tool in the editing box so that the correct formatting and characters show up - please make sure you use it in the future for clarity! :smiley:

OK… I was missing what you are saying - the field shows up with correct data but when you get field data on it is shows as client_name=unknown and it is not searchable. I missed that.

try a field name different than client_name maybe something like c_name just in case that is somehow not working where it is stored in Elasticsearch… alternatively you could try rotating the index where that is stored to see if it starts storing it properly.

thanks,
it didn’t work.
I don’t know what else can i do…
:frowning:

now it seem that is show in the search box… let me see if this output are stable…
Thanks a lot

Great! Mark the solution for future searchers! :smiley:

manny thanks!!!
great hug!