Graylog query with regular expression to filter working hours


(Pascal Basher) #1

My objective here is to determine how many times my PCs are being rebooted but I am only interested if this happens during business hours.

My PCs are all Win7 and I am using NXlog to collect the logs, the log that tells me that a PC has been rebooted are

2018-09-16 02:37:16 testpc.local INFO 6006 The Event log service was stopped.
2018-09-16 08:37:16 testpc.local INFO 6006 The Event log service was stopped.
2018-09-16 01:37:16 testpc.local INFO 6006 The Event log service was stopped.
2018-09-16 02:37:16 testpc.local INFO 6006 The Event log service was stopped.
2018-09-16 10:37:16 testpc.local INFO 6006 The Event log service was stopped.
2018-09-16 11:37:16 testpc.local INFO 6006 The Event log service was stopped.
2018-09-16 13:37:16 testpc.local INFO 6006 The Event log service was stopped.
2018-09-16 19:37:16 testpc.local INFO 6006 The Event log service was stopped.
2018-09-16 23:37:16 testpc.local INFO 6006 The Event log service was stopped.
2018-09-16 10:37:16 testpc.local INFO 6006 The Event log service was stopped.
2018-09-16 16:37:16 testpc.local INFO 6006 The Event log service was stopped.
2018-09-16 02:37:16 testpc.local INFO 6006 The Event log service was stopped.
2018-09-16 22:37:16 testpc.local INFO 6006 The Event log service was stopped.
2018-09-16 17:37:16 testpc.local INFO 6006 The Event log service was stopped.
2018-09-16 21:37:16 testpc.local INFO 6006 The Event log service was stopped.
2018-09-16 15:37:16 testpc.local INFO 6006 The Event log service was stopped.
2018-09-16 10:37:16 testpc.local INFO 6006 The Event log service was stopped.

So this is my sample data.

My though here is that I could query the logs using a regular expression to find the events during business hour. For that I came up with this regex.

\d{4}-\d{2}-\d{2} (08|09|10|11|12|13|14|15|16|17)

Which I have validated here:

And in looking at the documentation for elastic search here
https://www.elastic.co/guide/en/elasticsearch/reference/5.6/query-dsl-regexp-query.html#regexp-syntax

it says that the regex has to match the entire field
"Lucene’s patterns are always anchored. The pattern provided must match the entire string"

So, based on this I added " .* " to my query, which now is

\d{4}-\d{2}-\d{2} (08|09|10|11|12|13|14|15|16|17).*

So in graylog my complete query is:

/^\d{4}-\d{2}-\d{2} (08|09|10|11|12|13|14|15|16|17)/ AND “The Event log service was stopped” AND EventID:6006

But for some reason that I cant understand this is not working for me, it is not returning anything.

My Graylog version is 2.4.5

Any help will be much appreciated.


(Jan Doberstein) #2

If you do not place a field to search before the timestamp it will be searched in the field message, full_message and source but not timestamp.

I would easy that all up a little, adding a processing pipeline that gives you the indicator if the message is withing the working hours or not. That would look like:

rule "Between 9 and 17"
when
  to_date($message.timestamp).hourOfDay >= 9 && 
  to_date($message.timestamp).hourOfDay <= 17
then
  set_field("trigger_alert", true);
end

Then you can search for _exists_:trigger_alert AND EventID:"6006" and whatever you want to add to that search and save your query a complex regex.


(Pascal Basher) #3

Thanks a lot for your super quick response.

I also did the query specifying the filed with is this case is the “message” field. but same result

My apologies I am not familiar with processing pipelines, once added will that apply to my data already in captured ?


(Pascal Basher) #4

I just looked in older post and I see it is not possible to re-process Graylog logs

So in my particular case where I already have the logs, adding the pipeline (thanks for the tip, it seems pretty cook) will not help me.

So I somehow need to make my regex work on the messages field.

Any idea much appreciated


(Pascal Basher) #5

If anyone has any idea why my regex is not working, I will appreciate your input.

For now, it seems to me like Graylog GUI is not able to handle complex regex, but not sure about that

Thanks,


#6

Could You paste the whole message as it is in graylog. Ex:

EventReceivedTime
2018-09-27T08:53:24.047843+02:00
FileName
some/name
SourceModuleName
file
SourceModuleType
im_file
app
app
file
filename
hostname
hostnmae
log
Here will be some log
message
{“EventReceivedTime”:“2018-09-27T08:53:24.047843+02:00”,“SourceModuleName”:“asdasd”,“SourceModuleType”:“im_file”,“FileName”:“asdasdasd”,“log”:“2018-09-27 08:53:23:516asdasdasd”,“hostname”:“asdasd”}
requestId
21323
sessionId
123213213
source
source
timestamp
2018-09-27T06:53:24.057Z


(Pascal Basher) #7

Here is the message:

AccountName
SYSTEM
AccountType
User
Category
Task engine properly shut down
Channel
Microsoft-Windows-TaskScheduler/Operational
Domain
NT AUTHORITY
EventID
318
EventReceivedTime
2018-09-27 09:49:31
EventType
INFO
Keywords
-9223372036854776000
Opcode
Stop
OpcodeValue
2
ProcessID
1076
ProviderGuid
{DE7B24EA-73C8-4A09-985D-5BDADCFA9017}
RecordNumber
236452
Severity
INFO
SeverityValue
2
SourceModuleName
windows_events_logs
SourceModuleType
im_msvistalog
SourceName
Microsoft-Windows-TaskScheduler
Task
318
TaskEngineName
S-1-5-18:NT AUTHORITY\System:Service:
ThreadID
5840
UserID
S-1-5-18
Version
0
full_message
Task Scheduler shutdown Task Engine “S-1-5-18:NT AUTHORITY\System:Service:” process.
host_ip
192.168.100.202
level
6
message
2018-09-27 09:49:30 testpc.local INFO 318 NT AUTHORITY\SYSTEM Task Scheduler shutdown Task Engine “S-1-5-18:NT AUTHORITY\System:Service:” process.
source
testpc.local
timestamp
2018-09-27T14:49:30.000Z

It is a standard Windows log message generated with NXLog

Thanks


#8

I’ve tried on my logs and I believe this is impossible due to construction of elasticsearch indexing. Tried in Kibana, this doesn’t behave predictably.


(Pascal Basher) #9

@madi, thanks for trying this at least I know it is not only me.

To any sysadmin that might find this post via google. the solution I finally implemented is based on the suggestion from @jan, but the rule has to be modified as there is a data type consistency issue, the rule I ended up implementing is:

rule “business hour flag”
when
to_long(to_date($message.timestamp).hourOfDay) >= 9 &&
to_long(to_date($message.timestamp).hourOfDay) <= 21 &&
to_long(to_date($message.timestamp).dayOfWeek) >= 1 &&
to_long(to_date($message.timestamp).dayOfWeek) < 6
then
set_field(“work_hours”, true);
end

Essentially one has to add the to_long to be able to do the comparison.

Still this does not solve my base problem, as I have a ton of logs already in Graylog and I cant apply this pipeline to those, but at least I process new logs as I wanted.

Another problem I have is GL seems to not be respecting TimeZone (in my case “America/New_York”), so I had to do the >= and <= based on UTC which is the default Timezone, so if I would do say:

to_long(to_date($message.timestamp).hourOfDay, “America/Toronto”)

It does not like it and my rule just fails, but I will probably add a different post for that one.


(Jan Doberstein) #10

I guess @malexdno that we need to document better - Graylog is working internally with UTC.

You could open a feature request to be able to work with the timezone in this.


(Pascal Basher) #11

Thanks a lot @jan for your response, I am glad to know I am not going crazy, I have been very confused with reading the documentation and things not working. Documentation is fundamental and I am 100% sure you guys will look into that side of things.

As for timezone, I will create a feature request and in my particular case, it is timezone and be able to manage daylight savings, I took a quick look at the code and it does not seem like a trivial thing to implement as I see a lot of other functionalities are based on UTC. Probably the best way (and I might try this) is a plugin.

Thanks


(system) #12

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.