Different search results via GUI/REST API

1. Describe your incident:
I’m sending Windows Security Events to Graylog using WinLogBeat. We have to anonymize usernames in the logs. So I created a pipeline, attached it to the stream “All messages” and created rules which replace the usernames with “ANONYMIZED”. This works fine. When I search using the Graylog GUI I see the replacements. But when I use the REST API to find messages I get the usernames, here they are not replaced. I wonder why and how can this be changed.

2. Describe your environment:

  • OS Information:
    Ubuntu 20.04.3 LTS (GNU/Linux 5.4.0-91-generic x86_64)

  • Package Version:
    graylog-4.2-repository/stable,now 1-4 all [installed]
    graylog-integrations-plugins/stable,now 4.2.4-1 all [installed]
    graylog-server/stable,now 4.2.4-1 all [installed]

  • Service logs, configurations, and environment variables:

3. What steps have you already taken to try and solve the problem?
I have really no idea what to do :frowning:

4. How can the community help?
Find an explanation why this happens

Helpful Posting Tips: Tips for Posting Questions that Get Answers [Hold down CTRL and link on link to open tips documents in a separate tab]

It is possible that the message resides more than one stream/index and that either in the GUI or the API you are constraining your search to only show one or the other?

Thanks for your answer. I did not create any stream, I just use the “All messages” stream. Same for indices, I just use the “Default index set”. So it should not be possible that messages reside in more than one stream/index.

It still sounds to me that each search type is accessing a different index. Can you post all your pipeline rules that the message goes through (using the forum tools like </> to make the formatting nice) ? At any time are you using the function route_to_stream() Are you constraining the GUI search and the API search specifically to the All Messages index?

This is the rule for stage 0 ( Messages satisfying at least one rule in this stage, will continue to the next stage.):

rule "Check if user or computer account is given"
when    
    (not contains(to_string($message.winlogbeat_winlog_event_data_SubjectUserName),"$")) and contains(to_string($message.winlogbeat_event_code),"4663")
then    
    set_field("winlogbeat_winlog_event_data_SubjectUserName", "ANONYMIZED");
 end

and here the rule for stage 1:

rule "Anonymize usernames in message field"
when    
    has_field("message")
then    
    let regex = "([\\S\\s]*Account Name:\\s*)(.*)";
    set_field("message", regex_replace(regex,to_string($message.message),"$1ANONYMIZED"));
 end

Using the GUI I can select the stream “All messages” and get the anonymized results.
I use this to access the data using REST API:

curl -i -X POST \
-u 'EventReader:TQlCyGdrrlWuhNVy2sKM' \
-H 'Content-Type: application/json' \
-H 'Accept: text/csv' \
-H 'X-Requested-By: cli' \
'http://graylog.xxx.xxx:9000/api/views/search/messages' -d \
'{ 
  "timerange": [
    "absolute",{
      "from": "2021-12-22T00:00:00.000Z",
      "to": "2021-12-23T18:55:00.000Z"
    }
  ],
  "fields_in_order": [
    "timestamp",
    "ObjectName", 
    "SubjectUserName"
  ]
 }'

If you constrain your API call to the All Messages Stream, what do you get? - I have little experience with the API but based on Jan Doberstein’s post here, it seems you could add the stream as below…

curl -i -X POST \
-u 'EventReader:TQlCyGdrrlWuhNVy2sKM' \
-H 'Content-Type: application/json' \
-H 'Accept: text/csv' \
-H 'X-Requested-By: cli' \
'http://graylog.xxx.xxx:9000/api/views/search/messages' -d \
'{ 
 "streams": [
    "<stream-ID-Here>"
  ],
"timerange": [
    "absolute",{
      "from": "2021-12-22T00:00:00.000Z",
      "to": "2021-12-23T18:55:00.000Z"
    }
  ],
  "fields_in_order": [
    "timestamp",
    "ObjectName", 
    "SubjectUserName"
  ]
 }'

I added the stream to the API call:

curl -i -X POST \
-u 'EventReader:xxx' \
-H 'Content-Type: application/json' \
-H 'Accept: text/csv' \
-H 'X-Requested-By: cli' \
'http://xxx:9000/api/views/search/messages' -d \
'{
 "streams": [

    "000000000000000000000001"

  ],
 
  "timerange": [
    "absolute",{
      "from": "2021-12-27T00:38:00.000Z",
      "to": "2021-12-27T00:39:00.000Z"
    }
  ],
  "fields_in_order": [
    "timestamp",
    "ObjectName", 
    "SubjectUserName"
  ]
 }'

and again got the usernames, not the anonymized data.

In the rules you are replacing winlogbeat_winlog_event_data_SubjectUserName but in the API you are searching for SubjectUserName.

Are you creating a field somewhere called SubjectUserName?

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.