Too Many Fields (cont)

@drewmiranda-gl
Following on from Too many fields - Graylog Central (peer support) - Graylog Community

We’ve finally managed to get time/paperwork in place to run the python script. Unfortunately when I run the script I just get some blue text giving me the URL I’ve just typed in, nothing else seems to happen.

I’ve run the script as you demonstrated, doing so without username/password resulted in a 401 error and crash.

python3 es-os-count-field-usage.py --api-url https://username:password@server.domain.com:9200

Looking into this. My assumption is the script naively takes the URL and executes the web requests based on that. I’m working on getting my lab up and running with opensearch TLS (running into some issues that a coworker will help me with tomorrow :slight_smile: )

Sorry for the delay, it took some doing to get a working OpenSearch cluster with TLS :slight_smile:

I added updates to the script:

  • Allow disabling TLS verification (to accept privately signed or self signed certs)
  • Allow specifying a username/password to use for HTTP basic auth
  • Suppress TLS warnings

An example using the updated arguments:

python3 es-os-count-field-usage.py --api-url https://192.168.0.2:9200/ --no-verify --username admin --password admin

change the --api-url, --username, and --password arguments to meet your needs.

1 Like

Sorry @drewmiranda-gl another error. Appears to be an issue with --no-verify.

line 16
Attribute error: Module 'argparse' has no attribute 'BooleanOptionalAction'

I have noticed there are quite a lot of replicated fields within single winlogbeat logs, it looks like a Graylog problem as they’re not documented Beats fields but I can’t find anything in the GL documentation either. I can’t fix this easily in Beats processors or Pipelines as there are just too many logs with this issue (and each are potentially slightly different).

e.g.
winlogbeat_event_code (not documented in Beats)
winlogbeat_winlog_event_id (is documented in Beats)

winlogbeat_event_provider (not documented in Beats)
winlogbeat_winlog_event_provider (is documented in Beats)

I believe the minimum required python is 3.9.

Alternatively you can remove the arg and manually set the verify property in the python file.

Can you clarify what you mean by “not documented in beats”? Beats collect log data but there isn’t any standard or exhaustive list of each and every possible field.

Part of what is happening is that graylog is prepending winlogbeat_ (depends on the beat type, filebeat would be filebeat_) to every field name. So winlogbeat_event_code is originally event_code

Hope that helps.

@drewmiranda-gl we’re using Ubuntu Server 22.04 so the Python version is 3.10.

I understand how Graylog prepends things to the fields, we’ve been running Graylog for some years now.

event_code isn’t a documented beats log field.

winlog_event_id is a documented beats log field.

I was wondering whether event_code is a Graylog addition as it doesn’t appear to be a winlogbeat field. I’m trying to identify where some of these replica fields are coming from.

By default graylog does not add any fields (beyond some meta data fields that start with gl2_). If fields exist it will be for a couple of reasons:

  • the original message contained them
  • a pipeline rule (or extractor) created them

Assuming you don’t have any pipeline/extractors that are adding that field i would bet its coming from the beat agent.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.