What are analyzed fields?

Hi-
I am completely new to Graylog (and siem systems on the whole) and am trying to understand the searching language but having some difficulty.
I have a few questions and issues that I would be much appreciative if anyone can answer:

  1. regarding this sentence from http://docs.graylog.org/en/2.2/pages/queries.html:

Also note that message, full_message, and source are the only fields that are being analyzed by default. While wildcard searches (using * and ?) work on all indexed fields, analyzed fields will behave a little bit different. See wildcard and regexp queries for details.

I was unsuccessful in finding out what are analyzed fields, and how they work differently (I am guessing it uses regex rather than wildcards but I haven’t seen it written anywhere to confirm this)

What are analyzed fields, how do they work differently from standard fields (are they any differences other than searching) and can I add new fields to be analyzed?

  1. I am unable to use wildcard successfully. for example this would work: “Domain: DOMAINNAME” but this wouldn’t: “Domain: DOMAI*”. are search’s case insensitive (other than keywords such as AND or NOT)?. because that doesn’t work either.

  2. I am collecting logs from windows servers (using nxlog) , and noticed I have some similar search fields such as ‘Domain’ and ‘AccountDomain’ (its actually the only similar pair I have noticed but I assume there are more). are they any difference between the two? can I merge them together?

Thanks in advance for the help.

Hej @gkman

the difference of analyzed and non_analyzedis explained in the elasticsearch index mapping documentation. So you need to create a custom mapping to be able to do full-text-search on other fields than messages, full_message and source.

You also want to look at the searching documentation to understand how to search again.

regards
Jan

Hi @jan
I’m afraid I still didn’t understand.
what is the difference between full-text-search and term-search?

also- is there a reason why I shouldn’t index all my fields?

@gkman,

About item 2) “Note that leading wildcards are disabled to avoid excessive memory consumption! You can enable them in your Graylog configuration file:”

allow_leading_wildcard_searches = true

Change it at your “server.conf” and restart Graylog.

I noticed that- and it’s not the case (the example I gave is not a leading wildcard)
I did find my answer after when realizing that only indexed fields can be used with wildcards.

I am still left with other two unanswered questions:
1- what is the difference between full-text-search and term-search?
2- is there a reason why I shouldn’t index all my fields?

Hi –

Did you ever find an answer for your two questions? I have the same two questions.

Any reason I shouldn’t set all fields to be analyzed?

he @TJgrayD

that is not easy to answer as this might be OK in your environment, but not in others. Learn how Elasticsearch handles that data internal. What makes the difference and what happens if you set a field to analyzed. Maybe the following is a good starter for that.

https://www.elastic.co/guide/en/elasticsearch/guide/current/analysis-intro.html#_when_analyzers_are_used

Thanks, so if I’m reading that correctly, it sounds like by default elasticsearch will analyze all string fields with the standard analyzer, unless modified by a template. So it looks like graylog specifically makes the decision via its default template to not analyze all string fields. Right?

Is this the portion of the default template which disables analyzed fields?

        "store_generic" : {
          "match" : "*",
          "mapping" : {
            "index" : "not_analyzed"

If so, what would I put in my custom template to only analyze all string fields? I want to make sure I don’t analyze non-string fields.

Thanks!

we have described in the documentation how to get the current mapping and how to set your own mapping:

http://docs.graylog.org/en/2.4/pages/configuration/elasticsearch.html#custom-index-mappings

Thanks @jan I’ve read that (several times!), but I think as someone who isn’t as familiar with elasticsearch as you are, and just using it as a means to a logging end, it would be helpful to get some context related to my question.

The examples in the doc linked only show how to change specific fields. What would I do if I want to analyze all string fields?

Thanks!

1 Like