Nxlog best practice to avoid 1000 fields in index limit


(Githubkatten) #1

Hi!
What do you think about this problem described below, appreciate any suggestions!

We use nxlog for collecting windows 7: Application, Security, Setup, System logs to graylog.
This creates a lot of fields in graylog, approx. 1115 fields.

Currently we use this nxlog configuration, but it creates a lot of fields in Graylog (automapping ).

<Input eventlogIN>
    Module      im_msvistalog
    <QueryXML>
   <QueryList>                    
     <Query Id="0"> 
        <Select Path="Application">*</Select>
                             <Select Path="Security">*</Select>
                             <Select Path="Setup">*</Select>
                             <Select Path="System">*</Select>
     </Query>
   </QueryList>
  </QueryXML>

</Input>

<Output eventlogOUT>
    Module      om_tcp
    Host        xxxxxxx
    Port        9999999
    OutputType  GELF_TCP
</Output>

<Route eventlog>
    Path        eventlogIN => eventlogOUT
</Route>

We want to avoid getting so many fields in graylog(avoiding hitting 1000 fields limit in one index).

The graylog setup A today is:

• 1 graylog input for all 4 logs: Application, Security, Setup, System.
• 1 stream for all 4 logs.
• 1 index for all 4 logs.

We are considering some alternatives,

Alternative setup B:
• 1 graylog input for all 4 logs.
• 1 stream and sream rule for each of the 4 logs(Field Channel must match exactly Security etc))= 4 streams with stream rule.
• 1 index for each of the 4 streams= 4 index.

When we try setup B, my dedicated Security stream still contains 1115 fields.

Alternative setup C:
• 1 graylog input for each of the 4 logs = 4 separate inputs.
• 1 stream for each of the 4 logs= 1 index per input.
• 1 index for each of the 4 logs= 1 index per input.

Any thoughts about this?
BR Andreas
============Update=============
Thanks for the input on this matter @jochen
We decided on “Alternative Setup B”, and it works.
No fields limit warnings.
We also made a “garbage” stream and index that collects anything that’s not captured by the 4 streams and their rules just to be aware of anything “wrong”. We collect about 2500 W7 nxlog clients and it seems ok after 2 weeks.

@Magneton, yes we will probably collect Power shell logs too, as they might contain logs related to security.
We use the event viewer to create queries, that’s a good input on the matter.
Regarding your question about best practice to have individual index for a stream, it seems like a good practice, but as always it depends on the data :slight_smile:


(Jochen) #2

I’d go with setup B or, if you really need or want to manage 4 inputs, setup C.

As long as the messages are sorted into the 4 separate streams and are removed from the “All messages” stream, this should work (given that the messages in these 4 categories have less than 1000 distinct field names per index).

If everything fails, you can use the Processing Pipeline to remove any fields you don’t need from the messages and reduce the cardinality of message fields that way.


(Jake Smith) #3

Githubkatten,

Why don’t you try to use filters on your NXlog configuration rather than sending everything?

This will cut down the fields and the event noise received by Graylog.

You can use event viewer in Windows to create your queries.

You are also missing logging some important things such as powershell logs.

Just do a search for nxlog github and you will find many examples.

Cheers

Jake


(Jake Smith) #4

Jochen,

Sorry for hijacking the thread, but i wanted to ask.

Is it best practice to have individual indexes for a stream?

Cheers Jake


(system) #5

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.