Plugin CSV function debug on pipeline


#1

Hi everyone,
i try to use https://github.com/alcampos/graylog-plugin-function-csv from market place. I try to create rules on pipelines who extract value using this addons. Let me explain :slight_smile:

  • I have a filebeat input who receive csv logs file with a “log_type” added field.
  • I have 3 streams for each “log_type” field value
  • Heach stream have a associated pipeline with a rule (on stage 0) using csv addon for parsing

Here is an exemple for one of my 3 rules :

    rule "Logtype_BOUNCE"
when
	true
then
  let csv_fields = "type,timeLogged,timeQueued,orig,rcpt,orcpt,dsnAction,dsnStatus,dsnDiag,dsnMta,bounceCat,srcType,srcMta,dlvType,dlvSourceIp,dlvDestinationIp,dlvEsmtpAvailable,dlvSize,vmta,jobId,envId,queue,vmtaPool,header_x-id,dsnReportingMta";
  let csv_parsed = csv(csv_fields: csv_fields, csv_text: to_string($message.full_message), csv_separator: ",", dummy_value:"not_used");
  set_fields(csv_parsed);
end 

Here is a message exemple (from graylog message field), it’s a comma separated csv generated by a MTA. I anonymised values and somes fields could be empty :

b,2018-05-28 12:30:40+0200,2018-05-25 18:06:02+0200,bounce.XXXXXXXXXXXXXXX@XXXXXXX.fr,XXXXXXXX@XXXXX.fr,,failed,5.0.0 (undefined status),smtp;554 delivery error: dd Sorry your message to XXXXXXXXXX@XXXXXXX.fr cannot be delivered. This account has been disabled or discontinued [#102]. - mta1058.mail.ir2.yahoo.com,mx-eu.mail.am0.yahoodns.net (188.125.69.79),inactive-mailbox,smtp,XXXXX (XX.XX.XX.XX),smtp,XX.XX.XX.XX,XX.XX.XX.XX,"PIPELINING,8BITMIME,SIZE,STARTTLS",10755,XXXXX,,,yahoo-eu.rollup/XXXXX,,,

For debug :

  • Addon installation OK : have the function on my pipeline rules editor
  • nothing on server logs
  • stream woks well i have logs routed on each
  • no errors on rules menu

Processor Status

  • |1| GeoIP Resolver| active|
  • |2| Pipeline Processor| active|
  • |3| AWS Instance Name Lookup| active|
  • |4| Message Filter Chain| active|

Pipeline Simulation:

Simulation results

These are the results of processing the loaded message. Processing took 258 µs.

      0 μs
        Starting message processing
    47 μs
        Message 6275ef11-6272-11e8-b3b2-0050560102ca running [Pipeline 'logtype_BOUNCE' (5b0bcbc3d68d6b035a8f6b16)] for streams [5b0bcbadd68d6b035a8f6afb]
    165 μs
        Enter Stage 0
    175 μs
        Completed Stage 0 for Pipeline 'logtype_BOUNCE' (5b0bcbc3d68d6b035a8f6b16), continuing to next stage
    180 μs
        Exit Stage 0
    188 μs
        Finished message processing

Problem :
CSV addons doesnt extract values and no fields are creates.

Thanks for your help :sunny:


(Jochen) #2

I don’t see any mention of “Logtype_BOUNCE” in the simulation.


#3

I apologize it’s a mistake, i didn’t use the good pipeline I copy paste the simulator of “bounce” Stream. I updated my message.


(Jochen) #4

You can try using the debug() function to print pre- and postconditions inside the rules.


#5

Thanks @jochen ,i have nothing on graylog server logs (/var/log/graylog-server/server.log). I need to restart graylog service ? It’s strange because i have Throughput on my pipeline.


(Jochen) #6

No, the debug() function works without a restart.

Are you sure that your pipieline function is executed at all?


#7

Hi Jochen,
there is a screenshot of my 3 pipelines, very strange because a have throughput as you can see

And there is my BOUNCE rules with debug function :

rule "Logtype_BOUNCE"
when
	true
then
  let csv_fields = "type,timeLogged,timeQueued,orig,rcpt,orcpt,dsnAction,dsnStatus,dsnDiag,dsnMta,bounceCat,srcType,srcMta,dlvType,dlvSourceIp,dlvDestinationIp,dlvEsmtpAvailable,dlvSize,vmta,jobId,envId,queue,vmtaPool,header_x-id,dsnReportingMta";
debug(csv_fields);

  let csv_parsed = csv(csv_fields: csv_fields, csv_text: to_string($message.full_message), csv_separator: ",", dummy_value:"not_used");
  debug(csv_parsed);  
  
  set_fields(csv_parsed);
  
debug(csv_fields);
debug(csv_parsed);
end

It is the good for using debug function ?

Have a good day.


(Jochen) #8

Yes, this would print the (String) values of the arguments in the Graylog node’s logs.

Make sure that you don’t suppress logs on INFO level in your log4j2.xml configuration file.


#9

It’s a fresh install, there is the log4J file :

     <?xml version="1.0" encoding="UTF-8"?>
    <Configuration packages="org.graylog2.log4j" shutdownHook="disable">
        <Appenders>
            <RollingFile name="rolling-file" fileName="/var/log/graylog-server/server.log" filePattern="/var/log/graylog-server/server.log.%i.gz">
                <PatternLayout pattern="%d{yyyy-MM-dd'T'HH:mm:ss.SSSXXX} %-5p [%c{1}] %m%n"/>
                <Policies>
                    <SizeBasedTriggeringPolicy size="50MB"/>
                </Policies>
                <DefaultRolloverStrategy max="10" fileIndex="min"/>
            </RollingFile>

            <!-- Internal Graylog log appender. Please do not disable. This makes internal log messages available via REST calls. -->
            <Memory name="graylog-internal-logs" bufferSize="500"/>
        </Appenders>
        <Loggers>
            <!-- Application Loggers -->
            <Logger name="org.graylog2" level="info"/>
            <Logger name="com.github.joschi.jadconfig" level="warn"/>
            <!-- This emits a harmless warning for ActiveDirectory every time which we can't work around :( -->
            <Logger name="org.apache.directory.api.ldap.model.message.BindRequestImpl" level="error"/>
            <!-- Prevent DEBUG message about Lucene Expressions not found. -->
            <Logger name="org.elasticsearch.script" level="warn"/>
            <!-- Disable messages from the version check -->
            <Logger name="org.graylog2.periodical.VersionCheckThread" level="off"/>
            <!-- Suppress crazy byte array dump of Drools -->
            <Logger name="org.drools.compiler.kie.builder.impl.KieRepositoryImpl" level="warn"/>
            <!-- Silence chatty natty -->
            <Logger name="com.joestelmach.natty.Parser" level="warn"/>
            <!-- Silence Kafka log chatter -->
            <Logger name="kafka.log.Log" level="warn"/>
            <Logger name="kafka.log.OffsetIndex" level="warn"/>
            <!-- Silence useless session validation messages -->
            <Logger name="org.apache.shiro.session.mgt.AbstractValidatingSessionManager" level="warn"/>
            <Root level="warn">
                <AppenderRef ref="rolling-file"/>
                <AppenderRef ref="graylog-internal-logs"/>
            </Root>
        </Loggers>
    </Configuration>


(Jochen) #10

2 posts were split to a new topic: Unknown Beats protocol


(Jochen) #11

The processing pipeline is logging on org.graylog.plugins.pipelineprocessor on the INFO log level, but the default log level threshold is WARN in your setup.


#12

change as follow :

<?xml version="1.0" encoding="UTF-8"?>
    <!-- Internal Graylog log appender. Please do not disable. This makes internal log messages available via REST calls. -->
    <Memory name="graylog-internal-logs" bufferSize="500"/>
</Appenders>
<Loggers>
    <!-- Application Loggers -->
    <Logger name="org.graylog2" level="info"/>
    <Logger name="com.github.joschi.jadconfig" level="warn"/>
    <!-- This emits a harmless warning for ActiveDirectory every time which we can't work around :( -->
    <Logger name="org.apache.directory.api.ldap.model.message.BindRequestImpl" level="error"/>
    <!-- Prevent DEBUG message about Lucene Expressions not found. -->
    <Logger name="org.elasticsearch.script" level="warn"/>
    <!-- Disable messages from the version check -->
    <Logger name="org.graylog2.periodical.VersionCheckThread" level="off"/>
    <!-- Suppress crazy byte array dump of Drools -->
    <Logger name="org.drools.compiler.kie.builder.impl.KieRepositoryImpl" level="warn"/>
    <!-- Silence chatty natty -->
    <Logger name="com.joestelmach.natty.Parser" level="warn"/>
    <!-- Silence Kafka log chatter -->
    <Logger name="kafka.log.Log" level="warn"/>
    <Logger name="kafka.log.OffsetIndex" level="warn"/>
    <!-- Silence useless session validation messages -->
    <Logger name="org.apache.shiro.session.mgt.AbstractValidatingSessionManager" level="warn"/>
    <Root level="info">
        <AppenderRef ref="rolling-file"/>
        <AppenderRef ref="graylog-internal-logs"/>
    </Root>
</Loggers>

Info level seems to work :

root@graylog:~# tailf /var/log/graylog-server/server.log
2018-05-30T11:08:08.499+02:00 INFO  [HttpServer] [HttpServer] Started.
2018-05-30T11:08:08.499+02:00 INFO  [JerseyService] Started REST API at <http://XX.XX.XX.XX:9000/api/>
2018-05-30T11:08:08.500+02:00 INFO  [JerseyService] Started Web Interface at <http://XX.XX.XX.XX:9000/>
2018-05-30T11:08:08.500+02:00 INFO  [ServiceManagerListener] Services are healthy
2018-05-30T11:08:08.501+02:00 INFO  [ServerBootstrap] Services started, startup times in ms: {OutputSetupService [RUNNING]=17, BufferSynchronizerService [RUNNING]=27, KafkaJournal [RUNNING]=43, InputSetupService [RUNNING]=116, ConfigurationEtagService [RUNNING]=130, JournalReader [RUNNING]=131, StreamCacheService [RUNNING]=165, PeriodicalsService [RUNNING]=167, LookupTableService [RUNNING]=320, JerseyService [RUNNING]=11286}
2018-05-30T11:08:08.504+02:00 INFO  [ServerBootstrap] Graylog server up and running.
2018-05-30T11:08:08.504+02:00 INFO  [InputSetupService] Triggering launching persisted inputs, node transitioned from Uninitialized [LB:DEAD] to Running [LB:ALIVE]
2018-05-30T11:08:08.525+02:00 INFO  [InputStateListener] Input [Beats/5b07cd46d68d6b60a6a6471c] is now STARTING
2018-05-30T11:08:08.577+02:00 WARN  [NettyTransport] receiveBufferSize (SO_RCVBUF) for input BeatsInput{title=XXXXXX-ACCT, type=org.graylog.plugins.beats.BeatsInput, nodeId=null} should be 1048576 but is 212992.
2018-05-30T11:08:08.580+02:00 INFO  [InputStateListener] Input [Beats/5b07cd46d68d6b60a6a6471c] is now RUNNING

Streams --> Pipelines ares stills actives (i have Throughput on my stream) and my pipelines areslinked to my stream. But no pipelines debug info on logs :confused:


(Jochen) #13

Try adding the following rule to a pipeline connected to one or more streams:

rule "debug-marker"
when
  true
then
  debug("******** TEST ********");
end

If the logging works, this will unconditionally print a log line with the content “******** TEST ********” in the Graylog node’s logs.

Additionally, you have to restart Graylog if you’ve modified the log4j2.xml file.


Unknown Beats protocol
#14

Still nothing on graylog logs …crazy.

I have logs on my ACCT stream :

Pipeline is linked to stream :

My ACCT pipeline with only 1 stage and the debug rules :

The debug rules (used in 2 other pipelines) :

My log4j conf :

<?xml version="1.0" encoding="UTF-8"?>
<Configuration packages="org.graylog2.log4j" shutdownHook="disable">
    <Appenders>
        <RollingFile name="rolling-file" fileName="/var/log/graylog-server/server.log" filePattern="/var/log/graylog-server/server.log.%i.gz">
            <PatternLayout pattern="%d{yyyy-MM-dd'T'HH:mm:ss.SSSXXX} %-5p [%c{1}] %m%n"/>
            <Policies>
                <SizeBasedTriggeringPolicy size="50MB"/>
            </Policies>
            <DefaultRolloverStrategy max="10" fileIndex="min"/>
        </RollingFile>

        <!-- Internal Graylog log appender. Please do not disable. This makes internal log messages available via REST calls. -->
        <Memory name="graylog-internal-logs" bufferSize="500"/>
    </Appenders>
    <Loggers>
        <!-- Application Loggers -->
        <Logger name="org.graylog2" level="info"/>
        <Logger name="com.github.joschi.jadconfig" level="warn"/>
        <!-- This emits a harmless warning for ActiveDirectory every time which we can't work around :( -->
        <Logger name="org.apache.directory.api.ldap.model.message.BindRequestImpl" level="error"/>
        <!-- Prevent DEBUG message about Lucene Expressions not found. -->
        <Logger name="org.elasticsearch.script" level="warn"/>
        <!-- Disable messages from the version check -->
        <Logger name="org.graylog2.periodical.VersionCheckThread" level="off"/>
        <!-- Suppress crazy byte array dump of Drools -->
        <Logger name="org.drools.compiler.kie.builder.impl.KieRepositoryImpl" level="warn"/>
        <!-- Silence chatty natty -->
        <Logger name="com.joestelmach.natty.Parser" level="warn"/>
        <!-- Silence Kafka log chatter -->
        <Logger name="kafka.log.Log" level="warn"/>
        <Logger name="kafka.log.OffsetIndex" level="warn"/>
        <!-- Silence useless session validation messages -->
        <Logger name="org.apache.shiro.session.mgt.AbstractValidatingSessionManager" level="warn"/>
        <Root level="info">
            <AppenderRef ref="rolling-file"/>
            <AppenderRef ref="graylog-internal-logs"/>
        </Root>
    </Loggers>
</Configuration>

Shoud i put every subsystem and log4j in “debug” level ?

Graylog restarted, nothing on server.log :exploding_head:


(Jochen) #15

Check the order of message processors on the System/Configurations page.
The pipeline processor has to come last if you connected the pipelines to streams (because the stream matching has to happen before).


#16

Pipeline debug resolved, my pipeline processor was on the second position ! Thanks @jochen !!

Now i can debug CSV function, i put this debug on my rules and there is a strange “Null” value on message. Perhaps it’s why i have no csv processing.

My rule :

rule "Logtype_ACCT"
when
	true
then
  let csv_fields = "type,timeLogged,timeQueued,orig,rcpt,orcpt,dsnAction,dsnStatus,dsnDiag,dsnMta,bounceCat,srcType,srcMta,dlvType,dlvSourceIp,dlvDestinationIp,dlvEsmtpAvailable,dlvSize,vmta,jobId,envId,queue,vmtaPool,header_From";
  debug("*********csv_fields debug*********");
  debug(csv_fields);
  debug("************$message.full_message debug************");
  debug($message.full_message);
  
  let csv_parsed = csv(csv_fields: csv_fields, csv_text: to_string($message.full_message), csv_separator: ",", dummy_value:"not_used");

  set_fields(csv_parsed);
end 

In my server logs :

2018-05-30T13:10:54.208+02:00 INFO  [Function] PIPELINE DEBUG: *********csv_fields debug*********
    2018-05-30T13:10:54.208+02:00 INFO  [Function] PIPELINE DEBUG: type,timeLogged,timeQueued,orig,rcpt,orcpt,dsnAction,dsnStatus,dsnDiag,dsnMta,bounceCat,srcType,srcMta,dlvType,dlvSourceIp,dlvDestinationIp,dlvEsmtpAvailable,dlvSize,vmta,jobId,envId,queue,vmtaPool,header_From
    2018-05-30T13:10:54.208+02:00 INFO  [Function] PIPELINE DEBUG: ************$message.full_message debug************
    2018-05-30T13:10:54.208+02:00 INFO  [Function] PIPELINE DEBUG: Passed value is NULL.
    2018-05-30T13:10:54.208+02:00 INFO  [Function] PIPELINE DEBUG: *********csv_fields debug*********
    2018-05-30T13:10:54.209+02:00 INFO  [Function] PIPELINE DEBUG: type,timeLogged,timeQueued,orig,rcpt,orcpt,dsnAction,dsnStatus,dsnDiag,dsnMta,bounceCat,srcType,srcMta,dlvType,dlvSourceIp,dlvDestinationIp,dlvEsmtpAvailable,dlvSize,vmta,jobId,envId,queue,vmtaPool,header_From
    2018-05-30T13:10:54.209+02:00 INFO  [Function] PIPELINE DEBUG: ************$message.full_message debug************
    2018-05-30T13:10:54.209+02:00 INFO  [Function] PIPELINE DEBUG: Passed value is NULL.

(Jochen) #17

As far as I can see from your screenshots, there is no “full_message” field. Maybe you want to access the “message” field (via $message.message).


#18

Correct …everything works correctly now !

“$message.full_message” was provide by the code exemple on github :

I will create an issue to avoid other headache.

thanks a lot guys !


(system) #19

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.