Plugin CSV function debug on pipeline

Hi everyone,
i try to use https://github.com/alcampos/graylog-plugin-function-csv from market place. I try to create rules on pipelines who extract value using this addons. Let me explain :slight_smile:

  • I have a filebeat input who receive csv logs file with a “log_type” added field.
  • I have 3 streams for each “log_type” field value
  • Heach stream have a associated pipeline with a rule (on stage 0) using csv addon for parsing

Here is an exemple for one of my 3 rules :

    rule "Logtype_BOUNCE"
when
	true
then
  let csv_fields = "type,timeLogged,timeQueued,orig,rcpt,orcpt,dsnAction,dsnStatus,dsnDiag,dsnMta,bounceCat,srcType,srcMta,dlvType,dlvSourceIp,dlvDestinationIp,dlvEsmtpAvailable,dlvSize,vmta,jobId,envId,queue,vmtaPool,header_x-id,dsnReportingMta";
  let csv_parsed = csv(csv_fields: csv_fields, csv_text: to_string($message.full_message), csv_separator: ",", dummy_value:"not_used");
  set_fields(csv_parsed);
end 

Here is a message exemple (from graylog message field), it’s a comma separated csv generated by a MTA. I anonymised values and somes fields could be empty :

b,2018-05-28 12:30:40+0200,2018-05-25 18:06:02+0200,bounce.XXXXXXXXXXXXXXX@XXXXXXX.fr,XXXXXXXX@XXXXX.fr,,failed,5.0.0 (undefined status),smtp;554 delivery error: dd Sorry your message to XXXXXXXXXX@XXXXXXX.fr cannot be delivered. This account has been disabled or discontinued [#102]. - mta1058.mail.ir2.yahoo.com,mx-eu.mail.am0.yahoodns.net (188.125.69.79),inactive-mailbox,smtp,XXXXX (XX.XX.XX.XX),smtp,XX.XX.XX.XX,XX.XX.XX.XX,"PIPELINING,8BITMIME,SIZE,STARTTLS",10755,XXXXX,,,yahoo-eu.rollup/XXXXX,,,

For debug :

  • Addon installation OK : have the function on my pipeline rules editor
  • nothing on server logs
  • stream woks well i have logs routed on each
  • no errors on rules menu

Processor Status

  • |1| GeoIP Resolver| active|
  • |2| Pipeline Processor| active|
  • |3| AWS Instance Name Lookup| active|
  • |4| Message Filter Chain| active|

Pipeline Simulation:

Simulation results

These are the results of processing the loaded message. Processing took 258 µs.

      0 μs
        Starting message processing
    47 μs
        Message 6275ef11-6272-11e8-b3b2-0050560102ca running [Pipeline 'logtype_BOUNCE' (5b0bcbc3d68d6b035a8f6b16)] for streams [5b0bcbadd68d6b035a8f6afb]
    165 μs
        Enter Stage 0
    175 μs
        Completed Stage 0 for Pipeline 'logtype_BOUNCE' (5b0bcbc3d68d6b035a8f6b16), continuing to next stage
    180 μs
        Exit Stage 0
    188 μs
        Finished message processing

Problem :
CSV addons doesnt extract values and no fields are creates.

Thanks for your help :sunny:

I don’t see any mention of “Logtype_BOUNCE” in the simulation.

I apologize it’s a mistake, i didn’t use the good pipeline I copy paste the simulator of “bounce” Stream. I updated my message.

You can try using the debug() function to print pre- and postconditions inside the rules.

1 Like

Thanks @jochen ,i have nothing on graylog server logs (/var/log/graylog-server/server.log). I need to restart graylog service ? It’s strange because i have Throughput on my pipeline.

No, the debug() function works without a restart.

Are you sure that your pipieline function is executed at all?

Hi Jochen,
there is a screenshot of my 3 pipelines, very strange because a have throughput as you can see

And there is my BOUNCE rules with debug function :

rule "Logtype_BOUNCE"
when
	true
then
  let csv_fields = "type,timeLogged,timeQueued,orig,rcpt,orcpt,dsnAction,dsnStatus,dsnDiag,dsnMta,bounceCat,srcType,srcMta,dlvType,dlvSourceIp,dlvDestinationIp,dlvEsmtpAvailable,dlvSize,vmta,jobId,envId,queue,vmtaPool,header_x-id,dsnReportingMta";
debug(csv_fields);

  let csv_parsed = csv(csv_fields: csv_fields, csv_text: to_string($message.full_message), csv_separator: ",", dummy_value:"not_used");
  debug(csv_parsed);  
  
  set_fields(csv_parsed);
  
debug(csv_fields);
debug(csv_parsed);
end

It is the good for using debug function ?

Have a good day.

Yes, this would print the (String) values of the arguments in the Graylog node’s logs.

Make sure that you don’t suppress logs on INFO level in your log4j2.xml configuration file.

It’s a fresh install, there is the log4J file :

     <?xml version="1.0" encoding="UTF-8"?>
    <Configuration packages="org.graylog2.log4j" shutdownHook="disable">
        <Appenders>
            <RollingFile name="rolling-file" fileName="/var/log/graylog-server/server.log" filePattern="/var/log/graylog-server/server.log.%i.gz">
                <PatternLayout pattern="%d{yyyy-MM-dd'T'HH:mm:ss.SSSXXX} %-5p [%c{1}] %m%n"/>
                <Policies>
                    <SizeBasedTriggeringPolicy size="50MB"/>
                </Policies>
                <DefaultRolloverStrategy max="10" fileIndex="min"/>
            </RollingFile>

            <!-- Internal Graylog log appender. Please do not disable. This makes internal log messages available via REST calls. -->
            <Memory name="graylog-internal-logs" bufferSize="500"/>
        </Appenders>
        <Loggers>
            <!-- Application Loggers -->
            <Logger name="org.graylog2" level="info"/>
            <Logger name="com.github.joschi.jadconfig" level="warn"/>
            <!-- This emits a harmless warning for ActiveDirectory every time which we can't work around :( -->
            <Logger name="org.apache.directory.api.ldap.model.message.BindRequestImpl" level="error"/>
            <!-- Prevent DEBUG message about Lucene Expressions not found. -->
            <Logger name="org.elasticsearch.script" level="warn"/>
            <!-- Disable messages from the version check -->
            <Logger name="org.graylog2.periodical.VersionCheckThread" level="off"/>
            <!-- Suppress crazy byte array dump of Drools -->
            <Logger name="org.drools.compiler.kie.builder.impl.KieRepositoryImpl" level="warn"/>
            <!-- Silence chatty natty -->
            <Logger name="com.joestelmach.natty.Parser" level="warn"/>
            <!-- Silence Kafka log chatter -->
            <Logger name="kafka.log.Log" level="warn"/>
            <Logger name="kafka.log.OffsetIndex" level="warn"/>
            <!-- Silence useless session validation messages -->
            <Logger name="org.apache.shiro.session.mgt.AbstractValidatingSessionManager" level="warn"/>
            <Root level="warn">
                <AppenderRef ref="rolling-file"/>
                <AppenderRef ref="graylog-internal-logs"/>
            </Root>
        </Loggers>
    </Configuration>

2 posts were split to a new topic: Unknown Beats protocol

The processing pipeline is logging on org.graylog.plugins.pipelineprocessor on the INFO log level, but the default log level threshold is WARN in your setup.

1 Like

change as follow :

<?xml version="1.0" encoding="UTF-8"?>
    <!-- Internal Graylog log appender. Please do not disable. This makes internal log messages available via REST calls. -->
    <Memory name="graylog-internal-logs" bufferSize="500"/>
</Appenders>
<Loggers>
    <!-- Application Loggers -->
    <Logger name="org.graylog2" level="info"/>
    <Logger name="com.github.joschi.jadconfig" level="warn"/>
    <!-- This emits a harmless warning for ActiveDirectory every time which we can't work around :( -->
    <Logger name="org.apache.directory.api.ldap.model.message.BindRequestImpl" level="error"/>
    <!-- Prevent DEBUG message about Lucene Expressions not found. -->
    <Logger name="org.elasticsearch.script" level="warn"/>
    <!-- Disable messages from the version check -->
    <Logger name="org.graylog2.periodical.VersionCheckThread" level="off"/>
    <!-- Suppress crazy byte array dump of Drools -->
    <Logger name="org.drools.compiler.kie.builder.impl.KieRepositoryImpl" level="warn"/>
    <!-- Silence chatty natty -->
    <Logger name="com.joestelmach.natty.Parser" level="warn"/>
    <!-- Silence Kafka log chatter -->
    <Logger name="kafka.log.Log" level="warn"/>
    <Logger name="kafka.log.OffsetIndex" level="warn"/>
    <!-- Silence useless session validation messages -->
    <Logger name="org.apache.shiro.session.mgt.AbstractValidatingSessionManager" level="warn"/>
    <Root level="info">
        <AppenderRef ref="rolling-file"/>
        <AppenderRef ref="graylog-internal-logs"/>
    </Root>
</Loggers>

Info level seems to work :

root@graylog:~# tailf /var/log/graylog-server/server.log
2018-05-30T11:08:08.499+02:00 INFO  [HttpServer] [HttpServer] Started.
2018-05-30T11:08:08.499+02:00 INFO  [JerseyService] Started REST API at <http://XX.XX.XX.XX:9000/api/>
2018-05-30T11:08:08.500+02:00 INFO  [JerseyService] Started Web Interface at <http://XX.XX.XX.XX:9000/>
2018-05-30T11:08:08.500+02:00 INFO  [ServiceManagerListener] Services are healthy
2018-05-30T11:08:08.501+02:00 INFO  [ServerBootstrap] Services started, startup times in ms: {OutputSetupService [RUNNING]=17, BufferSynchronizerService [RUNNING]=27, KafkaJournal [RUNNING]=43, InputSetupService [RUNNING]=116, ConfigurationEtagService [RUNNING]=130, JournalReader [RUNNING]=131, StreamCacheService [RUNNING]=165, PeriodicalsService [RUNNING]=167, LookupTableService [RUNNING]=320, JerseyService [RUNNING]=11286}
2018-05-30T11:08:08.504+02:00 INFO  [ServerBootstrap] Graylog server up and running.
2018-05-30T11:08:08.504+02:00 INFO  [InputSetupService] Triggering launching persisted inputs, node transitioned from Uninitialized [LB:DEAD] to Running [LB:ALIVE]
2018-05-30T11:08:08.525+02:00 INFO  [InputStateListener] Input [Beats/5b07cd46d68d6b60a6a6471c] is now STARTING
2018-05-30T11:08:08.577+02:00 WARN  [NettyTransport] receiveBufferSize (SO_RCVBUF) for input BeatsInput{title=XXXXXX-ACCT, type=org.graylog.plugins.beats.BeatsInput, nodeId=null} should be 1048576 but is 212992.
2018-05-30T11:08:08.580+02:00 INFO  [InputStateListener] Input [Beats/5b07cd46d68d6b60a6a6471c] is now RUNNING

Streams --> Pipelines ares stills actives (i have Throughput on my stream) and my pipelines areslinked to my stream. But no pipelines debug info on logs :confused:

Try adding the following rule to a pipeline connected to one or more streams:

rule "debug-marker"
when
  true
then
  debug("******** TEST ********");
end

If the logging works, this will unconditionally print a log line with the content “******** TEST ********” in the Graylog node’s logs.

Additionally, you have to restart Graylog if you’ve modified the log4j2.xml file.

1 Like

Still nothing on graylog logs …crazy.

I have logs on my ACCT stream :

image

Pipeline is linked to stream :

image

My ACCT pipeline with only 1 stage and the debug rules :

image

The debug rules (used in 2 other pipelines) :
image

My log4j conf :

<?xml version="1.0" encoding="UTF-8"?>
<Configuration packages="org.graylog2.log4j" shutdownHook="disable">
    <Appenders>
        <RollingFile name="rolling-file" fileName="/var/log/graylog-server/server.log" filePattern="/var/log/graylog-server/server.log.%i.gz">
            <PatternLayout pattern="%d{yyyy-MM-dd'T'HH:mm:ss.SSSXXX} %-5p [%c{1}] %m%n"/>
            <Policies>
                <SizeBasedTriggeringPolicy size="50MB"/>
            </Policies>
            <DefaultRolloverStrategy max="10" fileIndex="min"/>
        </RollingFile>

        <!-- Internal Graylog log appender. Please do not disable. This makes internal log messages available via REST calls. -->
        <Memory name="graylog-internal-logs" bufferSize="500"/>
    </Appenders>
    <Loggers>
        <!-- Application Loggers -->
        <Logger name="org.graylog2" level="info"/>
        <Logger name="com.github.joschi.jadconfig" level="warn"/>
        <!-- This emits a harmless warning for ActiveDirectory every time which we can't work around :( -->
        <Logger name="org.apache.directory.api.ldap.model.message.BindRequestImpl" level="error"/>
        <!-- Prevent DEBUG message about Lucene Expressions not found. -->
        <Logger name="org.elasticsearch.script" level="warn"/>
        <!-- Disable messages from the version check -->
        <Logger name="org.graylog2.periodical.VersionCheckThread" level="off"/>
        <!-- Suppress crazy byte array dump of Drools -->
        <Logger name="org.drools.compiler.kie.builder.impl.KieRepositoryImpl" level="warn"/>
        <!-- Silence chatty natty -->
        <Logger name="com.joestelmach.natty.Parser" level="warn"/>
        <!-- Silence Kafka log chatter -->
        <Logger name="kafka.log.Log" level="warn"/>
        <Logger name="kafka.log.OffsetIndex" level="warn"/>
        <!-- Silence useless session validation messages -->
        <Logger name="org.apache.shiro.session.mgt.AbstractValidatingSessionManager" level="warn"/>
        <Root level="info">
            <AppenderRef ref="rolling-file"/>
            <AppenderRef ref="graylog-internal-logs"/>
        </Root>
    </Loggers>
</Configuration>

image

Shoud i put every subsystem and log4j in “debug” level ?

Graylog restarted, nothing on server.log :exploding_head:

Check the order of message processors on the System/Configurations page.
The pipeline processor has to come last if you connected the pipelines to streams (because the stream matching has to happen before).

1 Like

Pipeline debug resolved, my pipeline processor was on the second position ! Thanks @jochen !!

Now i can debug CSV function, i put this debug on my rules and there is a strange “Null” value on message. Perhaps it’s why i have no csv processing.

My rule :

rule "Logtype_ACCT"
when
	true
then
  let csv_fields = "type,timeLogged,timeQueued,orig,rcpt,orcpt,dsnAction,dsnStatus,dsnDiag,dsnMta,bounceCat,srcType,srcMta,dlvType,dlvSourceIp,dlvDestinationIp,dlvEsmtpAvailable,dlvSize,vmta,jobId,envId,queue,vmtaPool,header_From";
  debug("*********csv_fields debug*********");
  debug(csv_fields);
  debug("************$message.full_message debug************");
  debug($message.full_message);
  
  let csv_parsed = csv(csv_fields: csv_fields, csv_text: to_string($message.full_message), csv_separator: ",", dummy_value:"not_used");

  set_fields(csv_parsed);
end 

In my server logs :

2018-05-30T13:10:54.208+02:00 INFO  [Function] PIPELINE DEBUG: *********csv_fields debug*********
    2018-05-30T13:10:54.208+02:00 INFO  [Function] PIPELINE DEBUG: type,timeLogged,timeQueued,orig,rcpt,orcpt,dsnAction,dsnStatus,dsnDiag,dsnMta,bounceCat,srcType,srcMta,dlvType,dlvSourceIp,dlvDestinationIp,dlvEsmtpAvailable,dlvSize,vmta,jobId,envId,queue,vmtaPool,header_From
    2018-05-30T13:10:54.208+02:00 INFO  [Function] PIPELINE DEBUG: ************$message.full_message debug************
    2018-05-30T13:10:54.208+02:00 INFO  [Function] PIPELINE DEBUG: Passed value is NULL.
    2018-05-30T13:10:54.208+02:00 INFO  [Function] PIPELINE DEBUG: *********csv_fields debug*********
    2018-05-30T13:10:54.209+02:00 INFO  [Function] PIPELINE DEBUG: type,timeLogged,timeQueued,orig,rcpt,orcpt,dsnAction,dsnStatus,dsnDiag,dsnMta,bounceCat,srcType,srcMta,dlvType,dlvSourceIp,dlvDestinationIp,dlvEsmtpAvailable,dlvSize,vmta,jobId,envId,queue,vmtaPool,header_From
    2018-05-30T13:10:54.209+02:00 INFO  [Function] PIPELINE DEBUG: ************$message.full_message debug************
    2018-05-30T13:10:54.209+02:00 INFO  [Function] PIPELINE DEBUG: Passed value is NULL.

As far as I can see from your screenshots, there is no “full_message” field. Maybe you want to access the “message” field (via $message.message).

1 Like

Correct …everything works correctly now !

“$message.full_message” was provide by the code exemple on github :

I will create an issue to avoid other headache.

thanks a lot guys !

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.