AWS-Kinesis/cloudwatch throwing error in graylog logs -Not in GZIP format

Hi All,

I have installed graylog 3.3.0 with elastic 6.8.1 and mongo 3.6.17 versions in my env and I have setup AWS Kinesis/cloudwatch input and I did setup of kinesis stream to receive cloudwatch events and process them into graylog.

I can see in cloudwatch events are generated and below is the message I can see it in graylog,

aws_kinesis_stream
newstream

message
{“version”:“0”,“id”:“7683d207-b31c-41778f685865”,“detail-type”:“CloudWatch Alarm State Change”,“source”:“aws.cloudwatch”,“account”:“0873467805”,“time”:“2020-07-08T14:37:13Z”,“region”:“us-east-1”,“resources”:[“arn:aws:cloudwatch:us-east-1:0873467805:alarm:test-demo-web cpu utilization > 1%”],“detail”:{“alarmName”:“test-demo-web cpu utilization > 1%”,“state”:{“value”:“OK”,“reason”:“Threshold Crossed: 1 out of the last 1 datapoints [1.960655737704196 (08/07/20 14:32:00)] was not greater than the threshold (80.0) (minimum 1 datapoint for ALARM -> OK transition).”,“reasonData”:"{“version”:“1.0”,“queryDate”:“2020-07-08T14:37:13.251+0000”,“startDate”:“2020-07-08T14:32:00.000+0000”,“statistic”:“Average”,“period”:60,“recentDatapoints”:[1.960655737704196],“threshold”:80.0}",“timestamp”:“2020-07-08T14:37:13.254+0000”},“previousState”:{“value”:“INSUFFICIENT_DATA”,“reason”:“Insufficient Data: 1 datapoint was unknown.”,“reasonData”:"{“version”:“1.0”,“queryDate”:“2020-07-08T14:34:13.248+0000”,“statistic”:“Average”,“period”:60,“recentDatapoints”:,“threshold”:80.0}",“timestamp”:“2020-07-08T14:34:13.252+0000”},“configuration”:{“description”:“CPU Utilization of test-demo-web ec2 instance > 2%”,“metrics”:[{“id”:“14b99746-cbb-67dfa817a1cc”,“metricStat”:{“metric”:{“namespace”:“AWS/EC2”,“name”:“CPUUtilization”,“dimensions”:{“InstanceId”:“i-06a9k903d6b826b6”}},“period”:60,“stat”:“Average”},“returnData”:true}]}}}

source
aws-kinesis-raw-logs

timestamp
2020-07-08T14:37:13.649Z

and I manually changed the aws_message_type from KINESIS_RAW TO KINESIS_CLOUDWATCH_RAW in the input which I created and I tried to generated new events and events are seen in cloudwatch but in graylog log I am getting the below error,

image

2020-07-08 13:02:44,355 INFO    [DiagnosticEventLogger] - Current thread pool executor state: ExecutorStateEvent(executorName=SchedulerThreadPoolExecutor, currentQueueSize=0, activeThreads=0, coreThreads=0, leasesOwned=1, largestPoolSize=2, maximumPoolSize=2147483647) - {}
2020-07-08 13:03:02,266 ERROR   [KinesisShardProcessorFactory$KinesisShardProcessor] - Could not read Kinesis record from stream [teststream] - {}
java.util.zip.ZipException: Not in GZIP format
        at java.util.zip.GZIPInputStream.readHeader(GZIPInputStream.java:165) ~[?:1.8.0_252]
        at java.util.zip.GZIPInputStream.<init>(GZIPInputStream.java:79) ~[?:1.8.0_252]
        at java.util.zip.GZIPInputStream.<init>(GZIPInputStream.java:91) ~[?:1.8.0_252]
        at org.graylog2.plugin.Tools.decompressGzip(Tools.java:239) ~[graylog.jar:?]
        at org.graylog2.plugin.Tools.decompressGzip(Tools.java:227) ~[graylog.jar:?]
        at org.graylog.integrations.aws.transports.KinesisPayloadDecoder.decompressCloudWatchMessages(KinesisPayloadDecoder.java:100) ~[graylog-plugin-integrations-3.3.0.jar:?]
        at org.graylog.integrations.aws.transports.KinesisPayloadDecoder.processMessages(KinesisPayloadDecoder.java:64)
		~[graylog-plugin-integrations-3.3.0.jar:?]
        at org.graylog.integrations.aws.transports.KinesisShardProcessorFactory$KinesisShardProcessor.processRecords(KinesisShardProcessorFactory.java:98) [graylog-plugin-integrations-3.3.0.jar:?]
        at software.amazon.kinesis.lifecycle.ProcessTask.callProcessRecords(ProcessTask.java:200) [graylog-plugin-integrations-3.3.0.jar:?]
        at software.amazon.kinesis.lifecycle.ProcessTask.call(ProcessTask.java:141) [graylog-plugin-integrations-3.3.0.jar:?]
        at software.amazon.kinesis.lifecycle.ShardConsumer.executeTask(ShardConsumer.java:327) [graylog-plugin-integrations-3.3.0.jar:?]
        at software.amazon.kinesis.lifecycle.ShardConsumer.processData(ShardConsumer.java:313) [graylog-plugin-integrations-3.3.0.jar:?]
        at software.amazon.kinesis.lifecycle.ShardConsumer.handleInput(ShardConsumer.java:146) [graylog-plugin-integrations-3.3.0.jar:?]
        at software.amazon.kinesis.lifecycle.ShardConsumerSubscriber.onNext(ShardConsumerSubscriber.java:156) [graylog-plugin-integrations-3.3.0.jar:?]
        at software.amazon.kinesis.lifecycle.ShardConsumerSubscriber.onNext(ShardConsumerSubscriber.java:35) [graylog-plugin-integrations-3.3.0.jar:?]
        at software.amazon.kinesis.lifecycle.NotifyingSubscriber.onNext(NotifyingSubscriber.java:56) [graylog-plugin-integrations-3.3.0.jar:?]
        at software.amazon.kinesis.lifecycle.NotifyingSubscriber.onNext(NotifyingSubscriber.java:27) [graylog-plugin-integrations-3.3.0.jar:?]
        at io.reactivex.internal.util.HalfSerializer.onNext(HalfSerializer.java:45) [graylog-plugin-integrations-3.3.0.jar:?]
        at io.reactivex.internal.subscribers.StrictSubscriber.onNext(StrictSubscriber.java:97) [graylog-plugin-integrations-3.3.0.jar:?]
        at io.reactivex.internal.operators.flowable.FlowableObserveOn$ObserveOnSubscriber.runAsync(FlowableObserveOn.java:400) [graylog-plugin-integrations-3.3.0.jar:?]
        at io.reactivex.internal.operators.flowable.FlowableObserveOn$BaseObserveOnSubscriber.run(FlowableObserveOn.java:176) [graylog-plugin-integrations-3.3.0.jar:?]
        at io.reactivex.internal.schedulers.ExecutorScheduler$ExecutorWorker$BooleanRunnable.run(ExecutorScheduler.java:261) [graylog-plugin-integrations-3.3.0.jar:?]
        at io.reactivex.internal.schedulers.ExecutorScheduler$ExecutorWorker.run(ExecutorScheduler.java:226) [graylog-plugin-integrations-3.3.0.jar:?]

Please correct me if my understanding is wrong and also I was expecting the msg should get parsed automatically.

let me know your thoughts

Regards,
Ganeshbabu R

Taking a quick look into this, in the case where the message type is KINESIS_CLOUDWATCH_RAW or KINESIS_CLOUDWATCH_FLOW_LOGS, we do attempt to unzip prior to processing here in the KinesisPayloadDecoder. We’re doing this because the AWS documentation for Using CloudWatch Logs Subscription Filters (Example 1, Step 8) says:

The Data attribute in a Kinesis record is Base64 encoded and compressed with the gzip format.

What I don’t see is us doing a Base64 decode prior to unzipping, which could be a problem. We will need to take a deeper look and see if this lack of Base64 decoding really is is the root cause of the issue.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

It appears I was mistaken about the Base64 decode.

The difference between KINESIS_RAW and KINESIS_CLOUDWATCH_RAW is that when the message type is set to KINESIS_CLOUDWATCH_RAW, Graylog expects the data it pulls from Kinesis to be zipped. Graylog does not perform additional automatic parsing for the KINESIS_CLOUDWATCH_RAW message type beyond what is already done for the KINESIS_RAW message type.

If your input is producing data using the KINESIS_RAW message type, there is no reason to switch to KINESIS_CLOUDWATCH_RAW message type.