Export to CSV function in Graylog does not respect order


(Pedro Miguel Pereira Serrano Martins) #1

Background

I am logging messages in Graylog2 and I need to evaluate messages based on the order they arrive.
Upon making a search Graylog presents me the information ordered by Timestamp, which is very useful:

As you can see there are 2 types of timestamps marked:

  • The Timestamps on the left, in descending order, given by Graylog2
  • A unique timestamp ( on the right, I just marked the first one, the others I don’t care ) for each message

Problem

So, after visualizing this data, I decided I wanted to export it to CSV format, to run a few scripts on it. I did the following:

More actions .-> Export as CSV

The probem here is that the CSV file I have, is completely unordered. Remeber the unique timestamp on the right that I marked with yellow color?

It should be in the first row, but it’s not:

It is in row 188.

Questions

  • Is this normal behavior?
  • How to export a search to a CSV keeping its filters and order intact?

(Jochen) #2

For Graylog, that’s an opaque blob in the “message” field, not a unique timestamp. If you want to further process the JSON blob from the “message” field, you’ll have to extract it using a JSON extractor or a processing pipeline rule.


(Pedro Miguel Pereira Serrano Martins) #3

Oh, I am aware of that. I don’t care about that field, I am merely using it to identify where a message should be. If in the web interface the message with that specific ID is in row 1, then in the CSV I expected the same.

As for the issue you linked to, I take it Graylog2 is using the _doc flag. I have a few questions:

  • Does it mean it is incapable of giving me a CSV in the correct order because it is “working as intended”?
  • Do I need to make a REST GET request to ElasticSearch to get the CSV file I need in the order I expect?
  • Given that the issue was created in 2016, the fact that it is still open means that someone is working on a fix for it or that it is at least acknowledged as a bug?

From a user standpoint, I must say this behavior is rather counter-intuitive.


(Jochen) #4

Yes, correct.

Either you use the Elasticsearch HTTP API directly (although it doesn’t support CSV) or you post-process the CSV file if order is important to your use case.


(Pedro Miguel Pereira Serrano Martins) #5

Thanks. Could you give me some input regarding my last question?


(Jochen) #6

The behavior won’t be changed. The “documentation” tag on the issue indicates that the Graylog documentation will be improved on the subject.


(Pedro Miguel Pereira Serrano Martins) #7

Ahh, thank you for the feedback!


(Pedro Miguel Pereira Serrano Martins) #8

Well …

I coudl really use a hand here …
I did everything, but the CLAssistance is blocking my PR. Would you care to have a look?


(system) #9

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.