Export to CSV function in Graylog does not respect order

Background

I am logging messages in Graylog2 and I need to evaluate messages based on the order they arrive.
Upon making a search Graylog presents me the information ordered by Timestamp, which is very useful:

As you can see there are 2 types of timestamps marked:

  • The Timestamps on the left, in descending order, given by Graylog2
  • A unique timestamp ( on the right, I just marked the first one, the others I don’t care ) for each message

Problem

So, after visualizing this data, I decided I wanted to export it to CSV format, to run a few scripts on it. I did the following:

More actions .-> Export as CSV

The probem here is that the CSV file I have, is completely unordered. Remeber the unique timestamp on the right that I marked with yellow color?

It should be in the first row, but it’s not:

It is in row 188.

Questions

  • Is this normal behavior?
  • How to export a search to a CSV keeping its filters and order intact?

For Graylog, that’s an opaque blob in the “message” field, not a unique timestamp. If you want to further process the JSON blob from the “message” field, you’ll have to extract it using a JSON extractor or a processing pipeline rule.

Oh, I am aware of that. I don’t care about that field, I am merely using it to identify where a message should be. If in the web interface the message with that specific ID is in row 1, then in the CSV I expected the same.

As for the issue you linked to, I take it Graylog2 is using the _doc flag. I have a few questions:

  • Does it mean it is incapable of giving me a CSV in the correct order because it is “working as intended”?
  • Do I need to make a REST GET request to ElasticSearch to get the CSV file I need in the order I expect?
  • Given that the issue was created in 2016, the fact that it is still open means that someone is working on a fix for it or that it is at least acknowledged as a bug?

From a user standpoint, I must say this behavior is rather counter-intuitive.

Yes, correct.

Either you use the Elasticsearch HTTP API directly (although it doesn’t support CSV) or you post-process the CSV file if order is important to your use case.

Thanks. Could you give me some input regarding my last question?

The behavior won’t be changed. The “documentation” tag on the issue indicates that the Graylog documentation will be improved on the subject.

Ahh, thank you for the feedback!

Well …

I coudl really use a hand here …
I did everything, but the CLAssistance is blocking my PR. Would you care to have a look?

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.