Rolling back from 3.2.2 to 3.1.4

I’m looking to roll back from 3.2.2 to 3.1.4 following multiple usability and functionality issues with the 3.2.x releases.

Creating a topic to see if there are any caveats or gotchas to this reversion as I’m unable to find much information elsewhere.

1 Like

@rtanay as @Ponet pointed to a quite deep discussion about the usability of this release, it would be nice if you could share you experience which leads you to rolling back.

Thanks in advance.

- Konrad

Absolutely, I’ll break it down into two categories:

Functionality

  • System load during search - This is possibly related to the new behavior where editing the search query triggers a new search/lack of sufficient debounce on that, but I’ve had a single user running a search with a single string across a single day’s worth of logs in a single stream pin the CPUs all 6 of my nodes. I cannot think of a time where simply searching, across any range, had any noticeable impact on load prior to 3.2. A system setting to disable autosearching on query changes would be a bandaid fix.
  • Processing lag - This prompted this topic about reversion. Yesterday around 3pm all 6 nodes simultaneously started processing incoming logs extremely slowly. The backlog per node maxed out around 3,500,000 unprocessed messages, resulting in a log lag at its worst of over 2 hours. While we’ve experienced log lag before under 3.1, that was only during massive log spikes. During this lag period, message volume was normal. Additionally, during previous lag/log spike incidents, all nodes’ CPUs were pinned as they worked overtime to process the backlog. This event, while processed messages fell further and further behind, load on all 6 nodes remained minimal. Load did not appreciably increase nor the backlog begin to see serious processing until 6:40pm.

Usability

  • Missing ‘show surrounding messages’ option - No fewer than 5 of my teammates have asked me what happened to this feature post upgrade. It was quite useful and is sorely missed.
  • Auto-searching on absolute date change - Trying to search for a specific period in the past brings the system to a crawl as selecting the start date immediately starts a search, while the end date is still the current date. The previous functionality where you selected both start and end periods and had to manually press enter in the search bar to run a query again eliminated unnecessarily running the query against a huge chunk of logs. Again, being able to disable this in settings would be good.

Of these issues the inexplicable log lag is the most troubling. Graylog has been an excellent tool up until this point, but several hours of “flying blind” without current logs for no apparent reason is unacceptable.

2 Likes

@rtanay Thanks a lot for your detailed feedback.

I can say for certain that the Usability points will be addressed in the next patch respectively in the next
major version.

About functionality I can make no promises, but we plan to make some performance optimizations.

I hope you will be able to upgrade to the next major version then.

Best regards,
Konrad

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.