A few observations on upgrading from 2.5 to 3.0

So here’s a few fun observations that I ran into when upgrading:

1: If you’re using Beats input, you can no longer use PKCS8 keys, the “deprecated” older beats input won’t recognise them and will merrily start blowing errors all over the place when something connects.
2: Settings we use for ES (768 max conns, 128 per route, output batch of 4096) that worked absolutely fine in 2.5 are now causing bulk indexing timeout exceptions, settings had to be tweaked down to achieve similar performance
3: Graylog now runs as an unprivileged user (graylog) which meant it couldn’t read certificates from our certificate store since that user isn’t in the group allowed read access, so you’ll have to pay attention to that.

Other than that it was relaively smooth sailing :slight_smile: Nice work!

3 Likes

Wait what?

Aww, heck no. Dang. That’s gonna hurt. A lot. For people with clusters, let alon3 large clusters.

But what you’re saying is that it does not apply to the Graylog server itself? So now the two follow separate paths for keys and certs? Time to dig into the manuals.

2: Settings we use for ES (768 max conns, 128 per route, output batch of 4096) that worked absolutely fine in 2.5 are now causing bulk indexing timeout exceptions, settings had to be tweaked down to achieve similar performance

Which direction did you tweak them into? :slight_smile:

3: Graylog now runs as an unprivileged user (graylog) which meant it couldn’t read certificates from our certificate store since that user isn’t in the group allowed read access, so you’ll have to pay attention to that.

You weren’t running it unprivileged to begin with?

:heart:

1 Like

Graylog now runs as an unprivileged user (graylog) which meant it couldn’t read certificates from our certificate store since that user isn’t in the group allowed read access, so you’ll have to pay attention to that.

True for the Docker Release - we will include that in the 2.5 images soon too.

This is actually the deb package (I should add :D). On my servers we have a specific group that can read private keys from /etc/ssl/private - root obviously can do whatever, but for unprivileged users we have to add the user to that group. Since graylog ran fine before I didn’t see it until I noticed the input was at 0 msg/sec for about 10 minutes so a bit of log spelunking showed that Beats insisted the cert didn’t exist so it generated a self-signed one.

1 Like

Correct, the beats input can be assigned it’s own set of certs - I’ve had this issue before (around filebeat 5.x) where it didn’t want a PEM key, but insisted on a PKCS8 key - some time after, it stopped giving a shit and could deal with both.

Graylog beats input initially didn’t want to eat a PEM key, but did fine on a PKCS8 one, and that now seems to have reversed itself. Only tested this on the deprecated Beats input (the one that came with 2.5) because I’ve got too many systems still assuming fields_under_root is false (for the new Beats input it needs to be true).

Down. 64 max conns, 8 per route, batch size 1024 - seems to work fine so far :slight_smile:

Nah, didn’t see the need for it - and I was lazy :slight_smile: And usually if something doesn’t run unprivileged by default, there’s a reason for it that at the time I didn’t want to try to get into :smiley:

1 Like

Laziness by the developer? :stuck_out_tongue: Not saying that’s the case here by the way…

Same reason people demand that you disable SELinux to run their software. *shudder*

Here’s one more, probably subjective, but it feels like the web UI has gotten sluggish, it often will take a few seconds to load for example the search page (even though the search itself completes in a few ms), and often on configuration pages (that pull data out of Mongo), but no errors in any logs, and mongodb is working fine.

This is actually the deb package (I should add :D). On my servers we have a specific group that can read private keys from /etc/ssl/private - root obviously can do whatever, but for unprivileged users we have to add the user to that group.

Actually we did not change on that front - but nice to have that in mind if a customer comes around with the issue

we already fixed that - upcoming bug fix releases will include faster UI.

2 Likes

I have to admit that setting a larger heap size actually made this work much better - I forgot to alter the defaults so was running on a 1Gb heap which seems to have slowed things down some.

1 Like

Could you show me the error you are seeing?
Does this only appear on the old Beats input?

I would but my Graylog server logs have rotated away so I can’t find the exact error message. I only tested this on the “deprecated” Beats input; I’ve used the PKCS8 key on it with Graylog 2.5 (generated by taking a PEM key and doing a PKCS8 conversion with openssl). When I started this same setup on 3.0 it insisted there was no valid key in the PKCS8 file (not sure about the exact error message).

When I replaced that key with the normal PEM version, everything started working.

Oh hang on! I think I misunderstood you… Do you mean using the old BEATS input on an upgraded-to-3.0 Graylog? So apparently the official 3.0 BEATS input has no issues with the PKCS#8 keys, right?

Yes. And don’t know :smiley: Didn’t try the “new” beats input yet :slight_smile:

ACK! That makes life a little easier then, because it means my work instructions for setting up certificates and keys are still fully valid for 3.0. You just need to use the new BEATS input.

This happens the first time you visit a certain page after a restart.
I’ve fixed this in master and we’re likely gonna backport the fix to 3.0.1.

1 Like

FWIW, I just tested the deprecated Beats input with both an PKCS#1 and PKCS#8 key,
and it worked fine.
It would’ve surprised me to see only one of the Inputs fail, because they both share the same Transport code that is responsible for handling TLS.

1 Like

Huh… that is ridiculously weird then, either it was due to something else I messed up during upgrade, or I’m a super special snowflake with super special snowflake issues :smiley: Anyway, thank you for looking, I guess I was wrong about PKCS8 no longer working.

1 Like

Since you were wondering about the back and forth about supporting PKCS#1 keys…
JCE does not support PKCS#1 by itself. But having Bouncy Castle as an additional crypto provider make
it work.

In the past there were some releases that contained BC already, but it got reverted at some point.
I’ve reintroduced it in 3.0 and it’s meant to stay :slight_smile:

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.