Implementing GDPR Data Pseudonymization

alexhenman · May 30, 2018, 4:51pm

Hi there,

I wonder if anyone might be able to tell me whether it’s currently possible to implement data pseudonymization with graylog. Specifically what we’d like to be able to do is to retain log messages with personally identifiable information for a set period, say 6 months, and then pseudonymize the data after that.

I saw this thread where someone seems to have the same question but I don’t think that answers whether it’s possible to hash the logging data after a set period (only that it is possible to hash the data as it comes in). As far as I can tell I don’t believe this would be possible with pipelines?

Would appreciate any advice from people dealing with the same issues at the moment.

eduardohki · May 30, 2018, 5:20pm

Hi,

Unfortunately is not possible to change data after it is indexed on Elasticsearch.

However, you have two options: anonymize the logs at ingestion time (as stated in the thread you mentioned), or duplicate the data in two different indexes: one with raw data (short retention time) and another with long time retention where you can store anonymized logs.

What I’m not sure about is how to stream both logs inside the pipeline.

Regards,
Eduardo

alexhenman · May 31, 2018, 9:00am

Thanks for the reply Eduardo. Yes, just for anyone else interested I think this is the page of the docs that refers to time-based index retention.

It sounds like all the tools are there (multiple indexes with different retention strategies, pipelines and has functions) but I’m just not entirely sure how to piece them together in the right way yet. Let me know if you work it out as it sounds like you’ve got pretty much the same problem to solve.

system · June 14, 2018, 9:01am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Graylog Data Retention and Compress Graylog Central (peer support)	9	3480	November 1, 2022
Anonymized and raw views of same logs in different streams possible? Graylog Central (peer support) pipeline-rules , route-to-streampl	14	2400	September 20, 2018
Anonymizer extractor idea Graylog Central (peer support)	6	2975	June 14, 2018
Data integrity and confidentiality on Graylog Enterprise Graylog Central (peer support)	3	647	December 12, 2020
Log Retention Strategy Graylog Central (peer support)	5	2103	March 5, 2020

Implementing GDPR Data Pseudonymization

Related topics