CSV lookup tables cannot be refreshed

I’m using a bunch of simple CSV lookup tables. Some of them have an in-memory cache enabled, some have no cache at all. So far, the lookup tables are working fine and I can use them in my pipeline rules as expected. I’ve noticed, that after adding values to one of the csv files, the new values are unavailable when queriying the according lookup table. Of course I’ve made sure to wait until the refresh duration expired. The server log file is crowded with Java exceptions, indicating, that refreshing the csv files failed. I’ve no idea how to resolve since the files are read properly upon service start, just refreshing them fails.

Csv file content is most simple like:

“key”,“value”
“TextA”,“TextB”

The csv files are stored in /etc/graylog/tables/. I left ‘plugin_dir’ and ‘integratioons_scripts_dir’ untouched in server.conf and just added ‘allowed_auxiliary_paths = /etc/graylog/tables’

An example of the Java exceptions:

ERROR [LookupDataAdapter] Couldn't refresh data adapter <storcli_drive_state/61d5d338c04995689f509cd6/@5c3e9
7ee>
java.lang.NullPointerException: null
        at org.graylog2.lookup.adapters.CSVFileDataAdapter.doRefresh(CSVFileDataAdapter.java:125) ~[graylog.jar:?]
        at org.graylog2.plugin.lookup.LookupDataAdapter.refresh(LookupDataAdapter.java:109) ~[graylog.jar:?]
        at org.graylog2.lookup.LookupDataAdapterRefreshService.lambda$schedule$0(LookupDataAdapterRefreshService.java:142) ~[graylog.jar:?
]
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_312]
        at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) [?:1.8.0_312]
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) [?:1.8.0_
312]
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) [?:1.8.0_312]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_312]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_312]
        at java.lang.Thread.run(Thread.java:748) [?:1.8.0_312]

Ubuntu 20.04LTS 5.11.0-46-generic
Graylog 4.2.5+59802bf
Java 1.8.0_312

Any hints on this?

Many thanks ahead,
Elix

Hello,

Need to ask you some questions. Is the permission on /etc/graylog/tables owned by Graylog?
What have you done so far in trying to resolve this issue?

Hi gsmith,

graylog:graylog 664 on /etc/graylog/tables/ and all files below.

I totally ran out of ideas due to the fact, that the files can be read properly when the graylog service starts. So my settings like any permissions, file content, file location and so on basically cannot be totally wrong. The question is, what’s the difference between reading the CSV files uipon service start and reading them in order to refresh. Please correct me if I’m wrong.

Many thanks,
Elix

Hi there,

I’m having a similar issue. Graylog version is 4.2.5 and this is happening on both prod and test instance with exactly the same csv file which I re created a few times making sure it was utf-8, etc.

I ran several searches and look up seems to work ok. You can see the content of the file here

I have removed port ranges and made them individual ports because I thought that could have caused some issues.

2022-01-26T09:42:41.528+13:00 ERROR [LookupDataAdapter] Couldn’t refresh data adapter port-numbers-to-service-names-data/61ef6b1929ff6428d1cd74c3/@43673d6e
java.lang.NullPointerException: null
at org.graylog2.lookup.adapters.CSVFileDataAdapter.doRefresh(CSVFileDataAdapter.java:125) ~[graylog.jar:?]
at org.graylog2.plugin.lookup.LookupDataAdapter.refresh(LookupDataAdapter.java:109) ~[graylog.jar:?]
at org.graylog2.lookup.LookupDataAdapterRefreshService.lambda$schedule$0(LookupDataAdapterRefreshService.java:142) ~[graylog.jar:?]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_312]
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) [?:1.8.0_312]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) [?:1.8.0_312]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) [?:1.8.0_312]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_312]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_312]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_312]

I do have 4 other lookup tables, and 2 of these 4 are CSV too they work well without any errors on the logs. These files have the same permissions so I don’t think it’s a permissions or selinux issue. As I mentioned, if you run lookups from the UI they seem to work. The dashboards are also displaying the replaced values so it’s almost like there is something causing the exception when reloading the adapter but I can’t spot what.

Any thoughts?

Check that you don’t have any blank lines or special characters at the end of the file. Java seems to be complaining about getting NULL when it expects something. Also worth double checking all the data to make sure formatting is as expected. All guessing, not from experience :smiley:

1 Like

Hello @Elix

Good question :+1: . I wouldn’t think there was any difference. Not sure if there was an order.

By chance what does your data adaptor look like that has failed to refresh ?

<storcli_drive_state/61d5d338c04995689f509cd6/@5c3e97ee>

When you restart Graylog service, do you tail -f /var/log/graylog-server/server.log file do you see anything that may pertain to this?

It’s really odd. I deleted the csv file entirely and used vi to create a new with just the below

“port”,“service”
“22”,“ssh”

I’m suspecting that the error is printed out when it looks up for a “port” that is not on the CSV.
If I do a manual look up from the UI for a port that is not on the csv, I get null which is ok but maybe java doesn’t handle the null value.

I’m my case is just noise on the log file, the table works ok and does the job.

You can set it to default to something…

lookup_value(lookup_table: string, key: any, [default: any])

Thanks @tmacgbay that didn’t make the error go away. I have added to the rule as it seems a sensitive thing to have :smiley:
I set a single default value under the table settings and that stopped the errors from showing up on the logs. However, it only seems to accept string, number,boolean and null.
Is it possible to assign the lookup $key somehow? The error seems to be triggered before the pipeline processor.

Hi gsmith, hi tmacgbay,

thanks again for taking care of my issue! @tmacgbay you made it once again :slight_smile: Indeed, refreshing fails when there’s a blank line at the end of the file. I’ve removed it and not any more error messages in the server log.
Nonetheless, I think this is an error which should be caught by Graylog. It’s usual to have a blank line at the end of such file because other processors expect a line break at the end of each line and otherwise omit it.

BR,
Elix

1 Like

@jpobeda - am not sure what you mean by assigning the lookup $key… What did you want the lookup_value() to accept as default?

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.