Hello,
I have an Input that collects nginx access logs that are sent in the JSON format. I’ve been following the official Graylog guide on how to set up the extractor: How to use a JSON Extractor | Graylog
As it’s been suggested in the guide, I have two extractors: One to parse the message into a json field and one that extracts it.
Here’s an example message before parsing into a proper json field (data changed for privacy):
MyHost nginx: { “timestamp”: “1658474614.043”, “remote_addr”: “x.x.x.x.x”, “body_bytes_sent”: 229221, “request_time”: 0.005, “response_status”: 200, “request”: “GET /foo/bar/1999/09/sth.jpeg HTTP/2.0”, “request_method”: “GET”, “host”: “www…somesite.com”,“upstream_cache_status”: “”,“upstream_addr”: “x.x.x.x.x:xxx”,“http_x_forwarded_for”: “”,“http_referrer”: “https:////www.somesite.com/foo/bar/woo/boo/moo”, “http_user_agent”: “Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/103.0.0.0 Safari/537.36”, “http_version”: “HTTP/2.0”, “nginx_access”: true }
And it is successfully extracted into a json field by a regex extractor: nginx:\s+(.*)
{ “timestamp”: “1658474614.043”, “remote_addr”: “x.x.x.x.x”, “body_bytes_sent”: 229221, “request_time”: 0.005, “response_status”: 200, “request”: “GET /foo/bar/1999/09/sth.jpeg HTTP/2.0”, “request_method”: “GET”, “host”: “www…somesite.com”,“upstream_cache_status”: “”,“upstream_addr”: “x.x.x.x.x:xxx”,“http_x_forwarded_for”: “”,“http_referrer”: “https://www.somesite.com/foo/bar/woo/boo/moo”, “http_user_agent”: “Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/103.0.0.0 Safari/537.36”, “http_version”: “HTTP/2.0”, “nginx_access”: true }
After that it goes to the second extractor that fails completely. Not only is the preview incorrect (it omits some fields entirely):
remote_addr
x.x.x.x
request
GET /sth.dat HTTP/1.1
response_status
301
upstream_addr
body_bytes_sent
162
http_version
HTTP/1.1
request_method
GET
nginx_access
http_user_agent
Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/103.0.0.0 Safari/537.36
request_time
0
upstream_cache_status
host
sth
http_x_forwarded_for
http_referrer
timestamp
1658475023.035
It also keeps missing, it doesn’t extract at all:
The second Extractor configuration is left default with the exception of “flatten structure” option being turned on.
I kindly request your help and wish you a good day!