Greetings. Forgive me if this has been covered before, but I did my best with search terms and didn’t find a relevant topic.
I am using Graylog 2.4.6+ceaa7e4 on RHEL7 and java 1.8.0_171, with Elasticsearch 5.6.9. If I search within a particular string, the results will show results from all streams that contain the matching record. For example, if I search for “humongous,” it will show me 3 identical entries, one for the stream I am searching in, and the other 2 for other streams where that same message resides. Looking at the “Stored in index” value for each of the 3 messages confirms this.
If I then go to More actions -> Show query, I do indeed see this:
“query”: “streams:5b855004c883c0c29c90ca9a”
I can add that criteria to my search, and it still will show me duplicates when the same message exists in different streams.
Is this expected behavior? I think I am seeing this for all cases where one message hits multiple streams, so it’s consistent. If it’s expected, is there a way to turn that off? Or is there a search string which can dedup the output? I’d really like for a search within a stream to only show messages within indexes that stream is tied to.
Not sure what the admin search is, but I am navigating from the top Streams menu, then selecting the Stream I want to see. When the new page paints, I am typing my search string into “Type your search query here and press enter.” Right under where I am typing my search query, it shows the name of the stream I selected.
I forgot a potentially crucial piece of information! This only happens for events yesterday and earlier. Anything within today works as I’d expect. “Today” being the moving windows. So, “Search in the last 1 day” appears to show proper results back to around midnight, then duplicates prior to midnight.
Ok… The point at which it switches from showing just that stream’s index to looking in all indices is when the time range extends past the current index. So right now, for example, if I search in the last 8 hours, no duplicates. I see this in the stream page:
CXL-Prod-Weblogic
Found 425,612 messages in 126 ms, searched in 1 index.
If I change the search to the past day, I see this:
CXL-Prod-Weblogic
Found 899,915 messages in 81 ms, searched in 5 indices.
As you can see, it flat out claims it looked in 5 indices. Here’s the list it claimed to include:
Used indices
Graylog is intelligently selecting the indices it needs to search upon based on the time frame you selected. This list of indices is mainly useful for debugging purposes.
Indices used for this search:
cxl-_117
cxlprodwl-_32
cxlprodwl-_33
graylog_110
winlogon-_184
Again, this is all from a stream page. Am I doing something wrong? Is there a configuration setting I am missing?