Count unique field values using API (4.0.1)

Hi, we have been using Graylog and it’s been working really well for us.

We are in the process of upgrading from 3.1 to 4.0.1 and have found the API for searching has changed significantly. We have resolved everything we are aware of except for getting a count of unique values of a field via the API.

We currently use something along the lines of:

curl http://hptb032.dev.hou.compute.pgs.com:9000/api/search/universal/relative/terms?field=container_id&query=*&range=30

The part we care about in the result is:

  "terms": {
    "a7e59aed4309e57823b4cba53409ed043e6b9150f3d3f4b045eef60ad7c2626f": 3,
    "5530c8589ec51dde61ed7ca5c1d51003880c14336a34604d7d0c624cc19d23bc": 1,
    "d34895780edc14af512d005d3e05d42c36b91c4feb01bf37520db8512dbf3376": 6,
    "8b6a1b6b68a728c10309a26c5e5ef246e663cef5371e795e05d4677042ec5680": 1,
    "0cd9553dd183b6ead34cdc96142f47c7fd4cffc9a74c852bc023f448739d0952": 30,
    "764264d1b906ee40cf8201a7280c0b320db32f7b987214db80f05b141088f51e": 1,
    "278a14dcf709ebd88c4dd6abd747db524d2314dd49b0d5124aaba90d36dcb4cc": 1,
    "4111ec2210bf9cec0cd7657975e5b48594ec47643e5f79b0c0513986bc79f2f2": 1
  },

What I am trying to figure out is how I would construct a similar query to list the unique values of a field plus their count with release 4 of Graylog?

I have found plenty of responses using the terms endpoint, but so far nothing that helps for newer versions. Any pointers would be much appreciated.

@chriss4242
Hello,
If I understand you right you have fields with “unique values” and would like the “count” of each?

My message fields.

Let’s say I want to know the field called “Protocol”, the “unique values “ and there count I would do this…

image

Results would be…

image

From there you can save your “Search” or add it to a “DashBoard”.
Hope that helps

@gsmith , thanks for the response. That is the information I am after - however I need to get it via the rest api so it can be used programatically by an application.

So it is more a case of how to get it via a query to one of the REST API endpoints that I am trying to figure out.

@chriss4242
With Vesion 4 we did the same thing, but it no longer works for us. We have been using some Metrics with a bash scipt to get our informtion sent to another application.
Maybe someone else will reply later. Sorry I could not be more help.

Ah ok, that’s fine thanks for looking into it. I am pretty sure it will work because the GUI is able to display that data and I presume the GUI talks directly to the API to get the data it displays.

Ok I have worked out a solution if it helps anyone down the track.

I traced the queries being made by the GUI (mitmproxy) and used some help from this link to figure out how to deal with the id field (listed as optional, but required as it turns out): Graylog 3.3.9 Search API

The code we were using for Graylog 3.1 is as follows. The endpoint used here has been dropped:

def get_unique_logs(field, query):
    """Returns unique values of field for the query.  Also returns number of times each value is logged."""
    import requests
    
    params = {
        "query": query,
        "keyword": "last 30 days",
        "filter": f"streams:{spark_streamid}",
        "field": field,
        "batch_size": 10000
    }
    http_request = f"{graylog_url}/api/search/universal/keyword/terms"
    headers = {"Accept":"application/json"}
    auth = ( graylog_token, "token")
    
    r = requests.get(http_request, params=params, auth=auth, headers=headers)
    if r.status_code == 200:
        return(r.json()["terms"])
    return None

To work with Graylog 4 and the views/* endpoints, it changed to:

def get_unique_logs_401(field, query):
    """Returns unique values of field for the query.  Also returns number of times each value is logged."""
    import requests
    import json
    
    headers = {"Content-Type":"application/json", "X-Requested-By": "jupyter"}
    auth = ( graylog_token, "token")
    payload = {
      "queries": [
        {
          "id": "?",
          "timerange": {
              "type": "keyword",
              "keyword": "last 30 days"
           },
          "query": {
            "type": "elasticsearch",
            "query_string": query
          },
          "search_types": [
              {
                  "id": "?",
                  "column_groups": [],
                  "filter": None,
                  "name": "chart",
                  "query": None,
                  "rollup": True,
                  "row_groups": [
                      {
                          "field": field,
                          "limit": 15,
                          "type": "values"
                      }
                  ],
                  "series": [
                      {
                          "field": None,
                          "id": "count()",
                          "type": "count"
                      }
                  ],
                  "sort": [],
                  "streams": [],
                  "timerange": None,
                  "type": "pivot"
              }
          ]
        }
      ]
    }
    http_request = f"{graylog_url}/api/views/search/sync"
    r = requests.post(http_request, data=json.dumps(payload), auth=auth,  headers=headers)
    if r.status_code == 200:
        return(r.json())
    return None
3 Likes

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.