How to connect non-contiguous multiline messages

I have some Coldfusion servers that make HTTP requests. We need to log these events to track API performance of third party vendors, and help us predict problems with their service. Because of the way the logging works, each event ends requires two lines. (One for start, one for finish.) The two lines are often interspersed with lines from other events that are running simultaneously.

Is it possible to connect the “Starting” line with the “Completed” line in order to gather useful stats?

How Coldfusion logs HTTP calls

When Coldfusion makes an HTTP call, it logs it to http.log containing "Starting HTTP request and some data".

When the HTTP transaction completes, it sends another log message containing "HTTP request completed and some data"

These two operations can be correlated with one another by the process ID, which will be unique for the length of the operation, but will be reused throughout the day.

Ideal behavior

I would love to have this data logged from each correlated pair of event lines:

  • Date Start
  • Time Start
  • URL
  • Method
  • Status Code
  • Time Taken

Example

Start of query

  • One of our servers makes an API call to http://api.vendorproduct.com/json_gen/json_page/index
  • When the HTTP request starts, it logs this line in http.log:
    • "Information","ajp-bio-8014-exec-17270","03/30/21","12:34:22",,"Starting HTTP request {URL='http://api.vendorproduct.com/json_gen/json_page/index?', method='post'}"

End of query

  • The Operation completes
  • This line is logged to http.log:
    • "Information","ajp-bio-8014-exec-17270","03/30/21","12:34:22",,"HTTP request completed {Status Code=200 ,Time taken=415 ms}"

What connects them
The start and end of the operation are connected by the process ID ajp-bio-8014-exec-17270

The full snippet of log entries:
Including other events that were going on at the same time, making extraction difficult.

"Information","ajp-bio-8014-exec-17299","03/30/21","12:34:22",,"Starting HTTP request {URL='https://www.myproduct.com/pro/admin/events/vendorproduct/vendorproduct_import_events_daily_realtime30days_myserver1.exc', method='post'}"
"Information","ajp-bio-8014-exec-17270","03/30/21","12:34:22",,"Starting HTTP request {URL='http://api.vendorproduct.com/json_gen/json_page/index?', method='post'}"
"Information","ajp-bio-8014-exec-17287","03/30/21","12:34:22",,"Starting HTTP request {URL='https://www.myproduct.com/scheduled_tasks/operation.exc?idnum=846', method='get'}"
"Information","ajp-bio-8014-exec-17270","03/30/21","12:34:22",,"HTTP request completed  {Status Code=200 ,Time taken=415 ms}"
"Information","ajp-bio-8014-exec-17287","03/30/21","12:34:22",,"HTTP request completed  {Status Code=200 ,Time taken=294 ms}"
"Information","ajp-bio-8014-exec-17289","03/30/21","12:34:22",,"HTTP request completed  {Status Code=200 ,Time taken=29853 ms}"
"Information","ajp-bio-8014-exec-17289","03/30/21","12:34:22",,"Starting HTTP request {URL='https://www.myproduct.com/pro/admin/events/vendorproduct/vendorproduct_import_events_daily_realtime30days_myserver1.exc', method='post'}"
"Information","ajp-bio-8014-exec-17273","03/30/21","12:34:22",,"Starting HTTP request {URL='http://api.vendorproduct.com/json_gen/json_page/index?', method='post'}"
"Information","ajp-bio-8014-exec-17273","03/30/21","12:34:23",,"HTTP request completed  {Status Code=200 ,Time taken=601 ms}"
"Information","ajp-bio-8014-exec-17293","03/30/21","12:34:25",,"Starting HTTP request {URL='https://www.myproduct.com/scheduled_tasks/operation.exc?idnum=525', method='get'}"
"Information","ajp-bio-8014-exec-17293","03/30/21","12:34:25",,"HTTP request completed  {Status Code=200 ,Time taken=213 ms}"
"Information","ajp-bio-8014-exec-17279","03/30/21","12:34:25",,"HTTP request completed  {Status Code=200 ,Time taken=39436 ms}"
1 Like

Hi,

Connect those lines where exactly?

Does your log already been sent to graylog?

My hope is to connect the “Starting HTTP request” line with its associated “HTTP request completed” line. I would like to produce stats on each of the HTTP addresses our server reaches out to. This will help us identify vendors with slow APIs and correlate their API problems with issues that affect our users.

So, using the example above, the data I want to ask Graylog for a graph of the time required for all HTTP requests to http://api.vendorproduct.com/json_gen/json_page/index.

Hi @csean

Once you have those logs been indexed in graylog, take a look if Aggregation fits you needs

Aggregation doesn’t seem to be appropriate for this use case. Aggregations are meant to create a visualization of all the same data, not to connect two rows that are separated in time.

After a lot of research, I don’t think Graylog is suited for this task. In the end I wrote a python script that watches the log on disk, pairs up the mates, formats them as JSON, and writes out to a new log file for Filebeat to read and ship to Graylog. The new log lines look like this:

{
	"lineA": {
		"linenumber": 15451,
		"severity": "Information",
		"thread": "2563",
		"date": "04/16/21",
		"time": "11:33:35",
		"app": "",
		"cf_message": "Starting HTTP request {URL='https://api.vendor.com/api_endpoint', method='post'}"
	},
	"lineB": {
		"linenumber": 15452,
		"severity": "Information",
		"thread": "2563",
		"date": "04/16/21",
		"time": "11:33:36",
		"app": "",
		"cf_message": "HTTP request completed  {Status Code=200 ,Time taken=612 ms}"
	}
}
2 Likes

:clap: :clap: :clap:

Sorry for not be abble to help you with that,

keep going!

No worries! Thanks for the feedback. It’s much appreciated :smiley:

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.