Can not rotate active write index after upgrade to graylog 2.3/elasticsearch 5.5


(Li Li) #1

Hello community:

I was running on graylog 2.2.3/elasticsearch 2.4.6, and wanted to upgrade to graylog 2.3/elasticsearch 5.5. Here are the steps I did:

  1. upgraded to graylog 2.3, everything looked fine. I was able to “see” my old indices, and was able to rotate active write index.

  2. then I upgraded elasticsearch to 5.5, I was still able to “see” my old indices. But when I tried to rotate active write index, it immediately changed my elasticsearch status from “green” to “red”, making the newly created shards “unassigned”. In graylog server’s log, it shows:

2017-08-16T09:23:51.505-05:00 ERROR [IndexRotationThread] Couldn’t point deflector to a new index
org.graylog2.indexer.ElasticsearchException: Couldn’t read health status for index graylog_5

{“cluster_name”:“es-cluster”,“status”:“red”,“timed_out”:true,“number_of_nodes”:3,“number_of_data_nodes”:3,“active_primary_shards”:0,“active_shards”:0,“relocating_shards”:0,“initializing_shards”:0,“unassigned_shards”:3,“delayed_unassigned_shards”:0,“number_of_pending_tasks”:0,“number_of_in_flight_fetch”:0,“task_max_waiting_in_queue_millis”:0,“active_shards_percent_as_number”:83.33333333333334}
at org.graylog2.indexer.cluster.jest.JestUtils.specificException(JestUtils.java:91) ~[graylog.jar:?]
at org.graylog2.indexer.cluster.jest.JestUtils.execute(JestUtils.java:58) ~[graylog.jar:?]
at org.graylog2.indexer.cluster.jest.JestUtils.execute(JestUtils.java:63) ~[graylog.jar:?]
at org.graylog2.indexer.indices.Indices.waitForStatus(Indices.java:642) ~[graylog.jar:?]
at org.graylog2.indexer.indices.Indices.waitForRecovery(Indices.java:633) ~[graylog.jar:?]
at org.graylog2.periodical.IndexRotationThread.checkAndRepair(IndexRotationThread.java:163) ~[graylog.jar:?]
at org.graylog2.periodical.IndexRotationThread.lambda$doRun$0(IndexRotationThread.java:76) ~[graylog.jar:?]
at java.lang.Iterable.forEach(Iterable.java:75) [?:1.8.0_141]
at org.graylog2.periodical.IndexRotationThread.doRun(IndexRotationThread.java:73) [graylog.jar:?]
at org.graylog2.plugin.periodical.Periodical.run(Periodical.java:77) [graylog.jar:?]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_141]
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) [?:1.8.0_141]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) [?:1.8.0_141]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) [?:1.8.0_141]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_141]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_141]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_141]
2017-08-16T09:23:51.522-05:00 WARN [IndexRotationThread] Deflector is pointing to [graylog_4], not the newest one: [graylog_5]. Re-pointing.

Any ideas as to how to fix this problem?

Thanks,
-Li


#2

ES 5 has different field mappings. Check that your field mappings are compatible with ES 5.5. You can find the culprit field probably from the Graylog UI in the indexing errors error messages. Create an ES5.5 custom field mapping for that field and suddenly things start to work again.


(Li Li) #3

@jtkarvo Thanks! But there is no indexing error in the graylog UI. currently the logs continue writing to the old index file before I issued the “rotate active write index” operation.


#4

At the moment I don’t have the UI in front of me, but I am sure I found the problematic fields using the UI. I don’t remember where. If this is your problem, then you can of course also look at ES logs. If you have custom field mappings and there fields of a string type, you could fix them first. In my case, too, Graylog continued to write log messages to the index but failed to rotate, before fixing the field mapping.

One question; did you wait that ES is yellow before starting the Graylog servers after ES upgrade? Upgrading from ES2.4 to ES5.5 requires that you stop Graylog, flush ES, upgrade, wait for yellow, and only after that restart Graylog.

Any other error messages in Graylog or ES logs?


(Li Li) #5

@jtkarvo I did stop graylog 2.3, then flushed ES, then upgraded, then waited my ES to become ‘green’, then I started graylog 2.3 back up. Below is the ES log that shows the upgrade process with no error: at 09:04 I stopped ES and started to upgrade it, at 9:14, I brought ES back up.

[2017-08-16 09:04:13,236][INFO ][node ] [es1] stopping …
[2017-08-16 09:04:13,290][INFO ][node ] [es1] stopped
[2017-08-16 09:04:13,290][INFO ][node ] [es1] closing …
[2017-08-16 09:04:13,300][INFO ][node ] [es1] closed
[2017-08-16T09:14:46,204][INFO ][o.e.n.Node ] [es1] initializing …
[2017-08-16T09:14:46,283][INFO ][o.e.e.NodeEnvironment ] [es1] using [1] data paths, mounts [[/ (rootfs)]], net usable_space [47.8gb], net total_space [49.9gb], spins? [unknown], types [rootfs]
[2017-08-16T09:14:46,283][INFO ][o.e.e.NodeEnvironment ] [es1] heap size [1.9gb], compressed ordinary object pointers [true]
[2017-08-16T09:14:46,299][INFO ][o.e.n.Node ] [es1] node name [es1], node ID [yo1NvEn-SQC0FCP0IZ2G9w]
[2017-08-16T09:14:46,299][INFO ][o.e.n.Node ] [es1] version[5.5.1], pid[28078], build[19c13d0/2017-07-18T20:44:24.823Z], OS[Linux/3.10.0-514.26.2.el7.x86_64/amd64], JVM[Oracle Corporation/OpenJDK 64-Bit Server VM/1.8.0_141/25.141-b16]
[2017-08-16T09:14:46,299][INFO ][o.e.n.Node ] [es1] JVM arguments [-Xms2g, -Xmx2g, -XX:+UseConcMarkSweepGC, -XX:CMSInitiatingOccupancyFraction=75, -XX:+UseCMSInitiatingOccupancyOnly, -XX:+AlwaysPreTouch, -Xss1m, -Djava.awt.headless=true, -Dfile.encoding=UTF-8, -Djna.nosys=true, -Djdk.io.permissionsUseCanonicalPath=true, -Dio.netty.noUnsafe=true, -Dio.netty.noKeySetOptimization=true, -Dio.netty.recycler.maxCapacityPerThread=0, -Dlog4j.shutdownHookEnabled=false, -Dlog4j2.disable.jmx=true, -Dlog4j.skipJansi=true, -XX:+HeapDumpOnOutOfMemoryError, -Des.path.home=/usr/share/elasticsearch]
[2017-08-16T09:14:47,335][INFO ][o.e.p.PluginsService ] [es1] loaded module [aggs-matrix-stats]
[2017-08-16T09:14:47,335][INFO ][o.e.p.PluginsService ] [es1] loaded module [ingest-common]
[2017-08-16T09:14:47,335][INFO ][o.e.p.PluginsService ] [es1] loaded module [lang-expression]
[2017-08-16T09:14:47,335][INFO ][o.e.p.PluginsService ] [es1] loaded module [lang-groovy]
[2017-08-16T09:14:47,335][INFO ][o.e.p.PluginsService ] [es1] loaded module [lang-mustache]
[2017-08-16T09:14:47,335][INFO ][o.e.p.PluginsService ] [es1] loaded module [lang-painless]
[2017-08-16T09:14:47,335][INFO ][o.e.p.PluginsService ] [es1] loaded module [parent-join]
[2017-08-16T09:14:47,335][INFO ][o.e.p.PluginsService ] [es1] loaded module [percolator]
[2017-08-16T09:14:47,335][INFO ][o.e.p.PluginsService ] [es1] loaded module [reindex]
[2017-08-16T09:14:47,335][INFO ][o.e.p.PluginsService ] [es1] loaded module [transport-netty3]
[2017-08-16T09:14:47,336][INFO ][o.e.p.PluginsService ] [es1] loaded module [transport-netty4]
[2017-08-16T09:14:47,336][INFO ][o.e.p.PluginsService ] [es1] no plugins loaded
[2017-08-16T09:14:49,404][INFO ][o.e.d.DiscoveryModule ] [es1] using discovery type [zen]
[2017-08-16T09:14:49,908][INFO ][o.e.c.u.IndexFolderUpgrader] [graylog_1/7lgD97E4SKqh0pxbBoP1Mw] upgrading [/var/lib/elasticsearch/es-cluster/nodes/0/indices/graylog_1] to new naming convention
[2017-08-16T09:14:49,910][INFO ][o.e.c.u.IndexFolderUpgrader] [graylog_1/7lgD97E4SKqh0pxbBoP1Mw] moved from [/var/lib/elasticsearch/es-cluster/nodes/0/indices/graylog_1] to [/var/lib/elasticsearch/es-cluster/nodes/0/indices/7lgD97E4SKqh0pxbBoP1Mw]
[2017-08-16T09:14:49,916][INFO ][o.e.c.u.IndexFolderUpgrader] [graylog_0/eOc8PVeIRj6P6pyzJS8O2g] upgrading [/var/lib/elasticsearch/es-cluster/nodes/0/indices/graylog_0] to new naming convention
[2017-08-16T09:14:49,917][INFO ][o.e.c.u.IndexFolderUpgrader] [graylog_0/eOc8PVeIRj6P6pyzJS8O2g] moved from [/var/lib/elasticsearch/es-cluster/nodes/0/indices/graylog_0] to [/var/lib/elasticsearch/es-cluster/nodes/0/indices/eOc8PVeIRj6P6pyzJS8O2g]
[2017-08-16T09:14:49,922][INFO ][o.e.c.u.IndexFolderUpgrader] [graylog_4/6JbZd-gjTk-dWhPk96G3kw] upgrading [/var/lib/elasticsearch/es-cluster/nodes/0/indices/graylog_4] to new naming convention
[2017-08-16T09:14:49,923][INFO ][o.e.c.u.IndexFolderUpgrader] [graylog_4/6JbZd-gjTk-dWhPk96G3kw] moved from [/var/lib/elasticsearch/es-cluster/nodes/0/indices/graylog_4] to [/var/lib/elasticsearch/es-cluster/nodes/0/indices/6JbZd-gjTk-dWhPk96G3kw]
[2017-08-16T09:14:49,928][INFO ][o.e.c.u.IndexFolderUpgrader] [graylog_3/3dTUvsTuTx2YzZMfIDBOXw] upgrading [/var/lib/elasticsearch/es-cluster/nodes/0/indices/graylog_3] to new naming convention
[2017-08-16T09:14:49,930][INFO ][o.e.c.u.IndexFolderUpgrader] [graylog_3/3dTUvsTuTx2YzZMfIDBOXw] moved from [/var/lib/elasticsearch/es-cluster/nodes/0/indices/graylog_3] to [/var/lib/elasticsearch/es-cluster/nodes/0/indices/3dTUvsTuTx2YzZMfIDBOXw]
[2017-08-16T09:14:49,937][INFO ][o.e.c.u.IndexFolderUpgrader] [graylog_2/kwXU92I1SDaWGDc5xlwkyQ] upgrading [/var/lib/elasticsearch/es-cluster/nodes/0/indices/graylog_2] to new naming convention
[2017-08-16T09:14:49,938][INFO ][o.e.c.u.IndexFolderUpgrader] [graylog_2/kwXU92I1SDaWGDc5xlwkyQ] moved from [/var/lib/elasticsearch/es-cluster/nodes/0/indices/graylog_2] to [/var/lib/elasticsearch/es-cluster/nodes/0/indices/kwXU92I1SDaWGDc5xlwkyQ]
[2017-08-16T09:14:50,232][INFO ][o.e.n.Node ] [es1] initialized
[2017-08-16T09:14:50,232][INFO ][o.e.n.Node ] [es1] starting …
[2017-08-16T09:14:50,470][INFO ][o.e.t.TransportService ] [es1] publish_address {10.100.101.199:9300}, bound_addresses {[::1]:9300}, {127.0.0.1:9300}, {10.100.101.199:9300}, {[fe80::250:56ff:feb3:1fb7]:9300}
[2017-08-16T09:14:50,483][INFO ][o.e.b.BootstrapChecks ] [es1] bound or publishing to a non-loopback or non-link-local address, enforcing bootstrap checks
[2017-08-16T09:14:53,566][INFO ][o.e.c.s.ClusterService ] [es1] new_master {es1}{yo1NvEn-SQC0FCP0IZ2G9w}{0tVEyCgnRIa7yFb03d0ibQ}{10.100.101.199}{10.100.101.199:9300}, reason: zen-disco-elected-as-master ([0] nodes joined)
[2017-08-16T09:14:53,671][INFO ][o.e.h.n.Netty4HttpServerTransport] [es1] publish_address {10.100.101.199:9200}, bound_addresses {[::1]:9200}, {127.0.0.1:9200}, {10.100.101.199:9200}, {[fe80::250:56ff:feb3:1fb7]:9200}
[2017-08-16T09:14:53,672][INFO ][o.e.n.Node ] [es1] started
[2017-08-16T09:14:53,903][INFO ][o.e.c.s.ClusterSettings ] [es1] updating [cluster.routing.allocation.enable] from [ALL] to [none]
[2017-08-16T09:14:54,043][INFO ][o.e.g.GatewayService ] [es1] recovered [5] indices into cluster_state
[2017-08-16T09:14:54,318][WARN ][o.e.c.m.MetaDataMappingService] [es1] [graylog_1] re-syncing mappings with cluster state because of types [[message]]
[2017-08-16T09:14:54,318][WARN ][o.e.c.m.MetaDataMappingService] [es1] [graylog_4] re-syncing mappings with cluster state because of types [[message]]
[2017-08-16T09:14:54,318][WARN ][o.e.c.m.MetaDataMappingService] [es1] [graylog_3] re-syncing mappings with cluster state because of types [[message]]
[2017-08-16T09:14:54,319][WARN ][o.e.c.m.MetaDataMappingService] [es1] [graylog_2] re-syncing mappings with cluster state because of types [[message]]
[2017-08-16T09:14:54,606][WARN ][o.e.c.m.MetaDataMappingService] [es1] [graylog_0] re-syncing mappings with cluster state because of types [[message]]
[2017-08-16T09:15:52,291][INFO ][o.e.c.s.ClusterService ] [es1] added {{es2}{jxuIgyMYTOmK8ruKcFyUlA}{gX0vbJ81RjuZQrYMWWGLqw}{10.100.101.197}{10.100.101.197:9300},}, reason: zen-disco-node-join[{es2}{jxuIgyMYTOmK8ruKcFyUlA}{gX0vbJ81RjuZQrYMWWGLqw}{10.100.101.197}{10.100.101.197:9300}]
[2017-08-16T09:15:52,528][WARN ][o.e.d.z.ElectMasterService] [es1] value for setting “discovery.zen.minimum_master_nodes” is too low. This can result in data loss! Please set it to at least a quorum of master-eligible nodes (current value: [-1], total number of master-eligible nodes used for publishing in this round: [2])
[2017-08-16T09:16:40,561][INFO ][o.e.c.s.ClusterService ] [es1] added {{es3}{hhWJUGYMTpyhk6SVHvRGEw}{ANQEgBitQ3uCLa7WFrwWrQ}{10.100.101.196}{10.100.101.196:9300},}, reason: zen-disco-node-join[{es3}{hhWJUGYMTpyhk6SVHvRGEw}{ANQEgBitQ3uCLa7WFrwWrQ}{10.100.101.196}{10.100.101.196:9300}]
[2017-08-16T09:16:41,150][INFO ][o.e.c.r.a.AllocationService] [es1] Cluster health status changed from [RED] to [GREEN] (reason: [shards started [[graylog_2][0], [graylog_3][0], [graylog_4][0], [graylog_0][0]] …]).


(Li Li) #6

@jtkarvo below is the field difference: graylog_5 is the new index generated after I did a manual ‘rotate active write index’, graylog_4 is the old index prior to the ES upgrade. I see the differences but don’t know how to fix that…

[root@es1 indices]# curl -XGET 'http://localhost:9200/graylog_5/mapping?pretty’
{
“graylog_5” : {
“mappings” : {
“message” : {
“dynamic_templates” : [
{
“internal_fields” : {
“match” : "gl2
",
“mapping” : {
“type” : “keyword”
}
}
},
{
“store_generic” : {
“match” : "
",
“mapping” : {
“index” : “not_analyzed”
}
}
}
],
“properties” : {
“full_message” : {
“type” : “text”,
“analyzer” : “standard”
},
“message” : {
“type” : “text”,
“analyzer” : “standard”
},
“source” : {
“type” : “text”,
“analyzer” : “analyzer_keyword”,
“fielddata” : true
},
“streams” : {
“type” : “keyword”
},
“timestamp” : {
“type” : “date”,
“format” : “yyyy-MM-dd HH:mm:ss.SSS”
}
}
}
}
}
}
[root@es1 indices]# curl -XGET 'http://localhost:9200/graylog_4/mapping?pretty’
{
“graylog_4” : {
“mappings” : {
“message” : {
“dynamic_templates” : [
{
“internal_fields” : {
“match” : "gl2
",
“mapping” : {
“index” : “not_analyzed”,
“type” : “string”
}
}
},
{
“store_generic” : {
“match” : "
",
“mapping” : {
“index” : “not_analyzed”
}
}
}
],
“properties” : {
“facility” : {
“type” : “string”,
“index” : “not_analyzed”,
“fielddata” : false
},
“full_message” : {
“type” : “string”,
“analyzer” : “standard”
},
“gl2_remote_ip” : {
“type” : “string”,
“index” : “not_analyzed”,
“fielddata” : false
},
“gl2_remote_port” : {
“type” : “string”,
“index” : “not_analyzed”,
“fielddata” : false
},
“gl2_source_input” : {
“type” : “string”,
“index” : “not_analyzed”,
“fielddata” : false
},
“gl2_source_node” : {
“type” : “string”,
“index” : “not_analyzed”,
“fielddata” : false
},
“level” : {
“type” : “long”
},
“message” : {
“type” : “string”,
“analyzer” : “standard”
},
“source” : {
“type” : “string”,
“analyzer” : “analyzer_keyword”
},
“streams” : {
“type” : “string”,
“index” : “not_analyzed”,
“fielddata” : false
},
“timestamp” : {
“type” : “date”,
“format” : “yyyy-MM-dd HH:mm:ss.SSS”
}
}
}
}
}
}


#7

is “facility” a field in your incoming messages? It seems it is of a string type. If it is, you can make a custom field mapping and map facility to either text or keyword. See http://docs.graylog.org/en/2.3/pages/configuration/elasticsearch.html#custom-index-mappings. gl2 fields are set with dynamic mapping, they are not a problem.

Fields where the _5 index shows the new field types (keyword or text) are OK, also the gl2 fields. The problem is probably a field which is a string in _4 index and missing in the _5 index.


(Li Li) #8

@jtkarvo yes “facility” is a field in my incoming messages. Thanks for pointing me to the direction to fix this problem, I’ll need to study a bit on field mapping


(system) #9

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.