df shows:
Filesystem 1K-blocks Used Available Use% Mounted on
udev 8105732 0 8105732 0% /dev
tmpfs 1623552 2196 1621356 1% /run
/dev/nvme0n1p1 101583780 49813840 51753556 50% /
tmpfs 8117756 0 8117756 0% /dev/shm
tmpfs 5120 0 5120 0% /run/lock
tmpfs 8117756 0 8117756 0% /sys/fs/cgroup
/dev/loop0 13056 13056 0 100% /snap/amazon-ssm-agent/495
/dev/loop1 93312 93312 0 100% /snap/core/6531
/dev/loop2 90112 90112 0 100% /snap/core/5328
/dev/loop3 18304 18304 0 100% /snap/amazon-ssm-agent/1068
/dev/loop4 93184 93184 0 100% /snap/core/6405
tmpfs 1623548 0 1623548 0% /run/user/1000
I haven’t done anything fancy to shift the install from the package defaults. I believe it’s all going to / and there appears to be 50GB left there. I did expand on the drive from it’s initial sizing, I believe it was 50GB to start and I expanded it to 100GB.
/var/log/elasticsearch has the following files:
drwxr-x— 2 elasticsearch elasticsearch 4096 Mar 26 20:04 .
drwxrwxr-x 15 root syslog 4096 Mar 27 06:25 …
-rw-r–r-- 1 elasticsearch elasticsearch 67109550 Mar 18 05:03 gc.log.0
-rw-r–r-- 1 elasticsearch elasticsearch 2810784 Mar 27 16:06 gc.log.0.current
-rw-r–r-- 1 elasticsearch elasticsearch 16357876 Mar 20 00:51 gc.log.1.current
-rw-r–r-- 1 elasticsearch elasticsearch 9327 Mar 15 00:00 graylog-2019-03-14-1.log.gz
-rw-r–r-- 1 elasticsearch elasticsearch 4642 Mar 16 00:52 graylog-2019-03-15-1.log.gz
-rw-r–r-- 1 elasticsearch elasticsearch 8189 Mar 17 03:02 graylog-2019-03-16-1.log.gz
-rw-r–r-- 1 elasticsearch elasticsearch 977 Mar 18 00:18 graylog-2019-03-17-1.log.gz
-rw-r–r-- 1 elasticsearch elasticsearch 12639 Mar 19 00:00 graylog-2019-03-18-1.log.gz
-rw-r–r-- 1 elasticsearch elasticsearch 2194 Mar 20 00:51 graylog-2019-03-19-1.log.gz
-rw-r–r-- 1 elasticsearch elasticsearch 12281 Mar 21 00:00 graylog-2019-03-20-1.log.gz
-rw-r–r-- 1 elasticsearch elasticsearch 2307 Mar 26 20:04 graylog-2019-03-21-1.log.gz
-rw-r–r-- 1 elasticsearch elasticsearch 212857 Mar 26 20:05 graylog.log
-rw-r–r-- 1 elasticsearch elasticsearch 2887291 Mar 26 20:47 graylog_deprecation.log
-rw-r–r-- 1 elasticsearch elasticsearch 0 Mar 14 01:29 graylog_index_indexing_slowlog.log
-rw-r–r-- 1 elasticsearch elasticsearch 0 Mar 14 01:29 graylog_index_search_slowlog.log
tail -50 of the graylog.log shows the following:
at org.elasticsearch.action.search.AbstractSearchAsyncAction.executeNextPhase(AbstractSearchAsyncAction.java:153) ~[elasticsearch-6.6.2.jar:6.6.2]
at org.elasticsearch.action.search.FetchSearchPhase.moveToNextPhase(FetchSearchPhase.java:206) ~[elasticsearch-6.6.2.jar:6.6.2]
at org.elasticsearch.action.search.FetchSearchPhase.lambda$innerRun$2(FetchSearchPhase.java:104) ~[elasticsearch-6.6.2.jar:6.6.2]
at org.elasticsearch.action.search.CountedCollector.countDown(CountedCollector.java:53) [elasticsearch-6.6.2.jar:6.6.2]
at org.elasticsearch.action.search.CountedCollector.onFailure(CountedCollector.java:76) [elasticsearch-6.6.2.jar:6.6.2]
at org.elasticsearch.action.search.FetchSearchPhase$2.onFailure(FetchSearchPhase.java:173) [elasticsearch-6.6.2.jar:6.6.2]
at org.elasticsearch.action.ActionListenerResponseHandler.handleException(ActionListenerResponseHandler.java:53) [elasticsearch-6.6.2.jar:6.6.2]
at org.elasticsearch.action.search.SearchTransportService$ConnectionCountingHandler.handleException(SearchTransportService.java:462) [elasticsearch-6.6.2.jar:6.6.2]
at org.elasticsearch.transport.TransportService$ContextRestoreResponseHandler.handleException(TransportService.java:1103) [elasticsearch-6.6.2.jar:6.6.2]
at org.elasticsearch.transport.TransportService$6.doRun(TransportService.java:660) [elasticsearch-6.6.2.jar:6.6.2]
at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:759) [elasticsearch-6.6.2.jar:6.6.2]
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) [elasticsearch-6.6.2.jar:6.6.2]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_191]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_191]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_191]
[2019-03-26T20:04:39,209][INFO ][o.e.n.Node ] [FGUZb68] stopped
[2019-03-26T20:04:39,210][INFO ][o.e.n.Node ] [FGUZb68] closing …
[2019-03-26T20:04:39,230][INFO ][o.e.n.Node ] [FGUZb68] closed
[2019-03-26T20:05:04,844][INFO ][o.e.e.NodeEnvironment ] [FGUZb68] using [1] data paths, mounts [[/ (/dev/nvme0n1p1)]], net usable_space [49.5gb], net total_space [96.8gb], types [ext4]
[2019-03-26T20:05:04,847][INFO ][o.e.e.NodeEnvironment ] [FGUZb68] heap size [990.7mb], compressed ordinary object pointers [true]
[2019-03-26T20:05:05,493][INFO ][o.e.n.Node ] [FGUZb68] node name derived from node ID [FGUZb682QGGRBFobpyu-jQ]; set [node.name] to override
[2019-03-26T20:05:05,494][INFO ][o.e.n.Node ] [FGUZb68] version[6.7.0], pid[802], build[oss/deb/8453f77/2019-03-21T15:32:29.844721Z], OS[Linux/4.15.0-1034-aws/amd64], JVM[Oracle Corporation/OpenJDK 64-Bit Server VM/1.8.0_191/25.191-b12]
[2019-03-26T20:05:05,494][INFO ][o.e.n.Node ] [FGUZb68] JVM arguments [-Xms1g, -Xmx1g, -XX:+UseConcMarkSweepGC, -XX:CMSInitiatingOccupancyFraction=75, -XX:+UseCMSInitiatingOccupancyOnly, -Des.networkaddress.cache.ttl=60, -Des.networkaddress.cache.negative.ttl=10, -XX:+AlwaysPreTouch, -Xss1m, -Djava.awt.headless=true, -Dfile.encoding=UTF-8, -Djna.nosys=true, -XX:-OmitStackTraceInFastThrow, -Dio.netty.noUnsafe=true, -Dio.netty.noKeySetOptimization=true, -Dio.netty.recycler.maxCapacityPerThread=0, -Dlog4j.shutdownHookEnabled=false, -Dlog4j2.disable.jmx=true, -Djava.io.tmpdir=/tmp/elasticsearch-8684823813574339173, -XX:+HeapDumpOnOutOfMemoryError, -XX:HeapDumpPath=/var/lib/elasticsearch, -XX:ErrorFile=/var/log/elasticsearch/hs_err_pid%p.log, -XX:+PrintGCDetails, -XX:+PrintGCDateStamps, -XX:+PrintTenuringDistribution, -XX:+PrintGCApplicationStoppedTime, -Xloggc:/var/log/elasticsearch/gc.log, -XX:+UseGCLogFileRotation, -XX:NumberOfGCLogFiles=32, -XX:GCLogFileSize=64m, -Des.path.home=/usr/share/elasticsearch, -Des.path.conf=/etc/elasticsearch, -Des.distribution.flavor=oss, -Des.distribution.type=deb]
[2019-03-26T20:05:06,537][INFO ][o.e.p.PluginsService ] [FGUZb68] loaded module [aggs-matrix-stats]
[2019-03-26T20:05:06,537][INFO ][o.e.p.PluginsService ] [FGUZb68] loaded module [analysis-common]
[2019-03-26T20:05:06,537][INFO ][o.e.p.PluginsService ] [FGUZb68] loaded module [ingest-common]
[2019-03-26T20:05:06,537][INFO ][o.e.p.PluginsService ] [FGUZb68] loaded module [ingest-geoip]
[2019-03-26T20:05:06,537][INFO ][o.e.p.PluginsService ] [FGUZb68] loaded module [ingest-user-agent]
[2019-03-26T20:05:06,537][INFO ][o.e.p.PluginsService ] [FGUZb68] loaded module [lang-expression]
[2019-03-26T20:05:06,537][INFO ][o.e.p.PluginsService ] [FGUZb68] loaded module [lang-mustache]
[2019-03-26T20:05:06,537][INFO ][o.e.p.PluginsService ] [FGUZb68] loaded module [lang-painless]
[2019-03-26T20:05:06,537][INFO ][o.e.p.PluginsService ] [FGUZb68] loaded module [mapper-extras]
[2019-03-26T20:05:06,538][INFO ][o.e.p.PluginsService ] [FGUZb68] loaded module [parent-join]
[2019-03-26T20:05:06,538][INFO ][o.e.p.PluginsService ] [FGUZb68] loaded module [percolator]
[2019-03-26T20:05:06,538][INFO ][o.e.p.PluginsService ] [FGUZb68] loaded module [rank-eval]
[2019-03-26T20:05:06,538][INFO ][o.e.p.PluginsService ] [FGUZb68] loaded module [reindex]
[2019-03-26T20:05:06,538][INFO ][o.e.p.PluginsService ] [FGUZb68] loaded module [repository-url]
[2019-03-26T20:05:06,538][INFO ][o.e.p.PluginsService ] [FGUZb68] loaded module [transport-netty4]
[2019-03-26T20:05:06,538][INFO ][o.e.p.PluginsService ] [FGUZb68] loaded module [tribe]
[2019-03-26T20:05:06,538][INFO ][o.e.p.PluginsService ] [FGUZb68] no plugins loaded
[2019-03-26T20:05:12,149][INFO ][o.e.d.DiscoveryModule ] [FGUZb68] using discovery type [zen] and host providers [settings]
[2019-03-26T20:05:12,659][INFO ][o.e.n.Node ] [FGUZb68] initialized
[2019-03-26T20:05:12,659][INFO ][o.e.n.Node ] [FGUZb68] starting …
[2019-03-26T20:05:13,078][INFO ][o.e.t.TransportService ] [FGUZb68] publish_address {127.0.0.1:9300}, bound_addresses {[::1]:9300}, {127.0.0.1:9300}
[2019-03-26T20:05:16,235][INFO ][o.e.c.s.MasterService ] [FGUZb68] zen-disco-elected-as-master ([0] nodes joined), reason: new_master {FGUZb68}{FGUZb682QGGRBFobpyu-jQ}{hsZ_Cp10R321xSDJMnOoCQ}{127.0.0.1}{127.0.0.1:9300}
[2019-03-26T20:05:16,240][INFO ][o.e.c.s.ClusterApplierService] [FGUZb68] new_master {FGUZb68}{FGUZb682QGGRBFobpyu-jQ}{hsZ_Cp10R321xSDJMnOoCQ}{127.0.0.1}{127.0.0.1:9300}, reason: apply cluster state (from
master [master {FGUZb68}{FGUZb682QGGRBFobpyu-jQ}{hsZ_Cp10R321xSDJMnOoCQ}{127.0.0.1}{127.0.0.1:9300} committed version [1] source [zen-disco-elected-as-master ([0] nodes joined)]])
[2019-03-26T20:05:16,284][INFO ][o.e.h.n.Netty4HttpServerTransport] [FGUZb68] publish_address {127.0.0.1:9200}, bound_addresses {[::1]:9200}, {127.0.0.1:9200}
[2019-03-26T20:05:16,284][INFO ][o.e.n.Node ] [FGUZb68] started
[2019-03-26T20:05:17,247][INFO ][o.e.g.GatewayService ] [FGUZb68] recovered [50] indices into cluster_state
[2019-03-26T20:05:24,134][INFO ][o.e.c.r.a.AllocationService] [FGUZb68] Cluster health status changed from [RED] to [GREEN] (reason: [shards started [[graylog_61][1], [graylog_61][2]] …]).
tail -50 of the gc.log.0.current file shows:
2019-03-27T16:11:04.393+0000: 72362.036: Total time for which application threads were stopped: 0.0002143 seconds, Stopping threads took: 0.0000502 seconds
2019-03-27T16:11:06.075+0000: 72363.718: [GC (Allocation Failure) 2019-03-27T16:11:06.075+0000: 72363.718: [ParNew
Desired survivor size 17432576 bytes, new threshold 6 (max 6)
- age 1: 431600 bytes, 431600 total
- age 2: 13344 bytes, 444944 total
- age 3: 688 bytes, 445632 total
: 273371K->599K(306688K), 0.0042593 secs] 519745K->246974K(1014528K), 0.0043273 secs] [Times: user=0.02 sys=0.00, real=0.01 secs]
2019-03-27T16:11:06.079+0000: 72363.722: Total time for which application threads were stopped: 0.0045746 seconds, Stopping threads took: 0.0000361 seconds
2019-03-27T16:11:14.398+0000: 72372.041: Total time for which application threads were stopped: 0.0004605 seconds, Stopping threads took: 0.0002972 seconds
2019-03-27T16:11:14.401+0000: 72372.044: Total time for which application threads were stopped: 0.0001657 seconds, Stopping threads took: 0.0000393 seconds
2019-03-27T16:11:24.393+0000: 72382.036: Total time for which application threads were stopped: 0.0001858 seconds, Stopping threads took: 0.0000304 seconds
2019-03-27T16:11:29.677+0000: 72387.320: [GC (Allocation Failure) 2019-03-27T16:11:29.677+0000: 72387.320: [ParNew
Desired survivor size 17432576 bytes, new threshold 6 (max 6)
- age 1: 667456 bytes, 667456 total
- age 2: 144 bytes, 667600 total
- age 4: 320 bytes, 667920 total
: 273239K->771K(306688K), 0.0039145 secs] 519614K->247146K(1014528K), 0.0039866 secs] [Times: user=0.01 sys=0.00, real=0.00 secs]
2019-03-27T16:11:29.681+0000: 72387.324: Total time for which application threads were stopped: 0.0042236 seconds, Stopping threads took: 0.0000417 seconds
2019-03-27T16:11:34.394+0000: 72392.037: Total time for which application threads were stopped: 0.0001682 seconds, Stopping threads took: 0.0000333 seconds
2019-03-27T16:11:44.395+0000: 72402.038: Total time for which application threads were stopped: 0.0002155 seconds, Stopping threads took: 0.0000504 seconds
2019-03-27T16:11:44.402+0000: 72402.045: Total time for which application threads were stopped: 0.0001781 seconds, Stopping threads took: 0.0000317 seconds
2019-03-27T16:11:51.811+0000: 72409.454: [GC (Allocation Failure) 2019-03-27T16:11:51.811+0000: 72409.454: [ParNew
Desired survivor size 17432576 bytes, new threshold 6 (max 6)
- age 1: 723544 bytes, 723544 total
- age 2: 6216 bytes, 729760 total
- age 3: 64 bytes, 729824 total
- age 5: 320 bytes, 730144 total
: 273411K->840K(306688K), 0.0165914 secs] 519786K->247215K(1014528K), 0.0166842 secs] [Times: user=0.03 sys=0.00, real=0.01 secs]
2019-03-27T16:11:51.828+0000: 72409.471: Total time for which application threads were stopped: 0.0169449 seconds, Stopping threads took: 0.0000312 seconds
2019-03-27T16:11:54.394+0000: 72412.037: Total time for which application threads were stopped: 0.0001838 seconds, Stopping threads took: 0.0000344 seconds
2019-03-27T16:12:04.393+0000: 72422.036: Total time for which application threads were stopped: 0.0001912 seconds, Stopping threads took: 0.0000444 seconds
2019-03-27T16:12:14.390+0000: 72432.033: [GC (Allocation Failure) 2019-03-27T16:12:14.390+0000: 72432.033: [ParNew
Desired survivor size 17432576 bytes, new threshold 6 (max 6)
- age 1: 374584 bytes, 374584 total
- age 2: 60856 bytes, 435440 total
- age 3: 80 bytes, 435520 total
- age 4: 64 bytes, 435584 total
- age 6: 320 bytes, 435904 total
: 273480K->620K(306688K), 0.0068328 secs] 519855K->246995K(1014528K), 0.0069079 secs] [Times: user=0.02 sys=0.00, real=0.00 secs]
2019-03-27T16:12:14.397+0000: 72432.040: Total time for which application threads were stopped: 0.0071399 seconds, Stopping threads took: 0.0000310 seconds
2019-03-27T16:12:14.401+0000: 72432.044: Total time for which application threads were stopped: 0.0002052 seconds, Stopping threads took: 0.0000587 seconds
2019-03-27T16:12:14.407+0000: 72432.050: Total time for which application threads were stopped: 0.0001587 seconds, Stopping threads took: 0.0000259 seconds
2019-03-27T16:12:24.394+0000: 72442.037: Total time for which application threads were stopped: 0.0004411 seconds, Stopping threads took: 0.0000420 seconds
2019-03-27T16:12:34.394+0000: 72452.036: [GC (Allocation Failure) 2019-03-27T16:12:34.394+0000: 72452.037: [ParNew
Desired survivor size 17432576 bytes, new threshold 6 (max 6)
- age 1: 161648 bytes, 161648 total
- age 2: 384 bytes, 162032 total
- age 3: 80 bytes, 162112 total
- age 5: 64 bytes, 162176 total
: 273260K->424K(306688K), 0.0124344 secs] 519635K->246799K(1014528K), 0.0125396 secs] [Times: user=0.02 sys=0.00, real=0.01 secs]
Any hints at what’s happening from these snippets? How/where do I look for log entries that pertain to the low watermark?
Thanks,
Tariq