cat health · my-elasticsearch-cn

### cat health cat health是一个简洁的、单行命令来表示和/ _cluster/ health的相同信息 ~~~ GET /_cat/health?v ~~~ ~~~ epoch timestamp cluster status node.total node.data shards pri relo init unassign pending_tasks max_task_wait_time active_shards_percent 1475871424 16:17:04 elasticsearch green 1 1 5 5 0 0 0 0 - 100.0% ~~~ 我们可以添加一个ts选项来禁止输出时间戳 ~~~ GET /_cat/health?v&ts=false ~~~ 结果为: ~~~ cluster status node.total node.data shards pri relo init unassign pending_tasks max_task_wait_time active_shards_percent elasticsearch green 1 1 5 5 0 0 0 0 - 100.0% ~~~ 该命令的一个常见用法是检查节点之间的健康状况 ~~~ % pssh -i -h list.of.cluster.hosts curl -s localhost:9200/_cat/health [1] 20:20:52 [SUCCESS] es3.vm 1384309218 18:20:18 foo green 3 3 3 3 0 0 0 0 [2] 20:20:52 [SUCCESS] es1.vm 1384309218 18:20:18 foo green 3 3 3 3 0 0 0 0 [3] 20:20:52 [SUCCESS] es2.vm 1384309218 18:20:18 foo green 3 3 3 3 0 0 0 0 ~~~ 一个不那么明显的用途是跟踪一个大型集群随时间的恢复进度。大量的分片，启动一个集群，甚至在失去一个节点后恢复，可能需要时间(取决于您的网络和磁盘)。跟踪其进度的一种方法是在一个延迟的循环中使用这个命令 ~~~ % while true; do curl localhost:9200/_cat/health; sleep 120; done 1384309446 18:24:06 foo red 3 3 20 20 0 0 1812 0 1384309566 18:26:06 foo yellow 3 3 950 916 0 12 870 0 1384309686 18:28:06 foo yellow 3 3 1328 916 0 12 492 0 1384309806 18:30:06 foo green 3 3 1832 916 4 0 0 ^C ~~~ 在这种情况下，我们可以看出，恢复大约需要4分钟。如果这种情况持续了好几个小时，我们就可以看到未分配的碎片陡然下降。如果这个数字保持不变，我们可能意识到这是一个问题. 为什么要时间戳当集群出现故障时，您通常使用health命令。在此期间，将活动与日志文件、警报系统等关联起来非常重要。有两种输出格式。H:MM:SS输出仅仅是为了快速的人类查看。时间戳(点)时间保留了更多的信息，包括日期，如果你的恢复持续时间的话，那就是机器排序。