Executing Aggregations（执行聚合） · Elasticsearch 5.4 中文文档

# Executing Aggregations（执行聚合）原文链接 : [https://www.elastic.co/guide/en/elasticsearch/reference/5.4/_executing_aggregations.html](https://www.elastic.co/guide/en/elasticsearch/reference/5.4/_executing_aggregations.html) 译文链接 : [http://www.apache.wiki/pages/viewpage.action?pageId=4260792](http://www.apache.wiki/pages/viewpage.action?pageId=4260792) 贡献者 : [那伊抹微笑](/display/~wangyangting)，[ApacheCN](/display/~apachecn)，[Apache中文网](/display/~apachechina) 聚合提供了从数据中分组和提取数据的能力。最简单的聚合方法大致等于 **SQL GROUP BY** 和 **SQL** 聚合函数。在 **Elasticsearch** 中，您有执行搜索返回 **hits**（命中结果），并且同时返回聚合结果，把一个响应中的所有 **hits**（命中结果）分隔开的能力。这是非常强大且有效的，您可以执行查询和多个聚合，并且在一次使用中得到各自的（任何一个的）返回结果，使用一次简洁和简化的 **API** 来避免网络往返。作为开始，找这个例子按 **state** 将所有的 **account** 给分组了，然后按 **count** 降序（默认）排序返回 **top 10**（默认）的 **state** : ``` curl -XGET 'localhost:9200/bank/_search?pretty' -d' { "size": 0, "aggs": { "group_by_state": { "terms": { "field": "state.keyword" } } } }' ``` 在 **SQL** 中，上面的聚合概念类似下面 : ``` SELECT state, COUNT(*) FROM bank GROUP BY state ORDER BY COUNT(*) DESC ``` 响应如下（部分显示）: ``` { "took": 29, "timed_out": false, "_shards": { "total": 5, "successful": 5, "failed": 0 }, "hits" : { "total" : 1000, "max_score" : 0.0, "hits" : [ ] }, "aggregations" : { "group_by_state" : { "doc_count_error_upper_bound": 20, "sum_other_doc_count": 770, "buckets" : [ { "key" : "ID", "doc_count" : 27 }, { "key" : "TX", "doc_count" : 27 }, { "key" : "AL", "doc_count" : 25 }, { "key" : "MD", "doc_count" : 25 }, { "key" : "TN", "doc_count" : 23 }, { "key" : "MA", "doc_count" : 21 }, { "key" : "NC", "doc_count" : 21 }, { "key" : "ND", "doc_count" : 21 }, { "key" : "ME", "doc_count" : 20 }, { "key" : "MO", "doc_count" : 20 } ] } } } ``` 我们可以看到上面在 **ID**（**Idaho**）中有 **27** 个 **account**（账户），接着有 **27** 个 **account**（账户）在 **TX**（**Texas**）中，**25** 个账户在 **AL**（**Alabama**）中，等等。注意我们设置了 **size=0** 以不显示搜索的 **hits**（命中数量），因为我们只希望在响应中看聚合结果。基于前面的聚合，这个例子按 **state** 计算了平均的账户余额（再一次按 **count** 降序（默认）排序返回 **top 10**（默认）的 **state **）: ``` curl -XGET 'localhost:9200/bank/_search?pretty' -d' { "size": 0, "aggs": { "group_by_state": { "terms": { "field": "state.keyword" }, "aggs": { "average_balance": { "avg": { "field": "balance" } } } } } }' ``` 注意，我们是如何内嵌 **`average_balance`**聚合到 **group_by_state** 聚合的内部的。这是所有聚合中常见的模式。您可以嵌入聚合到随意的聚合中以从您需要的数据中开窗汇总。基于前面的聚合，现在让我们在 **average balance**（平均余额）上降序排序 : ``` curl -XGET 'localhost:9200/bank/_search?pretty' -d' { "size": 0, "aggs": { "group_by_state": { "terms": { "field": "state.keyword", "order": { "average_balance": "desc" } }, "aggs": { "average_balance": { "avg": { "field": "balance" } } } } } }' ``` 这个例子演示了我们如何按年龄段（**20-29**岁，**30-39**岁，和 **40-49**岁），然后按性别，最后获得每个年龄段，每个性别的平均账户余额 : ``` GET /bank/_search { "size": 0, "aggs": { "group_by_age": { "range": { "field": "age", "ranges": [ { "from": 20, "to": 30 }, { "from": 30, "to": 40 }, { "from": 40, "to": 50 } ] }, "aggs": { "group_by_gender": { "terms": { "field": "gender.keyword" }, "aggs": { "average_balance": { "avg": { "field": "balance" } } } } } } } } ``` 还有一些我们这里没有细讲的其它的聚合功能。如果您想要做进一步的实验，[聚合参考指南](https://www.elastic.co/guide/en/elasticsearch/reference/5.0/search-aggregations.html)是一个比较好的起点。