## **elasticsearch 查询**
es中的查询请求有两种方式,一种是简易版的查询,另外一种是使用JSON完整的请求体,叫做结构化查询(DSL)。由于DSL查询更为直观也更为简易,所以大都使用这种方式。DSL查询是POST过去一个json,由于post的请求是json格式的,所以存在很多灵活性,也有很多形式。这里有一个地方注意的是官方文档里面给的例子的json结构只是一部分,并不是可以直接黏贴复制进去使用的。一般要在外面加个query为key的机构。
## **路由查询**
官方文档地址:[https://www.elastic.co/guide/en/elasticsearch/reference/current/search-uri-request.html]()
![](https://box.kancloud.cn/8d8ea3e39e5b61c5dadf8967b5ec7037_620x203.png)
通过url query参数来实现搜索,常用参数如下:
* q:指定查询的语句;
* df:df指定要查询的字段;
* sort:排序;
* timeout:指定过期时间;
* form,size:用于分页
例如:
```
#查询user字段含有alfred的文档,结果按照age升序排列,返回5~14个文档,如果超过1s没有结束,则已超时结束
GET /my_index/_search?q=alfred&df=user&sort=age:asc&from=4&size=10&timeout=1s
```
## **Request body search**
通过body参数来实现搜索。
### **(1) match查询**
match查询也叫模糊查询。matcha查询会先对搜索词进行分词,分词完毕后再逐个对分词结果进行匹配。match还有两个相似的功能,一个是match_phrase,一个叫multi_match。
例子:
```
#创建索引以及准备数据
PUT my_index
{
"mappings": {
"_doc": {
"dynamic":"strict",
"properties": {
"title" : {
"type":"text"
},
"name":{
"type" : "keyword"
}
}
}
}
}
PUT my_index/_doc/1
{
"name" : "张三",
"title" : "我的宝马有222马力"
}
PUT my_index/_doc/2
{
"name" : "李四",
"title" : "我的奥迪有220马力"
}
PUT my_index/_doc/3
{
"name" : "王五",
"title" : "我的玛莎拉蒂有250马力"
}
#match查询
POST my_index/_doc/_search
{
"query": {
"match": {
"title": "宝马玛力"
}
},
"highlight":{
"pre_tags":"<tag1>",
"post_tags" : "</tag1>",
"fields":{"title":{}}
}
}
#返回结果
{
"took": 17,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 3,
"max_score": 0.970927,
"hits": [
{
"_index": "my_index",
"_type": "_doc",
"_id": "1",
"_score": 0.970927,
"_source": {
"name": "张三",
"title": "我的宝马有222马力"
},
"highlight": {
"title": [
"我的<tag1>宝</tag1><tag1>马</tag1>有222<tag1>马</tag1><tag1>力</tag1>"
]
}
},
{
"_index": "my_index",
"_type": "_doc",
"_id": "3",
"_score": 0.8630463,
"_source": {
"name": "王五",
"title": "我的玛莎拉蒂有250马力"
},
"highlight": {
"title": [
"我的<tag1>玛</tag1>莎拉蒂有250<tag1>马</tag1><tag1>力</tag1>"
]
}
},
{
"_index": "my_index",
"_type": "_doc",
"_id": "2",
"_score": 0.5753642,
"_source": {
"name": "李四",
"title": "我的奥迪有220马力"
},
"highlight": {
"title": [
"我的奥迪有220<tag1>马</tag1><tag1>力</tag1>"
]
}
}
]
}
}
```
说明:match查询会将查询词“宝马玛力”分解成一个一个词语,“宝”,“马”,“玛”,“力”再去匹配,返回查询结果
### **(2) match_phrase查询(短语匹配)**
和match查询类似,match_phrase查询首先解析查询字符串来产生一个词条列表。然后会搜索所有的词条,但只保留包含了所有搜索词条的文档,并且词条的位置要邻接。简单理解就是必须含有搜索词的所有词根,没做限制则还要毗邻。
```
#增加多一条数据
PUT my_index/_doc/5
{
"name" : "陈六",
"title" : "我的宝玛有250马力"
}
#查询
POST my_index/_doc/_search
{
"query": {
"match_phrase": {
"title": "宝马玛力"
}
},
"highlight":{
"pre_tags":"<h1>",
"post_tags" : "</h1>",
"fields":{"title":{}}
}
}
#返回结果
{
"took": 2,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 0,
"max_score": null,
"hits": []
}
}
```
说明:因为没有文档含有搜索词的所有词条且毗邻。
完全匹配可能比较严,我们会希望有个可调节因子,少匹配一个也满足,那就需要使用到slop。
例如:
```
#添加多两条数据
PUT my_index/_doc/6
{
"name" : "陈六",
"title" : "我的宝马的玛力有250马力"
}
PUT my_index/_doc/7
{
"name" : "陈六",
"title" : "我的宝马的李玛力有250马力"
}
#查询
POST my_index/_doc/_search
{
"query": {
"match_phrase": {
"title": {
"query":"宝马玛力",
"slop" : 1
}
}
},
"highlight":{
"pre_tags":"<h1>",
"post_tags" : "</h1>",
"fields":{"title":{}}
}
}
#返回结果
{
"took": 8,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 1.0359334,
"hits": [
{
"_index": "my_index",
"_type": "_doc",
"_id": "6",
"_score": 1.0359334,
"_source": {
"name": "陈六",
"title": "我的宝马的玛力有250马力"
},
"highlight": {
"title": [
"我的<h1>宝</h1><h1>马</h1>的<h1>玛</h1><h1>力</h1>有250马力"
]
}
}
]
}
}
```
说明:"宝马的玛力"我的宝马的玛力有250马力"含有所以查询词条,且位置差一个
### **(2) multi_match查询**
如果我们希望两个字段进行匹配,其中一个字段有这个文档就满足的话,使用multi_match
```
#增加多一条数据
PUT my_index/_doc/9
{
"name" : "玛力",
"title" : "我有一辆红旗"
}
#查询
POST my_index/_doc/_search
{
"query": {
"multi_match": {
"query":"玛力",
"fields":["title","name"]
}
},
"highlight":{
"pre_tags":"<h1>",
"post_tags" : "</h1>",
"fields":{"title":{}}
}
}
#结果
{
"took": 6,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 2,
"max_score": 0.2876821,
"hits": [
{
"_index": "my_index",
"_type": "_doc",
"_id": "9",
"_score": 0.2876821,
"_source": {
"name": "玛力",
"title": "我有一辆红旗"
}
},
{
"_index": "my_index",
"_type": "_doc",
"_id": "1",
"_score": 0.2876821,
"_source": {
"name": "张三",
"title": "我的宝马有222马力"
},
"highlight": {
"title": [
"我的宝马有222马<h1>力</h1>"
]
}
}
]
}
}
```
但是multi_match就涉及到匹配评分的问题
* 我们希望完全匹配的文档占的评分比较高,则需要使用best_fields
* 我们希望越多字段匹配的文档评分越高,就要使用most_fields
* 我们会希望这个词条的分词词汇是分配到不同字段中的,那么就使用cross_fields
```
POST my_index/_doc/_search
{
"query": {
"multi_match": {
"query":"玛力",
"fields":["title","name"],
"type" : "best_fields"
}
},
"highlight":{
"pre_tags":"<h1>",
"post_tags" : "</h1>",
"fields":{"title":{}}
}
}
```
## **term查询**
term是代表完全匹配,即不进行分词器分析,文档中必须包含整个搜索的词汇
使用term要确定的是这个字段是否“被分析”(analyzed),默认的字符串是被分析的。
```
DELETE my_index
PUT my_index
{
"mappings": {
"_doc": {
"dynamic":"strict",
"properties": {
"title" : {
"type":"text"
},
"name":{
"type" : "keyword"
}
}
}
}
}
PUT my_index/_doc/1
{
"name" : "张三",
"title" : "我的宝马有222马力"
}
#查询
POST my_index/_doc/_search
{
"query": {
"term": {
"title":"宝马"
}
}
}
#返回结果
{
"took": 19,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 0,
"max_score": null,
"hits": []
}
}
```
**因为"title"字段的类型为"text"是被分析的,即拆词保存。没有直接保存"宝马"。所以不能被搜索出来**
```
POST my_index/_doc/_search
{
"query": {
"term": {
"name":"张三"
}
}
}
#返回结果
{
"took": 4,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 0.2876821,
"hits": [
{
"_index": "my_index",
"_type": "_doc",
"_id": "1",
"_score": 0.2876821,
"_source": {
"name": "张三",
"title": "我的宝马有222马力"
}
}
]
}
}
```
**而"name"字段的类型为:keyword,不拆词,直接保存,所以能被检索出来**
说明:当希望字段类型"text"的中文也能被"term"检索出来,则使用"ik_max_word"
```
DELETE my_index
PUT my_index
{
"mappings": {
"_doc": {
"dynamic":"strict",
"properties": {
"title" : {
"type":"text",
"analyzer":"ik_max_word"
},
"name":{
"type" : "keyword"
}
}
}
}
}
PUT my_index/_doc/1
{
"name" : "张三",
"title" : "我的宝马有222马力"
}
#搜索
POST my_index/_doc/_search
{
"query": {
"term": {
"title":"宝马"
}
}
}
#返回结果
{
"took": 3,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 0.2876821,
"hits": [
{
"_index": "my_index",
"_type": "_doc",
"_id": "1",
"_score": 0.2876821,
"_source": {
"name": "张三",
"title": "我的宝马有222马力"
}
}
]
}
}
```
## **bool联合查询: must,should,must_not**
如果我们想要请求"title"中带"宝马",但是"name"中不带"宝马"这样类似的需求,就需要用到bool联合查询。
联合查询就会使用到must,should,must_not三种关键词。
这三个可以这么理解
* must: 文档必须完全匹配条件
* should: should下面会带一个以上的条件,至少满足一个条件,这个文档就符合should
* must_not: 文档必须不匹配条件
```
PUT my_index/_doc/2
{
"name" : "宝马",
"title" : "我的宝马x5有260马力"
}
PUT my_index/_doc/3
{
"name" : "宝马",
"title" : "我的奥迪有260马力"
}
#搜索
POST my_index/_doc/_search
{
"query":{
"bool":{
"must":{
"term":{
"name":"宝马"
}
},
"must_not":{
"term": {
"title": "宝马"
}
}
}
}
}
#返回结果
{
"took": 2,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 0.2876821,
"hits": [
{
"_index": "my_index",
"_type": "_doc",
"_id": "3",
"_score": 0.2876821,
"_source": {
"name": "宝马",
"title": "我的奥迪有260马力"
}
}
]
}
}
```