多应用+插件架构,代码干净,二开方便,首家独创一键云编译技术,文档视频完善,免费商用码云13.8K 广告
## **elasticsearch 查询** es中的查询请求有两种方式,一种是简易版的查询,另外一种是使用JSON完整的请求体,叫做结构化查询(DSL)。由于DSL查询更为直观也更为简易,所以大都使用这种方式。DSL查询是POST过去一个json,由于post的请求是json格式的,所以存在很多灵活性,也有很多形式。这里有一个地方注意的是官方文档里面给的例子的json结构只是一部分,并不是可以直接黏贴复制进去使用的。一般要在外面加个query为key的机构。 ## **路由查询** 官方文档地址:[https://www.elastic.co/guide/en/elasticsearch/reference/current/search-uri-request.html]() ![](https://box.kancloud.cn/8d8ea3e39e5b61c5dadf8967b5ec7037_620x203.png) 通过url query参数来实现搜索,常用参数如下: * q:指定查询的语句; * df:df指定要查询的字段; * sort:排序; * timeout:指定过期时间; * form,size:用于分页 例如: ``` #查询user字段含有alfred的文档,结果按照age升序排列,返回5~14个文档,如果超过1s没有结束,则已超时结束 GET /my_index/_search?q=alfred&df=user&sort=age:asc&from=4&size=10&timeout=1s ``` ## **Request body search** 通过body参数来实现搜索。 ### **(1) match查询** match查询也叫模糊查询。matcha查询会先对搜索词进行分词,分词完毕后再逐个对分词结果进行匹配。match还有两个相似的功能,一个是match_phrase,一个叫multi_match。 例子: ``` #创建索引以及准备数据 PUT my_index { "mappings": { "_doc": { "dynamic":"strict", "properties": { "title" : { "type":"text" }, "name":{ "type" : "keyword" } } } } } PUT my_index/_doc/1 { "name" : "张三", "title" : "我的宝马有222马力" } PUT my_index/_doc/2 { "name" : "李四", "title" : "我的奥迪有220马力" } PUT my_index/_doc/3 { "name" : "王五", "title" : "我的玛莎拉蒂有250马力" } #match查询 POST my_index/_doc/_search { "query": { "match": { "title": "宝马玛力" } }, "highlight":{ "pre_tags":"<tag1>", "post_tags" : "</tag1>", "fields":{"title":{}} } } #返回结果 { "took": 17, "timed_out": false, "_shards": { "total": 5, "successful": 5, "skipped": 0, "failed": 0 }, "hits": { "total": 3, "max_score": 0.970927, "hits": [ { "_index": "my_index", "_type": "_doc", "_id": "1", "_score": 0.970927, "_source": { "name": "张三", "title": "我的宝马有222马力" }, "highlight": { "title": [ "我的<tag1>宝</tag1><tag1>马</tag1>有222<tag1>马</tag1><tag1>力</tag1>" ] } }, { "_index": "my_index", "_type": "_doc", "_id": "3", "_score": 0.8630463, "_source": { "name": "王五", "title": "我的玛莎拉蒂有250马力" }, "highlight": { "title": [ "我的<tag1>玛</tag1>莎拉蒂有250<tag1>马</tag1><tag1>力</tag1>" ] } }, { "_index": "my_index", "_type": "_doc", "_id": "2", "_score": 0.5753642, "_source": { "name": "李四", "title": "我的奥迪有220马力" }, "highlight": { "title": [ "我的奥迪有220<tag1>马</tag1><tag1>力</tag1>" ] } } ] } } ``` 说明:match查询会将查询词“宝马玛力”分解成一个一个词语,“宝”,“马”,“玛”,“力”再去匹配,返回查询结果 ### **(2) match_phrase查询(短语匹配)** 和match查询类似,match_phrase查询首先解析查询字符串来产生一个词条列表。然后会搜索所有的词条,但只保留包含了所有搜索词条的文档,并且词条的位置要邻接。简单理解就是必须含有搜索词的所有词根,没做限制则还要毗邻。 ``` #增加多一条数据 PUT my_index/_doc/5 { "name" : "陈六", "title" : "我的宝玛有250马力" } #查询 POST my_index/_doc/_search { "query": { "match_phrase": { "title": "宝马玛力" } }, "highlight":{ "pre_tags":"<h1>", "post_tags" : "</h1>", "fields":{"title":{}} } } #返回结果 { "took": 2, "timed_out": false, "_shards": { "total": 5, "successful": 5, "skipped": 0, "failed": 0 }, "hits": { "total": 0, "max_score": null, "hits": [] } } ``` 说明:因为没有文档含有搜索词的所有词条且毗邻。 完全匹配可能比较严,我们会希望有个可调节因子,少匹配一个也满足,那就需要使用到slop。 例如: ``` #添加多两条数据 PUT my_index/_doc/6 { "name" : "陈六", "title" : "我的宝马的玛力有250马力" } PUT my_index/_doc/7 { "name" : "陈六", "title" : "我的宝马的李玛力有250马力" } #查询 POST my_index/_doc/_search { "query": { "match_phrase": { "title": { "query":"宝马玛力", "slop" : 1 } } }, "highlight":{ "pre_tags":"<h1>", "post_tags" : "</h1>", "fields":{"title":{}} } } #返回结果 { "took": 8, "timed_out": false, "_shards": { "total": 5, "successful": 5, "skipped": 0, "failed": 0 }, "hits": { "total": 1, "max_score": 1.0359334, "hits": [ { "_index": "my_index", "_type": "_doc", "_id": "6", "_score": 1.0359334, "_source": { "name": "陈六", "title": "我的宝马的玛力有250马力" }, "highlight": { "title": [ "我的<h1>宝</h1><h1>马</h1>的<h1>玛</h1><h1>力</h1>有250马力" ] } } ] } } ``` 说明:"宝马的玛力"我的宝马的玛力有250马力"含有所以查询词条,且位置差一个 ### **(2) multi_match查询** 如果我们希望两个字段进行匹配,其中一个字段有这个文档就满足的话,使用multi_match ``` #增加多一条数据 PUT my_index/_doc/9 { "name" : "玛力", "title" : "我有一辆红旗" } #查询 POST my_index/_doc/_search { "query": { "multi_match": { "query":"玛力", "fields":["title","name"] } }, "highlight":{ "pre_tags":"<h1>", "post_tags" : "</h1>", "fields":{"title":{}} } } #结果 { "took": 6, "timed_out": false, "_shards": { "total": 5, "successful": 5, "skipped": 0, "failed": 0 }, "hits": { "total": 2, "max_score": 0.2876821, "hits": [ { "_index": "my_index", "_type": "_doc", "_id": "9", "_score": 0.2876821, "_source": { "name": "玛力", "title": "我有一辆红旗" } }, { "_index": "my_index", "_type": "_doc", "_id": "1", "_score": 0.2876821, "_source": { "name": "张三", "title": "我的宝马有222马力" }, "highlight": { "title": [ "我的宝马有222马<h1>力</h1>" ] } } ] } } ``` 但是multi_match就涉及到匹配评分的问题 * 我们希望完全匹配的文档占的评分比较高,则需要使用best_fields * 我们希望越多字段匹配的文档评分越高,就要使用most_fields * 我们会希望这个词条的分词词汇是分配到不同字段中的,那么就使用cross_fields ``` POST my_index/_doc/_search { "query": { "multi_match": { "query":"玛力", "fields":["title","name"], "type" : "best_fields" } }, "highlight":{ "pre_tags":"<h1>", "post_tags" : "</h1>", "fields":{"title":{}} } } ``` ## **term查询** term是代表完全匹配,即不进行分词器分析,文档中必须包含整个搜索的词汇 使用term要确定的是这个字段是否“被分析”(analyzed),默认的字符串是被分析的。 ``` DELETE my_index PUT my_index { "mappings": { "_doc": { "dynamic":"strict", "properties": { "title" : { "type":"text" }, "name":{ "type" : "keyword" } } } } } PUT my_index/_doc/1 { "name" : "张三", "title" : "我的宝马有222马力" } #查询 POST my_index/_doc/_search { "query": { "term": { "title":"宝马" } } } #返回结果 { "took": 19, "timed_out": false, "_shards": { "total": 5, "successful": 5, "skipped": 0, "failed": 0 }, "hits": { "total": 0, "max_score": null, "hits": [] } } ``` **因为"title"字段的类型为"text"是被分析的,即拆词保存。没有直接保存"宝马"。所以不能被搜索出来** ``` POST my_index/_doc/_search { "query": { "term": { "name":"张三" } } } #返回结果 { "took": 4, "timed_out": false, "_shards": { "total": 5, "successful": 5, "skipped": 0, "failed": 0 }, "hits": { "total": 1, "max_score": 0.2876821, "hits": [ { "_index": "my_index", "_type": "_doc", "_id": "1", "_score": 0.2876821, "_source": { "name": "张三", "title": "我的宝马有222马力" } } ] } } ``` **而"name"字段的类型为:keyword,不拆词,直接保存,所以能被检索出来** 说明:当希望字段类型"text"的中文也能被"term"检索出来,则使用"ik_max_word" ``` DELETE my_index PUT my_index { "mappings": { "_doc": { "dynamic":"strict", "properties": { "title" : { "type":"text", "analyzer":"ik_max_word" }, "name":{ "type" : "keyword" } } } } } PUT my_index/_doc/1 { "name" : "张三", "title" : "我的宝马有222马力" } #搜索 POST my_index/_doc/_search { "query": { "term": { "title":"宝马" } } } #返回结果 { "took": 3, "timed_out": false, "_shards": { "total": 5, "successful": 5, "skipped": 0, "failed": 0 }, "hits": { "total": 1, "max_score": 0.2876821, "hits": [ { "_index": "my_index", "_type": "_doc", "_id": "1", "_score": 0.2876821, "_source": { "name": "张三", "title": "我的宝马有222马力" } } ] } } ``` ## **bool联合查询: must,should,must_not** 如果我们想要请求"title"中带"宝马",但是"name"中不带"宝马"这样类似的需求,就需要用到bool联合查询。 联合查询就会使用到must,should,must_not三种关键词。 这三个可以这么理解 * must: 文档必须完全匹配条件 * should: should下面会带一个以上的条件,至少满足一个条件,这个文档就符合should * must_not: 文档必须不匹配条件 ``` PUT my_index/_doc/2 { "name" : "宝马", "title" : "我的宝马x5有260马力" } PUT my_index/_doc/3 { "name" : "宝马", "title" : "我的奥迪有260马力" } #搜索 POST my_index/_doc/_search { "query":{ "bool":{ "must":{ "term":{ "name":"宝马" } }, "must_not":{ "term": { "title": "宝马" } } } } } #返回结果 { "took": 2, "timed_out": false, "_shards": { "total": 5, "successful": 5, "skipped": 0, "failed": 0 }, "hits": { "total": 1, "max_score": 0.2876821, "hits": [ { "_index": "my_index", "_type": "_doc", "_id": "3", "_score": 0.2876821, "_source": { "name": "宝马", "title": "我的奥迪有260马力" } } ] } } ```