Stop Token Filter（Stop 词元过滤器） · Elasticsearch 5.4 中文文档

# Stop Token Filter（Stop 词元过滤器）原文链接 : [https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-stop-tokenfilter.html](https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-stop-tokenfilter.html) 译文链接 : [http://www.apache.wiki/pages/viewpage.action?pageId=10028519](http://www.apache.wiki/pages/viewpage.action?pageId=10028519) 贡献者 : [fucker](/display/~caizhongjie)，[ApacheCN](/display/~apachecn)，[Apache中文网](/display/~apachechina) **stop** 类型的词元过滤器，用于从词元流中移除**stop words**。以下是 **stop** 类型的词元过滤器的可选设置： | `stopwords` | 停止词列表. 默认 **`_english_`** 无效词. | | `stopwords_path` | 无效词配置文件路径（绝对或者相对路径）. 每个停止词自占一行 (换行分隔). 文件必须是 **UTF-8** 编码. | | `ignore_case` | 设置为 **`true` **所有词被转为小写. 默认 **`false`**. | | `remove_trailing` | 设置为 **false**，以便不忽略搜索的最后一个字词，如果它是无效词。这对于完成建议者非常有用，因为像 **green a** 一样的查询可以扩展到 **green apple** ，即使你删除一般的无效词。默认为 **true**。 | **stopwords** 参数接受一个数组类型的无效词： ``` PUT /my_index { "settings": { "analysis": { "filter": { "my_stop": { "type": "stop", "stopwords": ["and", "is", "the"] } } } } } ``` 或预定义的语言特定列表： ``` PUT /my_index { "settings": { "analysis": { "filter": { "my_stop": { "type": "stop", "stopwords": "_english_" } } } } } ``` **Elasticsearch** 提供以下预定义语言列表： **`_arabic_`, `_armenian_`, `_basque_`, `_brazilian_`, `_bulgarian_`, `_catalan_`, `_czech_`, `_danish_`, `_dutch_`, `_english_`, `_finnish_`, `_french_`, `_galician_`, `_german_`, `_greek_`, `_hindi_`, `_hungarian_`,`_indonesian_`, `_irish_`, `_italian_`, `_latvian_`, `_norwegian_`, `_persian_`, `_portuguese_`, `_romanian_`, `_russian_`, `_sorani_`, `_spanish_`, `_swedish_`, `_thai_`, `_turkish_`.** 空的无效词列表（禁用无效词）使用：**`\_none_。` **