Elasticsearch

elastic search 安装

安装

// 下载
wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-7.12.0-linux-x86_64.tar.gz
tar -zxvf elasticsearch-7.12.0-linux-x86_64.tar.gz
sudo mv elasticsearch-7.12.0 /usr/local/elasticsearch

配置

修改 config 目录下 elasticsearch.yml 文件。

1	vim elasticsearch.yml

// 去掉行开头的 # 并重命名集群名，这里命名为 my-el
cluster.name: my-el
node.name: node-1

// 可以使用ip地址访问
network.host: 0.0.0.0
node.data : true
discovery.seed_hosts : []
cluster.initial_master_nodes : ["node-1"]

启动之后测试是否正常运行。

1
2
3

./elasticsearch -d

curl 127.0.0.1:9200

{
name: "node-1",
cluster_name: "my-el",
cluster_uuid: "MmEwssVBQUqOzbT5GNkq9g",
version: {
number: "7.12.0",
build_flavor: "default",
build_type: "tar",
build_hash: "78722783c38caa25a70982b5b042074cde5d3b3a",
build_date: "2021-03-18T06:17:15.410153305Z",
build_snapshot: false,
lucene_version: "8.8.0",
minimum_wire_compatibility_version: "6.8.0",
minimum_index_compatibility_version: "6.0.0-beta1"
},
tagline: "You Know, for Search"
}

root 用户无法启动？

如果是 root 用户，需要切换到普通账号或者新建 ES 账号

$ groupadd elasticsearch
$ useradd -g elasticsearch elasticsearch
$ chown -R elasticsearch:elasticsearch elasticsearch
$ su elasticsearch

内存不足？

1	sysctl -w vm.max_map_count=262144

elasticsearch-head

安装

git clone git://github.com/mobz/elasticsearch-head.git
cd elasticsearch-head
npm install
npm run start
open http://localhost:9100/

跨域问题：Access to XMLHttpRequest at ‘http://xx.xx:9200/_stats' from origin ‘http://localhost:9100' has been blocked by CORS policy

修改elasticsearch.yml文件

1
2
3

# cors
http.cors.enabled: true
http.cors.allow-origin: "*"

elastic search 使用

indexMySQLdatabase概念，一个index就是一个数据库。

type相当于MySQL中table的概念，一个type就是一个表。这个会在8.x废弃掉。

document即文档，相当于MySQL中的一条数据，ES的数据类似于Mongo，如["name": "qii404", "age": 23]。我们通过复合查询功能框模拟Curl请求实现。

类型名称 8.X 会废弃，使用默认的 _doc。

PUT

创建索引及对应的字段类型，相当于MySQL建表。

1 2	PUT video_index2 {"mappings":{"properties":{"name":{"type":"text"},"age":{"type":"long"},"birthday":{"type":"date"}}}}

旧版写法：

1
2
3

PUT 索引名/~类型名~/文档id
PUT video_index2/user/1 # 创建索引（type为user）并添加数据
{"name":"Rui", "age": 18}

put 可以用来修改文档。

1 2	PUT video_index/user/1 # 修改age为3，同时会删除name属性 {"age": 3}

POST

更新 name 字段的值。每次修改 version会加1。

1 2	POST video_index2/_doc/1/_update {"doc":{"name":"Rui123"}}

更新建议用 POST xxx/_update ，避免 PUT 更新造成字段丢失。

DELETE

// 删除索引
DELETE video_index2/
// 删除文档
DELETE video_index2/_doc/1

GET

1 2	// 得到索引信息 GET video_index

GET video_index/_doc/_search?q=name:Rui
// 等价于
GET video_index/_doc/_search
{
  "query": {"match": { "name": "Rui"}}
}

_source指定过滤输出字段。

GET video_index/_doc/_search
{
  "query": {"match": { "name": "Rui"}},
  "_source": ["name", "birthday"]
}

sort根据字段排序

GET video_index/_doc/_search
{
  "query": {"match": { "name": "Rui"}},
  "sort": [{"age": {"order": "desc"}}]
}

from 和size分页

GET video_index/_doc/_search
{
  "query": {"match": { "name": "Rui"}},
  "sort": [{"age": {"order": "desc"}}],
  "from": 0,
  "size": 10
}

must (and) 所有条件都要符合，where id = 1 and name = zhangsan

should（or）where name = xx or age = 18

must_not(not)

GET video_index/_doc/_search
{"query":{"bool":{"should":[{"match":{"name":"Rui"}},{"match":{"age":18}}]}}}

GET video_index/_doc/_search
{"query":{"bool":{"must":[{"match":{"name":"Rui"}},{"match":{"age":18}}]}}}

filter过滤

1 2	GET video_index/_doc/_search {"query":{"bool":{"filter":{"range":{"age":{"gt":18}}}}}}

gt/gte 大于/大于等于
lt/lte 小于/小于等于
eq 等于

基础语法说明如下：

关键词	说明
match_all	查询简单的匹配所有文档。在没有指定查询方式时，它是默认的查询
match	用于全文搜索或者精确查询，如果在一个精确值的字段上使用它，例如数字、日期、布尔或者一个 not_analyzed 字符串字段，那么它将会精确匹配给定的值
range	查询找出那些落在指定区间内的数字或者时间 gt 大于；gte 大于等于；lt 小于；lte 小于等于
term	被用于精确值匹配
terms	terms 查询和 term 查询一样，但它允许你指定多值进行匹配
exists	查找那些指定字段中有值的文档
missing	查找那些指定字段中无值的文档
must	多组合查询必须匹配这些条件才能被包含进来
must_not	多组合查询必须不匹配这些条件才能被包含进来
should	多组合查询如果满足这些语句中的任意语句，将增加 _score ，否则，无任何影响。它们主要用于修正每个文档的相关性得分
filter	多组合查询这些语句对评分没有贡献，只是根据过滤标准来排除或包含文档

高亮查询

GET video_index/_doc/_search
{"query":{"match":{"name":"qi"}},"highlight":{"fields":{"name":{}}}}

// 返回结果
"highlight":{
    "name":[
        "<em>qi</em>"
    ]
}

GET video_index/_doc/_search
{
  "query":{"match":{"name":"qi"}},
  "highlight": {
    "pre_tags": "<span class='key' style='color:red'>",
    "post_tags": "</span>",
    "fields": {
      "name": {}
    }
  }
}

// 返回结果
"highlight":{
    "name":[
        <span class='key' style='color:red'>qi</span>"
    ]
}

ik 分词器

https://github.com/medcl/elasticsearch-analysis-ik/releases/tag/v7.12.0

复制到 elasticsearch/plugins/目录下，重启elasticsearch。

1 2	// 出现加载插件的 log [INFO ][o.e.p.PluginsService ] [node-1] loaded plugin [analysis-ik]

ik_smart 最小粒度划分 ik_max_word 最细粒度划分。

POST _analyze
{
    "analyzer":"ik_smart",
    "text":"中华人民"
}

{
    "tokens":[
        {
            "token":"中华人民",
            "start_offset":0,
            "end_offset":7,
            "type":"CN_WORD",
            "position":0
        }
    ]
}

POST _analyze
{
    "analyzer":"ik_max_word",
    "text":"中华人民"
}

{
    "tokens":[
        {
            "token":"中华人民",
            "start_offset":0,
            "end_offset":7,
            "type":"CN_WORD",
            "position":0
        },
        {
            "token":"中华",
            "start_offset":0,
            "end_offset":2,
            "type":"CN_WORD",
            "position":2
        },
        {
            "token":"华人",
            "start_offset":1,
            "end_offset":3,
            "type":"CN_WORD",
            "position":3
        },
        {
            "token":"人民",
            "start_offset":2,
            "end_offset":4,
            "type":"CN_WORD",
            "position":5
        }
    ]
}

自定义词典：编辑config/IKAnalyzer.cfg.xml文件

1 2	<!--用户可以在这里配置自己的扩展字典 --> <entry key="ext_dict">my.dic</entry>

关于分词

match 会根据分词器分词解析并进行查询。
term 查询直接通过倒排索引指定词条精确匹配。

两个类型

keyword 不会被分词器解析
text 可以被分词器解析