filebeat+elasticsearch从massage中提取字段

本文主要是介绍filebeat+elasticsearch从massage中提取字段，希望对大家解决编程问题提供一定的参考价值，需要的开发者们随着小编来一起学习吧！

　由于性能问题，采用filebeat+elasticsearch+kiban的架构，该架构中不使用logstash，也就无法进行传统的字段过滤，比如将massage中的信息进行过滤，提出新的字段。但是在elasticsearch 5.0版本以后，elasticsearch中添加了Ingest Node功能，以前如果需要对数据进行加工，都是在索引之前进行处理，比如logstash可以对日志进行结构化和转换，现在直接在es就可以处理了，目前es提供了一些常用的诸如convert、grok之类的处理器，在使用的时候，先定义一个pipeline管道，里面设置文档的加工逻辑，在建索引的时候指定pipeline名称，那么这个索引就会按照预先定义好的pipeline来处理了。

示例如下：
在filebeat.yml中设置

filebeat.prospectors:
- input_type: log- /work/1.log
output.elasticsearch:hosts: ["localhost:9200"]pipeline: "test-pipeline"

在/work下创建pipeline.json文件：

{"description" : "test-pipeline","processors" : [{"grok" :{"field" : "message","patterns" :["%{IP:client} %{WORD:method} %{URIPATHPARAM:request} %{NUMBER:bytes} %{NUMBER:duration}"]}}]
}

将pipeline导入elasticsearch中

curl -XPUT 'http://localhost:9200/_ingest/pipeline/test-pipeline' -d@/work/pipeline.json

在/work下创建文件1.log，内容为：

55.3.244.1 GET /index.html 15824 0.043

输出结果为：

{"_index": "filebeat-2017.03.15","_type": "log","_id": "AVrQ6Jg-W7NyrdbKiTgl","_score": null,"_source": {"request": "/index.html","offset": 73,"method": "GET","input_type": "log","source": "/work/1.log","message": "55.3.244.1 GET /index.html 15824 0.043","type": "log","duration": "0.043","@timestamp": "2017-03-15T07:39:48.649Z","bytes": "15824","beat": {"hostname": "xia-VirtualBox","name": "xia-VirtualBox","version": "5.2.2"},"client": "55.3.244.1"},"fields": {"@timestamp": [1489563588649]},"sort": [1489563588649]
}