由于我们的埋点日志是嵌套json类型,要想最终所有字段展开来统计分析就必须把嵌套json展开。
- 日志格式如下:
2019-01-22 19:25:58 172.17.12.177 /statistics/EventAgent appkey=yiche&enc=0<ype=view&yc_log={"uuid":"73B333EB-EC87-4F9F-867B-A9BF38CBEBB2","mac":"02:00:00:00:00:00","uid":-1,"idfa":"2BFD67CF-ED60-4CF6-BA6E-FC0B18FDDDF8","osv":"iOS11.4.1","fac":"apple","mdl":"iPhone SE","req_id":"360C8C43-73AC-4429-9E43-2C08F4C1C425","itime":1548156351820,"os":"2","sn_id":"6B937D83-BFB2-4C22-85A8-5B3E82D9D0F1","dvid":"3676b52dc155e1eec3ca514f38736fd6","aptkn":"4fb9b2bffb808515aa0e9a5f5b17d826769e432f63d5cf87f7fb5ce4d67ef9f1","cha":"App Store","idfv":"B1EAD56F-E456-4FF2-A3C2-9A8FA0693C22","nt":4,"lg_vl":{"pfrom":"shouye","ptitle":"shouye"},"av":"10.3.3"} 218.15.255.124 200
- 最开始Logstash的配置文件如下:
input {
file {
path => ["/data/test_logstash.log"]
type => ["nginx_log"]
start_position => "beginning"
}
}
filter {
if [type] =~ "nginx_log" {
grok {
match => { "message" => "%{TIMESTAMP_ISO8601:create_time} %{IP:server_ip} %{URIPATH:uri} %{GREEDYDATA:args} %{IP:client_ip} %{NUMBER:status}" }
}
urldecode{
field =>args
}
kv {
source =>"args"
field_split =>"&"
remove_field => [ "args","@timestamp","message","path","@version","path","host" ]
}
json {
source => "yc_log"
remove_field => [ "yc_log" ]
}
}
}
output {
stdout { codec => rubydebug }
}
按照以上配置文件运行Logstash得到的结果如下:
{
"server_ip" => "172.17.12.177",
"cha" => "App Store",
"mdl" => "iPhone SE",
"type" => "nginx_log",
"mac" => "02:00:00:00:00:00",
"ptitle" => "shouye",
"appkey" => "yiche",
"idfv" => "B1EAD56F-E456-4FF2-A3C2-9A8FA0693C22",
"sn_id" => "6B937D83-BFB2-4C22-85A8-5B3E82D9D0F1",
"aptkn" => "4fb9b2bffb808515aa0e9a5f5b17d826769e432f63d5cf87f7fb5ce4d67ef9f1",
"av" => "10.3.3",
"os" => "2",
"idfa" => "2BFD67CF-ED60-4CF6-BA6E-FC0B18FDDDF8",
"uid" => -1,
"uuid" => "73B333EB-EC87-4F9F-867B-A9BF38CBEBB2",
"req_id" => "360C8C43-73AC-4429-9E43-2C08F4C1C425",
"status" => "200",
"uri" => "/statistics/EventAgent",
"enc" => "0",
"ltype" => "view",
"lg_vl" => {
"ptitle" => "shouye",
"pfrom" => "shouye"
},
"nt" => 4,
"pfrom" => "shouye",
"itime" => 1548156351820,
"client_ip" => "218.15.255.124",
"create_time" => "2019-01-22 19:25:58",
"dvid" => "3676b52dc155e1eec3ca514f38736fd6",
"fac" => "apple",
"lg_value" => "{\"pfrom\":\"shouye\",\"ptitle\":\"shouye\"}",
"osv" => "iOS11.4.1"
}
可以看到lg_vl字段仍然是json格式,没有解析出来。如果直接在配置文件中添加
json { source => "lg_vl" }
会报jsonParseException错。
- 正确做法
input {
file {
path => ["/data/test_logstash.log"]
type => ["nginx_log"]
start_position => "beginning"
}
}
filter {
if [type] =~ "nginx_log" {
grok {
match => { "message" => "%{TIMESTAMP_ISO8601:create_time} %{IP:server_ip} %{URIPATH:uri} %{GREEDYDATA:args} %{IP:client_ip} %{NUMBER:status}" }
}
urldecode{
field =>args
}
kv {
source =>"args"
field_split =>"&"
remove_field => [ "args","@timestamp","message","path","@version","path","host" ]
}
json {
source => "yc_log"
remove_field => [ "yc_log" ]
}
mutate {
add_field => { "lg_value" => "%{lg_vl}" }
}
json {
source => "lg_value"
remove_field => [ "lg_vl","lg_value" ]
}
}
}
output {
stdout { codec => rubydebug }
}
在解析完上一层json之后添加一个字段lg_value,再将lg_vl的内容赋值给lg_value;之后单独对lg_value进行json解析就可以了。解析完结果如下:
{
"type" => "nginx_log",
"nt" => 4,
"dvid" => "3676b52dc155e1eec3ca514f38736fd6",
"os" => "2",
"fac" => "apple",
"ltype" => "view",
"client_ip" => "218.15.255.124",
"itime" => 1548156351820,
"mac" => "02:00:00:00:00:00",
"idfa" => "2BFD67CF-ED60-4CF6-BA6E-FC0B18FDDDF8",
"uri" => "/statistics/EventAgent",
"aptkn" => "4fb9b2bffb808515aa0e9a5f5b17d826769e432f63d5cf87f7fb5ce4d67ef9f1",
"sn_id" => "6B937D83-BFB2-4C22-85A8-5B3E82D9D0F1",
"create_time" => "2019-01-22 19:25:58",
"osv" => "iOS11.4.1",
"req_id" => "360C8C43-73AC-4429-9E43-2C08F4C1C425",
"ptitle" => "shouye",
"av" => "10.3.3",
"server_ip" => "172.17.12.177",
"pfrom" => "shouye",
"enc" => "0",
"mdl" => "iPhone SE",
"cha" => "App Store",
"idfv" => "B1EAD56F-E456-4FF2-A3C2-9A8FA0693C22",
"uid" => -1,
"uuid" => "73B333EB-EC87-4F9F-867B-A9BF38CBEBB2",
"appkey" => "yiche",
"status" => "200"
}
完美,棒棒哒!!!