shell - linux:怎样从复杂的log中提取信息

Question

例如在文件1.log中
id=1
a=1,b=2,c=3,d=4,e=5....,z=100

id=2
a=3,b=4,d=20,e=6,f=7,...,z=30

id=3
a=4,b=4,c=2,d=5,e=8,...,z=29

....
现在我想统计在log中d的分布~
有什么好方法吗？ grep每次都是输出整行，没法提取一个关键词的信息。

怪我咯 · Answer

Awk solution:

#!/bin/bash
                                
awk -F"," '
NF == 0 {next}    # skip blank line
NF == 1 {printf "%s ", }    # for id line
# for data line
{
    for (i = 1; i <= NF; i++) {
        split($i, a, "=");
        if (a[1] == "d") print $i;
    }
}
' 1.log

The results are as follows:
id=1 d=4
id=2 d=20
id=3 d=5

The advantage of awk is that it can handle the input/output format in a more detailed manner.

ringa_lee · Answer

First remove the d= in id=, then
grep -o parameter Extract matching patterns. To grab the numbers again, just use awk or cut.

grep -v "id=[0-9]*" 1.log | grep -o "d=[0-9]*" | awk -F'=' '{ print  }'

Or, use egrep,

grep -v "id=[0-9]*" 1.log | egrep -o "d=[0-9]+" | cut -d '=' -f 2

There are still many methods, and other sed ones can be used;

PHP中文网 · Answer

Give me another idea...

mv 1.log /opt/www/1.log

Then use a php script to process it and create a new 1.php. The script is as follows:

$v){
	$b = explode("=",$v);
	if($b[0]=="d"){
		$new_arr[] = $b[1];
	}
}
print_r($new_arr);
?>

ringa_lee · Answer

This is more suitable to be done with awk or flex.

flex:

$ cat 1.l 
%%
d=[0-9]*,   printf("%d
", atoi(yytext + 2));

.|


$ flex 1.l && gcc lex.yy.c -lfl && ./a.out < 1.txt 
4
20
5

怪我咯 · Answer

This kind of log processing can be done with awk, perl, or ruby. Last perl version

perl -ne 'print  if m/d=(\d+)/' your_log_file

PHP中文网 · Answer

Use Python, it works well on any OS.

import re
_re.compile('d=\d+')
# readline in 'line'
matched = _re.search(line)
if matched:
    extracted = matched.group(0)
print extracted

ringa_lee · Answer

Use the cut command.
cut -d 'Split characters' -f 'Select the meaning of which paragraph'
There seems to be another parameter -c.

$ cat 1.log |cut -c 0-4 |cut -d ',' -f 4

You can man it yourself.