仕事で、以下のようなSQLとその実行時間のログを集計することがあったので、awkで平均値を求めた際のコマンドを備忘として残しておく。
かなり簡略化したログなのだが、「[]」の中に実行に要した時間があって、その右に実行したSQLが記述されているようなログだ。「;」より右のはPrepared StatementでSQLに入れられる値だ。
● /tmp/sample.log
shell2016/04/25 00:00:01 : [11.25] SELECT AAA.* FROM AAA WHERE AAA.xxx = ? ; [123] 2016/04/25 00:00:01 : [9.76] SELECT BBB.* FROM BBB WHERE BBB.yyy = ? AND BBB.zzz; [19,234] 2016/04/25 00:00:01 : [17.32] SELECT AAA.* FROM AAA WHERE AAA.xxx = ? ; [13] 2016/04/25 00:00:02 : [13.45] SELECT AAA.ZSC FROM AAA WHERE AAA.xxx = ? AND AAA.qqq; [22,52] 2016/04/25 00:00:02 : [15.05] SELECT AAA.* FROM AAA WHERE AAA.xxx = ? ; [2345] 2016/04/25 00:00:03 : [13.45] SELECT AAA.ZSC FROM AAA WHERE AAA.xxx = ? AND AAA.qqq; [39,1952] 2016/04/25 00:00:04 : [8.45] SELECT AAA.ZSC FROM AAA WHERE AAA.xxx = ? AND AAA.qqq; [423,1222] 2016/04/25 00:00:05 : [14.45] SELECT BBB.* FROM BBB WHERE BBB.yyy = ? AND BBB.zzz; [9,33234]
で、awk(+α)で各SQLの平均実行時間を計算するという内容なのだが、とりあえず以下のコマンドで対応した。
左から順に平均実行時間、SQLの件数、SQLという列。
bash
awk -F' : ' '{print $2}' /tmp/sample.log | awk -F';' '{print $1}' | awk -F'[][]' '{if($0!=""){a[$3]+=$2;arr[$3]++}}END{for(i in a) print a[i]/arr[i] "," arr[i] "," i;}'
shell[root@test-node ~]# awk -F' : ' '{print $2}' /tmp/sample.log [11.25] SELECT AAA.* FROM AAA WHERE AAA.xxx = ? ; [123] [9.76] SELECT BBB.* FROM BBB WHERE BBB.yyy = ? AND BBB.zzz; [19,234] [17.32] SELECT AAA.* FROM AAA WHERE AAA.xxx = ? ; [13] [13.45] SELECT AAA.ZSC FROM AAA WHERE AAA.xxx = ? AND AAA.qqq; [22,52] [15.05] SELECT AAA.* FROM AAA WHERE AAA.xxx = ? ; [2345] [13.45] SELECT AAA.ZSC FROM AAA WHERE AAA.xxx = ? AND AAA.qqq; [39,1952] [8.45] SELECT AAA.ZSC FROM AAA WHERE AAA.xxx = ? AND AAA.qqq; [423,1222] [14.45] SELECT BBB.* FROM BBB WHERE BBB.yyy = ? AND BBB.zzz; [9,33234] [root@test-node ~]# awk -F' : ' '{print $2}' /tmp/sample.log | awk -F';' '{print $1}' [11.25] SELECT AAA.* FROM AAA WHERE AAA.xxx = ? [9.76] SELECT BBB.* FROM BBB WHERE BBB.yyy = ? AND BBB.zzz [17.32] SELECT AAA.* FROM AAA WHERE AAA.xxx = ? [13.45] SELECT AAA.ZSC FROM AAA WHERE AAA.xxx = ? AND AAA.qqq [15.05] SELECT AAA.* FROM AAA WHERE AAA.xxx = ? [13.45] SELECT AAA.ZSC FROM AAA WHERE AAA.xxx = ? AND AAA.qqq [8.45] SELECT AAA.ZSC FROM AAA WHERE AAA.xxx = ? AND AAA.qqq [14.45] SELECT BBB.* FROM BBB WHERE BBB.yyy = ? AND BBB.zzz [root@test-node ~]# awk -F' : ' '{print $2}' /tmp/sample.log | awk -F';' '{print $1}' | awk -F'[][]' '{if($0!=""){a[$3]+=$2;arr[$3]++}}END{for(i in a) print a[i]/arr[i] "," arr[i] "," i;}' 12.105,2, SELECT BBB.* FROM BBB WHERE BBB.yyy = ? AND BBB.zzz 14.54,3, SELECT AAA.* FROM AAA WHERE AAA.xxx = ? 11.7833,3, SELECT AAA.ZSC FROM AAA WHERE AAA.xxx = ? AND AAA.qqq
多分、ちゃんと書けばもうちょっと短縮出来ると思うのだが、とりあえずはこれで集計出来たので良しとしよう。