最近对数据处理很感兴趣,一直在研究Linux三剑客:grep AWK sed ,下面举个实际工作中遇到的一个小实例,联用三大文本处理剑客。
已知table.txt文件,cat查看文件内容如下,其实也就是我查询数据库中部分表的状态。
[pc@glusterfs-srv-01 ~]$ cat table.txt
*************************** 15. row ***************************
Name: procs_priv
Engine: MyISAM
Version: 10
Row_format: Fixed
Rows: 0
Avg_row_length: 0
Data_length: 0
Max_data_length: 239253730204057599
Index_length: 4096
Data_free: 0
Auto_increment: NULL
Create_time: 2016-09-19 01:38:11
Update_time: 2016-09-19 01:38:11
Check_time: NULL
Collation: utf8_bin
Checksum: NULL
Create_options:
Comment: Procedure privileges
*************************** 16. row ***************************
Name: proxies_priv
Engine: MyISAM
Version: 10
Row_format: Fixed
Rows: 2
Avg_row_length: 693
Data_length: 1386
Max_data_length: 195062158860484607
Index_length: 5120
Data_free: 0
Auto_increment: NULL
Create_time: 2016-09-19 01:38:11
Update_time: 2016-09-19 01:38:11
Check_time: 2016-09-19 01:38:11
Collation: utf8_bin
Checksum: NULL
Create_options:
Comment: User proxy privileges
*************************** 17. row ***************************
Name: servers
Engine: MyISAM
Version: 10
Row_format: Fixed
Rows: 0
Avg_row_length: 0
Data_length: 0
Max_data_length: 433752939111120895
Index_length: 1024
Data_free: 0
Auto_increment: NULL
Create_time: 2016-10-19 01:38:11
Update_time: 2016-09-19 01:38:11
Check_time: NULL
Collation: utf8_general_ci
Checksum: NULL
Create_options:
Comment: MySQL Foreign Servers table
*************************** 18. row ***************************
Name: slave_master_info
Engine: InnoDB
Version: 10
Row_format: Compact
Rows: 0
Avg_row_length: 0
Data_length: 16384
Max_data_length: 0
Index_length: 0
Data_free: 0
Auto_increment: NULL
Create_time: 2016-03-19 01:38:15
Update_time: NULL
Check_time: NULL
Collation: utf8_general_ci
Checksum: NULL
Create_options: stats_persistent=0
Comment: Master Information
*************************** 19. row ***************************
Name: slave_relay_log_info
Engine: InnoDB
Version: 10
Row_format: Compact
Rows: 0
Avg_row_length: 0
Data_length: 16384
Max_data_length: 0
Index_length: 0
Data_free: 0
Auto_increment: NULL
Create_time: 2016-04-19 01:38:15
Update_time: NULL
Check_time: NULL
Collation: utf8_general_ci
Checksum: NULL
Create_options: stats_persistent=0
Comment: Relay Log Information
要求: 打印出在2016年8月份之后创建的数据库表名
答案: cat table.txt | awk 'NF' | grep -iE 'name|create_time' | awk -F':' '{ print $2}' | sed -e 's/^[[:space:]]//’ | awk -F- ‘/^[[:alpha:]]/ {a=$0;next} {if ($2>7) print a}'
shell下运行结果如下,
上面的命令以“|”符号为分割,注解如下:
1 查看table.txt内容
2 去除空行,这个也是AWK的一小技巧
3 搜索出包含name或create_time字段的行
4 以符号“:”为分隔符,打印第二个字段
5 去除每行开头的空格
6 这部分也是最关键的命令,意思是搜索到以字母开头的行后,把整行内容赋值保存到变量a中,并且以字符“-”为分割,跳到下一行处理,如果这行的第二个字段值大于7,则打印变量a的内容,即是上一行的内容。