In this tutorial, we look at how we can use AWK to print the header lines from a file or a command output along with the pattern being searched.
While filtering output from certain commands or lengthy reports, it may be important to display the first line of the file or the header line to make sense of the rest of the output which is being displayed.
Consider the below output.
[sahil@linuxnix ~]$ df -hTP Filesystem Type Size Used Avail Use% Mounted on /dev/mapper/vg_pbox6-lv_root ext4 18G 4.9G 12G 30% / tmpfs tmpfs 491M 80K 491M 1% /dev/shm /dev/sda1 ext4 477M 35M 418M 8% /boot /dev/sr0 iso9660 3.7G 3.7G 0 100% /media/CentOS_6.8_Final /dev/sdb ext4 488M 396K 462M 1% /u01 /dev/sdc ext4 488M 396K 462M 1% /u02
We would like to print only the ext4 type file systems but along with the header line as well to make sense of the values indicated by the respective fields.
We could use grep to meet this requirement as done in the below command
[sahil@linuxnix ~]$ df -hTP | grep -E "Filesystem|ext4" Filesystem Type Size Used Avail Use% Mounted on /dev/mapper/vg_pbox6-lv_root ext4 18G 4.9G 12G 30% / /dev/sda1 ext4 477M 35M 418M 8% /boot /dev/sdb ext4 488M 396K 462M 1% /u01 /dev/sdc ext4 488M 396K 462M 1% /u02
The pipe (|) symbol is an alternation which tells the Linux grep command to print lines containing the word Filesystem or ext4.
But for this to work we’ll always need to know a pattern from the header line (first line) which is not convenient.
So now, we see how we can obtain the required output using awk.
[sahil@linuxnix ~]$ df -hTP | awk 'NR==1 {print}; /ext4/ {print}' Filesystem Type Size Used Avail Use% Mounted on /dev/mapper/vg_pbox6-lv_root ext4 18G 4.8G 12G 30% / /dev/sda1 ext4 477M 35M 418M 8% /boot /dev/sdb ext4 488M 396K 462M 1% /u01 /dev/sdc ext4 488M 396K 462M 1% /u02
Let me explain the above awk command.
- NR here means the row number and $NR==1 implies row number 1. You should consider reading other AWK variables.
- After this, we have the print statement which tells awk to print the first row.
- Next, we have a semicolon (;). AWK allows chaining of statements similar to chaining of commands on the command line shell. So the semicolon tells awk that there are more statements to execute.
- The next AWK statement searches for the word ext4 and prints all lines containing this word.
For our next example, let’s consider the below CSV file
[sahil@linuxnix ~]$ cat agent.csv Name, Address, Phone Number Roger,121B Baker's Street, +44-123-5678 Daniel,125A Baker's Street, +44-173-5628 Sean,122B Baker's Street, +44-423-9678 Charles,127D Baker's Street, +44-573-2678 Pierce,129B Baker's Street, +44-825-3678
From the above file, we’d like to print details for the Names Roger and Sean along with the header line. Here’s a long but interesting AWK one-liner which does the job perfectly.
[sahil@linuxnix ~]$ awk -F, 'BEGIN{IGNORECASE=1} ; {OFS=","} ; NR==1 {print $1, $NF}; /[Rr]oger|sean/ {print $1, $NF}' agent.csv Name, Phone Number Roger, +44-123-5678 Sean, +44-423-9678
Let me break down the above command step by step.
- The BEGIN{IGNORECASE=1} tell awk to perform case-insensitive searches while filtering for patterns.
- With {OFS=”,”} we set the value of the Output Field Separator (OFS) variable to a comma (,) so that the resulting output may be redirected to a CSV file if required.
- The next line prints the first and last field(column) from the first row. Here NR==1 matches the first row like we did in our previous example and value of $NF evaluates to the last column number which in this case is 3.
- Next, we perform the pattern search for the names. We use the pipe (|) symbol to denote an alternation indicating to match for either one or both of the patterns.
- Here I’ve used a character class [Rr] to match for Roger or roger. Although IGNORECASE=1 should take care of the case sensitivity part I wanted to use it in the example just for the sake of demonstration.
- In the final print statement as with the previous one, we print the first and last fields from the matching rows.
0 comments:
Post a Comment