Thursday 8 August 2019

Search patterns in files using Linux grep command

Does you job requires you to frequently search for patterns in files through Linux command line? Or, do you feel frustrated when you have to open files in UI editors to search for strings or patterns on Linux? Well, if yes then the Linux grep command is for you. This command can be used to search a pattern in one or more files directly from the command line. 

In this article, we will understand the usage of Linux grep command through practical examples.
 

SYNTAX

Before jumping on to the examples, lets first take a look on how to use the grep command. Here is the basic syntax information of grep command from the man page :
grep [OPTIONS] PATTERN [FILE...]
So we see that the grep command does require PATTERN as a mandatory argument. The OPTION and FILE arguments are non-mandatory. While the OPTION argument tells the grep command to act in a way as specified by the definition of that OPTION, the FILE argument tells the grep command about the files in which the pattern needs to be searched. The ellipsis '...' in the argument FILE indicates that more than one files can be presented in the argument list. 

NOTE For those who are new to this type of syntax information, any argument specified in square brackets [] are non-mandatory.
 

EXAMPLES

 

1. A basic example

Here is how the grep command can be used in its most basic form.
# grep "Linux" input.txt 
Welcome to Linux.
In the output above, the line in the file input.txt containing the pattern or string "Linux" was displayed as output.
 

2. Pattern matching is case sensitive

The pattern matching done by grep command is case sensitive. For example, if the argument to grep command is "LINUX" (instead of "Linux") then grep will not match the lines containing string "Linux". 

Here is an example :
# grep "LINUX" input.txt 
#
So we see that no output was displayed. If it is desired that grep command should ignore the case sensitiveness then the option -i can be used. Here is the example :
# grep -i "LINUX" input.txt 
Welcome to Linux.
So we see that this time the string "LINUX" matched with the line containing the string "Linux".
 

3. Search in more that one file

If more than one file is supplied in argument list then grep searches for the pattern or string in all the files. 

For example :
# grep "Linux" input.txt output.txt 
input.txt:Welcome to Linux.
output.txt:I hope you enjoyed working on Linux.
As we can see in the output above, the lines containing the string "Linux" along with their respective file names were displayed in the output.

Also, to search in a complete directory, the argument '*' can passed as input.

Here is an example :
# grep "Linux" *
input.txt:Welcome to Linux.
output.txt:I hope you enjoyed working on Linux.
Binary file test_strace matches
test_strace.c:    if(NULL == fopen("Linux","rw"))
So we see that lines in all the files (in current directory) containing the string "Linux" were displayed as output.
 

4. Search recursively using -r option

There exists an option -r through which the grep command can search for pattern (or string) recursively in the sub-directories. 

Here is an example :
# grep -r "Linux" *
input.txt:Welcome to Linux.
new_dir/new.txt:Linux vs Windows
output.txt:I hope you enjoyed working on Linux.
Binary file test_strace matches
test_strace.c:    if(NULL == fopen("Linux","rw"))
So we see that the output contains matching results from files contained in sub-directories.
 

5. Match patterns using regular expressions

The grep command also allows the usage of regular expressions in pattern matching. This provides tremendous power to user using the grep command to search for any possible pattern that can be represented through regular expression. 

Here is an example :
# grep -r ".*Linux" output.txt output1.txt 
output.txt:I hope you enjoyed working on Linux.
output1.txt:Welcome to Linux.
output1.txt:I hope you will have fun with Linux.
As we can see that the grep command above used a regular expression ".*Linux" for pattern matching in files output.txt and ouput1.txt. 

Here is a table of regular expression operators and their effect :
Operator Effect
. Matches any single character.
? The preceding item is optional and will be matched, at most, once.
* The preceding item will be matched zero or more times.
+ The preceding item will be matched one or more times.
{N} The preceding item is matched exactly N times.
{N,} The preceding item is matched N or more times.
{N,M} The preceding item is matched at least N times, but not more than M times.
- represents the range if it's not first or last in a list or the ending point of a range in a list.
^ Matches the empty string at the beginning of a line; also represents the characters not in the range of a list.
$ Matches the empty string at the end of a line.
\b Matches the empty string at the edge of a word.
\B Matches the empty string provided it's not at the edge of a word.
\< Match the empty string at the beginning of word.
\> Match the empty string at the end of word.

Any one or a combination of these operators can be used to form a regular expression that represents the pattern of user's choice.

0 comments:

Post a Comment