10.4 Extended matching
There are multiple ways of finding a pattern. Whether you want to extract lines where a substring matches anywhere in the line, or if you are looking for an exact match, these are the main basic options:
Return all lines that match the substring anywhere in the line:
Return the lines where the first column (or word) exactly matches the complete search pattern:
Return the lines where the second column exactly matches the complete search pattern:
If you are looking for a pattern anywhere within a specific column, you can use the ~ sign. Be aware of these partial matches, as the search pattern “ID” can also be found within the string “NUCLEOTIDE”, and the pattern “sample1” also matches the strings “sample11”, “sample12”, “sample13”, …:
To extract lines based on multiple search patterns at once, you can use && (and) and || (or).
This means, the following command returns all lines that contain either the exact pattern “ID” in the first column, or the string “ID” anywhere in the second column:
While the next code returns all lines that contain the exact pattern “ID” in the first column, and contain the string “ID” anywhere in the second column:
You can also use regular expressions in your search patterns. Remember the tutorial about regular expressions (chapter 9.4) - the same results can be accomplished by matching for the regex pattern in awk: