Improving grepping over a huge file performance
        Posted  
        
            by 
                rogerio_marcio
            
        on Programmers
        
        See other posts from Programmers
        
            or by rogerio_marcio
        
        
        
        Published on 2012-05-29T22:02:09Z
        Indexed on 
            2012/09/05
            21:50 UTC
        
        
        Read the original article
        Hit count: 441
        
I have FILE_A which has over 300K lines and FILE_B which has over 30M lines. I created a bash script that greps each line in FILE_A over in FILE_B and writes the result of the grep to a new file.
This whole process is taking over 5+ hours.
I'm looking for suggestions on whether you see any way of improving the performance of my script.
I'm using grep -F -m 1 as the grep command. FILE_A looks like this:
123456789 
123455321
and FILE_B is like this:
123456789,123456789,730025400149993,
123455321,123455321,730025400126097,
So with bash I have a while loop that picks the next line in FILE_A and greps it over in FILE_B. When the pattern is found in FILE_B i write it to result.txt.
while read -r line; do
   grep -F -m1 $line 30MFile
done < 300KFile
Thanks a lot in advance for your help.
© Programmers or respective owner