Input field separator in awk
        Posted  
        
            by 
                Matthijs
            
        on Super User
        
        See other posts from Super User
        
            or by Matthijs
        
        
        
        Published on 2012-10-05T20:43:04Z
        Indexed on 
            2012/10/05
            21:41 UTC
        
        
        Read the original article
        Hit count: 327
        
I have many large data files. The delimiter between the fields is a semicolon. However, I have found that there are semicolons in some of the fields, so I cannot simply use the semicolon as a field separator.
The following example has 4 fields, but awk sees only 3, because the '1' in field 3 is stripped by the regex (which includes a '-' because some of the numerical data are negative):
echo '"This";"is";1;"line of; data"' | awk -F'[0-9"-];[0-9"-]' '{print "No. of fields:\t"NF; print "Field 3:\t" $3}'
No. of fields:  3
Field 3:        ;"line of; data"
Of course,
echo '"This";"is";1;"line of; data"' | awk -F';' '{print "No. of fields:\t"NF}'
No. of fields:  5
solves that problem, but counts the last field as two separate fields.
Does anyone know a solution to this?
Thanks!
Matthijs
© Super User or respective owner