SQL like group by and sum for text files in command line?

Posted by dnkb on Super User See other posts from Super User or by dnkb
Published on 2010-05-02T04:25:57Z Indexed on 2010/05/02 4:28 UTC
Read the original article Hit count: 375

Filed under:
|
|
|
|

I have huge text files with two fields, the first is a string the second is an integer. The files are sorted by the first field. What I'd like to get in the output is one line per unique string and the sum of the numbers for the identical strings. Some strings appear only once while other appear multiple times. E.g. Given the sample data below, for the string glehnia I'd like to get 10+22=32 in the result.

Any suggestions how to do this either with gnuwin32 command line tools or in linux shell?

Thanks!

glehnia 10 glehnia 22
glehniae 343
glehnii 923
glei 1171
glei 2283
glei 3466
gleib 914
gleiber 652
gleiberg 495
gleiberg 709

© Super User or respective owner

Related posts about linux

Related posts about sed