SQL like group by and sum for text files in command line?

Posted by dnkb on Super User See other posts from Super User or by dnkb
Published on 2010-05-02T04:25:57Z Indexed on 2010/05/02 4:28 UTC
Read the original article Hit count: 475

Filed under:

linux

|

sed

|

awk

|

command-line

|

bash

I have huge text files with two fields, the first is a string the second is an integer. The files are sorted by the first field. What I'd like to get in the output is one line per unique string and the sum of the numbers for the identical strings. Some strings appear only once while other appear multiple times. E.g. Given the sample data below, for the string glehnia I'd like to get 10+22=32 in the result.

Any suggestions how to do this either with gnuwin32 command line tools or in linux shell?

Thanks!

glehnia 10 glehnia 22
glehniae 343
glehnii 923
glei 1171
glei 2283
glei 3466
gleib 914
gleiber 652
gleiberg 495
gleiberg 709

© Super User or respective owner

Related posts about linux

apt-get install and update fail

as seen on Ask Ubuntu - Search for 'Ask Ubuntu'
I've got a problem with apt-get update and apt-get install ... commands . every time update or installing fails and errors are : Get:1 http://dl.google.com stable Release.gpg [198B] Ign http://dl.google.com/linux/chrome/deb/ stable/main Translation-en_US Get:2 http://dl… >>> More
kernel module compiling error

as seen on Ask Ubuntu - Search for 'Ask Ubuntu'
sh@ubuntu:/home/ccpp/helloworld$ make gcc-4.6 -O2 -DMODULE -D_KERNEL_ -W -Wall -Wstrict-prototypes -Wmissing-prototypes -isystem /lib/modules/`uname -r`/build/include -c -o hello-1.o hello-1.c hello-1.c:4:0: warning: "MODULE" redefined [enabled by default] <command-line>:0:0: note: this is… >>> More
Build-Essentials installation failing

as seen on Ask Ubuntu - Search for 'Ask Ubuntu'
I am having trouble accessing the several critical header files that show to be a part of the build process. The "Ubuntu Software Center" shows "Build Essentials" as installed: Next I did the following two commands, which did not improve the problem: ~$ sudo apt-get install build-essential [sudo]… >>> More
Updating Debian kernel

as seen on Super User - Search for 'Super User'
I'm trying to update my Debian machine to 2.6.32-46 (which is the new stable). However, after doing apt-get update my apt-cache search linux-image shows me: linux-headers-2.6.32-5-486 - Header files for Linux 2.6.32-5-486 linux-headers-2.6.32-5-686-bigmem - Header files for Linux 2.6.32-5-686-bigmem linux-headers-2… >>> More
Serial connection over a single USB cable (Windows to linux, or linux to linux)

as seen on Server Fault - Search for 'Server Fault'
I'm helping out with a project for an embedded device that only has USB and no serial. This device is running Linux. These days, when we need to connect to a serial port on a device we typically use a USB to serial adapter (on something like a phone system or a load balancing device, etc). I would… >>> More

Related posts about sed

sed + RE , want to ignore remarked line using sed

as seen on Stack Overflow - Search for 'Stack Overflow'
I want to replace the a string with the word string as the following example down , but if a string exist also after comment then a string will replaced too In which way I can add to the sed command the irregular exp that ignore from the #. [root@localhost tmp]# more test a b aa bb #a #b #aa #bb [root@localhost… >>> More
script to search and replace deprecated functions

as seen on Stack Overflow - Search for 'Stack Overflow'
Hi, I am using the following script to search and replace the deprecated functions in a file with the newer ones. 5 for strFile in `ls deprecated_functions_search_and_replace.txt ` 6 do 7 sed "s/ereg_replace[^\(]*(\([^,]*\),/preg_replace\1('#'.\2.'#',/g" $strFile > temp_file 8 … >>> More
Footer not stretching 100% when horizontally scrolled

as seen on Pro Webmasters - Search for 'Pro Webmasters'
I have a footer which is set to 100% width, but if i size the window smaller so a horizontal scrollbar appears, using the scrollbar shows whitespace to the right of the footer ... its not spanned 100% of the page, just the viewport. <!doctype html> <html lang="en" class="no-js"> <head> … >>> More
What does this regex mean and why

as seen on Ask Ubuntu - Search for 'Ask Ubuntu'
$ sed "s/\(^[a-z,0-9]*\)\(.*\)\( [a-z,0-9]*$\)/\1\2 \1/g" desired_file_name I apreciate it even if you only explain part of it or at lest structure it with words as in s\alphanumerical_at_start\something\alphanumerical_at_end\something_else\global Could someone explain what that means, why and… >>> More
Sed problem in a Bash script

as seen on Ask Ubuntu - Search for 'Ask Ubuntu'
Hello there. I'm having a problem using the sed command . I'm trying to write a bash script that does the following : search for the line that contain :@ then save the line that contained :@ and replace it with new line as in the following: #! /bin/bash echo "Please enter the ip address… >>> More