Search Results

Search found 490 results on 20 pages for 'awk'.

Page 4/20 | < Previous Page | 1 2 3 4 5 6 7 8 9 10 11 12 | Next Page >

use awk to identify multi-line record and filtering

- by nanshi

I need to process a big data file that contains multi-line records, example input: 1 Name Dan 1 Title Professor 1 Address aaa street 1 City xxx city 1 State yyy 1 Phone 123-456-7890 2 Name Luke 2 Title Professor 2 Address bbb street 2 City xxx city 3 Name Tom 3 Title Associate Professor 3 Like Golf 4 Name 4 Title Trainer 4 Likes Running Note that the first integer field is unique and really identifies a whole record. So in the above input I really have 4 records although I dont know how many lines of attributes each records may have. I need to: - identify valid record (must have "Name" and "Title" field) - output the available attributes for each valid record, say "Name", "Title", "Address" are needed fields. Example output: 1 Name Dan 1 Title Professor 1 Address aaa street 2 Name Luke 2 Title Professor 2 Address bbb street 3 Name Tom 3 Title Associate Professor So in the output file, record 4 is removed since it doen't have the "Name" field. Record 3 doesn't have Address field but still being print to the output since it is a valid record that has "Name" and "Title". Can I do this with awk? But how do i identify a whole record using the first "id" field on each line? Thanks a lot to the unix shell script expert for helping me out! :)

Read the article
Uniq in awk; removing duplicate values in a column using awk

- by D W

I have a large datafile in the following format below: ENST00000371026 WDR78,WDR78,WDR78, WD repeat domain 78 isoform 1,WD repeat domain 78 isoform 1,WD repeat domain 78 isoform 2, ENST00000371023 WDR32 WD repeat domain 32 isoform 2 ENST00000400908 RERE,KIAA0458, atrophin-1 like protein isoform a,Homo sapiens mRNA for KIAA0458 protein, partial cds., The columns are tab separated. Multiple values within columns are comma separated. I would like to remove the duplicate values in the second column to result in something like this: ENST00000371026 WDR78 WD repeat domain 78 isoform 1,WD repeat domain 78 isoform 1,WD repeat domain 78 isoform 2, ENST00000371023 WDR32 WD repeat domain 32 isoform 2 ENST00000400908 RERE,KIAA0458 atrophin-1 like protein isoform a,Homo sapiens mRNA for KIAA0458 protein, partial cds., I tried the following code below but it doesn't seem to remove the duplicate values. awk ' BEGIN { FS="\t" } ; { split($2, valueArray,","); j=0; for (i in valueArray) { if (!( valueArray[i] in duplicateArray)) { duplicateArray[j] = valueArray[i]; j++; } }; printf $1 "\t"; for (j in duplicateArray) { if (duplicateArray[j]) { printf duplicateArray[j] ","; } } printf "\t"; print $3 }' knownGeneFromUCSC.txt How can I remove the duplicates in column 2 correctly?

Read the article
AWK scripting :How to remove Field separator using awk

- by anil-1985

Need the following output ONGC044 ONGC043 ONGC042 ONGC041 ONGC046 ONGC047 from this input Medium Label Medium ID Free Blocks =============================================================================== [ONGC044] ECCPRDDB_FS_43 ac100076:4aed9b39:44f0:0001 195311616 [ONGC043] ECCPRDDB_FS_42 ac100076:4aed9b1d:44e8:0001 195311616 [ONGC042] ECCPRDDB_FS_41 ac100076:4aed9af4:4469:0001 195311616 [ONGC041] ECCPRDDB_FS_40 ac100076:4aed9ad3:445e:0001 195311616 [ONGC046] ECCPRDDB_FS_44 ac100076:4aedd04a:68c6:0001 195311616 [ONGC047] ECCPRDDB_FS_45 ac100076:4aedd4a0:6bf5:0001 195311616

Read the article
How to replace the nth column/field in a comma-separated string using sed/awk?

- by Peter Meier

assume I have a string "1,2,3,4" Now I want to replace, e.g. the 3rd field of the string by some different value. "1,2,NEW,4" I managed to do this with the following command: echo "1,2,3,4" | awk -F, -v OFS=, '{$3="NEW"; print }' Now the index for the column to be replaced should be passed as a variable. So in this case index=3 How can I pass this to awk? Because this won't work: echo "1,2,3,4" | awk -F, -v OFS=, '{$index="NEW"; print }' echo "1,2,3,4" | awk -F, -v OFS=, '{$($index)="NEW"; print }' echo "1,2,3,4" | awk -F, -v OFS=, '{\$$index="NEW"; print }' Thanks for your help!

Read the article
awk and cat - How to ignore multiple lines?

- by Filipe YaBa Polido

I need to extract Voip log from a D-Link router, so I've setup a little python script that executes a command in this router via telnet. My script does a "cat /var/log/calls.log" and returns the result, however... it also sends non-important stuff, like the BusyBox banner, etc... How can I ignore lines from 1 to 6 and the last 2 ? This is my current output: yaba@foobar:/stuff$ python calls.py BusyBox v1.00 (2009.04.09-11:17+0000) Built-in shell (msh) Enter 'help' for a list of built-in commands. DVA-G3170i/PT # cat /var/call.log 1 ,1294620563,2 ,+351xxx080806 ,xxx530802 ,1 ,3 ,1 DVA-G3170i/PT # exit And I just need: 1 ,1294620563,2 ,+351xxx080806 ,xxx530802 ,1 ,3 ,1 (it can have multiple lines) So that I can save it to a CSV and later to a sql db. Thanks, and sorry my bad english.

Read the article
Sum of every N lines ; awk

- by Sharat Chandra

I have a file containing data in a single column .. I have to find the sum of every 4 lines and print the sum That is, I have to compute sum of values from 0-3rd line sum of line 4 to 7,sum of lines 8 to 11 and so on .....

Read the article
awk or perl file editing & manipulation

- by paul44

I have a standard passwd file & a usermap file - which maps unix name (eg jbloggs) with AD account name (eg bloggsjoe) in the format: jbloggs bloggsjoe jsmith smithjohn ... etc. How can I edit the passwd file to swap the original unix name with the AD account name so each line of the passwd file has the AD account name instead. Appreciate any help for a perl learner.

Read the article
awk/sed/bash to merge data

- by Kyle

Trying to merge some data that I have. The input would look like so: foo bar foo baz boo abc def abc ghi And I would like the output to look like: foo bar baz boo abc def ghi I have some ideas using some arrays in a shell script, but I was looking for a more elegant or quicker solution.

Read the article
awk/sed/bash to merge/concatenate data

- by Kyle

Trying to merge some data that I have. The input would look like so: foo bar foo baz boo abc def abc ghi And I would like the output to look like: foo bar baz boo abc def ghi I have some ideas using some arrays in a shell script, but I was looking for a more elegant or quicker solution.

Read the article
AWK Shift empty column to left (to start position)

- by Filip Zembol

INPUT: fofo jojo tst fojo jofo sts rhr hrhh dodo jojo hoho jojo zozo roro vovo OUTPUT: fofo jojo tst fojo jofo sts rhr hrhh dodo jojo hoho jojo zozo roro popo NOTE: Please help me, I need to shift all rows, which have first column empty. Every fields are tab delimited. In this file some rows start from first column, but some rows start from second or third column. Thank you

Read the article
AWK: compare apache dates without using regular expression

- by smallmeans

I'm writing a loganalysis application and wanted to grab apache log records between two certain dates. Assume that a date is formated as such: 22/Dec/2009:00:19 (day/month/year:hour:minute) Currently, I'm using a regular expression to replace the month name with its numeric value, remove the separators, so the above date is converted to: 221220090019 making a date comparison trivial.. but.. Running a regex on each record for large files, say, one containing a quarter million records, is extremely costly.. is there any other method not involving regex substitution? Thanks in advance Edit: here's the function doing the convertion/comparison function dateInRange(t, from, to) { sub(/[[]/, "", t); split(t, a, "[/:]"); match("JanFebMarAprMayJunJulAugSepOctNovDec", a[2]); a[2] = sprintf("%02d", (RSTART + 2) / 3); s = a[3] a[2] a[1] a[4] a[5]; return s >= from && s <= to; } "from" and "to" are the intervals in the aforementioned format, and "t" is the raw apache log date/time field (e.g [22/Dec/2009:00:19:36)

Read the article
join 3 files by first Column with awk ?

- by noinflection

i have three similar files, they are all like this: File A ID1 Value1a ID2 Value2a . . . IDN Value2n and i want an output like this Output ID1 Value1a Value1b Value1c ID2 Value2a Value2b Value2c ..... IDN ValueNa ValueNb ValueNc Looking to the first line, i want value1A to be the value of id1 in fileA, value1B the value of id1 in fileB, and so on which each field and each line. I thougth it like a sql join. I've tried several things but none of them where even close.

Read the article
How to remove all words written in capital letters ONLY (by using sed and/or awk)

- by Virtual_Lotos

I am trying to delete all words written in capital letters only by using sed: sed -r "s/\b[A-Z]\w*\s*//g" < file1 > file2 But this solution capture all the words starting with capital letters and delete them (this in not the goal). Here's an example: file1 content: AAAAAAAAAAAA BBbbbbb AbAbAbAb aaaaaBBBBB AAAAAA BBBBBB A1-B1 a1-b1 A1-b1 AA AAAAA BBBBB AAAAA Abbbb AAA AAAAA AAAABB Abbbb Baaaa Aaaaa AB AAAAAA1 BBBBBBb AAAAAA 1 BBBBBB b Result should be like this (file2 content): BBbbbbb AbAbAbAb aaaaaBBBBB A1-B1 a1-b1 A1-b1 AA Abbbb AAA Abbbb Baaaa Aaaaa AB AAAAAA1 BBBBBBb AAAAAA 1 BBBBBB b Each line of at least one digit or one lowercase letter should remain intact (should not be deleted).

Read the article
OSX, G/AWK, Bash - "illegal statement, unterminated string" and no file output.

- by S1syphus

I have a script that somebody from SO kindly provided to solve an issue I was having, However, I'm having some issues getting it to work on OSX. gawk --version GNU Awk 3.1.6 awk --version awk version 20100208 The original source is: awk -F, -vOFS=, -vc=1 ' NR == 1 { for (i=1; i<NF; i++) { if ($i != "") { g[c]=i; f[c++]=$i } } } NR>2 { for (i=1; i < c; i++) { print $1,$2, $g[i] > "output_"f[i]".csv } }' data.csv When I run the script it gives the following error: awk: syntax error at source line 12 context is print $1,$2, $g[i] > >>> "output_"f <<< [i]".csv awk: illegal statement at source line 13 From the look of it the variable of [i] isn't been amended to the output file, but I don't know why. If I change AWK to GAWK and run the original script here is the output: gawk: cmd. line:11: print $1,$2, $g[i] > "output_"f[i]".csv gawk: cmd. line:11: ^ unterminated string So I edit the relevant line to fix the unterminated string print $1,$2, $g[i] > "output_"f[i]".csv" Then it runs through fine produces no errors, but there is no output files. Any ideas? I spent the majority of last night and this morning pouring over this. A sample input file: ,,L1,,,L2,,,L3,,,L4,,,L5,,,L6,,,L7,,,L8,,,L9,,,L10,,,L11, Title,r/t,needed,actual,Inst,needed,actual,Inst,needed,actual,Inst,needed,actual,Inst,neede d,actual,Inst,needed,actual,Inst,needed,actual,Inst,needed,actual,Inst,needed,actual,Inst,needed,actual,Inst,needed,actual,Inst EXAMPLEfoo,60,6,6,6,0,0,0,0,0,0,6,6,6,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0 EXAMPLEbar,30,6,6,12,6,7,14,6,6,12,6,6,12,6,8,16,6,7,14,6,7.5,15,6,6,12,6,8,16,6,0,0,6,7,14 EXAMPLE1,60,3,3,3,3,5,5,3,4,4,3,3,3,3,6,6,3,4,4,3,3,3,3,4,4,3,8,8,3,0,0,3,4,4 EXAMPLE2,120,6,6,3,0,0,0,6,8,4,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0 EXAMPLE3,60,6,6,6,6,8,8,6,6,6,6,6,6,0,0,0,0,0,0,6,8,8,6,6,6,0,0,0,0,0,0,0,10,10 EXAMPLE4,30,6,6,12,6,7,14,6,6,12,6,6,12,3,5.5,11,6,7.5,15,6,6,12,6,0,0,6,9,18,6,0,0,6,6.5,13 And the example out put should be So for L1 an example out put would look like: EXAMPLEfoo,60,6 EXAMPLEbar,30,6 EXAMPLE1,60,3 EXAMPLE2,120,6 EXAMPLE3,60,6 EXAMPLE4,30,6 And for L2: EXAMPLEfoo,60,0 EXAMPLEbar,30,6 EXAMPLE1,60,3 EXAMPLE2,120,0 EXAMPLE3,60,6 EXAMPLE4,30,6

Read the article
How can I use Awk inside a Perl script?

- by papoyan

I'm having trouble using the following code inside my Perl script, any advise is really appreciated, how to correct the syntax? # If I execute in bash, it's working just fine bash$ whois google.com | egrep "\w+([._-]\w)*@\w+([._-]\w)*\.\w{2,4}" |awk ' {for (i=1;i<=NF;i++) {if ( $i ~ /[[:alpha:]]@[[:alpha:]]/ ) { print $i}}}'|head -n1 [email protected] #----------------------------------- #but this doesn't work bash$ ./email.pl google.com awk: {for (i=1;i<=NF;i++) {if ( ~ /[[:alpha:]]@[[:alpha:]]/ ) { print }}} awk: ^ syntax error # Here is my script bash$ cat email.pl ####\#!/usr/bin/perl $input = lc shift @ARGV; $host = $input; my $email = `whois $host | egrep "\w+([._-]\w)*@\w+([._-]\w)*\.\w{2,4}" |awk ' {for (i=1;i<=NF;i++) {if ( $i ~ /[[:alpha:]]@[[:alpha:]]/ ) { print $i}}}'|head -1`; print my $email; bash$

Read the article
Replacing every 10th pipe with new line in unix

- by user327958

Lets say I have fields: name, number, id I have a data file: name1|number1|id1|name2|number2|id2...etc I want to replace every 3rd pipe with a new line or '\n' so I get: name1|number1|id1 name2|number2|id2 I'm having no luck with awk or sed. I've tried the following, and variations of: awk '/"\|"/{c++;if(c==10){sub("\|","\n");c=0}}1' inputfile.txt sed 's/"|"/"\n"/2' inputfile.txt It tells me awk: syntax error near line 1 awk: illegal statement near line 1 awk: syntax error near line 1 awk: bailing out near line 1 Any help is greatly appreciated! EDIT: Thank you!

Read the article
how do I paste text to a line by line text filter like awk, without having stdin echo to the screen?

- by Barton Chittenden

I have a text in an email on a windows box that looks something like this: 100 some random text 101 some more random text 102 lots of random text, all different 103 lots of random text, all the same I want to extract the numbers, i.e. the first word on each line. I've got a terminal running bash open on my Linux box... If these were in a text file, I would do this: awk '{print $1}' mytextfile.txt I would like to paste these in, and get my numbers out, without creating a temp file. my naive first attempt looked like this: $ awk '{print $1}' 100 some random text 100 101 some more random text 101 102 lots of random text, all different 103 lots of random text, all the same 102 103 The buffering of stdin and stdout make a hash of this. I wouldn't mind if stdin all printed first, followed by all of stdout; this is what would happen if I were to paste into 'sort' for example, but awk and sed are a different story. a little more thought gave me this: open two terminals. Create a fifo file. Read from the fifo on one terminal, write to it on another. This does in fact work, but I'm lazy. I don't want to open a second terminal. Is there a way in the shell that I can hide the text echoed to the screen when I'm passing it in to a pipe, so that I paste this: 100 some random text 101 some more random text 102 lots of random text, all different 103 lots of random text, all the same but see this? $ awk '{print $1}' 100 101 102 103

Read the article
How to print ASCII value of a character using basic awk only.

- by Venkataramesh Kommoju

I need to print the ASCII value of the given charecter in awk only. the below gives 0 as output. echo a | awk '{ printf("%d \n",$1); }' Help please.

Read the article
How can I get awk input from a file and add my own text to the data?

- by xs2dhillon

Assume that I have a text file separated by colons. I understand how to display the entire text file or any specific column using awk. However, what I want to do is to get the awk output and then display it by adding my own text using a shell script? For example, assume that my text file is: England:London:GMT USA:Washington:EST France:Paris:GMT What I want to do is to display this input into the below format: COUNTRY: England CAPITOL: London TIMEZONE: GMT COUNTRY: USA CAPITOL: Washington TIMEZONE: EST COUNTRY: France CAPITOL: Paris TIMEZONE: GMT

Read the article
How to remove caracters like (), ' * [] form a grep results with grep, awk or sed?

- by easyyu

For example if I made a file with grep that give me a next result: 16 Jan 07:18:42 (name1), xx.210.49.xx), 16 Jan 07:19:14 (name2), xx.210.xx.24), 16 Jan 07:19:17 (name3), xx.140.xxx.79), 16 Jan 07:19:44 (name4), xx.210.49.xx), 16 Jan 07:19:56 (name5), xx.140.xxx.79), ,then how to sed awk or grep to remove all except date name and IP to look like this: 16 Jan 07:18:42 name1 xx.210.49.xx 16 Jan 07:19:14 name2 xx.210.xx.24 16 Jan 07:19:17 name3 xx.140.xxx.79 16 Jan 07:19:44 name4 xx.210.49.xx 16 Jan 07:19:56 name5 xx.140.xxx.79 My grep command look like this: grep 'double' $DAEMON | awk -F" " '{print $2" "$1" "$3" "$8" "$10}' > $DBLOG Thx.

Read the article
What can I do using awk that I cannot do in Perl?

- by alvin

I had read somewhere about one specific feature that is present in awk but not in Perl. I have failed in locating it again. I would appreciate it if anyone here can point it out. This might be a useless trivia, but I am still curious to know.

Read the article
How to pass a variable to an awk print parameter...

- by Jamie

I'm trying extract the nth + 1 and nth + 3 columns from a file. This is what tried, which is a useful pseudo code: for i in {1..100} ; do awk -F "," " { printf \"%3d, %12.3f, %12.3f\\n\", \$1, \$($i+1), \$($i+3) } " All_Runs.csv > Run-$i.csv which, obviously doesn't work (but it seemed reasonable to hope). How can I do this?

Read the article
issue running a batch script to kill a process

- by user657064

I am using the following script on a command line to kill a hypothetical notepad process (using a Korn shell in Windows XP, if that matters): kill $(tasklist | grep -i notepad.exe | awk '{print 2}') Now I take this line, and put it into a batch file c:\temp\testkill.bat, thinking that I should just as well be able to kill the process by running the batch file. However, when I run the batch file, I get the following awk error about unbalanced parentheses: C:/Temp ./testkill.bat C:\Tempkill $(tasklist | grep -i notepad.exe | awk '{print $2}') awk: unbalanced () Context is: {print $2}) <<< C:/Temp So I'm baffled as to why I am getting this error about unbalanced parentheses when I run this script via a batch file, but have no issues when I run the command directly from the command line? (Btw, I'm not necessarily tied to this way of killing a process - as a total noob to shell scripting, I am additionally wondering why if I write the following on the command line: tasklist | grep -i notepad.exe | awk '{print $2}' | kill the process ID that comes out of the tasklist/grep/awk calls doesn't seem to properly get piped to kill...)

Read the article
AWK: how to reuse a result NR-times without removing END?

- by HH

How can I get all differences, not just one? I want to use the calculated result for each item in the third column. The dilemma is that if I remove END I can print $3 but cannot have ave. If I leave END I have ave but not all differences. awk '{sum+=$3} END {ave=sum/NR} END {print $3-ave}' coriolis_data -0.00964 // I want to see the rest differences, how? coriolis_data .105 0.005 0.9766 0.0001 0.595 0.005 .095 0.005 0.9963 0.0001 0.595 0.005 .115 0.005 0.9687 0.0001 0.595 0.005 .105 0.005 0.9693 0.0001 0.595 0.005 .095 0.005 0.9798 0.0001 0.595 0.005 .105 0.005 0.9798 0.0001 0.595 0.005 .095 0.005 0.9711 0.0001 0.595 0.005 .110 0.005 0.9640 0.0001 0.595 0.005 .105 0.005 0.9704 0.0001 0.595 0.005 .090 0.005 0.9644 0.0001 0.595 0.005

Read the article
Best way to parse this particular string using awk / sed?

- by Jack

Hi, I need to get a particular version string from a file (call it version.lst) and use it to compare another in a shell script. For example sake, the file contains lines that look like this: V1.000 -- build date and other info here -- APP1 V1.000 -- build date and other info here -- APP2 V1.500 -- build date and other info here -- APP3 .. and so on. Let's say I am trying to grab the first version (in this case, V1.000) from APP1. Obviously, the versions can change and I want this to be dynamic. What I have right now works: var = `cat version.lst | grep " -- APP1" | grep -Eo V[0-9].[0-9]{3}` Pipe to grep will get the line containing APP1 and the second pipe to grep will get the version string. However, I hear grep is not the way to do this so I'd like to learn the best way using awk or sed. Any ideas? I am new to both and haven't found a tutorial easy enough to learn the syntax of it. Do they support egrep? Thanks!

Read the article

< Previous Page | 1 2 3 4 5 6 7 8 9 10 11 12 | Next Page >