Search Results

Search found 2253 results on 91 pages for 'grep'.

Page 10/91 | < Previous Page | 6 7 8 9 10 11 12 13 14 15 16 17 | Next Page >

Spider a Website and Return URLs Only

- by Rob Wilkerson

I'm not quite sure how best to define/articulate this, but I'm looking for a way to pseudo-spider a website. The key is that I don't actually want the content, but rather a simple list of URIs. I can get reasonably close to this idea with Wget using the --spider option, but when piping that output through a grep, I can't seem to find the right magic to make it work: wget --spider --force-html -r -l1 http://somesite.com | grep 'Saving to:' The grep filter seems to have absolutely no affect on the wget output. Have I got something wrong or is there another tool I should try that's more geared towards providing this kind of limited result set? Thanks. UPDATE So I just found out offline that, by default, wget writes to stderr. I missed that in the man pages (in fact, I still haven't found it if it's in there). Once I piped the return to stdout, I got closer to what I need: wget --spider --force-html -r -l1 http://somesite.com 2>&1 | grep 'Saving to:' I'd still be interested in other/better means for doing this kind of thing, if any exist.

Read the article
How to replace pairs of strings in two files to identical IDs?

- by Péter Török

Sorry if the title is not very intelligible, I couldn't come up with anything better. Hopefully my explanation is clear enough: I have a pair of rather large log files with very similar content, except that some strings are different between the two. A couple of examples: UnifiedClassLoader3@19518cc | UnifiedClassLoader3@d0357a JBossRMIClassLoader@13c2d7f | JBossRMIClassLoader@191777e That is, wherever the first file contains UnifiedClassLoader3@19518cc, the second contains UnifiedClassLoader3@d0357a, and so on. [Update] There are about 40 distinct pairs of such identifiers.[/Update] I want to replace these with identical IDs so that I can spot the really important differences between the two files. I.e. I want to replace all occurrences of both UnifiedClassLoader3@19518cc in file1 and UnifiedClassLoader3@d0357a in file2 with UnifiedClassLoader3@1; all occurrences of both JBossRMIClassLoader@13c2d7f in file1 and JBossRMIClassLoader@191777e in file2 with JBossRMIClassLoader@2 etc. Using the Cygwin shell, so far I managed to list all different identifiers occurring in one of the files with grep -o -e 'ClassLoader[0-9]*@[0-9a-f][0-9a-f]*' file1.log | sort | uniq However, now the original order is lost, so I don't know which is the pair of which ID in the other file. With grep -n I can get the line number, so the sort would preserve the order of appearance, but then I can't weed out the duplicate occurrences. Unfortunately grep can not print only the first match of a pattern. I figured I could save the list of identifiers produced by the above command into a file, then iterate over the patterns in the file with grep -n | head -n 1, concatenate the results and sort them again. The result would be something like 2 ClassLoader3@19518cc 137 ClassLoader@13c2d7f 563 ClassLoader3@1267649 ... Then I could (either manually or with sed itself) massage this into a sed command like sed -e 's/ClassLoader3@19518cc/ClassLoader3@2/g' -e 's/ClassLoader@13c2d7f/ClassLoader@137/g' -e 's/ClassLoader3@1267649/ClassLoader3@563/g' file1.log > file1_processed.log and similarly for file2. However, before I start, I would like to verify that my plan is the simplest possible working solution to this. Is there any flaw in this approach? Is there a simpler way?

Read the article
How to find Tomcat's PID and kill it in python?

- by 4herpsand7derpsago

Normally, one shuts down Apache Tomcat by running its shutdown.sh script (or batch file). In some cases, such as when Tomcat's web container is hosting a web app that does some crazy things with multi-threading, running shutdown.sh gracefully shuts down some parts of Tomcat (as I can see more available memory returning to the system), but the Tomcat process keeps running. I'm trying to write a simple Python script that: Calls shutdown.sh Runs ps -aef | grep tomcat to find any process with Tomcat referenced If applicable, kills the process with kill -9 <PID> Here's what I've got so far (as a prototype - I'm brand new to Python BTW): #!/usr/bin/python # Imports import sys import subprocess # Load from imported module. if __init__ == "__main__": main() # Main entry point. def main(): # Shutdown Tomcat shutdownCmd = "sh ${TOMCAT_HOME}/bin/shutdown.sh" subprocess.call([shutdownCmd], shell=true) # Check for PID grepCmd = "ps -aef | grep tomcat" grepResults = subprocess.call([grepCmd], shell=true) if(grepResult.length > 1): # Get PID and kill it. pid = ??? killPidCmd = "kill -9 $pid" subprocess.call([killPidCmd], shell=true) # Exit. sys.exit() I'm struggling with the middle part - with obtaining the grep results, checking to see if their size is greater than 1 (since grep always returns a reference to itself, at least 1 result will always be returned, methinks), and then parsing that returned PID and passing it into the killPidCmd. Thanks in advance!

Read the article
shell script segment to avoid overwriting files

- by johndashen

I have a perl script (or any executable) E which will take a file foo.xml and write a file foo.txt. I use a Beowulf cluster to run E for a large number of XML files, but I'd like to write a simple job server script in shell (bash) which doesn't overwrite existing txt files. I'm currently doing something like #!/bin/sh PATTERN="[A-Z]*0[1-2][a-j]"; # this matches foo in all cases todo=`ls *.xml | grep $PATTERN`; isdone=`ls *.foo | grep $PATTERN`; whatsleft=todo - isdone; # what's the unix magic? #and then call the job server; jobserve E "$whatsleft"; and then I don't know how to get the difference between $todo and $isdone. I'd prefer using sort/uniq to something like a for loop with grep inside, but I'm not sure how to do it (pipes? temporary files?) As a bonus question, is there a way to do lookahead search in bash grep?

Read the article
parsing FireFox bookmarks using regular expression

- by SIFE

I tried to parse firefox bookmark(JSON exported version), using this efforts: cat boo.json | grep '\"uri\"\:\"^http\://[a-zA-Z0-9\-\.]+\.[a-zA-Z]{2,3}\"' cat boo.json | grep '"uri"\:"^http\://[a-zA-Z0-9\-\.]+\.[a-zA-Z]{2,3}' cat boo.json | grep '"uri"\:"^http\://[a-zA-Z0-9\-\.]+\.[a-zA-Z]{2,3}"' And few others but all fails, json bookmarked file will look like this: .........."uri":"http://www.google.com/?"......"uri":"http://stackoverflow.com/" So, the output should be like this: "uri":"http://www.google.com/?" "uri":"http://stackoverflow.com/" What is the missing part on my regular expression?

Read the article
Use a grepped file as an included source in bash

- by Andrew

I'm on a shared webhost where I don't have permission to edit the global bash configuration file at /ect/bashrc. Unfortunately there is one line in the global file, mesg y, which puts the terminal in tty mode and makes scp and similar commands unavailable. My local ~./bashrc includes the global file as a source, like so: # Source global definitions if [ -f /etc/bashrc ]; then . /etc/bashrc fi My current workaround uses grep to output the global file, sans offending line, into a local file and use that as a source. # Source global definitions if [ -f /etc/bashrc ]; then grep -v mesg /etc/bashrc > ~/.bash_global . ~/.bash_global fi Is there a way to do include a grepped file like this without the intermediate step of creating an actual file? Something like this? . grep -v mesg /etc/bashrc > ~/.bash_global

Read the article
Listing C/C++ functions (Code analysis in Unix)

- by Jond

Whether we're maintaining unfamiliar code or checking out the implementation details of an Apache module it can help if we can quickly traverse the code and build up an overview of what we're looking at. Grep serves most of my daily needs but there are some cases where it just wont do. Here's a common example of how it can help. To find the definition of a PHP function I'm interested in I can type this at the command line: grep -r "function myfunc" . This could be adapted very quickly to C or C++ if we know the return type, but things become more complicated if, say, I want to list every method that my class provides: grep "function " ./src/mine.class.php Since there's no single keyword that denotes a function or method in C++ and because it's generally more complex syntax, I think I'd need some kind of static code analysis tool, smart use of the C Preprocessor or blind faith the coder followed strict code guidelines (# of whitespace, position of curlies etc) to get these sorts of results. What would you recommend? p.s. be nice, this is my first post ;-) :p

Read the article
partial string matching - R

- by DonDyck

I need to write a query in R to match partial string in column names. I am looking for something similar to LIKE operator in SQL. For e.g, if I know beginning, middle or end part of the string I would write the query in format: LIKE 'beginning%middle%' in SQL and it would return matching strings. In pmatch or grep it seems I can only specify 'beginning' , 'end' and not the order. Is there any similar function in R that I am looking for? For example, say I am looking in the vector: y<- c("I am looking for a dog", "looking for a new dog", "a dog", "I am just looking") Lets say I want to write a query which picks "looking for a new dog" and I know start of the string is "looking" and end of string is "dog". If I do a grep("dog",y) it will return 1,2,3. Is there any way I can specify beginning and end in grep?

Read the article
Search files for text matching format of a Unix directory

- by BrandonKowalski

I am attempting to search through all the files in a directory for text matching the pattern of any arbitrary directory. The output of this I hope to use to make a list of all directories referenced in the files (this part I think I can figure out on my own). I have looked at various regex resources and made my own expression that seems to work in the browser based tool but not with grep in the command line. /\w+[(/\w+)]+ My understanding so far is the above expression will look for the beginning / of a directory then look for an indeterminate number of characters before looking for a repeating block of the same thing. Any guidance would be greatly appreciated.

Read the article
Search all files containing text

- by enthdegree

With Busybox, how do you search for an expression within a bunch of files recursively through a bunch of directories, but only look through text files? We don't know what the file's suffix is going to be; it could be .sh, it could be nothing, it could be something else. I was considering somehow basing the search on encoding although I am not quite sure what the encoding would be either. I've tried busybox grep -r but it searches through binary files too, which wastes a lot of time.

Read the article
tar - exclude certain files

- by Alan

I wish to tar all files in a directory and its subdirectories that do NOT end in .jpg, .bmp, .gif, or png. So, given the following folders and files: foo/file.txt foo/file.gif foo/bar/file foo/bar/image.jpg I want to tar only the files file.txt and file. file.gif and image.jpg should be ignored. I would also like to maintain the folder structure. My first thought was to pipe the results of the find command in conjunction with grep -v ".jpg|.gif|.bmp.png" to a text file, and then use the tar include argument to feed it that list of files. However, the results of the grepped find command also contain directories (in the example above, it would be "foo" and "foo/bar"), and when a directory is fed to tar, it includes all files in that directory, so I would end up with a tar file containing all of the files--not what I want. Is there any way to prevent find from outputting directories? Is there a far easier way to approach this?

Read the article
How to search a text file for strings between two tokens in Ubuntu terminal and save the output?

- by Blue

How can I search a text file for this pattern in Ubuntu terminal and save the output as a text file? I'm looking for everything between the string "abc" and the string "cde" in a long list of data. For example: blah blah abc fkdljgn cde blah blah blah blah blah blah abc skdjfn cde blah In the example above I would be looking for an output such as this: fkdljgn skdjfn It is important that I can also save the data output as a text file. Can I use grep or agrep and if so, what is the format?

Read the article
How can I quickly find the first line of a file that matches a regex?

- by lamcro

I want to search for a line in a file, using regex, inside a Perl script. Assuming it is in a system with grep installed, is it better to: call the external grep through an open() command open() the file directly and use a while loop and an if ($line =~ m/regex/)?

Read the article
Count number of occurrences of a pattern in a file (even on same line)

- by jrdioko

When searching for number of occurrences of a string in a file, I generally use: grep pattern file | wc -l However, this only finds one occurrence per line, because of the way grep works. How can I search for the number of times a string appears in a file, regardless of whether they are on the same or different lines? Also, what if I'm searching for a regex pattern, not a simple string? How can I count those, or, even better, print each match on a new line?

Read the article
search a string in a file as ignoring the lines beginning with #

- by ephieste

I want to find a string such as "qwertty=" in a file with "awk" or "grep" but I don't want to see the lines with #. Please see the example grep -ni "qwertty" /aaa/bbb 798:# * qwertty - enable/disable 1222:#qwertty=1 1223:qwertty=2 1224:#qwertty=3 I want to find the line 1223. What should be the search query for this purpose?

Read the article
grep a rar in cygwin

- by Tomer

Hi, I want to do grep texts files inside a rar without extracting the rar file to disk, I tried a couple of combinations with pipes however it didnt work i tried for example unrar e myrar.rar | grep mysearchedline however it actually opened it to disk, I don't want to open it to disk, I don't have enough space for it to be opened (its real big with real big logs). Thanks.

Read the article
How to grep curl -I header information

- by Mint

Im trying to get the redirect link from a site by using curl -I then grep to "location" and then sed out the location text so that I am left with the URL. But this doesn't work, it will outputs the URL to screen and doesn't put it test=$(curl -I "http://www.redirectURL.com/" 2> /dev/null | grep "location" | sed -E 's/location:[ ]+//g') echo "1..$test..2" Which then outputs: ..2http://www.newURLfromRedirect.com/bla Whats going on?

Read the article
While loop read multiple lines from a grep

- by Basil

I'm writing a script in AIX 5.3 that will loop through the output of a df and check each volume against another config file. If the volume appears in the config file, it will set a flag which is needed later in the script. If my config file only has a single column and I use a for loop, this works perfectly. My problem, however, is that if I use a while read loop to populate more than one variable per line, any variables I set between the while and the done are discarded. For example, assuming the contents of /netapp/conf/ExcludeFile.conf are a bunch of lines containing two fields each: volName="myVolume" utilization=70 thresholdFlag=0 grep volName /netapp/conf/ExcludeFile.conf | while read vol threshold; do if [ $utilization -ge $threshold ] ; then thresholdFlag=1 fi done echo "$thresholdFlag" In this example, thresholdFlag will always be 0, even if the volume appears in the file and its utilization is greater than the threshold. I could have added an echo "setting thresholdFlag to 1" in there, see the echo, and it'll still echo a 0 at the end. Is there a clean way to do this? I think my while loop is being done in a subshell, and changes I make to variables in there are actually being made to local variables that are discarded after the done.

Read the article
finding files that match a precise size: a multiple of 4096 bytes

- by doub1ejack

I have several drupal sites running on my local machine with WAMP installed (apache 2.2.17, php 5.3.4, and mysql 5.1.53). Whenever I try to visit the administrative page, the php process seems to die. From apache_error.log: [Fri Nov 09 10:43:26 2012] [notice] Parent: child process exited with status 255 -- Restarting. [Fri Nov 09 10:43:26 2012] [notice] Apache/2.2.17 (Win32) PHP/5.3.4 configured -- resuming normal operations [Fri Nov 09 10:43:26 2012] [notice] Server built: Oct 24 2010 13:33:15 [Fri Nov 09 10:43:26 2012] [notice] Parent: Created child process 9924 [Fri Nov 09 10:43:26 2012] [notice] Child 9924: Child process is running [Fri Nov 09 10:43:26 2012] [notice] Child 9924: Acquired the start mutex. [Fri Nov 09 10:43:26 2012] [notice] Child 9924: Starting 64 worker threads. [Fri Nov 09 10:43:26 2012] [notice] Child 9924: Starting thread to listen on port 80. Some research has led me to a php bug report on the '4096 byte bug'. I would like to see if I have any files whose filesize is a multiple of 4096 bytes, but I don't know how to do that. I have gitBash installed and can use most of the typical linux tools through that (find, grep, etc), but I'm not familiar enough with linux to figure it out on my own. Little help?

Read the article
Cron job checking for changes in Git repository

- by HNygard

We have just moved our server configs to a Git repository. Therefore there should not be any changes in any of the repository folders. I was thinking about how I could set up a cron job to check for any uncommited changes. How could a cron job be set up to check for changes in a Git repository? Greping the output of the git status command might just do it. Grep and cron jobs are not my strong side. Here are some sample outputs from git status: Standing the folder containing the git repository (e.g. /path/gitrepo/) with changed files: $ git status # On branch master # Changes not staged for commit: # (use "git add <file>..." to update what will be committed) # (use "git checkout -- <file>..." to discard changes in working directory) # # modified: apache2/sites-enabled/000-default # # Untracked files: # (use "git add <file>..." to include in what will be committed) # # apache2/conf.d/test no changes added to commit (use "git add" and/or "git commit -a") Standing in the folder when there is no changes: $ git status # On branch master nothing to commit (working directory clean) Update: Synced up with origin is not important. There should be no local changes. Local files that must be in place go into the .gitignore file. In addition to the server configs there are also git repos for content (static web sites, web apps, wordpress, etc). None of the repositories should have local changes. We might use Puppet in the long run since its being used for development of one of the web apps.

Read the article
Extract a specific string from a curl'd result

- by allentown

Given this curl command: curl --user-agent "fogent" --silent -o page.html "http://www.google.com/search?q=insansiate" * Spelling is intentionally incorrect. I want to grab the suggestion as my result. I want to be able to either grep into the page.html file perhaps with grep -oE or pipe it right from curl and never store a file. The result should be: 'instantiate' I need only the word 'instantiate', or the phrase, whatever google is auto correcting, is what I am after. Here is the basic html that is returned: <span class=spell style="color:#cc0000">Did you mean: </span><a href="/search?hl=en&ie=UTF-8&&sa=X&ei=VEMUTMDqGoOINraK3NwL&ved=0CB0QBSgA&q=instantiate&spell=1"class=spell><b><i>instantiate</i></b></a>  <span class=std>Top 2 results shown</span> So perhaps from/to of the string below, which I hope is unique enough to cover all my bases. class=spell><b><i>instantiate</i></b></a>   I keep running into issues with greedy grep; perhaps I should run it though an html prettify tool first to get a line break or 50 in there. I don't know of any simple way to do so in bash, which is what I would ideally like this to be in. I really don't want to deal with firing up perl, and making sure I have the correct module. Any suggestions, thank you?

Read the article
Under *nix, how can I find a string within a file within a directory ?

- by roberto

Hi all. I'm using ubuntu linux, and I use bash from with a terminal emulator every day for many tasks. I would like to know how to find a string or a substring within a file that is within a particular directory. If I was knew the file which contained my target substring, I would just cat the file and pipe it through grep, thus: cat file | grep mysubstring But in this case, the pesky substring could be anywhere within a known directory. How do I hunt down my substring ?

Read the article
Find -type d with no subfolders

- by titatom

Good morning ! This is a simple one I believe, but I am still a noob :) I am trying to find all folders with a certain name. I am able to do this with the command find /path/to/look/in/ -type d | grep .texturedata The output gives me lots of folders like this : /path/to/look/in/.texturedata/v037/animBMP But I would like it to stop at .texturedata : /path/to/look/in/.texturedata/ I have hundreds of these paths and would like to lock them down by piping the output of grep into chmod 000 I was given a command with the argument -dpe once, but I have no idea what it does and the Internet has not be able to help me determine it's usage Thanks you very much for your help !

Read the article
Finding a integer number after a beginning t=

- by user2966696

I have a string like this: 33 00 4b 46 ff ff 03 10 30 t=25562 I am only interested in the five digits at the very end after the t= How can I get this numbers with a regular expression out of it? I tried grep t=..... but I also got all characters including the t= in the beginning, which I would like to drop? After finding that five digit number, I would like to divide this by 1000. So in the above mentioned case the number 25.562. Is this possible with grep and regular expressions? Thanks for your help.

Read the article
how to find a text string which may be present in some unknown file in entire filesystem

- by Registered User

I am stuck up with a problem I have a line 'something' in some file. In which file is this line that I have forgotten. In the entire root file system I would like to find out which file and where is this line. So how can I go for this.I have used find but when I used find then I knew the name of file in this case I do not know name of file also. It is a Ubuntu server 10.04 So what can I do to find out which file has this string.

Read the article

< Previous Page | 6 7 8 9 10 11 12 13 14 15 16 17 | Next Page >