Search Results

Search found 48264 results on 1931 pages for 'text search'.

Page 167/1931 | < Previous Page | 163 164 165 166 167 168 169 170 171 172 173 174  | Next Page >

  • (Python) Extracting Text from Source Code?

    - by zhuyxn
    Currently have a large webpage whose source code is ~200,000 lines of almost all (if not all) HTML. More specifically, it is a webpage whose content is a few thousand blocks of paragraphs separated by line breaks (though a line break does not specifically mean there is a separation in content) My main objective is to extract text from the source code as if I were copying/pasting the webpage into a text editor. There is another parsing function I would like to use, which originally took in copied/pasted text rather than the source code. To do this, I'm currently using urllib2, and calling .get_text() in Beautiful Soup. The problem is, Beautiful Soup is leaving tremendous amounts of white space in my code, and it is difficult to pass the result into the second "text" parser. I have done quite a bit of research on parsing HTMLs, but I'm frankly not sure how to solve this problem easily. Furthermore, I'm a bit confused on how to use imports like lxml to extract text as if I were to simply copy and paste?

    Read the article

  • % _ in search form displays all results

    - by fusion
    if the search form is blank, it should display an error that something should be entered by the user. it should only show those results which contain the keywords the user has entered in the search textbox. however, if the user enters % or _ or +, it displays all results. how do i display an error when the user enters these wildcard characters? my search php code: $search_result = ""; $search_result = $_GET["q"]; $search_result = trim($search_result); if ($search_result == "") { echo "<p>Search Error</p><p>Please enter a search...</p>" ; exit(); } $result = mysql_query('SELECT cQuotes, vAuthor, cArabic, vReference FROM thquotes WHERE cQuotes LIKE "%' . mysql_real_escape_string($search_result) .'%" ORDER BY idQuotes DESC', $conn) or die ('Error: '.mysql_error()); // there's either one or zero records. Again, no need for a while loop function h($s) { echo htmlspecialchars($s, ENT_QUOTES); } ?> <div class="caption">Search Results</div> <div class="center_div"> <table> <?php while ($row= mysql_fetch_array($result)) { ?> <tr> <td style="text-align:right; font-size:15px;"><?php h($row['cArabic']); ?></td> <td style="font-size:16px;"><?php h($cQuotes); ?></td> <td style="font-size:12px;"><?php h($row['vAuthor']); ?></td> <td style="font-size:12px; font-style:italic; text-align:right;"><?php h($row['vReference']); ?></td> </tr> <?php } ?> </table> <?php ?> </div>

    Read the article

  • Optimizing a lot of Scanner.findWithinHorizon(pattern, 0) calls

    - by darvids0n
    I'm building a process which extracts data from 6 csv-style files and two poorly laid out .txt reports and builds output CSVs, and I'm fully aware that there's going to be some overhead searching through all that whitespace thousands of times, but I never anticipated converting about about 50,000 records would take 12 hours. Excerpt of my manual matching code (I know it's horrible that I use lists of tokens like that, but it was the best thing I could think of): public static String lookup(List<String> tokensBefore, List<String> tokensAfter) { String result = null; while(_match(tokensBefore)) { // block until all input is read if(id.hasNext()) { result = id.next(); // capture the next token that matches if(_matchImmediate(tokensAfter)) // try to match tokensAfter to this result return result; } else return null; // end of file; no match } return null; // no matches } private static boolean _match(List<String> tokens) { return _match(tokens, true); } private static boolean _match(List<String> tokens, boolean block) { if(tokens != null && !tokens.isEmpty()) { if(id.findWithinHorizon(tokens.get(0), 0) == null) return false; for(int i = 1; i <= tokens.size(); i++) { if (i == tokens.size()) { // matches all tokens return true; } else if(id.hasNext() && !id.next().matches(tokens.get(i))) { break; // break to blocking behaviour } } } else { return true; // empty list always matches } if(block) return _match(tokens); // loop until we find something or nothing else return false; // return after just one attempted match } private static boolean _matchImmediate(List<String> tokens) { if(tokens != null) { for(int i = 0; i <= tokens.size(); i++) { if (i == tokens.size()) { // matches all tokens return true; } else if(!id.hasNext() || !id.next().matches(tokens.get(i))) { return false; // doesn't match, or end of file } } return false; // we have some serious problems if this ever gets called } else { return true; // empty list always matches } } Basically wondering how I would work in an efficient string search (Boyer-Moore or similar). My Scanner id is scanning a java.util.String, figured buffering it to memory would reduce I/O since the search here is being performed thousands of times on a relatively small file. The performance increase compared to scanning a BufferedReader(FileReader(File)) was probably less than 1%, the process still looks to be taking a LONG time. I've also traced execution and the slowness of my overall conversion process is definitely between the first and last like of the lookup method. In fact, so much so that I ran a shortcut process to count the number of occurrences of various identifiers in the .csv-style files (I use 2 lookup methods, this is just one of them) and the process completed indexing approx 4 different identifiers for 50,000 records in less than a minute. Compared to 12 hours, that's instant. Some notes (updated): I don't necessarily need the pattern-matching behaviour, I only get the first field of a line of text so I need to match line breaks or use Scanner.nextLine(). All ID numbers I need start at position 0 of a line and run through til the first block of whitespace, after which is the name of the corresponding object. I would ideally want to return a String, not an int locating the line number or start position of the result, but if it's faster then it will still work just fine. If an int is being returned, however, then I would now have to seek to that line again just to get the ID; storing the ID of every line that is searched sounds like a way around that. Anything to help me out, even if it saves 1ms per search, will help, so all input is appreciated. Thankyou! Usage scenario 1: I have a list of objects in file A, who in the old-style system have an id number which is not in file A. It is, however, POSSIBLY in another csv-style file (file B) or possibly still in a .txt report (file C) which each also contain a bunch of other information which is not useful here, and so file B needs to be searched through for the object's full name (1 token since it would reside within the second column of any given line), and then the first column should be the ID number. If that doesn't work, we then have to split the search token by whitespace into separate tokens before doing a search of file C for those tokens as well. Generalised code: String field; for (/* each record in file A */) { /* construct the rest of this object from file A info */ // now to find the ID, if we can List<String> objectName = new ArrayList<String>(1); objectName.add(Pattern.quote(thisObject.fullName)); field = lookup(objectSearchToken, objectName); // search file B if(field == null) // not found in file B { lookupReset(false); // initialise scanner to check file C objectName.clear(); // not using the full name String[] tokens = thisObject.fullName.split(id.delimiter().pattern()); for(String s : tokens) objectName.add(Pattern.quote(s)); field = lookup(objectSearchToken, objectName); // search file C lookupReset(true); // back to file B } else { /* found it, file B specific processing here */ } if(field != null) // found it in B or C thisObject.ID = field; } The objectName tokens are all uppercase words with possible hyphens or apostrophes in them, separated by spaces. Much like a person's name. As per a comment, I will pre-compile the regex for my objectSearchToken, which is just [\r\n]+. What's ending up happening in file C is, every single line is being checked, even the 95% of lines which don't contain an ID number and object name at the start. Would it be quicker to use ^[\r\n]+.*(objectname) instead of two separate regexes? It may reduce the number of _match executions. The more general case of that would be, concatenate all tokensBefore with all tokensAfter, and put a .* in the middle. It would need to be matching backwards through the file though, otherwise it would match the correct line but with a huge .* block in the middle with lots of lines. The above situation could be resolved if I could get java.util.Scanner to return the token previous to the current one after a call to findWithinHorizon. I have another usage scenario. Will put it up asap.

    Read the article

  • Can i use the open source rich text editor in my site ? what does this GNU do with that ?

    - by Shyju
    I want to use one rich text editor to one of my ASP page.I searched in internet and found that there are lot of open source items available like TinyMCE,FCK Editror,nice edit etc.. Can i put the same samples in my website.There is a GNU license associated with it.Can somebody interpret it for me to answer these questions 1 . Can i use it in my website without getting a permission from anyone ? 2 . Do i need maintain the same files always ? Can i make some customization and use it ? Customization is only the CSS changes its going to be. 3 . Do i need to put all the files as it is of the package which downloaded.because my download has samples which supports all languages. Do i need to put all those folders too From GNU license " if you distribute copies of such a program, whether gratis or for a fee, you must pass on to the recipients the same freedoms that you received. You must make sure that they, too, receive or can get the source code. And you must show them these terms so they know their rights."

    Read the article

  • Error "403 Forbidden" on Sharepoint Search Settings Page

    - by user21924
    Hello I thought I had solved this nightmare by re-entering the values in my SSP properties set up, however accessing the Search Settings page error has reared it ugly head again. Now all solutions point to this method listed here * http://www.routtlogics.com/blog/Lists/Posts/Post.aspx?ID=6 * http://social.technet.microsoft.com/Forums/en-US/sharepointadmin/thread/f00651cd-e452-45b9-b19e-90e89c3c3ad4 * http://blogs.technet.com/sushrao/archive/2009/03/26/microsoft-office-sharepoint-server-2007-moss-403-forbidden-error-when-clicked-on-search-settings-page.aspx The above workaround(s) basically states that granting the local group WSS_WPG read and write permission to the Task folder in the Windows directory would solve the problem, however whenever I try to change to the permission attribute of this folder I get an access denied message, even when logged in as a Domain administrator, Enterprise and even the SharePoint Farm administrator. Please guys how do I get around this access denied issue. Thanks

    Read the article

  • Error "403 Forbidden" on Sharepoint Search Settings Page

    - by user21924
    Hello I thought I had solved this nightmare by re-entering the values in my SSP properties set up, however accessing the Search Settings page error has reared it ugly head again. Now all solutions point to this method listed here * http://www.routtlogics.com/blog/Lists/Posts/Post.aspx?ID=6 * http://social.technet.microsoft.com/Forums/en-US/sharepointadmin/thread/f00651cd-e452-45b9-b19e-90e89c3c3ad4 * http://blogs.technet.com/sushrao/archive/2009/03/26/microsoft-office-sharepoint-server-2007-moss-403-forbidden-error-when-clicked-on-search-settings-page.aspx The above workaround(s) basically states that granting the local group WSS_WPG read and write permission to the Task folder in the Windows directory would solve the problem, however whenever I try to change to the permission attribute of this folder I get an access denied message, even when logged in as a Domain administrator, Enterprise and even the SharePoint Farm administrator. Please guys how do I get around this access denied issue. Thanks

    Read the article

  • A desktop Wiki editor/viewer: is there anything out there?

    - by MrBertie
    I'm a big user of wikis, mainly Dokuwiki, I really like the clarity and ease of use of simple text files. However all good wikis seem to require a web-server of some kind; has anyone come across a good desktop wiki editor/viewer that work with plain-text files, and allow me to work with wiki text files just like any other document file type (note: not a desktop wiki running inside a local webserver) Before you rush to suggest (I hope!) I have done months of research on this and have tried Wixi, Wikidpad, zulupad.... Any ideas anyone?

    Read the article

  • Is there any desktop version available for wsywyg editors

    - by user825904
    I have been searching for long and i am not able to find one editor which i can always keep open for daily notes keeping with some codes and which highlights the code Just like we have our editor where we ask questions. ALl is want is I can write some text as normal Then i have few php code which i can write and then click some code button or like we have ctrl +K and i gets higlighted then on the same page i write some paragraph text as normal withoout higligting and some bullet points Then again i get some c++ code and i wrap that around on the same page and page grows daily Currently i use notepad++ and it does not support syntx higlighting of different langages on same page and on individual blocks. it highights all the file even the normal text as well basically i am looking wsywyg editor but as windows desktop application

    Read the article

  • openldap search acl

    - by Patrick
    I'm trying to write an access control for OpenLDAP to allow a user to search with a certain base dn, but only get results back from certain sub dn's. I've played with lots of different rules but cant get it to work. I'm not sure its even possible. For example: I have the user with the dn uid=testuser,ou=people,dc=example,dc=com. I want this user to be able to search with a base of dc=example,dc=com and get back entries in ou=people,dc=example,dc=com. There are lots of other sub OUs under dc=example,dc=com, but only entries in ou=people should be returned (for bonus, I'd only like certain attributes to be returned as well). Can this be done?

    Read the article

  • Disable Bing search entirely from IE when using sysprep

    - by takesides
    I'm creating a Windows Server 2012 R2 image for our product (an all-in-one appliance installed on server hardware) and I am having incredible trouble with getting Bing removed from Internet Explorer. Our clients do not want Bing at all, not as a default search, not as a "search from address bar" feature, not in any sense. I've managed to configure everything else that I want in IE using WAIK to generate an unattend XML file but this has me completely stuck. Ideally I want Bing removed from IE altogether but I would make do with it just being completely disabled. I can find no options in either WAIK or Group Policy to configure this as I need it. Any ideas gratefully received.

    Read the article

  • LibreOffice Calc SEARCH and FIND functions

    - by TTT
    I am trying to process some data in Calc. One of the steps involve finding if a certain string is part of one of the column. I tried using FIND and SEARCH functions. Both behave in the same way and I am not getting correct results. E.g. Say I have following strings in Column A NY SF LON CAN US and am trying to put following formula in column C =SEARCH("NY",A2) The result is - cell C2 will have 1 (which is correct) but if the same formula is copied to other cells in column C - it gives me "#VALUE!" error and I am unable to find out why ? Any one has any ideas ? Thanks in advance TT

    Read the article

  • Search /usr/local/texlive directory with Spotlight

    - by Teake Nutma
    Having TeXLive installed on my Mac, I frequently need to consult documentation for some of the packages. It seems silly to Google this when I have the PDFs all on my HDD in /usr/local/texlive/2011/texmf-dist/doc , so I want to be able to use Spotlight to search for them. However, I can't get Spotlight to cooperate. I tried mdimport /usr/local/texlive/2011/texmf-dist/doc which then does some work, but afterwards doesn't display any results in Spotlight. I've also added the folder in Alfred's search scope to no avail. Any ideas?

    Read the article

  • custom adm file for IE search providers on Windows 2003

    - by filipv
    Hi, I have been stumped by this issue for some time, I created a custom ADM template for a customer to populate the search providers in Internet Explorer 7 and 8. The custom ADM works fine and I have set 3 search providers, the problem is that I cannot change them (the customer wants to change the entry for wikipedia from EN to NL), I edited the ADM file but the clients seem to keep on using the old settings. Removing/replacing the adm file has no effect either, the settings remain. I used the following article as a base: http://support.microsoft.com/kb/918238 For IE8 you need to work with a name instead of a UID for the default entry, that works as expected. but there seems to be no way to change the setting once in place.

    Read the article

  • Meaning of "*" in Windows 7 Explorer Search?

    - by Pumbaa80
    I have a folder containing files like radiobutton-clicked.png radiobutton-foobar.png radiobutton-foobarbaz.png ... etc. This is what happens when I search in Windows Explorer: radio: all files found radio*: all files found *button: all files found *radiobutton*: all files found radiobutton*: no results radiobutton: no results radio*button: all files found So what the hell does the * precisely do? Is there some documentation on this? And why does radio and radio*button work as a search term, but radiobutton not? Edit: I know that * is usually supposed to be a wildcard matching 0 or more characters. But obviously it doesn't in this case.

    Read the article

  • How can I edit files directly from find results

    - by Sergio Tulentsev
    Let's say, I want to replace one string with another. Normally I use TextMate, but today, I don't remember why, I decided to run the search in Sublime Text 2 (which I installed some time ago, but haven't used since). I was impressed with "Find Results" window, seems more useful that textmate's. Then I tried to edit text right inside that window and (I can swear on my macbook) it worked (I was able to commit those changes to git repo)! I thought then, "Wow, that's a really nice feature. Should I migrate from TM already?". Unfortunately, I wasn't able to reproduce this behaviour later. I can still edit text in "find results" window, but changes don't persist (that is, original files are not touched). Am I doing something wrong now, or was it caffeine-induced hallucination earlier in the day?

    Read the article

  • A desktop Wiki editor/viewer: is there anything out there?

    - by MrBertie
    I'm a big user of wikis, mainly Dokuwiki, I really like the clarity and ease of use of simple text files. However all good wikis seem to require a web-server of some kind; has anyone come across a good desktop wiki editor/viewer that work with plain-text files, and allow me to work with wiki text files just like any other document file type (note: not a desktop wiki running inside a local webserver) Before you rush to suggest (I hope!) I have done months of research on this and have tried Wixi, Wikidpad, zulupad.... Any ideas anyone?

    Read the article

  • How do you create virtual folders from saved search

    - by Jérôme Radix
    I would like to have on unix-like platforms, the same functionality as to Windows 7 Library folders (aka virtual folders) you see in Windows Explorer. Gnome Nautilus do that kind of virtual folders through saved search. But I want a system-wide solution, not a gnome-wide solution. Is there a tool that creates virtual folders from the concatenation of multiple search queries (the result of multiple find commands ?). The solution should index files for better performances and you should be able to define the default folder for copy operations. I assume the solution of this kind of problem certainly use FUSE, but I can't see a complete solution to this kind of task in FUSE applications.

    Read the article

  • How do you create virtual folders from saved search

    - by Jérôme Radix
    I would like to have on unix-like platforms, the same functionality as to Windows 7 Library folders (aka virtual folders) you see in Windows Explorer. Gnome Nautilus do that kind of virtual folders through saved search. But I want a system-wide solution, not a gnome-wide solution. Is there a tool that creates virtual folders from the concatenation of multiple search queries (the result of multiple find commands ?). The solution should index files for better performances and you should be able to define the default folder for copy operations. I assume the solution of this kind of problem certainly use FUSE, but I can't see a complete solution to this kind of task in FUSE applications.

    Read the article

  • page up/down print ~ instead of history search in terminal

    - by Desmond
    I am on a Macbook Pro with mac os x 10.8.2 I have set: page up: \033[5~ page down: \033[6~ in terminal keyboard settings (pressing esc to get \033). My ~/.xinputrc is: # Be 8 bit clean. set input-meta on set output-meta on set convert-meta off # Auto completion options set show-all-if-ambiguous on set completion-ignore-case on # Keybindings "\e[1~": beginning-of-line # Home key "\e[4~": end-of-line # End key "\e[5~": history-search-backward # Page Up "\e[6~": history-search-forward # Page Down "\e[3~": delete-char # Delete key "\e[5C": forward-word # Ctrl+right "\e[5D": backward-word # Ctrl+left I am just following a guide found on internet (actually there are a lot of guide really similar): http://macimproved.wordpress.com/2010/01/04/fix-page-updown-home-end-in-terminal/ Unfortunately, the only (terrific) result is that when I press page up (fn + up arrow) just a "~" is printed in the terminal.

    Read the article

  • Search Domain Not Working With Squid

    - by Kyle Brandt
    I just set up a squid proxy as a parent proxy to HAVP. When I or other users try to access a domain with an address like "http://foo" I get the following squid error in the browser: The dnsserver returned: Server Failure: The name server was unable to process this query. However, "http://foo.companyname.com" works fine. The search domain in resolv.conf on both the client and proxy host is companyname.com. (There a better term for "search domain"?) Is there a way to correct this, maybe something in the squid.conf file?.

    Read the article

  • Advanced imap search with alpine?

    - by devnull
    I am using alpine since a few days and I am very happy with the IMAP functionality and the terrific speed. The only frustration is that the W search (whereis) isn't as powerful (and probably not meant to be). What would be the best solution to search all inbox messages e.g. with a specific from (and having alpine to show a list of these matching messages) or even in the entire collection of IMAP folders or in a specific IMAP folder (e.g. archive) ? I think this might need a special script

    Read the article

  • remove start.funmoods search from chrome

    - by Joe King
    I post this with much trepidation after my baptism by fire recently, and knowing that this question has been asked and answered already. My problem is that I cannot seem to remove start.funmoods as the default search engine when I type into the omnibox in Chrome - I have followed the instruction in the answer to the previous question on this topic. In particular: I deleted funmods using the control panel - add/remove programs Under wrench-tools-extensions funmods is not mentioned Under wrench-settings-manage search engines, there is nothing listed at all. Restarted chrome and rebooting have not helped.

    Read the article

  • IE Search Provider: Specifying gTLD / Country-Specific Site

    - by jwa
    I am based in the UK, and as such typically use google.co.uk as my search engine. However, my employer is based in continental Europe, and thus my internet proxy is located overseas. As a result, IP geo-location presents a location outside of the UK. Google detects this, and as a result will redirect my searches from the address bar to a foreign Google domain. This leads to "local" answers having a higher ranking, many of which are not written in English language! Is there a specific search provider / URL I can give to IE which will use a specific gTLD of google (.co.uk), rather than performing the location-based redirect?

    Read the article

  • Howto access thread data outside a thread

    - by Quandary
    Question: I start the MS Text-to-speech engine in a thread, in order to avoid a crash on DLL_attach. It starts fine, and the text to speech engine gets initialized, but I can't access ISpVoice outside the thread. How can I access ISpVoice outside the thread ? It's a global variable after all... #include <windows.h> #include <sapi.h> #include "XPThreads.h" ISpVoice * pVoice = NULL; unsigned long init_engine_thread(void* param) { Sleep(5000); printf("lolthread\n"); //HRESULT hr = CoInitializeEx(NULL, COINIT_MULTITHREADED); HRESULT hr = CoInitialize(NULL); if(FAILED(hr) ) { MessageBox(NULL, TEXT("Failed To Initialize"), TEXT("Error"), 0); char buffer[2000] ; sprintf(buffer, "An error occured: 0x%08X.\n", hr); FILE * pFile = fopen ( "c:\\temp\\CoInitialize_dll.txt" , "w" ); fwrite (buffer , 1 , strlen(buffer) , pFile ); fclose (pFile); } else { printf("trying to create instance.\n"); //HRESULT hr = CoCreateInstance(CLSID_SpVoice, NULL, CLSCTX_ALL, IID_ISpVoice, (void **) &pVoice); //hr = CoCreateInstance(CLSID_SpVoice, NULL, CLSCTX_ALL, IID_ISpVoice, (void **) &pVoice); //HRESULT hr = CoCreateInstance(__uuidof(ISpVoice), NULL, CLSCTX_INPROC_SERVER, IID_ISpVoice, (void **) &pVoice); HRESULT hr = CoCreateInstance(__uuidof(SpVoice), NULL, CLSCTX_ALL, IID_ISpVoice, (void **) &pVoice); if( SUCCEEDED( hr ) ) { printf("Succeeded\n"); hr = pVoice->Speak(L"The text to speech engine has been successfully initialized.", 0, NULL); } else { printf("failed\n"); MessageBox(NULL, TEXT("Failed To Create COM instance"), TEXT("Error"), 0); char buffer[2000] ; sprintf(buffer, "An error occured: 0x%08X.\n", hr); FILE * pFile = fopen ( "c:\\temp\\CoCreateInstance_dll.txt" , "w" ); fwrite (buffer , 1 , strlen(buffer) , pFile ); fclose (pFile); } } if(pVoice != NULL) { pVoice->Release(); pVoice = NULL; } CoUninitialize(); return NULL; } XPThreads* ptrThread = new XPThreads(init_engine_thread); BOOL APIENTRY DllMain( HMODULE hModule, DWORD ul_reason_for_call, LPVOID lpReserved) { switch (ul_reason_for_call) { case DLL_PROCESS_ATTACH: //init_engine(); LoadLibrary(TEXT("ole32.dll")); ptrThread->Run(); break; case DLL_THREAD_ATTACH: break; case DLL_THREAD_DETACH: break; case DLL_PROCESS_DETACH: break; } return TRUE; }

    Read the article

< Previous Page | 163 164 165 166 167 168 169 170 171 172 173 174  | Next Page >