Search Results

Search found 48264 results on 1931 pages for 'text search'.

Page 277/1931 | < Previous Page | 273 274 275 276 277 278 279 280 281 282 283 284  | Next Page >

  • python read utf8 text file problem

    - by cpps
    I have a problem with python about reading and print utf8 text file. I have a test.txt in utf8 encoding without BOM. This file has two characters in it: ?? The first character "?" is Chinese and the second "?" is Japanese. Now, When I use Ulipad (a python editor) to run the following code to read the txt file, and print these two characters. import codecs infile = "C:\\test.txt" f = codecs.open(infile, "r", "utf-8") s = f.read() print(s) I got this error, "UnicodeEncodeError: 'cp950' codec can't encode character '\u58f0' in position 1: illegal multibyte sequence" I found it caused from the second character "?" . But when I use the same code to test in python default GUI IDLE, it works to print the two characters with no error. So, how can I fix the problem. My running environment is python 3.1 , windows xp traditional Chinese.

    Read the article

  • Find ASCII "arrows" in text

    - by ulver
    I'm trying to find all the occurrences of "Arrows" in text, so in "<----=====><==->>" the arrows are: "<----", "=====>", "<==", "->", ">" This works: String[] patterns = {"<=*", "<-*", "=*>", "-*>"}; for (String p : patterns) { Matcher A = Pattern.compile(p).matcher(s); while (A.find()) { System.out.println(A.group()); } } but this doesn't: String p = "<=*|<-*|=*>|-*>"; Matcher A = Pattern.compile(p).matcher(s); while (A.find()) { System.out.println(A.group()); } No idea why. It often reports "<" instead of "<====" or similar. What is wrong?

    Read the article

  • Multiple lines of text to a single map

    - by steven
    I've been trying to use Hadoop to send N amount of lines to a single mapping. I don't require for the lines to be split already. I've tried to use NLineInputFormat, however that sends N lines of text from the data to each mapper one line at a time [giving up after the Nth line]. I have tried to set the option and it only takes N lines of input sending it at 1 line at a time to each map: job.setInt("mapred.line.input.format.linespermap", 10); I've found a mailing list recommending me to override LineRecordReader::next, however that is not that simple, as that the internal data members are all private. I've just checked the source for NLineInputFormat and it hard codes LineReader, so overriding will not help. Also, btw I'm using Hadoop 0.18 for compatibility with the Amazon EC2 MapReduce.

    Read the article

  • Reading numeric Excel data as text using xlrd in Python

    - by Brian
    Hi guys, I am trying to read in an Excel file using xlrd, and I am wondering if there is a way to ignore the cell formatting used in Excel file, and just import all data as text? Here is the code I am using for far: import xlrd xls_file = 'xltest.xls' xls_workbook = xlrd.open_workbook(xls_file) xls_sheet = xls_workbook.sheet_by_index(0) raw_data = [['']*xls_sheet.ncols for _ in range(xls_sheet.nrows)] raw_str = '' feild_delim = ',' text_delim = '"' for rnum in range(xls_sheet.nrows): for cnum in range(xls_sheet.ncols): raw_data[rnum][cnum] = str(xls_sheet.cell(rnum,cnum).value) for rnum in range(len(raw_data)): for cnum in range(len(raw_data[rnum])): if (cnum == len(raw_data[rnum]) - 1): feild_delim = '\n' else: feild_delim = ',' raw_str += text_delim + raw_data[rnum][cnum] + text_delim + feild_delim final_csv = open('FINAL.csv', 'w') final_csv.write(raw_str) final_csv.close() This code is functional, but there are certain fields, such as a zip code, that are imported as numbers, so they have the decimal zero suffix. For example, is there is a zip code of '79854' in the Excel file, it will be imported as '79854.0'. I have tried finding a solution in this xlrd spec, but was unsuccessful.

    Read the article

  • What does LAME text does in MP3 file?

    - by Dims
    I see here http://en.wikipedia.org/wiki/MP3 that MP3 file consists of MP3 headers interchanged with MP3 data. MP3 header consist of few bytes. But here is my MP3 file dump with ID3 tag cut. Header is highlighted with blue. You can see that "LAME3.96" text is highlighted with green. What does it does there? Is this a part of MP3 elementary stream? Or this is the part of some headers I didn't tag?

    Read the article

  • Set text based on Form <select> value

    - by danit
    I have the following <select> in a form: <select style="width: 100px; text-align: center;"> <option value="email">Email</option> <option value="telephone">Phone</option> </select> The default option is Email so the following <input> is show: <p class="email_form">Please supply your Email Address</p> <input onfocus="if(!this._haschanged){this.value=''};this._haschanged=true;" style="width: 270px" type="email" name="email" id="email" value="[email protected]"> However if they choose telephone I want to change the above to ask for a telephone number?

    Read the article

  • Get the text of an li element

    - by Deepak Prasanna
    <ul class="leftbutton" > <li id="menu-selected">Sample 1</li> <li>Sample 2</li> <li>Sample 3</li> <li>Sample 4</li> <li>Sample 5</li> </ul> I want to get the text of the li item which the id="menu-selected". Right now I am doing something like document.getElementById('menu_selected').childNodes.item(0).nodeValue Is there any simpler way of doing the same?

    Read the article

  • Read data from a text file using Java

    - by Saran
    I need to read a text file line by line using Java. I use available() method of FileInputStream to check and loop over the file. But while reading, the loop terminates after the line before the last one. i.e., if the file has 10lines, the loop reads only the first 9 lines. Snippet used : while(fis.available() > 0) { char c = (char)fis.read(); ..... ..... }

    Read the article

  • Searching Techniques/Algorithms for Resources over a given area

    - by Raydon
    I have a flat area with nodes randomly placed on this flat surface. I need techniques which are able to take a starting point, move in a certain way (the algorithm), find nodes and continue searching. I do not have an overall view of the surface (i.e. I cannot see everything), only a limited view (i.e. 4 cells in any direction). Ideally, these methods would be efficient in the way that they work. Any points in the right direction would be greatly appreciated.

    Read the article

  • code to edit the text in an EditText widgit

    - by user293663
    Hello, I am a newbi programmer and am having trouble with a measurement conversion program I am working on. What I would like to do is have several EditText boxes. When one is filled in and a calculate button is hit then the rest will be populated with a converted number. The part I am getting stuck on is outputting the answers to EditText widgets. ex: There are three EditText Widgets. a user inputs the number 6 into the first one (labeled inches) and presses calculate. I would like the next box to display .5 (labeled feet) and the last to display .1666 (this would be yards) .setText() apparently does not work to modify the text in an EditText any help would be greatly appreciated.

    Read the article

  • Finding key Solr performance metrics

    - by Mike Malloy
    To improve performance of Solr find your slowest searches, monitor query results, cache hit rate and cache size, document cache and filter cache; find problems with Solr update handlers by tracking index operations and document operations. There is a tool from New Relic which may help. http://www.newrelic.com/solr.html

    Read the article

  • Extract all text from a HTML page without losing context

    - by grmbl
    For a translation program I am trying to get a 95% accurate text from a HTML file in order to translate the sentences and links. For example: <div><a href="stack">Overflow</a> <span>Texts <b>go</b> here</span></div> Should give me 2 results to translate: Overflow Texts <b>go</b> here Any suggestions or commercial packages available for this problem?

    Read the article

  • RegularExpression-esque search matching Objects in List

    - by Pindatjuh
    I'm currently working on an implementation of the following idea, and I was wondering if there is any literature on this subject. Working with Java, but the principle applies on any language with a decent type-system, I like to implement: matching Objects from a List using a RegularExpression-esque search: So let's say I have a List containing List<Object> x = new ArrayList<Object>(); x.add(new Object()); x.add("Hello World"); x.add("Second String"); x.add(5); // Integer (auto-boxing) x.add(6); // Integer Then I create a "Regular Expression" (not working with a stream of characters, but working with a stream of Objects), and instead of character-classes, I use type-system properties: [String][Integer] And this would match one sublist: {Match["Second String", 5]}. The expression: [String:length()<15] Will match two sublist (each of length 1) containing a String which instance is passing the expression instance.length() < 5: {Match["Hello World"],Match["Second String"]}. [Object][Object] Matches any pair in the List: {Match[Object,"Hello World"],Match["Second String", 5]}, in a streamed manner (no overlapping matches). Ofcourse, my implementation will have grouping, lookahead/lookbehinds and is hierarchical (i.e. matching n elements from Lists in Lists), etc. The above merely illustrates the concept. Is there a name for this principle, and is there literature available on it?

    Read the article

  • 2D Array values frequency

    - by Morano88
    If I have a 2D array that is arranged as follows : String X[][] = new String [][] {{"127.0.0.9", "60", "75000","UDP", "Good"}, {"127.0.0.8", "75", "75000","TCP", "Bad"}, {"127.0.0.9", "75", "70000","UDP", "Good"}, {"127.0.0.1", "", "70000","UDP", "Good"}, {"127.0.0.1", "75", "75000","TCP", "Bad"} }; I want to know the frequency of each value .. so I27.0.0.9 gets 2. How can I do a general solution for this ? In Java or any algorithm for any language ?

    Read the article

  • Add domain to relative urls

    - by Rick
    How can I add http://facebook.com to relative URL's contained within #facebook_urls? Eg: <a href="/test.html"> becomes <a href="http://facebook.com/test.html"> #facebook_urls also contains absolute urls, so I want to make sure I don't touch those. Thanks!

    Read the article

  • filter to reverse lines of a text file

    - by Greg Hewgill
    I'm writing a small shell script that needs to reverse the lines of a text file. Is there a standard filter command to do this sort of thing? My specific application is that I'm getting a list of Git commit identifiers, and I want to process them in reverse order: git log --pretty=oneline work...master | grep -v DEBUG: | cut -d' ' -f1 | reverse The best I've come up with is to implement reverse like this: ... | cat -b | sort -rn | cut -f2- This uses cat to number every line, then sort to sort them in descending numeric order (which ends up reversing the whole file), then cut to remove the unneeded line number. The above works for my application, but may fail in the general case because cat -b only numbers nonblank lines. Is there a better, more general way to do this?

    Read the article

  • Get highest frequency terms from Lucene index

    - by Julia
    Hello! i need to extract terms with highest frequencies from several lucene indexes, to use them for some semantic analysis. So, I want to get maybe top 30 most occuring terms(still did not decide on threshold, i will analyze results) and their per-index counts. I am aware that I might lose some precision because of potentionally dropped duplicates, but for now, lets say i am ok with that. So for the proposed solutions, (needless to say maybe) speed is not important, since I would do static analysis, I would put accent on simplicity of implementation because im not so skilled with Lucene (not the programming guru too :/ ) and cant wrap my mind around many concepts of it.. I can not find any code samples from something similar, so all concrete advices (code, pseudocode, links to code samples...) I will apretiate very much!!! Thank you!

    Read the article

  • Lucene (.NET) Document stucture and performance suggestions.

    - by Josh Handel
    Hello, I am indexing about 100M documents that consist of a few string identifiers and a hundred or so numaric terms.. I won't be doing range queries, so I haven't dugg too deep into Numaric Field but I'm not thinking its the right choose here. My problem is that the query performance degrades quickly when I start adding OR criteria to my query.. All my queries are on specific numaric terms.. So a document looks like StringField:[someString] and N DataField:[someNumber].. I then query it with something like DataField:((+1 +(2 3)) (+75 +(3 5 52)) (+99 +88 +(102 155 199))). Currently these queries take about 7 to 16 seconds to run on my laptop.. I would like to make sure thats really the best they can do.. I am open to suggestions on field structure and query structure :-). Thanks Josh PS: I have already read over all the other lucene performance discussions on here, and on the Lucene wiki and at lucid imiagination... I'm a bit further down the rabbit hole then that...

    Read the article

  • Generating Graph with 2 Y Values from Text File

    - by Joey jie
    Hi all, I have remade my original post as it was terribly formatted. Basically I would like some advice / tips on how to generate a line graph with 2 Y Axis (temperature and humidity) to display some information from my text file. It is contained in a textfile called temperaturedata.txt I have included a link to one of my posts from the JpGrapher forum only because it is able to display the code clearly. I understand that since it is JpGraph problem I shouldn't post here however the community here is a lot more supportive and active. Many thanks for all your help guys in advance! my code

    Read the article

  • how to find the last instance of a setting in a config file

    - by Glenn Kelley
    I am trying to figure out how to find the last entry of a string in multiple config files across a server. Each of the strings will be in the /home/***usernamewouldbehere/public_html/typo3conf/localconf.php file In short - the last entry in the config files will point to the database server the application is utilizing - and we need to know which accounts point to which db server. While I can run something like this - grep "$_db_host" /home/*/public_html/conf/localconf.php It does not really help much because it gives us way to much information ... and not what we really need. What i really need to know is the last entry of this string $_db_host = 'xx'; and to sort them out in an export file Since the config files may have multiple entries (example below) $_db_host = 'localhost'; $_db_host = '10.0.1.234'; It would be great to list in a file all of those that have the entry for 'localhost' and then list all of those that have the entry for '10.0.1.234' (or whichever server there may be there) but even if I need to do that manually that would be great. I am not sure how to get to it using Awk - ... and really stuck What I am hoping for is something that would be piped as follows db_host = localhost /home/username1/www/conf/localconf.php db_host = localhost / home/username2/public_html/conf/localconf.php db_host= '10.1.2.23' /home/username55/public_html/conf/localconf.php hoping that helps you help me :-)

    Read the article

  • Add wordwrap to decoded json text

    - by Gary
    Hi, I am using a simple script on my PHP webpage to decode and output JSON as text. However, what ever I try I can't get it to wordwrap the output. $file = file_get_contents('sample.txt'); $out = (json_decode($file)); echo $out->mainText; How can I get this script to wordwrap at 600 characters without chopping words in half? If possible, can you show me the whole script please as I am slowly learning. Thanks

    Read the article

  • Eclipse keyword highlighting in in my own text editor

    - by Torok Balint
    I made a simple text editor in eclipse to which I added some simple WordRule based syntax highlighting to highlight the language keywords. The problem is that when a keyword is part of an identifier (eg. "import" is part of "extra_import"), then "import" is highlighted in "extra_import". How can I stop eclipse to highlight a a keyword if it is only a sub string of another string? Anlther question; is there a regular expression based IRule? What is the purpose of WhitespaceRule? White spaces are usually not highlighted. Thaks

    Read the article

  • How to change text color for link in tr element with CSS

    - by mathee
    I'd like to change the background and text color when the mouse hovers over a row in a table: tr { background-color:#FFF; color:#000; } tr:hover { background-color:#000; color:#FFF; } This works if there aren't any links in the tr elements, but when there are, the link color remains black (because of a { color: #000; }?). How do I specify in the CSS that links in the tr element should change color when the mouse hovers over the tr?

    Read the article

  • How to replace custom IDs in the order of their appearance with a shell script?

    - by Péter Török
    I have a pair of rather large log files with very similar content, except that some identifiers are different between the two. A couple of examples: UnifiedClassLoader3@19518cc | UnifiedClassLoader3@d0357a JBossRMIClassLoader@13c2d7f | JBossRMIClassLoader@191777e That is, wherever the first file contains UnifiedClassLoader3@19518cc, the second contains UnifiedClassLoader3@d0357a, and so on. I want to replace these with identical IDs so that I can spot the really important differences between the two files. I.e. I want to replace all occurrences of both UnifiedClassLoader3@19518cc in file1 and UnifiedClassLoader3@d0357a in file2 with UnifiedClassLoader3@1; all occurrences of both JBossRMIClassLoader@13c2d7f in file1 and JBossRMIClassLoader@191777e in file2 with JBossRMIClassLoader@2 etc. Using the Cygwin shell, so far I managed to list all different identifiers occurring in one of the files with grep -o -e 'ClassLoader[0-9]*@[0-9a-f][0-9a-f]*' file1.log | sort | uniq However, now the original order is lost, so I don't know which is the pair of which ID in the other file. With grep -n I can get the line number, so the sort would preserve the order of appearance, but then I can't weed out the duplicate occurrences. Unfortunately grep can not print only the first match of a pattern. I figured I could save the list of identifiers produced by the above command into a file, then iterate over the patterns in the file with grep -n | head -n 1, concatenate the results and sort them again. The result would be something like 2 ClassLoader3@19518cc 137 ClassLoader@13c2d7f 563 ClassLoader3@1267649 ... Then I could (either manually or with sed itself) massage this into a sed command like sed -e 's/ClassLoader3@19518cc/ClassLoader3@2/g' -e 's/ClassLoader@13c2d7f/ClassLoader@137/g' -e 's/ClassLoader3@1267649/ClassLoader3@563/g' file1.log > file1_processed.log and similarly for file2. However, before I start, I would like to verify that my plan is the simplest possible working solution to this. Is there any flaw in this approach? Is there a simpler way?

    Read the article

< Previous Page | 273 274 275 276 277 278 279 280 281 282 283 284  | Next Page >