Search Results

Search found 59230 results on 2370 pages for 'character set'.

Page 60/2370 | < Previous Page | 56 57 58 59 60 61 62 63 64 65 66 67  | Next Page >

  • Scraping &#151 character (long dash) error in Nokogiri

    - by DavidP6
    I having trouble scraping a certain long dash that is encoded as — ; on the Time magazine site. It looks like this: —. It works fine when this dash is encoded as mdash, but when the problem dash is scraped, it is returned as unknown characters. I am using Nokogiri and am wondering if I have to use some sort of special encoding? The page says it is encoded with UTF-8.

    Read the article

  • Servlet response wrapper has encoding problem

    - by John O
    A servlet response wrapper is being used in a Servlet Filter. The idea is that the response is manipulated, with a 'nonce' value being injected into forms, as part of defence against CSRF attacks. The web app is using UTF-8 everywhere. When the Servlet Filter is absent, no problems. When the filter is added, encoding issues occur. (It seems as if the response is reverting to 8859-1.) The guts of the code : final class CsrfResponseWrapper extends AbstractResponseWrapper { ... byte[] modifyResponse(byte[] aInputResponse){ ... String originalInput = new String(aInputResponse, encoding); String modifiedResult = addHiddenParamToPostedForms(originalInput); result = modifiedResult.getBytes(encoding); ... } ... } As I understand it, the transition between byte-land and String-land should specify an encoding. That is done here, as you can see, in two places. The value of the 'encoding' variable is 'UTF-8'; the alteration of the String itself is standard string manipulation (with a regex), and never specifies an encoding (addHiddenParamToPostedForms). Where am I in error about the encoding? EDIT: Here is the base class (sorry it's rather long): package hirondelle.web4j.security; import javax.servlet.ServletOutputStream; import javax.servlet.ServletResponse; import javax.servlet.http.HttpServletResponse; import javax.servlet.http.HttpServletResponseWrapper; import java.io.ByteArrayOutputStream; import java.io.IOException; import java.io.PrintWriter; /** Abstract Base Class for altering response content. (May be useful in future contexts as well. For now, keep package-private.) */ abstract class AbstractResponseWrapper extends HttpServletResponseWrapper { AbstractResponseWrapper(ServletResponse aServletResponse) throws IOException { super((HttpServletResponse)aServletResponse); fOutputStream = new ModifiedOutputStream(aServletResponse.getOutputStream()); fWriter = new PrintWriter(fOutputStream); } /** Return the modified response. */ abstract byte[] modifyResponse(byte[] aInputResponse); /** Standard servlet method. */ public final ServletOutputStream getOutputStream() { //fLogger.fine("Modified Response : Getting output stream."); if ( fWriterReturned ) { throw new IllegalStateException(); } fOutputStreamReturned = true; return fOutputStream; } /** Standard servlet method. */ public final PrintWriter getWriter() { //fLogger.fine("Modified Response : Getting writer."); if ( fOutputStreamReturned ) { throw new IllegalStateException(); } fWriterReturned = true; return fWriter; } // PRIVATE /* Well-behaved servlets return either an OutputStream or a PrintWriter, but not both. */ private PrintWriter fWriter; private ModifiedOutputStream fOutputStream; /* These items are used to implement conformance to the javadoc for ServletResponse, regarding exceptions being thrown. */ private boolean fWriterReturned; private boolean fOutputStreamReturned; /** Modified low level output stream. */ private class ModifiedOutputStream extends ServletOutputStream { public ModifiedOutputStream(ServletOutputStream aOutputStream) { fServletOutputStream = aOutputStream; fBuffer = new ByteArrayOutputStream(); } /** Must be implemented to make this class concrete. */ public void write(int aByte) { fBuffer.write(aByte); } public void close() throws IOException { if ( !fIsClosed ){ processStream(); fServletOutputStream.close(); fIsClosed = true; } } public void flush() throws IOException { if ( fBuffer.size() != 0 ){ if ( !fIsClosed ) { processStream(); fBuffer = new ByteArrayOutputStream(); } } } /** Perform the core processing, by calling the abstract method. */ public void processStream() throws IOException { fServletOutputStream.write(modifyResponse(fBuffer.toByteArray())); fServletOutputStream.flush(); } // PRIVATE // private ServletOutputStream fServletOutputStream; private ByteArrayOutputStream fBuffer; /** Tracks if this stream has been closed. */ private boolean fIsClosed = false; } }

    Read the article

  • Why use spaces instead of tabs for indentation? [closed]

    - by erenon
    Possible Duplicate: Are spaces preferred over tabs for indentation? Why do most coding standards recommend the use of spaces instead of tabs? Tabs can be configured to be as many characters wide as needed, but spaces can't. Example: Zend cs Pear cs Pear manual: This helps to avoid problems with diffs, patches, SVN history and annotations. How could tabs cause problems?

    Read the article

  • How can I match a match a null byte (0x00) in the Visual Studio binary editor with a find using a re

    - by Paul K
    Open a file in the Visual Studio binary editor that contains a null byte (0x00), then use the Quick Find feature (Ctrl +F) to find null bytes. I would have thought I could use a regular expression such as \x00 to match null bytes but it doesn't work. Searching for any other hex value using this method works fine. Is this a VS bug, 'feature', or am I just missing something? Is there a work around?

    Read the article

  • Position of character in a string

    - by Irfan
    I have a string : var str = "12345a45";//position is 6 here now i want the position of 'a'(alphabet) in that string similarly i have few more string like this: var str1 = "1234567a45";//position is 8 here var str2 = "12345a4";//position is 6 here var str3 = "123a";//position is 4 here var str4 = "a45";//position is 1 here Now what i thought of doing is , just searching the string from last and know the occurrence of any alphabet in that strings. any help will be appreciated . thanks.

    Read the article

  • getTextContent from Node with whitespace character normalization

    - by Nayn
    Hi, I am working with XPATH, Java and want to extract some text out of one html page. The text is located under some div with some whitespace characters in between, like &nbsp; <br> etc. I want these to be converted into 'space' and 'newline' respectively while extracting. The method I am using to extract text is Element.getTextContent() which does not respect whitespace characters. Could somebody tell me if there is a way to extract text with whitespace normalization OR Extract whole html markup under the 'Node' so that i could replace it by myself. Thanks Nayn

    Read the article

  • Python: (sampling with replacement): efficient algorithm to extract the set of UNIQUE N-tuples from a set

    - by Homunculus Reticulli
    I have a set of items, from which I want to select DISSIMILAR tuples (more on the definition of dissimilar touples later). The set could contain potentially several thousand items, although typically, it would contain only a few hundreds. I am trying to write a generic algorithm that will allow me to select N items to form an N-tuple, from the original set. The new set of selected N-tuples should be DISSIMILAR. A N-tuple A is said to be DISSIMILAR to another N-tuple B if and only if: Every pair (2-tuple) that occurs in A DOES NOT appear in B Note: For this algorithm, A 2-tuple (pair) is considered SIMILAR/IDENTICAL if it contains the same elements, i.e. (x,y) is considered the same as (y,x). This is a (possible variation on the) classic Urn Problem. A trivial (pseudocode) implementation of this algorithm would be something along the lines of def fetch_unique_tuples(original_set, tuple_size): while True: # randomly select [tuple_size] items from the set to create first set # create a key or hash from the N elements and store in a set # store selected N-tuple in a container if end_condition_met: break I don't think this is the most efficient way of doing this - and though I am no algorithm theorist, I suspect that the time for this algorithm to run is NOT O(n) - in fact, its probably more likely to be O(n!). I am wondering if there is a more efficient way of implementing such an algo, and preferably, reducing the time to O(n). Actually, as Mark Byers pointed out there is a second variable m, which is the size of the number of elements being selected. This (i.e. m) will typically be between 2 and 5. Regarding examples, here would be a typical (albeit shortened) example: original_list = ['CAGG', 'CTTC', 'ACCT', 'TGCA', 'CCTG', 'CAAA', 'TGCC', 'ACTT', 'TAAT', 'CTTG', 'CGGC', 'GGCC', 'TCCT', 'ATCC', 'ACAG', 'TGAA', 'TTTG', 'ACAA', 'TGTC', 'TGGA', 'CTGC', 'GCTC', 'AGGA', 'TGCT', 'GCGC', 'GCGG', 'AAAG', 'GCTG', 'GCCG', 'ACCA', 'CTCC', 'CACG', 'CATA', 'GGGA', 'CGAG', 'CCCC', 'GGTG', 'AAGT', 'CCAC', 'AACA', 'AATA', 'CGAC', 'GGAA', 'TACC', 'AGTT', 'GTGG', 'CGCA', 'GGGG', 'GAGA', 'AGCC', 'ACCG', 'CCAT', 'AGAC', 'GGGT', 'CAGC', 'GATG', 'TTCG'] Select 3-tuples from the original list should produce a list (or set) similar to: [('CAGG', 'CTTC', 'ACCT') ('CAGG', 'TGCA', 'CCTG') ('CAGG', 'CAAA', 'TGCC') ('CAGG', 'ACTT', 'ACCT') ('CAGG', 'CTTG', 'CGGC') .... ('CTTC', 'TGCA', 'CAAA') ] [[Edit]] Actually, in constructing the example output, I have realized that the earlier definition I gave for UNIQUENESS was incorrect. I have updated my definition and have introduced a new metric of DISSIMILARITY instead, as a result of this finding.

    Read the article

  • How to read and write UTF-8 to disk on the Android?

    - by Rob Kent
    I cannot read and write extended characters (French accented characters, for example) to a text file using the standard InputStreamReader methods shown in the Android API examples. When I read back the file using: InputStreamReader tmp = new InputStreamReader(in); BufferedReader reader = new BufferedReader(tmp); String str; while ((str = reader.readLine()) != null) { ... the string read is truncated at the extended characters instead of at the end-of-line. The second half of the string then comes on the next line. I'm assuming that I need to persist my data as UTF-8 but I cannot find any examples of that, and I'm new to Java. Can anyone provide me with an example or a link to relevant documentation?

    Read the article

  • How do I split up a long value (32 bits) into four char variables (8bits) using C?

    - by Jordan S
    I have a 32 bit long variable, CurrentPosition, that I want to split up into 4, 8bit characters. How would I do that most efficiently in C? I am working with an 8bit MCU, 8051 architectecture. unsigned long CurrentPosition = 7654321; unsigned char CP1 = 0; unsigned char CP2 = 0; unsigned char CP3 = 0; unsigned char CP4 = 0; // What do I do next? Should I just reference the starting address of CurrentPosition with a pointer and then add 8 two that address four times? It is little Endian. ALSO I want CurrentPosition to remain unchanged.

    Read the article

  • Have a set a cgi scripts shared by multiple domains

    - by rpat
    Goal: Have multiple domains share a set of cgi(perl) scripts Environment: Apache 2.0 on a dedicated Cent OS server. (Apache configuration files generated by cPanel) I have dozens of domains on the dedicated server. The domains set up by cPanel under VirtualHost section. I have almost no knowledge of Apache. Most of what I do is taken care of by cPanel. I would like to put a set of scripts under one directory (perhaps under / or /opt ) and for each of the domains, under the individual cgi-bin, I would like to create a symbolic link to this common directory. This way I am hoping to avoid having to keep a copy of scripts for every domain. Since Apache config files are generated by cPanel, I would not like to manually make changes to those. Beside, I could mess things up. I see that cPanel recommends use of include files rather than changing the httpd.conf Perhaps I need to have the following of symbolic links enabled in the cgi-bin directory and allow the web server user execute the scripts not owned by it. May be I am making things more complicated than they are. I would be glad to use any other means to achieve my goal. Thanks in advance for your help. *I asked this on stackoverflow and some one suggested that I could ask this on serverfault.

    Read the article

  • SEO issue red characters in source code? &gt; Why? Syntax highlighting? browser source code?

    - by judi
    SEO issue red characters Hi all I'm building webstes using dreamweaver, but when I look at the source code it is red for &quot; characters. I'm told anything appearing in red puts off Google's seo. Does anyone know why this appears in red? For example when I view code source on the site i get the gt; in red <a href="miss-sold-mortgages.html" class="darkblue">Find out more&gt;&gt;</a></span> </div> Thanks for your help Regards Judi

    Read the article

  • SQLite/iPhone read copyright symbol

    - by Marco A
    Hi All, I am having problems reading the copyright symbol from a sqlite db that I have for my App that I am developing. I import the information manually, ie, from an excel sheet. I have tried two ways of doing it and failed with both: 1) Tried replacing the copyright symbol with "\u00ae" (unicode combination) within excel and then importing the modified file. - Result: I get the combination of \u00ae as a part of the string, it doesnt detect the unicode combination. 2) Tried leaving as it is. Importing the excel with the copyright symbol. - Result: I get a symbol that is different from the copyright, its something like an AE put together.looks like this: Æ Heres my code how I read from DB: -(void) readCategoriesFromDatabase:(NSString *) rest_input { // Init the products Array categories = [[NSMutableArray alloc] init]; // Open the database from the users filessytem rest_input = [rest_input stringByAppendingString:@"'"]; NSString *newString; newString = [@"select distinct category from food where restaurant='" stringByAppendingString:rest_input]; const char *cat_sqlStatement = [newString UTF8String]; sqlite3_stmt *cat_compiledStatement; if(sqlite3_prepare_v2(database, cat_sqlStatement, -1, &cat_compiledStatement, NULL) == SQLITE_OK) { // Loop through the results and add them to the feeds array while(sqlite3_step(cat_compiledStatement) == SQLITE_ROW) { NSString *catName = [NSString stringWithUTF8String:(char *)sqlite3_column_text(cat_compiledStatement,0)]; // Create a new product object with the data from the database Product *category = [[Product alloc] initWithName:catName]; // Add the product object to the respective Array [categories addObject:category]; [category release]; } sqlite3_finalize(cat_compiledStatement); } NSLog(@"Finished Accessing Database to gather Categories...."); } I open the DB with this function: -(void) checkAndCreateDatabase{ NSLog(@"Checking/Creating Database...."); NSFileManager *fileManager = [NSFileManager defaultManager]; success = [fileManager fileExistsAtPath:databasePath]; [fileManager removeFileAtPath:databasePath handler:nil]; NSString *databasePathFromApp = [[[NSBundle mainBundle] resourcePath] stringByAppendingPathComponent:databaseName]; [fileManager copyItemAtPath:databasePathFromApp toPath:databasePath error:nil]; [fileManager release]; if (sqlite3_open([databasePath UTF8String], &database) != SQLITE_OK) { sqlite3_close(database); database = nil; } NSLog(@"Finished Checking/Creating Database...."); } Thanks to anything that can help me out.

    Read the article

  • RegEx to Reject Unescaped Character

    - by JDV72
    I want to restrict usage of unescaped ampersands in a particular input field. I'm having trouble getting a RegEx to kill usage of "&" unless followed by "amp;"...or perhaps just restrict usage of "& " (note the space). I tried to adapt the answer in this thread, but to no avail. Thanks. (FWIW, here's a RegEx I made to ensure that a filename field didn't contain restrited chars. and ended in .mp3. It works fine, but does it look efficient?)

    Read the article

  • Convert ISO/Windows charsets to UTF-8 in Javascript

    - by Amir
    I'm developing a firefox plugin and i fetch web pages to do some analysis for the user. The problem is when i try to get (XMLHttpRequest) pages that are not utf-8 encoded the string i see is messed up. For example hebrew pages with windows-1125 or Chinese pages with gb2312. I already tried the following: var uDecoder=Components.classes["@mozilla.org/intl/scriptableunicodeconverter"].getService(Components.interfaces.nsIScriptableUnicodeConverter); uDecoder.charset="windows-1255"; alert( xhr.responseText ); var decoder=Components.classes["@mozilla.org/intl/utf8converterservice;1"].getService(Components.interfaces.nsIUTF8ConverterService); alert(decoder.convertStringToUTF8(xhr.responseText,"WINDOWS-1255",true)); I also tried escape/unescape/encodeURIComponent any ideas???

    Read the article

  • Twitter Search API is returning weird characters - is it more or is it them?

    - by DanSingerman
    We are building an app that accesses the Twitter search over JSONP. It mostly works fine, but occasionally the request returns a JSONP callback that exists of weird unparseable characters. Here is an example: http://search.twitter.com/search.json?result_type=recent&rpp=100&geocode=51.4375857,-0.1658648,1km&page=5&callback=jsonp1272532482854 (If you change page=5 to a value less than 5 in the URL it works fine) So Am I doing something wrong? Can anyone suggest a workaround?

    Read the article

  • Java application failing on special characters.

    - by Scottm
    An application I am working on reads information from files to populate a database. Some of the characters in the files are non-English, for example accented French characters. The application is working fine in Windows but on our Solaris machine it is failing to recognise the special characters and is throwing an exception. For example when it encounters the accented e in "Gérer" it says :- Encountered: "\u0161" (353), after : "\'G\u00c3\u00a9rer les mod\u00c3" (an exception which is thrown from our application) I suspect that in order to stop this from happening I need to change the file.encoding property of the JVM. I tried to do this via System.setProperty() but it has not stopped the error from occurring. Are there any suggestions for what I could do? I was thinking about setting the basic locale of the solaris platform in /etc/default/init to be UTF-8. Does anyone think this might help? Any thoughts are much appreciated.

    Read the article

  • String useless character strip - PHP

    - by Zoltan Repas
    Hi! I've got a huge problem. I made a special ID for the things in our webpage. Let's see an example: H0059 - this is the special ID called registration number. The last two chars are the things' id. I'd like to cut off the useless characters, to get the real ID, what means strip the first char, and all the 0s before any other numbers. (Example: L0745 = 745, V1754 = 1754, L0003 = 3, B0141 = 141, P0040 = 40, V8000 = 8000) Please help me in this. I've tried with strreplace and explode but failed :( Thanks for the help.

    Read the article

  • Python: (sampling with replacement): efficient algorithm to extract the set of DISSIMILAR N-tuples from a set

    - by Homunculus Reticulli
    I have a set of items, from which I want to select DISSIMILAR tuples (more on the definition of dissimilar touples later). The set could contain potentially several thousand items, although typically, it would contain only a few hundreds. I am trying to write a generic algorithm that will allow me to select N items to form an N-tuple, from the original set. The new set of selected N-tuples should be DISSIMILAR. A N-tuple A is said to be DISSIMILAR to another N-tuple B if and only if: Every pair (2-tuple) that occurs in A DOES NOT appear in B Note: For this algorithm, A 2-tuple (pair) is considered SIMILAR/IDENTICAL if it contains the same elements, i.e. (x,y) is considered the same as (y,x). This is a (possible variation on the) classic Urn Problem. A trivial (pseudocode) implementation of this algorithm would be something along the lines of def fetch_unique_tuples(original_set, tuple_size): while True: # randomly select [tuple_size] items from the set to create first set # create a key or hash from the N elements and store in a set # store selected N-tuple in a container if end_condition_met: break I don't think this is the most efficient way of doing this - and though I am no algorithm theorist, I suspect that the time for this algorithm to run is NOT O(n) - in fact, its probably more likely to be O(n!). I am wondering if there is a more efficient way of implementing such an algo, and preferably, reducing the time to O(n). Actually, as Mark Byers pointed out there is a second variable m, which is the size of the number of elements being selected. This (i.e. m) will typically be between 2 and 5. Regarding examples, here would be a typical (albeit shortened) example: original_list = ['CAGG', 'CTTC', 'ACCT', 'TGCA', 'CCTG', 'CAAA', 'TGCC', 'ACTT', 'TAAT', 'CTTG', 'CGGC', 'GGCC', 'TCCT', 'ATCC', 'ACAG', 'TGAA', 'TTTG', 'ACAA', 'TGTC', 'TGGA', 'CTGC', 'GCTC', 'AGGA', 'TGCT', 'GCGC', 'GCGG', 'AAAG', 'GCTG', 'GCCG', 'ACCA', 'CTCC', 'CACG', 'CATA', 'GGGA', 'CGAG', 'CCCC', 'GGTG', 'AAGT', 'CCAC', 'AACA', 'AATA', 'CGAC', 'GGAA', 'TACC', 'AGTT', 'GTGG', 'CGCA', 'GGGG', 'GAGA', 'AGCC', 'ACCG', 'CCAT', 'AGAC', 'GGGT', 'CAGC', 'GATG', 'TTCG'] # Select 3-tuples from the original list should produce a list (or set) similar to: [('CAGG', 'CTTC', 'ACCT') ('CAGG', 'TGCA', 'CCTG') ('CAGG', 'CAAA', 'TGCC') ('CAGG', 'ACTT', 'ACCT') ('CAGG', 'CTTG', 'CGGC') .... ('CTTC', 'TGCA', 'CAAA') ] [[Edit]] Actually, in constructing the example output, I have realized that the earlier definition I gave for UNIQUENESS was incorrect. I have updated my definition and have introduced a new metric of DISSIMILARITY instead, as a result of this finding.

    Read the article

  • UNC shared path not accessible though necessary permissions are set

    - by Vysakh
    I have 2 environments A and B. A is an original environment whereas B is a clone of A, exactly except AD servers. AD server of B has been assigned a trust relationship with A, so that all the service and user accounts of A can be used in B too. And trusting works fine, perfect!! But I encounter some issues accessing UNC paths(\server2\shared) with these service accounts. I had a check in A environment and all the permissions set in that environment is done in B too (already set since it is a clone of A),but the issue is with B environment only. And FYI, the user is an owner of that folder in both the environments. I tried creating a folder inside the share(\server2\shared) using command prompt, but failed with error "access denied". What I done a workaround is that I added that user in "security" tab of folder permissions and after that it worked fine. But this was not done in the original environment. Is this something related to trust relationship? Why the share to the same location for the same user works differently in 2 environments, though they've been set with the same permissions. FYI, these are windows 2003 servers. Can someone please help.

    Read the article

  • Does Postgresql varchar count using unicode character length or ASCII character length?

    - by bennylope
    I tried importing a database dump from a SQL file and the insert failed when inserting the string Mér into a field defined as varying(3). I didn't capture the exact error, but it pointed to that specific value with the constraint of varying(3). Given that I considered this unimportant to what I was doing at the time, I just changed the value to Mer, it worked, and I moved on. Is a varying field with its limit taking into account length of the byte string? What really boggles my mind is that this was dumped from another PostgreSQL database. So it doesn't make sense how a constraint could allow the value to be written initially.

    Read the article

  • Python code, extracting extensions

    - by user1434001
    import os path = '/Users/Marjan/Documents/Nothing/Costco' print path names = os.listdir(path) print len(names) for name in names: print name Here is the code I've been using, it lists all the names in this category in terminal. There are a few filenames in this file (Costco) that don't have .html and _files. I need to pick them out, the only issue is that it has over 2,500 filenames. Need help on a code that will search through this path and pick out all the filenames that don't end with .html or _files. Thanks guys

    Read the article

< Previous Page | 56 57 58 59 60 61 62 63 64 65 66 67  | Next Page >