Search Results

Search found 9564 results on 383 pages for 'character encoding'.

Page 9/383 | < Previous Page | 5 6 7 8 9 10 11 12 13 14 15 16 | Next Page >

Regex, encoding, and characters that look a like

- by hack.augusto

First, a brief example, let's say I have this "/[0-9]{2}°/" regex and this text "24º". The text won't match, obviusly ... (?) really, it depends on the character encoding. Here is my problem, I do not have control on which chars the user uses, so, I need to cover all possibilities in the regex /[0-9]{2}[°º]/, or even better, assure that the text has only the chars I'm expecting °. But I can't just remove the unknow chars otherwise the regex won't work, I need to change it to the chars that looks like it and I'm expecting. I have done this through a little function that maps the "look like" to "what I expect" and change it, the problem is, I have not covered all possibilities, for example, today I found a new "-", now we got three of them, just like latex =D - -- --- ,cool , but the regex didn't work. Does anyone knows how I might solve this?

Read the article
Chinese encoding issue while listing files

- by Null Pointer

I am running a Java application on a Solaris10 with Chinese. Now there are some files in a directory with chinese filenames. When I do files = new File(dir).list() where "dir" is the parent directory containing that chinese file, I get the result filename files[0] as ?????(some junk characters). Now the deal is that my programs file.encoding property is already set to GBK and I also do Charset.isSupported("GBK") and it returns true too. So where could be the problem. I am running out of ideas. NOTE: I am not trying to print the filename anywhere or copy the file or something. I am simply openeing a stream to it, something like below: files = new File(dir).list(); new FileInputStream(files[0]); Now this gives me a FileNotFoundExcpetion, so I debug just to find that value inside files[0] is "??????".

Read the article
Guessing UTF-8 encoding

- by Dervin Thunk

I have a question that may be quite naive, but I feel the need to ask, because I don't really know what is going on. I'm on Ubuntu. Suppose I do echo "t" > test.txt if I then file test.txt I get test.txt:ASCII text If I then do echo "å" > test.txt Then I get test.txt: UTF-8 Unicode text How does that happen? How does file "know" the encoding, or, alternatively, how does it guess it? Thanks.

Read the article
PHP encoding with DOMDocument

- by Olivier Lalonde

<tag> ????? ? </tag> When I try to get the content of the following code using DOMDocument functions, it returns something like: ÐÐ»ÐµÐºÑ Ðœ I've tried setting DOMDocument encoding to different values (UTF-8, ISO-8859-1), using mb_convert_encoding, iconv and utf8_encode but without success. How can I get "????? ?" instead of "ÐÐ»ÐµÐºÑ Ðœ" ? EDIT: The input is coming from a page loaded with curl. When I output the page content to my browser, the characters are displayed correctly (so I doubt the input is the problem).

Read the article
UriBuilder incorrectly encoding Query Parameters value ?

- by Fred

Lets consider the following code sample where a path and single parameter are encoded... Parameter name: "param" Parameter value: "foo/bar?aaa=bbb&ccc=ddd" (happens to be a url with query parameters) String test = UriBuilder.fromPath("https://dummy.com"). queryParam("param", "foo/bar?aaa=bbb&ccc=ddd"). build().toURL().toString(); The encoded URL string returned is: "https://dummy.com?param=foo/bar?aaa%3Dbbb&ccc%3Dddd" Is this correct ? Should not the character "&" (and may be even "?") be encoded in the parameter value string ? Would not the URL produced be interpreted as follow: One first parameter, name="param", value = "ar?aaa%3Dbbb" followed by a second parameter, name="ccc%3Dddd", without value.

Read the article
SQLite character encoding for Google Gears

- by MHD

We're using jQuery to get a JSON-string from our server (UTF-8 response, also UTF-8 request through jQuery) and put this JSON into a Google Gears WorkerPool. This workerpool processes the JSON and stores it into a Gears database (SQLite). It turns out that, apparently, SQLite stores data using iso-8859-1 rather than UTF-8. Since we're trying to store user names that might contain Cyrillic characters (and others that you might encounter in Europe), this goes horribly wrong. Can anyone tell me how to change the character encoding in either the Gears WorkerPool or the SQLite database that Gears employs? Of course, if I'm looking in the wrong direction with my problem, feel free to offer alternatives! Unfortunately, HTML5 isn't an option as we're supposed to support IE7 primarily.

Read the article
Some special characters defined in "ISO-8859-1" can't be shown when encoding with "UTF-8"

- by Mike.Huang

I need to get a string from URL request of brower, and then create a text image by requested text. I know the default encoding of the Java net transmission is "ISO-8859-1", it can works normally with all characters what defined in "ISO-8859-1". But when I request a multi-byte Unicode character (e.g. chinese or something like ¤?), then I need to decode it by "UTF-8" from "ISO-8859-1". My codes like: String reslut = new String(requestString.getBytes("ISO-8859-1"), "UTF-8"); Everything is fine, but I found some characters in ISO-8859-1 are not been shown now, which characters are 0x80 - 0xFF(defined in" ISO-8859-1"), i.e. the characters after 0x80 (in "ISO-8859-1") not been shown when converted to "UTF-8" from "ISO-8859-1". Any other method can solve this query?

Read the article
Repair bad character due to encoding problem

- by remi bourgarel

Hi all, Recently we had an encoding problem in our system : If we had the string "æ" in our db ,it became "Ã¦" on our web pages. Now this problem is solved, but the problem is that now we have a lot of "Ã¦" in our database : users didn't see and validate pre-filled form with these characters. I found that If you read in utf 8 C3A6 you'll get "æ", if you read it in ascii you'll get "Ã¦". It's strange because if I execute "select convert(varbinary(40),N'æ'),convert(varbinary(40),'Ã¦')" I don't have the same result... Do you have any idea on how I can fix my database (ie change all "Ã¦" to "æ") ? thx

Read the article
Python encoding for pipe.communicate

- by Brian M. Hunt

I'm calling pipe.communicate from Python's subprocess module from Python 2.6. I get the following error from this code: from subprocess import Popen pipe = Popen(cwd) pipe.communicate( data ) For an arbitrary cwd, and where data that contains unicode (specifically 0xE9): Exec. exception: 'ascii' codec can't encode character u'\xe9' in position 507: ordinal not in range(128) Traceback (most recent call last): ... stdout, stderr = pipe.communicate( data ) File "/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/subprocess.py", line 671, in communicate return self._communicate(input) File "/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/subprocess.py", line 1177, in _communicate bytes_written = os.write(self.stdin.fileno(), chunk) This is happening, I presume, because pipe.communicate() is expecting ASCII encoded string, but data is unicode. Is this the problem I'm encountering, and i sthere a way to pass unicode to pipe.communicate()? Thank you for reading! Brian

Read the article
Efficient JSON encoding for data that may be binary, but is often text

- by Evgeny

I need to send a JSON packet across the wire with the contents of an arbitrary file. This may be a binary file (like a ZIP file), but most often it will be plain ASCII text. I'm currently using base64 encoding, which handles all files, but it increases the size of the data significantly - even if the file is ASCII to begin with. Is there a more efficient way I can encode the data, other than manually checking for any non-ASCII characters and then deciding whether or not to base64-encode it? I'm currently writing this in Python, but will probably need to do the same in Java, C# and C++, so an easily portable solution would be preferable.

Read the article
Possible Encoding Issue Reading HTM File using .Net Streamreader

- by Brian Boatright

I have an HTML file with a ® (copyright) and ™ (trademark) symbol in the text. These are just two among many other symbols. When I read the html file into a literal control it converts the symbols to something else. The copyright symbol converts to ? (open box in ff) The trademark symbol converts to ™ (as expected) If (System.IO.File.Exists(FullName)) Then Dim StreamReader1 As New System.IO.StreamReader(FullName) Contents.Text = StreamReader1.ReadToEnd() StreamReader1.Close() End If Contents is a <asp:Literal runat="server" ID="Contents"></asp:Literal> and it's the only control in the aspx page. From some research I think this is related to the encoding but I don't know why it would change how to fix it. The html file does not contain any Content-Type settings in the head section.

Read the article
How to display characters in http get response correctly with the right encoding

- by DixieFlatline

Hello! Does anyone know how to read c,š,ž characters in http get response properly? When i make my request in browser the browser displays all characters correctly. But in java program with apache jars i don't know how to set the encoding right. I tried with client.getParams().setParameter(CoreProtocolPNames.HTTP_CONTENT_CHARSET, "UTF-8"); but it's not working. My code: HttpClient client = new DefaultHttpClient(); String getURL = "http://www.google.com"; HttpGet get = new HttpGet(getURL); HttpResponse responseGet = client.execute(get); HttpEntity resEntityGet = responseGet.getEntity(); if (resEntityGet != null) { Log.i("GET RESPONSE",EntityUtils.toString(resEntityGet)); } } catch (Exception e) { e.printStackTrace(); }

Read the article
How to specify character encoding for Ant Task parameters in Java

- by räph

I'm writing an ANT task in Java. In my build.xml I specify parameters, which should be read from my java class. Problems occur, when I use special characters, like german umlauts (Ö,Ä,Ü) in these parameters. In my java task they appear as ?-characters (using System.out.print). All my files are encoded as UTF-8. and my build.xml has the corresponding declaration: <?xml version="1.0" encoding="UTF-8" ?> For the details of writing the task: I do it according to http://ant.apache.org/manual/develop.html (especially Point 5 nested elements). I have nested elements in my task like: <parameter name="test" value="ÖÄÜtest"/> and a java method: public void addConfiguredParameter(Parameter prop) { System.out.println(prop.getValue()); //prints ???test } to read the parameter values.

Read the article
Encoding a string as an integer .NET

- by Paul Knopf

I have a string that I would like represented uniquely as an integer. For example: A3FJEI = 34950140 How would I go about writing a EncodeAsInteger(string) method. I understand that the amount of characters in the string will make the integer increase greatly, forcing the value to become a long, not an int. Since I need the value to be an integer, I don't need the numerical representation to be entirely unique to the string. Maybe I can foreach through all the characters of the string and sum the numerical keycode of the character.

Read the article
remove non-UTF-8 characters from xml with declared encoding=utf-8 - Java

- by St Nietzke

I have to handle this scenario in Java: I'm getting a request in XML form from a client with declared encoding=utf-8. Unfortunately it may contain not utf-8 characters and there is a requirement to remove these characters from the xml on my side (legacy). Let's consider an example where this invalid XML contains £ (pound). 1) I get xml as java String with £ in it (I don't have access to interface right now, but I probably get xml as a java String). Can I use replaceAll(£, "") to get rid of this character? Any potential issues? 2) I get xml as an array of bytes - how to handle this operation safely in that case?

Read the article
encoding of =1 in emails

- by Maenny

Hi folks, I have probably a stupid problem. In a script I generate a URL with GET parameters, something like 'www.mydomain.com/index.php?item=1234'. This URL will be sent by PHP through mail() in an UTF-8 encoding (the scriptfile itself also is utf-8). Now each time I have the GET-Parameter with two numbers after the '=' the URL in the email looks like 'www.mydomain.com/index.php?item?34' with a rectangle instead of '=12'. I am sure there is an easy way to fix this? Thanks in advance, Maenny

Read the article
C++ unicode UTF-16 encoding

- by Dan

Hi all, I have a wide char string is L"hao123--??????", and it must be encoded to "hao123--\u6211\u7684\u4E0A\u7F51\u4E3B\u9875". I was told that the encoded string is a special “%uNNNN” format for encoding Unicode UTF-16 code points. In this website(http://rishida.net/tools/conversion/), it tell me it's JavaScript escapes. But I don't know how to encode it with C++. It that any library to do this work? or give me some tips. Thanks my friends!

Read the article
Typical text encoding and EOL behavior on mobile devices

- by Dan W

Typical things to worry about when dealing with text are the BOM/signature, encoding, and the end of line (EOL) char/chars. I know that Windows often favours \r\n (CR+LF) and Mac/Linux favours \n (LF), but how about popular mobile devices such as the iPhone and Android? Do typical apps on those platforms favour one or the other (or maybe even \r for iOS)? I'll supply both types to the user just in case, but I'd like to choose one as default. Also, which text encodings are mobiles most likely to use - UTF-8, iso-8859-1, Windows 1252 (or other default codepage) or maybe even UTF-16? And if they use UTF-8/16, are they likely to need (or require not having) a BOM/signature? What is the typical behavior here? Once again, I'll supply a range of encodings to the user just in case, but I'd like to prioritize or use certain encodings as default if it's appropriate.

Read the article
IE 8 Chinese encoding characters

- by digitalbart

Hello, I am unable to render Chinese characters in IE 8. I have researched this and I am aware of the meta tag to force compatibility mode. I am also aware of the language pack you can install. Finally I have seen that Microsoft actually forces IE7 compatibility mode on their Chinese website. http://www.microsoft.com/zh/cn/default.aspx I am wondering if anyone has any alternatives solutions to this problem. None them seem that appealing to me. I am using utf8 as my encoding and this problem only occurs in IE8. Thanks

Read the article
Deciphering Encoding: Packet Analyzation Tools

- by Zombies

I am looking for better tools than wireshark for this. The problem with wireshark is that it does not format the data layer (which is the only part I am looking at) cleanly for me to compare the different packets and attempt to understand the third party encoding (which is closed source). Specifically, what are some good tools for viewing data, and not tcp/udp header information? Particularly, a tool that formats the data for comparison. To be very specific: I would like a program that compares multiple (not just 2) files in hex.

Read the article
Prevent ASP.NET from encoding strings on output

- by Darkwater23

How can I stop ASP.Net from encoding anchor tags in List Items when the page renders? I have a collection of objects. Each object has a link property. I did a foreach and tried to output the links in a BulletedList, but ASP encoded all the links. Any idea? Thanks! Here's the offending snippet of code. When the user picks a specialty, I use the SelectedIndexChange event to clear and add links to the BulletedList: if (SpecialtyList.SelectedIndex > 0) { PhysicianLinks.Items.Clear(); foreach (Physician doc in docs) { if (doc.Specialties.Contains(SpecialtyList.SelectedValue)) { PhysicianLinks.Items.Add(new ListItem("<a href=\"" + doc.Link + "\">" + doc.FullName + "</a>")); } } }

Read the article
Best way to correct garbled data caused by false encoding

- by ercan

Hi all, I have a set of data that contains garbled text fields because of encoding errors during many import/exports from one database to another. Most of the errors were caused by converting UTF-8 to ISO-8859-1. Strangely enough, the errors are not consistent: the word 'München' appears as 'MÃ¼nchen' in some place and as 'MÃœnchen'. Is there a trick in SQL server to correct this kind of crap? The first thing that I can think of is to exploit the COLLATE clause, so that Ã¼ is interpreted as ü, but I don't exactly know how. If it isn't possible to make it in the DB level, do you know any tool that helps for a bulk correction? (no manual find/replace tool, but a tool that guesses the garbled text somehow and correct them)

Read the article
String Encoding doesn't ouput all characters

- by AndroidXTr3meN

My client uses InputStreamReader/BufferedReader to fetch text from the Internet. However when I save the Text to a *.txt the text shows extra weird special symbols like 'Â'. I've tried Convert the String to ASCII but that mess upp å,ä,ö,Ø which I use. I've tried food = food.replace("Â", ""); and IndexOf(); But string won't find it. But it's there in HEX Editor. So summary: When I use text.setText(Android), the output looks fine with NO weird symbols, but when I save the text to *.txt I get about 4 of 'Â'. I do not want ASCII because I use other Non-ASCII character. The 'Â' is displayed as a Whitespace on my Android and in notepad. Thanks! Have A great Weekend! EDIT* found: in Wordpad

Read the article
Help with proper character encoding.

- by mmattax

I have a HTML form that is sometimes submitted with accented characters: à, è, ì, ò, ù I have a PHP script that exports these form submissions into CSV format, when I look at the CSV format in a text editor (vim or notepad for example) the characters look fine, but when opened with Open Office or Word, I get some funky results: ????? I am also passing these submission to salesforce and am getting an error: "The entity "Atilde" was referenced, but not declared." What can I do to ensure portability of my CSV file? What's the proper way to handle the encoding? My HTML file is content-type is set as: Content-Type: text/html; charset=utf-8 Data is being stored in MySQL as latin1_swedish_ci collation.

Read the article
Understanding character encoding in typical Java web app

- by Marcus

Some pseudocode from a typical web app: String a = "A bunch of text"; //UTF-16 saveTextInDb(a); //Write to Oracle VARCHAR(15) column String b = readTextFromDb(); //UTF-16 out.write(b); //Write to http response In the first line we create a Java String which uses UTF-16. When you save to Oracle VARCHAR(15) does Oracle also store this as UTF-16? Does the length of an Oracle VARCHAR refer to number of Unicode characters (and not number of bytes)? And then when we write b to the ServletResponse is this being written as UTF-16 or are we by default converting to another encoding like UTF-8?

Read the article

< Previous Page | 5 6 7 8 9 10 11 12 13 14 15 16 | Next Page >