unicode string - Page 16

Unicode escaping in C/C++

- by Geo

Hi guys! I'm having a dispute with a colleague of mine. She says that the following: char* a = "\x000aaxz"; will/can be seen by the compiler as "\x000aa". I do not agree with her, as I think you can have a maximum number of 4 hex characters after the \x. Can you have more than 4 hex chars? Who is right here?

Read the article

Joomla 1.5 & Indic Unicode Fonts - How-to?

- by Ganesh

I am using Inscript Keyboard to directly type into TinyMCE. However when I click on save, all the characters appear as question marks on website and even in article list on admin side. How I should solve the problem? I am specifically talking about Marathi but the problem-solution might be same for all Devnagrari fonts. Thanks in advance.

Read the article

Determining Whether a String Is Contained Within a String Array (Case Insensitive)

About once every couple of months I need to write a bit of code that does one thing if a particular string is found within an array of strings and something else if it is not ignoring differences in case. For whatever reason, I never seem to remember the code snippet to accomplish this, so after spending 10 minutes of research today I thought I'd write it down here in an effort to help commit it to memory or, at the very least, serve as a quick place to find the answer when the need arises again.So without further adieu, here it is:Visual Basic Version:If stringArrayName.Contains("valueToLookFor", StringComparer.OrdinalIgnoreCase) Then ... Else ... End IfC# Version:if (stringArrayName.Contains("valueToLookFor", StringComparer.OrdinalIgnoreCase)) ... else ...Without the StringComparer.OrdinalIgnoreCase the search will be case-sensitive. For more information on comparing strings, see: New Recommendations for Using Strings in Microsoft .NET 2.0.Happy Programming!Did you know that DotNetSlackers also publishes .net articles written by top known .net Authors? We already have over 80 articles in several categories including Silverlight. Take a look: here.

Read the article

How to do proper Unicode and ANSI output redirection on cmd.exe?

- by Sorin Sbarnea

If you are doing automation on windows and you are redirecting the output of different commands (internal cmd.exe or external, you'll discover that your log files contains combined Unicode and ANSI output (meaning that they are invalid and will not load well in viewers/editors). Is it is possible to make cmd.exe work with UTF-8? This question is not about display, s about stdin/stdout/stderr redirection and Unicode. I am looking for a solution that would allow you to: redirect the output of the internal commands to a file using UTF-8 redirect output of external commands supporting Unicode to the files but encoded as UTF-8. If it is impossible to obtain this kind of consistence using batch files, is there another way of solving this problem, like using python scripting for this? In this case, I would like to know if it is possible to do the Unicode detection alone (user using the scripting should not remember if the called tools will output Unicode or not, it will just expect to convert the output to UTF-8. For simplicity we'll assume that if the tool output is not-Unicode it will be considered as UTF-8 (no codepage conversion).

Read the article

Delphi 2009 dbExpress and Interbase: Unicode migration steps and risks?

- by mjustin

Currently, our database uses Win1252 as the only character encoding. We will have to support Unicode in the database tables soon, which means we have to perform this migration for four databases and around 80 Delphi applications which run in-house in a 24/7 environment. Are there recommendations for database migrations to UTF-8 (or UNICODE_FSS) for Delphi applications? Some questions listed below. Many thanks in advance for your answers! are there tools which help with the migration of the existing databases (sizes between 250 MB and 2 GB, no Blob fields), by dumping the data, recreating the database with UNICODE_FSS or UTF-8, and loading the data back? are there known problems with Delphi 2009, dbExpress and Interbase 7.5 related to Unicode character sets? would you recommend to upgrade the databases to Interbase 2009 first? (This upgrade is planned but does not have a high priority) can we simply migrate the database and Delphi will handle the Unicode character sets automatically, or will we have to change all character field types in every Datamodule (dfm and source code) too? which strategy would you recommend to work on the migration in parallel with the normal development and maintenance of the existing application? The application runs in-house so development and database administration is done internally. Update: one problem I found now is that there are two different persistent field types for Unicode and non Unicode character fields. For the existing database, dbExpress creates TStringField objects. For the Unicode database fields, dbExpress creates (or expects!) TWideStringField objects. This looks like a lot of work lies ahead. While we could try to avoid persistent fields (and add calculated fields at run time), Of course we would prefer a solution which does not require so many changes in existing units and DFM files.

Read the article

Using unicodedata.normalize in Python 2.7

- by dpitch40

Once again, I am very confused with a unicode question. I can't figure out how to successfully use unicodedata.normalize to convert non-ASCII characters as expected. For instance, I want to convert the string u"Cœur" To u"Coeur" I am pretty sure that unicodedata.normalize is the way to do this, but I can't get it to work. It just leaves the string unchanged. >>> s = u"Cœur" >>> unicodedata.normalize('NFKD', s) == s True What am I doing wrong?

Read the article

Where can I find a useful multi-language Unicode font for Mac OS X?

- by Stephen Jennings

On every browser I've tried (Firefox, Safari, Chrome, and Omniweb), when I go to a web page containing somewhat less-common characters, I can't see the glyphs. For example, on the Wikipedia page for the Bengali Language, the very first line contains a string of squares; on Windows, I can see the Bengali writing. Firefox does display code points on the Coptic Language article, but not Bengali. I'm not sure why. On Windows, as long as I have the Arial Unicode MS font installed, these characters fall back to that font and display properly. Mac OS X doesn't seem to ship with a font containing these Unicode characters (it has Arial Unicode MS, but it must be a subset of the Windows version because Bengali doesn't display in that font). I checked on my Snow Leopard DVD and I installed "Additional Fonts" from the Optional Installs package, but I'm still missing many languages. Is there any good, free font that contains a large collection of languages? I know creating fonts is difficult and time-consuming, but it seems like including at least one font like this with operating systems should be standard by now.

Read the article

How can you change a ";" seperated string to some kind of dictionary?

- by Sem Dendoncker

Hi, I have a string like this: "user=u123;name=Test;lastname=User" I want to get a dictionary for this string like this: user "u123" name "Test" lastname "User" this way I can easely access the data within the string. I want to do this in C#. Cheers, M.

Read the article

How Do I Remove The First 4 Characters From A String If It Matches A Pattern In Ruby

- by James

I have the following string: "h3. My Title Goes Here" I basically want to remove the first 4 characters from the string so that I just get back: "My Title Goes Here". The thing is I am iterating over an array of strings and not all have the h3. part in front so I can't just ditch the first 4 characters blindly. I have checked the docs and the closest think I could find was chomp, but that only works for the end of a string. Right now I am doing this: "h3. My Title Goes Here".reverse.chomp(" .3h").reverse This gives me my desired output, but there has to be a better way right? I mean I don't want to reverse a string twice for no reason. I am new to programming so I might have missed something obvious, but I didn't see the opposite of chomp anywhere in the docs. Is there another method that will work? Thanks!

Read the article

Should UTF-16 be considered harmful?

- by Artyom

I'm going to ask what is probably quite a controversial question: "Should one of the most popular encodings, UTF-16, be considered harmful?" Why do I ask this question? How many programmers are aware of the fact that UTF-16 is actually a variable length encoding? By this I mean that there are code points that, represented as surrogate pairs, take more than one element. I know; lots of applications, frameworks and APIs use UTF-16, such as Java's String, C#'s String, Win32 APIs, Qt GUI libraries, the ICU Unicode library, etc. However, with all of that, there are lots of basic bugs in the processing of characters out of BMP (characters that should be encoded using two UTF-16 elements). For example, try to edit one of these characters: 𝄞 (U+1D11E) MUSICAL SYMBOL G CLEF 𝕥 (U+1D565) MATHEMATICAL DOUBLE-STRUCK SMALL T 𝟶 (U+1D7F6) MATHEMATICAL MONOSPACE DIGIT ZERO 𠂊 (U+2008A) Han Character You may miss some, depending on what fonts you have installed. These characters are all outside of the BMP (Basic Multilingual Plane). If you cannot see these characters, you can also try looking at them in the Unicode Character reference. For example, try to create file names in Windows that include these characters; try to delete these characters with a "backspace" to see how they behave in different applications that use UTF-16. I did some tests and the results are quite bad: Opera has problem with editing them (delete required 2 presses on backspace) Notepad can't deal with them correctly (delete required 2 presses on backspace) File names editing in Window dialogs in broken (delete required 2 presses on backspace) All QT3 applications can't deal with them - show two empty squares instead of one symbol. Python encodes such characters incorrectly when used directly u'X'!=unicode('X','utf-16') on some platforms when X in character outside of BMP. Python 2.5 unicodedata fails to get properties on such characters when python compiled with UTF-16 Unicode strings. StackOverflow seems to remove these characters from the text if edited directly in as Unicode characters (these characters are shown using HTML Unicode escapes). WinForms TextBox may generate invalid string when limited with MaxLength. It seems that such bugs are extremely easy to find in many applications that use UTF-16. So... Do you think that UTF-16 should be considered harmful?

Read the article

How can I check if PHP was compiled with the UNICODE version of the Win32 API?

- by Wesley Murch

This is related to this Stack Overflow post: glob() can't find file names with multibyte characters on Windows? I'm having issues with PHP and files that have multibyte characters on Windows. Here's my test case: print_r(scandir('./uploads/')); print_r(glob('./uploads/*')); Correct Output on remote UNIX server: Array ( [0] => . [1] => .. [2] => filename-äöü.jpg [3] => filename.jpg [4] => test?test.jpg [5] => ??? ?????.jpg [6] => ?????????.jpg [7] => ???.jpg ) Array ( [0] => ./uploads/filename-äöü.jpg [1] => ./uploads/filename.jpg [2] => ./uploads/test?test.jpg [3] => ./uploads/??? ?????.jpg [4] => ./uploads/?????????.jpg [5] => ./uploads/???.jpg ) Incorrect Output locally on Windows: Array ( [0] => . [1] => .. [2] => ??? ?????.jpg [3] => ???.jpg [4] => ?????????.jpg [5] => filename-äöü.jpg [6] => filename.jpg [7] => test?test.jpg ) Array ( [0] => ./uploads/filename-äöü.jpg [1] => ./uploads/filename.jpg ) Here's a relevant excerpt from the answer I chose to accept (which actually is a quote from an article that was posted online over 2 years ago): From the comments on this article: http://www.rooftopsolutions.nl/blog/filesystem-encoding-and-php The output from your PHP installation on Windows is easy to explain : you installed the wrong version of PHP, and used a version not compiled to use the Unicode version of the Win32 API. For this reason, the filesystem calls used by PHP will use the legacy "ANSI" API and so the C/C++ libraries linked with this version of PHP will first try to convert yout UTF-8-encoded PHP string into the local "ANSI" codepage selected in the running environment (see the CHCP command before starting PHP from a command line window) Your version of Windows is MOST PROBABLY NOT responsible of this weird thing. Actually, this is YOUR version of PHP which is not compiled correctly, and that uses the legacy ANSI version of the Win32 API (for compatibility with the legacy 16-bit versions of Windows 95/98 whose filesystem support in the kernel actually had no direct support for Unicode, but used an internal conversion layer to convert Unicode to the local ANSI codepage before using the actual ANSI version of the API). Recompile PHP using the compiler option to use the UNICODE version of the Win32 API (which should be the default today, and anyway always the default for PHP installed on a server that will NEVER be Windows 95 or Windows 98...) I can't confirm whether this is my problem or not. I used phpinfo() and did not find anything interesting, but I wasn't sure what to look for. I've been using XAMPP for easy installations, so I'm really not sure exactly how it was installed. I'm using Windows 7, 64 bit - so forgive my ignorance, but I'm not even sure if "Win32" is relevant here. How can I check if my current version of PHP was compiled with the configuration mentioned above? PHP Version: 5.3.8 System: Windows NT WES-PC 6.1 build 7601 (Windows 7 Home Premium Edition Service Pack 1) i586 Build Date: Aug 23 2011 11:47:20 Compiler: MSVC9 (Visual C++ 2008) Architecture: x86 Configure Command: cscript /nologo configure.js "--enable-snapshot-build" "--disable-isapi" "--enable-debug-pack" "--disable-isapi" "--without-mssql" "--without-pdo-mssql" "--without-pi3web" "--with-pdo-oci=D:\php-sdk\oracle\instantclient10\sdk,shared" "--with-oci8=D:\php-sdk\oracle\instantclient10\sdk,shared" "--with-oci8-11g=D:\php-sdk\oracle\instantclient11\sdk,shared" "--enable-object-out-dir=../obj/" "--enable-com-dotnet" "--with-mcrypt=static" "--disable-static-analyze"

Read the article

How to find first non-repetitive character from a string?

- by masato-san

I've spent half day trying to figure out this and finally I got working solution. However, I feel like this can be done in simpler way. I think this code is not really readable. Problem: Find first non-repetitive character from a string. $string = "abbcabz" In this case, the function should output "c". The reason I use concatenation instead of $input[index_to_remove] = '' in order to remove character from a given string is because if I do that, it actually just leave empty cell so that my return value $input[0] does not not return the character I want to return. For instance, $str = "abc"; $str[0] = ''; echo $str; This will output "bc" But actually if I test, var_dump($str); it will give me: string(3) "bc" Here is my intention: Given: input while first char exists in substring of input { get index_to_remove input = chars left of index_to_remove . chars right of index_to_remove if dupe of first char is not found from substring remove first char from input } return first char of input Code: function find_first_non_repetitive2($input) { while(strpos(substr($input, 1), $input[0]) !== false) { $index_to_remove = strpos(substr($input,1), $input[0]) + 1; $input = substr($input, 0, $index_to_remove) . substr($input, $index_to_remove + 1); if(strpos(substr($input, 1), $input[0]) == false) { $input = substr($input, 1); } } return $input[0]; }

Read the article

Problem initialing a unicode string

- by Simon

Hey All. Atm im working with native API calls and i have to get RtlInitUnicodeString to work. The way i use: const WCHAR wcMutex[] = L"String1"; UNICODE_STRING unicodeMutexBuffer; RtlInitUnicodeString(&unicodeMutexBuffer,wcMutex); now my problem the project doesnt compile , i get this error: Error argument of type "UNICODE_STRING*" is incompatible with type of "PUNICODE_STRING" but in my old Driver kit , i used same way to initialize the unicode string struct

Read the article

Unicode in PostgreSQL 8.4

- by user8382

I installed the "postgresql-8.4" package with default options. Everything worked fine, however I can't seem to manage to create unicode databases: -- This doesn't work createdb test1 --encoding UNICODE -- This works createdb test2 The error message "createdb: database creation failed: ERROR: new encoding (UTF8) is incompatible with the encoding of the template database (SQL_ASCII)" is a bit puzzling because (afaik) I don't use a template for creating the new db, or is it implicitely referring to the default "postgres" database for some reason ? Or maybe I'm missing a setting in a .conf file ?

Read the article

Unicode To ASCII Conversion [closed]

- by Yuvaraj

Hi all, i creating an small application in Delphi 2009. here i got problem that when i run my application in WindowsXP its working but it is not working in Windows95. i know the problem that 95 will not support Unicode. if anyone knows the solution please tell me. and also i have one more idea that converting Unicode to ASCII. is it possible please tell how to do that. Thanks in Advance Worm Regards, Yuvaraj

Read the article

JavaScript: Given an offset and substring length in an HTML string, what is the parent node?

- by Bungle

My current project requires locating an array of strings within an element's text content, then wrapping those matching strings in <a> elements using JavaScript (requirements simplified here for clarity). I need to avoid jQuery if at all possible - at least including the full library. For example, given this block of HTML: <div> <p>This is a paragraph of text used as an example in this Stack Overflow question.</p> </div> and this array of strings to match: ['paragraph', 'example'] I would need to arrive at this: <div> <p>This is a <a href="http://www.example.com/">paragraph</a> of text used as an <a href="http://www.example.com/">example</a> in this Stack Overflow question.</p> </div> I've arrived at a solution to this by using the innerHTML() method and some string manipulation - basically using the offsets (via indexOf()) and lengths of the strings in the array to break the HTML string apart at the appropriate character offsets and insert <a href="http://www.example.com/"> and </a> tags where needed. However, an additional requirement has me stumped. I'm not allowed to wrap any matched strings in <a> elements if they're already in one, or if they're a descendant of a heading element (<h1> to <h6>). So, given the same array of strings above and this block of HTML (the term matching has to be case-insensitive, by the way): <div> <h1>Example</a> <p>This is a <a href="http://www.example.com/">paragraph of text</a> used as an example in this Stack Overflow question.</p> </div> I would need to disregard both the occurrence of "Example" in the <h1> element, and the "paragraph" in <a href="http://www.example.com/">paragraph of text</a>. This suggests to me that I have to determine which node each matched string is in, and then traverse its ancestors until I hit <body>, checking to see if I encounter a <a> or <h_> node along the way. Firstly, does this sound reasonable? Is there a simpler or more obvious approach that I've failed to consider? It doesn't seem like regular expressions or another string-based comparison to find bounding tags would be robust - I'm thinking of issues like self-closing elements, irregularly nested tags, etc. There's also this... Secondly, is this possible, and if so, how would I approach it?

Read the article

What's the fastest way to convert a string of text into a url-safe variable?

- by ensnare

I'd like to convert a string of text, e.g., "User Name" into something which I can turn into part of a url, e.g., "User-Name." What's the fastest way to do a string replacement ("-" for " ") as well as make sure that characters are only [a-zA-Z0-9]?

Read the article

Parsing string logic issue c#

- by N0xus

This is a follow on from this question My program is taking in a string that is comprised of two parts: a distance value and an id number respectively. I've split these up and stored them in local variables inside my program. All of the id numbers are stored in a dictionary and are used check the incoming distance value. Though I should note that each string that gets sent into my program from the device is passed along on a single string. The next time my program receives that a signal from a device, it overrides the previous data that was there before. Should the id key coming into my program match one inside my dictionary, then a variable held next to my dictionaries key, should be updated. However, when I run my program, I don't get 6 different values, I only get the same value and they all update at the same time. This is all the code I have written trying to do this: Dictionary<string, string> myDictonary = new Dictionary<string, string>(); string Value1 = ""; string Value2 = ""; string Value3 = ""; string Value4 = ""; string Value5 = ""; string Value6 = ""; void Start() { myDictonary.Add("11111111", Value1); myDictonary.Add("22222222", Value2); myDictonary.Add("33333333", Value3); myDictonary.Add("44444444", Value4); myDictonary.Add("55555555", Value5); myDictonary.Add("66666666", Value6); } private void AppendString(string message) { testMessage = message; string[] messages = message.Split(','); foreach(string w in messages) { if(!message.StartsWith(" ")) outputContent.text += w + "\n"; } messageCount = "RSSI number " + messages[0]; uuidString = "UUID number " + messages[1]; if(myDictonary.ContainsKey(messages[1])) { Value1 = messageCount; Value2 = messageCount; Value3 = messageCount; Value4 = messageCount; Value5 = messageCount; Value6 = messageCount; } } How can I get it so that when programs recives the first key, for example 1111111, it only updates Value1? The information that comes through can be dynamic, so I'd like to avoid harding as much information as I possibly can.

Read the article

PHP PDF library with unicode support. anybody??

- by soden

I tried dompdf. Its a lot easier than the other libraries but it doesnt have unicode support. Well it has unicode support but requires another library calld PDFLib ($1k version). so just wondering if anybodies ever stumbled upon or used any PHP pdf library which is easy to use and has unicode support. thanks,

Read the article

How do I best remove the unicode characters that XHTML regards as non-valid using php?

- by Andrew Stacey

I run a forum designed to support an international mathematics group. I've recently switched it to unicode for better support of international characters. In debugging this conversion, I've discovered that not all unicode characters are considered as valid XHTML (the relevant website appears to be http://www.w3.org/TR/unicode-xml/). One of the steps that the forum software goes through before presenting the posts to the browser is an XHTML validation/sanitisation step. It seems a reasonable idea that at that stage it should remove any unicode characters that XHTML doesn't like. So my question is: Is there a standard (or best) way of doing this in PHP? (The forum is written in PHP, by the way.) I guess that the failsafe would be a simple str_replace (if that's also the best, do I need to do anything extra to make sure it works properly with unicode?) but that would involve me having to go through the XHTML DTD (or the above-referenced W3 page) carefully to figure out what characters to list in the search part of str_replace, so if this is the best way, has someone already done that so that I can steal, err, copy, it? (Incidentally, the character that caused the problem was U+000C, the 'formfeed', which (according to the W3 page) is valid HTML but invalid XHTML!)

Read the article

Is the Unicode prefix N still needed in SQL Compact Edition?

- by Dave

At least in previous versions of SQL Server, you had to prefix Unicode string constants with an "N" to make them be treated as Unicode. Thus, select foo from bar where fizz = N'buzz' (See "Server-Side Programming with Unicode" for SQL Server 2005 "from the horse's mouth" documentation.) We have an application that is using SQL Compact Edition and I am wondering if that is still necessary. From the testing I am doing, it appears to be unneeded. That is, the following SQL statements both behave identically in SQL CE, but the second one fails in SQL Server 2005: select foo from bar where foo=N'???' select foo from bar where foo='???' (I hope I'm not swearing in some language I don't know about...) I'm wondering if that is because all strings are treated as Unicode in SQL CE, or if perhaps the default code page is now Unicode-aware. If anyone has seen any official documentation, either yea or nay, I'd appreciate it. I know I could go the safe route and just add the "N"'s, but there's a lot of code that will need changed, and if I don't need to, I don't want to! Thanks for your help!

Read the article

Convert byte array to understandable String

- by Ender

I have a program that handles byte arrays in Java, and now I would like to write this into a XML file. However, I am unsure as to how I can convert the following byte array into a sensible String to write to a file. Assuming that it was Unicode characters I attempted the following code: String temp = new String(encodedBytes, "UTF-8"); Only to have the debugger show that the encodedBytes contain "\ufffd\ufffd ^\ufffd\ufffd-m\ufffd\ufffd\/ufffd \ufffd\ufffdIA\ufffd\ufffd". The String should contain a hash in alphanumerical format. How would I turn the above String into a sensible String for output?

Read the article

How do I fix this unicode/cPickle error in Python?

- by alex

ids = cPickle.loads(gem.value) loads() argument 1 must be string, not unicode

Read the article

Why does Python sometimes upgrade a string to unicode and sometimes not?

- by samtregar

I'm confused. Consider this code working the way I expect: >>> foo = u'Émilie and Juañ are turncoats.' >>> bar = "foo is %s" % foo >>> bar u'foo is \xc3\x89milie and Jua\xc3\xb1 are turncoats.' And this code not at all working the way I expect: >>> try: ... raise Exception(foo) ... except Exception as e: ... foo2 = e ... >>> bar = "foo2 is %s" % foo2 ------------------------------------------------------------ Traceback (most recent call last): File "<ipython console>", line 1, in <module> UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-1: ordinal not in range(128) Can someone explain what's going on here? Why does it matter whether the unicode data is in a plain unicode string or stored in an Exception object? And why does this fix it: >>> bar = u"foo2 is %s" % foo2 >>> bar u'foo2 is \xc3\x89milie and Jua\xc3\xb1 are turncoats.' I am quite confused! Thanks for the help! UPDATE: My coding buddy Randall has added to my confusion in an attempt to help me! Send in the reinforcements to explain how this is supposed to make sense: >>> class A: ... def __str__(self): return "string" ... def __unicode__(self): return "unicode" ... >>> "%s %s" % (u'niño', A()) u'ni\xc3\xb1o unicode' >>> "%s %s" % (A(), u'niño') u'string ni\xc3\xb1o' Note that the order of the arguments here determines which method is called!

Read the article

What is the best way to use Unicode in C++ on iPhone?

- by Olli Wang

Hi, I want to create my C++ libraries with Unicode support so they can be reused on other platforms. I have found the ICU (International Components for Unicode) project but I also found a discuss about Apple rejecting for using ICU (see http://tinyurl.com/y86phfb). So how do you guys use Unicode in C++ on iPhone? Thanks.

Search Results

Search found 36172 results on 1447 pages for 'unicode string'.

Page 16/1447 | < Previous Page | 12 13 14 15 16 17 18 19 20 21 22 23 | Next Page >

- by Geo

- by Ganesh

- by Sorin Sbarnea

- by mjustin

- by dpitch40

- by Stephen Jennings

- by Sem Dendoncker

- by James

- by Artyom

- by Wesley Murch

- by masato-san

- by Simon

- by user8382

- by Yuvaraj

- by Bungle

- by ensnare

- by N0xus

- by soden

- by Andrew Stacey

- by Dave

- by Ender

- by alex

- by samtregar

- by Olli Wang

< Previous Page | 12 13 14 15 16 17 18 19 20 21 22 23 | Next Page >