Search Results

Search found 1003 results on 41 pages for 'utf8'.

Page 5/41 | < Previous Page | 1 2 3 4 5 6 7 8 9 10 11 12  | Next Page >

  • Unable to convert file to UTF-8

    - by antoniocs
    I am on windows xp sp3 and I am trying to convert a file from ASCII to UTF-8. I use notepad++ to do this. I go to Encoding Convert to UTF-8 without BOM. I save the file, reopen and it is still on ASCII. I am using this file in a webpage and I need the file to be UTF-8, because I have strings in utf-8 and they am seeing little squares with ? on them.

    Read the article

  • How to make emacs accept UTF-8 from the keyboard

    - by Brent.Longborough
    My friends have persuaded me to "try again" (about the 5th time in about 12 years) with emacs. I'm currently suffering a little, and need help with emacs + utf-8. I'm running the 23.3.1 emacs gui on Windows 7 with my own custom keyboard layout (built with MS Keyboard Layout Creator). The layout has a full ISO-8859-1 (Latin-1) character set, plus some additional characters from ISO-8859-9 (Latin-5, gis etc for Turkish) and w for Welsh (don't know where that one lives). In my .emacs, I have (blindly) added these lines: ;; key board / input method settings (setq locale-coding-system 'utf-8) (set-terminal-coding-system 'utf-8) (set-keyboard-coding-system 'utf-8) (set-language-environment 'UTF-8) ; prefer utf-8 for language settings Now, when I enter characters from ISO Latin-1 from the keyboard, they are accepted without problems, but characters from outside Latin-1 are "translated" to an approximate character in Latin-1. Thus, for example, Latin-5 "g" gets converted to a plain "g". Cutting and pasting, however, work fine. Can anyone tell me what I'm doing wrong? I should like to make everything I do with emacs utf-8 with BOM.

    Read the article

  • Extract large zip file (50G) on Mac OS X

    - by chingjun
    I was trying to move the files to another hard drive. So I archived all my photos in one large zip file using the Mac OS X built-in compress function. But the file failed to extract. I've tried many programs, but none of the programs I tried were able to extract the file. I've tried Mac OS X's extract utility, Stuffit Expander, 7zip (command line), all failed. Mac's archive utility and Stuffit don't seem to support large files, and 7zip's command line version gave an error stating unsupported archive. I have no luck in Windows too as many of my files have Chinese filenames, and couldn't extract to the correct name under Windows. Could anyone please suggest some programs that can support large files, can handle files compressed using Mac OS X's compress function, and can support UTF-8 filename? With or without GUI is fine. Thank you in advance.

    Read the article

  • Problem with unpacking zip archive with UTF-8 file names in OS X if zip was made in Windows

    - by Andrei
    I have packed my files in Windows 7 using Total Commander asking to use UTF-8 for file names. Then I tried to unpack my files in OS X, but Cyrillic names were messed. I have tried most programs -- none has helped me, so I had to use Parallels with Windows and Total Commander to get what I want. Is there any other way to do it? Is it a fault of Total Commander or I need to tune OS X settings?

    Read the article

  • UTF-8 bit representation

    - by Yanick Rochon
    I'm learning about UTF-8 standards and this is what I'm learning : Definition and bytes used UTF-8 binary representation Meaning 0xxxxxxx 1 byte for 1 à 7 bits chars 110xxxxx 10xxxxxx 2 bytes for 8 à 11 bits chars 1110xxxx 10xxxxxx 10xxxxxx 3 bytes for 12 à 16 bits chars 11110xxx 10xxxxxx 10xxxxxx 10xxxxxx 4 bytes for 17 à 21 bits chars And I'm wondering, why 2 bytes UTF-8 code is not 10xxxxxx instead, thus gaining 1 bit all the way up to 22 bits with a 4 bytes UTF-8 code? The way it is right now, 64 possible values are lost (from 1000000 to 10111111). I'm not trying to argue the standards, but I'm wondering why this is so? ** EDIT ** Even, why isn't it UTF-8 binary representation Meaning 0xxxxxxx 1 byte for 1 à 7 bits chars 110xxxxx xxxxxxxx 2 bytes for 8 à 13 bits chars 1110xxxx xxxxxxxx xxxxxxxx 3 bytes for 14 à 20 bits chars 11110xxx xxxxxxxx xxxxxxxx xxxxxxxx 4 bytes for 21 à 27 bits chars ...? Thanks!

    Read the article

  • Is there a MySQL performance benchmark to measure the impact of utf8_unicode_ci versus utf8_general_ci?

    - by MiniQuark
    I read here and there that using the utf8_unicode_ci collation ensures a better treatment of unicode text (for example, it knowns how to expand characters such as 'œ' into 'oe' for searching and ordering) compared to the default utf8_general_ci which basically just strips diacritics. Unfortunately, both sources indicate that utf8_unicode_ci is slightly slower than utf8_general_ci. So my question is: what does "slightly slower" mean? Has anyone run benchmarks? Are we talking about a -0.01% performance impact or rather something like -25%? Thanks for your help.

    Read the article

  • In utf-8 collation, why 11- is less then 1- ?

    - by ???
    I found that the sort result in ASCII: 1- 11- and in UTF-8: 11- 1- I feel it's so counter-intuitive, and it's not dictionary order. Isn't the character '-' (002d) is always less then [0-9] (0030-0039)? What's the general rule in UTF-8 collation? And how to bypass it, just make - be less then [0-9] while keep other characters unchanged for UTF-8, in Linux? (So it can affects the result of ls --sort, sort, etc. )

    Read the article

  • Detect CJK characters in PHP

    - by Jasie
    Hello, I've got an input box that allows UTF8 characters -- can I detect whether the characters are in Chinese, Japanese, or Korean programmatically (part of some Unicode range, perhaps)? I would change search methods depending on if MySQL's fulltext searching would work (it won't work for CJK characters). Thanks!

    Read the article

  • Rails 2.3.5, Ruby 1.9, SQLite 3 incompatible character encodings: UTF-8 and ASCII-8BIT

    - by Daniil Harik
    Hello, I know that question with same title has been asked almost 6 month ago. I have Googled for this problem and I have not found any working solution. Has there been any fixes for this very critical problem? I need to get my website running ASAP. Just to get the site up and running I'm even ready to add utf8 conversion methods to all my variables or risk to upgrade to Rails 3 beta Thank You in advance!

    Read the article

  • UTF-8 character encoding in Java

    - by user332523
    Hello, I am having some problems getting some French text to convert to UTF8 so that it can be displayed properly, either in a console, text file or in a GUI element. The original string is HANDICAP+ES which is supposed to be HANDICAPÉES No matter how I tried converting it, it appears the same way. Any ideas on how I can do this conversion? Thanks, Cam

    Read the article

  • How is this website fixing the encoding ??

    - by Tal Galili
    Hi all, I am trying to turn this text: ×וויר. העתיד של רשתות חברתיות והתקשורת ×©×œ× ×• Into this text: ?????. ????? ?? ????? ??????? ???????? ???? Somehow, this website: http://www.pixiesoft.com/flip/ Can do it, and I would like to know how I might be able to do it myself (with whatever programming language or software) Just saving the file as UTF8 won't do it. My motivation for this question is that I have a friend's exported XML file with the garbled text which I want to turn into corrected Hebrew text file. The XML export was originally garbled by MySQL import and exports, but I don't have the information needed to fix it or traceback the problem. Thanks.

    Read the article

  • .Net using Chr() to parse text

    - by Marcx
    I'm building a simple client-server chat system. The clients send data to the server and the server resends the data to all the other clients. I'm using the TcpListener and Network stream classes to send the data between the client and the server. The fields I need to send are, for example: name, text, timestamp, etc. I separate them using the ASCII character 29. I'm also using ASCII character 30 to mark the end of the streamed data. The data is encoded with UTF8.. Is this a good approach? Will I run into problems? Are there better methods? UPDATE: Probably my question was misunderstood, so I explain it better.. Suppose to have a list of data to send from client to server, and suppose to send all the data in only one stream, how do you send these data? Using a markup Using a character as a delimiter Using a fixed length for every fields

    Read the article

  • Can MySQL automatically specify `_utf8` for inserts to UTF-8 columns?

    - by Neil
    I have a table like this, where one column is latin1, the other is UTF-8: Create Table: CREATE TABLE `names` ( `name_english` varchar(255) character NOT NULL, `name_chinese` varchar(255) character set utf8 default NULL, ) ENGINE=MyISAM DEFAULT CHARSET=latin1 When I do an insert, I have to type _utf8 before values being inserted into UTF-8 columns: insert into names (name_english = "hooey", name_chinese = _utf8 "??"); However, since MySQL should know that name_chinese is a UTF-8 column, it should be able to know to use _utf8 automatically. Is there any way to tell MySQL to use _utf8 automatically, so when I'm programatically making prepared statements, I don't have to worry about including it with the right parameters?

    Read the article

  • rails, mysql charsets & encoding: binary

    - by Benjamin Vetter
    Hi, i've a rails app that runs using utf-8. It uses a mysql database, all tables with mysql's default charset and collation (i.e. latin1). Therefore the latin1 tables contain utf-8 data. Sure, that's not nice, but i'm not really interested in it. Everything works fine, because the connection encoding is latin1 as well and therefore mysql does not convert between charsets. Only one problem: i need a utf-8 fulltext index for one table: mysql> show create table autocompletephrases; ... AUTO_INCREMENT=310095 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci But: I don't want to convert between charsets in my rails app. Therefore I would like to know if i could just set config/database.yml production: adapter: mysql >>>> encoding: binary ... which just calls SET NAMES 'binary' when connecting to mySQL. It looks like it works for my case, because i guess it forces mysql to -not- convert between charsets (mySQL docs). Does anyone knows about problems about doing this? Any side-effects? Or do you have any other suggestions? But i'd like to avoid converting my whole database to utf-8. Many Thanks! Benjamin

    Read the article

  • problem with mysql character set & GWT

    - by Ehsan Khodarahmi
    Hi I've a SmartGWT application which interacts with a mysql database using rpc services. Suppose it as a simple form with a textbox & two save & load buttons. My database & tables & all fields collation is utf8_persian_ci. All java source files & module html & xml files have saved with utf8 character set. & also I've a meta tag in module html file which contains my form : <meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> my application works correctly in eclipse develpment mode & also in my local tomcat server. Then i put it on remote server (I compress it using jar.exe into a war file with -cvf flag & then upload it using my server's plesk control panel). In this mode, when I load data from a mysql table (load a record from any table), data will load into my form with no problem, but when I want to save some data (in persian language), mysql just writes some ? (question sign) in characteristic table fields. Any idea ?

    Read the article

  • How can I handle UTF-8 while posting to a vBulletin board with WWW::Mechanize?

    - by MrMirror
    I have a problem with some automating posting to bulletin board... If I send the posting form to the vBulletin board, I get corrupted entities. Feel free to copy-paste the script and try it... It looks like the board's expecting some decoded utf8, but if I send the message decoded the entities are still wrong. #!/usr/bin/perl use strict; use warnings; use WWW::Mechanize; use Digest::MD5 qw(md5_hex); my $mech = WWW::Mechanize->new(); my $base_url = 'http://www.boerse.bz/'; my $username = 'MrMirror'; my $password = 'test'; $mech->get($base_url); print "Login\n"; $mech->form_number(1); $mech->field('vb_login_username' => $username); $mech->field('vb_login_password' => $password); $mech->field('vb_login_md5password' => md5_hex($password)); $mech->field('vb_login_md5password_utf' => md5_hex($password)); $mech->submit(); unless ($mech->content() =~ m!Weiterleitung!gi) { print "No Rediction!\n"; exit; } print "Redict\n"; $mech->get($base_url); unless ($mech->content() =~ m!Logout!gi) { print "Login Failed!\n"; exit; } $mech->get($base_url .'/newthread.php?do=newthread&f=173'); $mech->form_number(3); $mech->field('subject' => 'MrMirror makes some testing ä ö ü ß'); $mech->field('message' => "ä ö ü ß"); ### everything allright here $mech->dump_forms(); ### preview submit, don't wanna spam around ;) $mech->click('preview'); print "\n\n\n---------------------------------------------------------------------\n\n\n"; ### same form, wrong entities :( $mech->dump_forms();

    Read the article

  • Why do I get an extra newline in the middle of a UTF-8 character with XML::Parser?

    - by René Nyffenegger
    I encountered a problem dealing with UTF-8, XML and Perl. The following is the smallest piece of code and data in order to reproduce the problem. Here's an XML file that needs to be parsed: <?xml version="1.0" encoding="utf-8"?> <test> <words>???????????? ??????? ????????? ???? ???????????? ??????</words> <words>???????????? ??????? ????????? ???? ???????????? ??????</words> <words>???????????? ??????? ????????? ???? ???????????? ??????</words> [<words> .... </words> 148 times repeated] <words>???????????? ??????? ????????? ???? ???????????? ??????</words> <words>???????????? ??????? ????????? ???? ???????????? ??????</words> </test> The parsing is done with this perl script: use warnings; use strict; use XML::Parser; use Data::Dump; my $in_words = 0; my $xml_parser=new XML::Parser(Style=>'Stream'); $xml_parser->setHandlers ( Start => \&start_element, End => \&end_element, Char => \&character_data, Default => \&default); open OUT, '>out.txt'; binmode (OUT, ":utf8"); open XML, 'xml_test.xml' or die; $xml_parser->parse(*XML); close XML; close OUT; sub start_element { my($parseinst, $element, %attributes) = @_; if ($element eq 'words') { $in_words = 1; } else { $in_words = 0; } } sub end_element { my($parseinst, $element, %attributes) = @_; if ($element eq 'words') { $in_words = 0; } } sub default { # nothing to see here; } sub character_data { my($parseinst, $data) = @_; if ($in_words) { if ($in_words) { print OUT "$data\n"; } } } When the script is run, it produces the out.txt file. The problem is in this file on line 147. The 22th character (which in utf-8 consists of \xd6 \xb8) is split between the d6 and b8 with a new line. This should not happen. Now, I am interested if someone else has this problem or can reproduce it. And why I am getting this problem. I am running this script on Windows: C:\temp>perl -v This is perl, v5.10.0 built for MSWin32-x86-multi-thread (with 5 registered patches, see perl -V for more detail) Copyright 1987-2007, Larry Wall Binary build 1003 [285500] provided by ActiveState http://www.ActiveState.com Built May 13 2008 16:52:49

    Read the article

  • phpmyadmin shows numbers or blob for mysql's utf8_bin callation columns?

    - by marc40000
    Hi ! I have a table with a varchar column. Its collation is set to utf8_bin. My software using this table and column works perfectly. But when I look at the content in phpmyadmin, I only see some hex values or [Blob xB]. Can I make phpmyadmin show the content correctly? Besides, when I set the collation to utf8_general_ci or utf8_unicode_ci, the phpmyadmin shows the content correctly. Thx Marc [edit]Hah, I found out, there is a small "+Options" link above every table in phpmyadmin. It opens several options including "Show BLOB contents" - which makes the [blob] to readable text when enabled and "Show binary contents as HEX" which shows the hex codes as text when disabled. No idea why there are two options though and why sometimes there is a [Blob] and sometimes hex values. Well. Now I'm still wondering: Setting these options get lost when I go to another table. I have to set them every time I go there. Is there a way to save those options? [/edit]

    Read the article

  • .NET Weird character encoding issue

    - by born to hula
    Our globalization mechanism stores error messages in a SQL 2005 DB. Some of the error messages are used as subjects on email messages sent to the development team. Recently, with no clear reason, we started receiving emails with strangely encoded subjects, such as: =?utf-8?B?Qm1mQm92ZXNwYS5Qb3NUcmFkaW5nRXNwZWNpZmljYWNhbyAtIFN1Y2Vzc28gbm8gcmVwcm 9jZXNzYW1lbnRvLiBEYXRhIFByZWfDo28gPSAzMS8wMy8yMDEwIDAwOjAwOjAwIC0gTsO6bWVyby BkbyBFdmVudG8gZGUgTmVnw7NjaW8gPSAxMDAyIC0gQ8OzZGlnbyBOYXR1cmV6YSBkYSBPcGVyY cOnw6NvID0gQyAtIFNlcn... We don't have any clue on the reason this is happening, nor which encoding pattern is being used here (maybe utf-8?). I'd really appreciate some help.

    Read the article

  • cannot output a json encoded dict containing accents (noob inside)

    - by user296546
    Hi all, here is a fairly simple example wich is driving me nuts since a couple of days. Considering the following script: # -*- coding: utf-8 -* from json import dumps as json_dumps machaine = u"une personne émérite" print(machaine) output = {} output[1] = machaine jsonoutput = json_dumps(output) print(jsonoutput) The result of this from cli: une personne émérite {"1": "une personne \u00e9m\u00e9rite"} I don't understand why their such a difference between the two strings. i have been trying all sorts of encode, decode etc but i can't seem to be able to find the right way to do it. Does anybody has an idea ? Thanks in advance. Matthieu

    Read the article

< Previous Page | 1 2 3 4 5 6 7 8 9 10 11 12  | Next Page >