Search Results

Search found 1649 results on 66 pages for 'unicode normalization'.

Page 12/66 | < Previous Page | 8 9 10 11 12 13 14 15 16 17 18 19  | Next Page >

  • SQLBits - Unicode Porn

    - by Most Valuable Yak (Rob Volk)
    We've just finished up a fantastic event at SQLBits X in London!  If you've never been to SQLBits and you can make it to the UK, I highly recommend it.  If you didn't attend, here's what you missed. Meanwhile, for those who attended the Lightning Talk sessions and were disappointed that I ran out of time, here's the last part that you would have seen: /*    How to Lose Friends and Irritate People...With Unicode!     Rob Volk     SQLBits X - London - March 31, 2012 */ -- some sexy SQL DECLARE @oohbaby TABLE(i INT NOT NULL UNIQUE, uni_char AS NCHAR(i), hex AS CAST(i AS BINARY(2))) INSERT @oohbaby VALUES(664),(1022),(1023),(1120),(1150),(8857),(11609),(42420),(42427) -- change results font to larger size, some only work in grid font SELECT * FROM @oohbaby SELECT NCHAR(1022) + NCHAR(1023) AS Page3Girl It's probably better that you run this yourself, in the privacy of your own home/office, you know *wink* *wink* *nudge* *nudge* *say no more*

    Read the article

  • Best way to convert a Unicode URL to ASCII (UTF-8 percent-escaped) in Python?

    - by benhoyt
    I'm wondering what's the best way -- or if there's a simple way with the standard library -- to convert a URL with Unicode chars in the domain name and path to the equivalent ASCII URL, encoded with domain as IDNA and the path %-encoded, as per RFC 3986. I get from the user a URL in UTF-8. So if they've typed in http://?.ws/? I get 'http://\xe2\x9e\xa1.ws/\xe2\x99\xa5' in Python. And what I want out is the ASCII version: 'http://xn--hgi.ws/%E2%99%A5'. What I do at the moment is split the URL up into parts via a regex, and then manually IDNA-encode the domain, and separately encode the path and query string with different urllib.quote() calls. # url is UTF-8 here, eg: url = u'http://?.ws/?'.encode('utf-8') match = re.match(r'([a-z]{3,5})://(.+\.[a-z0-9]{1,6})' r'(:\d{1,5})?(/.*?)(\?.*)?$', url, flags=re.I) if not match: raise BadURLException(url) protocol, domain, port, path, query = match.groups() try: domain = unicode(domain, 'utf-8') except UnicodeDecodeError: return '' # bad UTF-8 chars in domain domain = domain.encode('idna') if port is None: port = '' path = urllib.quote(path) if query is None: query = '' else: query = urllib.quote(query, safe='=&?/') url = protocol + '://' + domain + port + path + query # url is ASCII here, eg: url = 'http://xn--hgi.ws/%E3%89%8C' Is this correct? Any better suggestions? Is there a simple standard-library function to do this?

    Read the article

  • Fast, Unicode-capable, cross-platform programmer's text editor that shows invisibles like ZWSP?

    - by Roger_S
    Our publishing workflow includes Windows and Linux machines (there are some Macs too, but not in the critical-path workflow). Many texts include both English and Khmer and are marked-up in XML. XML Copy Editor is the best cross-platform open-source XML editor I've discovered. It utilizes the Scintilla editing component, which is generally good with Unicode but which does not enable non-printing or invisible characters like U+200B (zero-width space) and U+200C (zero-width non-joiner) to be displayed. Khmer does not separate words with a space character as Western languages do, so ZWSP is used in electronic texts to enable applications to break lines easily. Ideally I'd edit the markup and the content in a single editor, but XML awareness is less important at times than being able to display invisibles. (OpenOffice.org Writer and Microsoft Word are the only two apps I know that will display ZWSP. They are not suitable for the markup and text manipulations that need to be done to prepare manuscripts for publication, unfortunately, although I guess they're fine for authoring.) I tried out a promising editor last week, but a search-and-replace regex operation that took under a second in TextPad 4.7.3 lasted over twenty seconds. So I want to mention that speed and the ability to handle large (up to 150mb) files is also a concern. Is there a good, fast, free or not too expensive text editor, with versions on Windows and Linux and maybe mac too, Unicode-aware and capable of displaying invisibles like ZWSP? That has syntax highlighting, can handle large files and is customizable enough that I won't tear my hair out in frustration? Thanks, Roger_S

    Read the article

  • If I use Unicode on a ISO-8859-1 site, how will that be interpreted by a browser?

    - by grg-n-sox
    So I got a site that uses ISO-8859-1 encoding and I can't change that. I want to be sure that the content I enter into the web app on the site gets parsed correctly. The parser works on a character by character basis. I also cannot change the parser, I am just writing files for it to handle. The content in my file I am telling the app to display after parsing contains Unicode characters (or at least I assume so, even if they were produced by Windows Alt Codes mapped to CP437). Using entities is not an option due to the character by character operation of the parser. The only characters that the parser escapes upon output are markup sensitive ones like ampersand, less than, and greater than symbols. I would just go ahead and put this through to see what it looks like, but output can only be seen on a publishing, which has to spend a couple days getting approved and such, and that would be asking too much for just a test case. So, long story short, if I told a site to output ?ÇÑ¥?? on a site with a meta tag stating it is supposed to use ISO-8859-1, will a browser auto-detect the Unicode and display it or will it literally translate it as ISO-8859-1 and get a different set of characters?

    Read the article

  • Which Perl moudle can handle variety of date formats with unicode characters ?

    - by ram
    My requirement is parsing xml files which contains wide varieties of timestamps based on the locales at which they are written. They may contain Unicode characters in case of Chinese or Korean locales. I have to parse these timestamps and put then in a standard format something like 2009-11-26 12:40:54 to put them in a oracle database. Sometimes I may not even know the locale and yet I have to parse the timestamps. I am looking for a module that automatically detects the timestamp format (including unicode characters for am and pm in their local language) and converts in to epoch time so that I can convert it back to what ever way I like to. I have gone through similar questions in this forum. Few suggested DateFormat module, and Date::Parse module. The perl distribution I am using is 5.10 so Date::Manip doesn't come as a core module. As I am supposed to use just the basic core modules and few CPAN modules(on request I cannot ask for all), I request you to kindly suggest me a good module that suffices all my requirements. Thanks in advance

    Read the article

  • Sun Fire X4270 M3 SAP Enhancement Package 4 for SAP ERP 6.0 (Unicode) Two-Tier Standard Sales and Distribution (SD) Benchmark

    - by Brian
    Oracle's Sun Fire X4270 M3 server achieved 8,320 SAP SD Benchmark users running SAP enhancement package 4 for SAP ERP 6.0 with unicode software using Oracle Database 11g and Oracle Solaris 10. The Sun Fire X4270 M3 server using Oracle Database 11g and Oracle Solaris 10 beat both IBM Flex System x240 and IBM System x3650 M4 server running DB2 9.7 and Windows Server 2008 R2 Enterprise Edition. The Sun Fire X4270 M3 server running Oracle Database 11g and Oracle Solaris 10 beat the HP ProLiant BL460c Gen8 server using SQL Server 2008 and Windows Server 2008 R2 Enterprise Edition by 6%. The Sun Fire X4270 M3 server using Oracle Database 11g and Oracle Solaris 10 beat Cisco UCS C240 M3 server running SQL Server 2008 and Windows Server 2008 R2 Datacenter Edition by 9%. The Sun Fire X4270 M3 server running Oracle Database 11g and Oracle Solaris 10 beat the Fujitsu PRIMERGY RX300 S7 server using SQL Server 2008 and Windows Server 2008 R2 Enterprise Edition by 10%. Performance Landscape SAP-SD 2-Tier Performance Table (in decreasing performance order). SAP ERP 6.0 Enhancement Pack 4 (Unicode) Results (benchmark version from January 2009 to April 2012) System OS Database Users SAPERP/ECCRelease SAPS SAPS/Proc Date Sun Fire X4270 M3 2xIntel Xeon E5-2690 @2.90GHz 128 GB Oracle Solaris 10 Oracle Database 11g 8,320 20096.0 EP4(Unicode) 45,570 22,785 10-Apr-12 IBM Flex System x240 2xIntel Xeon E5-2690 @2.90GHz 128 GB Windows Server 2008 R2 EE DB2 9.7 7,960 20096.0 EP4(Unicode) 43,520 21,760 11-Apr-12 HP ProLiant BL460c Gen8 2xIntel Xeon E5-2690 @2.90GHz 128 GB Windows Server 2008 R2 EE SQL Server 2008 7,865 20096.0 EP4(Unicode) 42,920 21,460 29-Mar-12 IBM System x3650 M4 2xIntel Xeon E5-2690 @2.90GHz 128 GB Windows Server 2008 R2 EE DB2 9.7 7,855 20096.0 EP4(Unicode) 42,880 21,440 06-Mar-12 Cisco UCS C240 M3 2xIntel Xeon E5-2690 @2.90GHz 128 GB Windows Server 2008 R2 DE SQL Server 2008 7,635 20096.0 EP4(Unicode) 41,800 20,900 06-Mar-12 Fujitsu PRIMERGY RX300 S7 2xIntel Xeon E5-2690 @2.90GHz 128 GB Windows Server 2008 R2 EE SQL Server 2008 7,570 20096.0 EP4(Unicode) 41,320 20,660 06-Mar-12 Complete benchmark results may be found at the SAP benchmark website http://www.sap.com/benchmark. Configuration and Results Summary Hardware Configuration: Sun Fire X4270 M3 2 x 2.90 GHz Intel Xeon E5-2690 processors 128 GB memory Sun StorageTek 6540 with 4 * 16 * 300GB 15Krpm 4Gb FC-AL Software Configuration: Oracle Solaris 10 Oracle Database 11g SAP enhancement package 4 for SAP ERP 6.0 (Unicode) Certified Results (published by SAP): Number of benchmark users: 8,320 Average dialog response time: 0.95 seconds Throughput: Fully processed order line: 911,330 Dialog steps/hour: 2,734,000 SAPS: 45,570 SAP Certification: 2012014 Benchmark Description The SAP Standard Application SD (Sales and Distribution) Benchmark is a two-tier ERP business test that is indicative of full business workloads of complete order processing and invoice processing, and demonstrates the ability to run both the application and database software on a single system. The SAP Standard Application SD Benchmark represents the critical tasks performed in real-world ERP business environments. SAP is one of the premier world-wide ERP application providers, and maintains a suite of benchmark tests to demonstrate the performance of competitive systems on the various SAP products. See Also SAP Benchmark Website Sun Fire X4270 M3 Server oracle.com OTN Oracle Solaris oracle.com OTN Oracle Database 11g Release 2 Enterprise Edition oracle.com OTN Disclosure Statement Two-tier SAP Sales and Distribution (SD) standard SAP SD benchmark based on SAP enhancement package 4 for SAP ERP 6.0 (Unicode) application benchmark as of 04/11/12: Sun Fire X4270 M3 (2 processors, 16 cores, 32 threads) 8,320 SAP SD Users, 2 x 2.90 GHz Intel Xeon E5-2690, 128 GB memory, Oracle 11g, Solaris 10, Cert# 2012014. IBM Flex System x240 (2 processors, 16 cores, 32 threads) 7,960 SAP SD Users, 2 x 2.90 GHz Intel Xeon E5-2690, 128 GB memory, DB2 9.7, Windows Server 2008 R2 EE, Cert# 2012016. IBM System x3650 M4 (2 processors, 16 cores, 32 threads) 7,855 SAP SD Users, 2 x 2.90 GHz Intel Xeon E5-2690, 128 GB memory, DB2 9.7, Windows Server 2008 R2 EE, Cert# 2012010. Cisco UCS C240 M3 (2 processors, 16 cores, 32 threads) 7,635 SAP SD Users, 2 x 2.90 GHz Intel Xeon E5-2690, 128 GB memory, SQL Server 2008, Windows Server 2008 R2 DE, Cert# 2012011. Fujitsu PRIMERGY RX300 S7 (2 processors, 16 cores, 32 threads) 7,570 SAP SD Users, 2 x 2.90 GHz Intel Xeon E5-2690, 128 GB memory, SQL Server 2008, Windows Server 2008 R2 EE, Cert# 2012008. HP ProLiant DL380p Gen8 (2 processors, 16 cores, 32 threads) 7,865 SAP SD Users, 2 x 2.90 GHz Intel Xeon E5-2690, 128 GB memory, SQL Server 2008, Windows Server 2008 R2 EE, Cert# 2012012. SAP, R/3, reg TM of SAP AG in Germany and other countries. More info www.sap.com/benchmark

    Read the article

  • Efficient Trie implementation for unicode strings

    - by U Mad
    I have been looking for an efficient String trie implementation. Mostly I have found code like this: Referential implementation in Java (per wikipedia) I dislike these implementations for mostly two reasons: They support only 256 ASCII characters. I need to cover things like cyrillic. They are extremely memory inefficient. Each node contains an array of 256 references, which is 4096 bytes on a 64 bit machine in Java. Each of these nodes can have up to 256 subnodes with 4096 bytes of references each. So a full Trie for every ASCII 2 character string would require a bit over 1MB. Three character strings? 256MB just for arrays in nodes. And so on. Of course I don't intend to have all of 16 million three character strings in my Trie, so a lot of space is just wasted. Most of these arrays are just null references as their capacity far exceeds the actual number of inserted keys. And if I add unicode, the arrays get even larger (char has 64k values instead of 256 in Java). Is there any hope of making an efficient trie for strings? I have considered a couple of improvements over these types of implementations: Instead of using array of references, I could use an array of primitive integer type, which indexes into an array of references to nodes whose size is close to the number of actual nodes. I could break strings into 4 bit parts which would allow for node arrays of size 16 at the cost of a deeper tree.

    Read the article

  • Is there a Pac-Man-like character in ASCII or Unicode?

    - by Ricket
    Simple question: is there a character that looks either like Pac-Man, or like the ghost in Pac-Man? With Google's recent Pac-Man logo, everyone should know what these look like, but in case you don't here are some sample images: If you answer "no" please provide a little more proof that you actually searched all unicode characters...

    Read the article

  • How do I convert from unicode to single byte in C#?

    - by xarzu
    How do I convert from unicode to single byte in C#? This does not work: int level =1; string argument; // and then argument is assigned if (argument[2] == Convert.ToChar(level)) { // does not work } And this: char test1 = argument[2]; char test2 = Convert.ToChar(level); produces funky results. test1 can be: 49 '1' while test2 will be 1 ''

    Read the article

  • Which of the following Unicode characters should be used in HTML?

    - by George Edison
    I am aware that any Unicode character can be inserted into an HTML document via the following format: &#x0000; ...where 0000 is the character code of the desired character My question is: which of these characters has the most widespread availability when it comes to the client's browser being able to display the character? In other words, what are the ranges of codes that should be used in an HTML document that is going to be widely deployed?

    Read the article

  • Ignore non-unicode programs language when installing software

    - by mitya
    This is something that is driving me nuts for a while and I haven't been able to find a solution for this problem anywhere. I am running Windows 7 and my "Language for non-Unicode programs" setting is set to Russian. I need for some non-unicode software that has a Russian UI. However, for most of my software I prefer to use the English UI. A lot of software out there is multilingual and is too smart for my liking. When installing, it switches the UI to Russian and the software UI stays in Russian after the installation without an option to change that, besides setting the "non-unicode language" to English. It switches back to Russian once I revert the setting and reboot. Most of the time it is driver software, i.e: Intel, HP, etc. How can force the installation to run English and stay that way after install, ignoring the "Language for non-Unicode programs" setting? Now, I understand this might be specific to the installer: MSI, Install Shield, etc. But any solution will be good, even if I have to apply it for every software installation. Thanks in advance for any helpful information!

    Read the article

  • Qt/C++ regular expression library with unicode property support

    - by Dave
    I'm converting an application from the .Net framework to Qt using C++. The application makes extensive use of regular expression unicode properties, i.e. \p{L}, \p{M}, etc. I've just discovered that the QRegExp class lacks support for this among other things (lookbehinds, etc.) Can anyone recommend a C++ regular expression library that: Supports unicode properties Is unicode-aware in other respects (i.e. \w matches more than ASCII word characters) As a bonus, supports lookbehinds. Please don't point me to the wikipedia article; I don't trust it. That article says that QRegExp supports unicode properties. Unless I'm really doing something wrong, it doesn't. I'm looking for someone actually using unicode properties with a regex library in a project.

    Read the article

  • DRM Tallyrand - The New User Interface

    - by russ.bishop
    I received word recently that the Tallyrand (11.1.2.0) build is out of our hands. I'm not sure when it will hit eDelivery, but if it hasn't already it should happen soon. For this post, I want to really quickly show the new user interface. The login screen: When you login, you are browsing versions and hierarchies. Note that Unicode is fully supported: The UI attempts to provide context-sensitive links where possible; notice here that an unloaded version is selected, so the UI shows a link. Clicking the link automatically brings up this Load Version dialog. This same thing applies elsewhere in the UI when you attempt to perform an action with an unloaded version: Here is browsing a hierarchy, with the property grid and context menu displayed (though you can hide the property grid anytime you like to provide more room): Worried about drag and drop? Don't! We support it even though this is a browser app. Also notice the Relationships feature on the right displaying a node's ancestors: Where possible, we try to present the available options, rather than just throwing up an "OK/Cancel" dialog (which most users never read anyway): Context-sensitive shortcuts automatically fill-in the context based on the currently selected node. For example, if you want to run a query using the selected node as the root, you can just click that query in the Shortcuts tab. In this screenshot, clicking Model After would model the selected node: This is just for starters. There is much more to cover, on both the client and server. For example, all communication channels are now configurable (no more DCOM). You can pick the ports, the encoding (binary or XML), and the transport mechanism (TCP, TCP over SSL, or SOAP over HTTP). All the relevant WS-* standards are also supported, eg: WS-Security, etc. Plus new features (besides the web client and unicode support). I hope to cover as much of these things as I can in the coming months. If you have specific requests, comment on this post and I'll try to cover them.

    Read the article

  • what is optimum length for html title tag in Unicode format?

    - by user1501256
    I have a website that generates its title tag dynamically. the title tag is in unicode format. the title tag is limited to 65 character but sometimes Google doesn't show title tag completely in SERP. I'd like to know what is the optimum length of title tag in terms of seo for unicode titles, and is there any difference between Unicode title and non-Unicode title tag? And what about other search engines Bing, Yahoo and so on.

    Read the article

  • Good bitmap fonts with big sizes and unicode support

    - by bitonic
    I really like bitmap fonts for programming/terminal. As far as I know there are two bitmap fonts with good unicode support: Unifont Fixed The problem is that I have a really high resolution screen, and they're both too small. Fixed does include a large size (10x20) but it looks really bad (it's basically always bold, and bold is a different face). Are there any other bitmap fonts with unicode support and large sizes? Terminus is the only font with a decent size but it doesn't have good unicode support. Having good coverage for mathematical symbols would be enough, since that's what I need.

    Read the article

  • How can I copy files with names containing spaces and UNICODE, when using a shell script?

    - by LOlliffe
    I have a list of files that I'm trying to copy and move (using cp and mv) in a bash shell script. The problem that I'm running into, is that I can't get either command to recognize a huge number of files, seemingly because the filenames contain spaces and/or unicode characters. I couldn't find any switches to decode/re-encode these characters. Instead, for example, if I copy "file name.xml", I get "*.xml" and a script error that the file wasn't found for my result. Does anyone know settings or commands that will deal with these files?

    Read the article

  • how to remove a few lines from a Unicode registry file using batch commands in Windows?

    - by Cosmin
    Hi. I have a program who's generating some data in registry. I save it with "reg export HKCU\Software\ProgramName\Data data.reg" (Unicode format). I need to take it to other computer and import it there so the program from that computer could use the data. But I have to remove some text lines from data.reg. The text lines are easy to find because they contain some strings. Now I'm doing this manually (using Wordpad) every few days but maybe there is another way... Oh and I can't install other programs on these computers (the access is restricted) so I have to use batch/cmd files. What I tried so far: - redirecting the export to "con" but is visual only not in a variable; - using "for /F ..." but this works only with ANSI and removes blank lines. Can somebody please help me...? Thank you.

    Read the article

< Previous Page | 8 9 10 11 12 13 14 15 16 17 18 19  | Next Page >