Detect if PCRE was built without the --enable-unicode-properties or --enable-utf8 configuration switches

Posted by Mark Baker on Stack Overflow See other posts from Stack Overflow or by Mark Baker
Published on 2010-12-22T13:21:09Z Indexed on 2010/12/22 13:54 UTC
Read the original article Hit count: 198

Filed under:
|
|

I've a PHP library that uses a number of regular expressions featuring the \P expressions for multibyte strings, e.g.

((((?:\P{M}\p{M}*)+?)|(\'[^\']*\')|(\"[^\"]*\"))!)?\$?([a-z]{1,3})\$?(\d+)

While this works on most builds, I've had a few reports of the regexp returning an error.

Depending on Operating platform, the error messages from PCRE are:

Compilation failed: PCRE does not support \L, \l, \N, \P, \p, \U, \u, or \X at offset n

or

Compilation failed: support for \\P, \\p, and \\X has not been compiled at offset n

I know that I can probably test a regexp at the beginning of my code that uses \P, and trap for a returned error, then use that response to set a compatibility flag and provide a degraded (non UTF-8) regexp without the \P within the main body of my code based on that compatibility flag; but I was wondering if there was any simpler way to identify whether PCRE had been built without the --enable-unicode-properties or --enable-utf8 configuration switches. PHP provides access to PCRE_VERSION constant, but that won't help identify whether \P support is enabled or not.

© Stack Overflow or respective owner

Related posts about php

Related posts about utf-8