Page 1 of 1

request feature: Han character differentiation

Posted: 2009-07-08 16:35:27
by blindcorpse
if there were some way that Find AnyScript could distinguish between Chinese (traditional), Chinese (simplified), and Korean and Japanese non-syllabary characters, that would be truly wonderful! I frequently make use of the Find AnyHan capability, but since I routinely used Chinese and Japanese in my documents I have to go through afterward and fix the "minority" language.

I realize this may be technologically difficult, or unfeasible. But it would be helpful! Sigh!

Re: request feature: Han character differentiation

Posted: 2009-07-09 06:52:55
by Kino
If you have applied language attributes properly on text portions in those languages and/or if you are using different fonts for those different languages, i.e. Hiragino for Japanese and LiSong Pro for Traditional Chinese, you can use Attribute Sensitive find to find and select text portions in a given language or in a given font.

If not, that is theoretically impossible. “字” is “字” and occupies the same code point in all CJK languages.

However, if you don't require a perfect solution, the following PowerFind Pro expression may help.

Code: Select all

(?<jp>\p{Hiragana}|\p{Katakana}|\x{30FC}){0}(?<=\g<jp>)\p{Han}+|\p{Han}+(?=\g<jp>)
It matches one or more Hanji preceded or followed by hiragara/katakana. Or this one.

Code: Select all

[「『(]*(\p{Han}*[\p{Hiragana}\p{Katakana}\x{30FC}]\p{Han}*)+[」』)。、〜,.!?]*
It finds a Japanese clause which contains at least a hiragana or a katakana and may have some adjacent kanji characters.

Re: request feature: Han character differentiation

Posted: 2009-07-09 11:44:31
by martin
Kino wrote:If you have applied language attributes properly on text portions in those languages and/or if you are using different fonts for those different languages, i.e. Hiragino for Japanese and LiSong Pro for Traditional Chinese, you can use Attribute Sensitive find to find and select text portions in a given language or in a given font.
Or use Kino's excellent Select by Language or Select by Font macros to automate this for you.

Re: request feature: Han character differentiation

Posted: 2009-07-09 14:22:03
by blindcorpse
Thanks, martin and Kino -- those suggestions were very helpful! I'm surprised at myself that I never figured out how to use the "Attribute Sensitive" feature of search, especially since I've knocked heads against this problem for a long time. And, I've installed Kino's macros, which are great. In addition to selecting text by font or language attribute, they let me see what fonts are in the file, even the missing ones. Excellent! http://www.nisus.com/forum/posting.php? ... =18&t=3462# http://www.nisus.com/forum/posting.php? ... =18&t=3462#