You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Samuel Tang <sa...@yahoo.com.hk> on 2004/04/28 17:39:30 UTC

[Lucene] XML Indexing

XMLIndexingDemo seems not able to index traditional Chinese characters. I can only search for English text and not Chinese. In fact, my XML document contains both Chinese and English text. How can I fix this problem? Is it necessary for me to convert the Chinese characters in BIG5 to UTF-8 before doing the file indexing? If it is, then how can we do it? This problem won't happen on indexing bilingual HTML files (Chinese & English) with Lucene Demo HTML parser. 

必殺技、飲歌、小星星...
浪漫鈴聲  情心連繫
http://ringtone.yahoo.com.hk/