You are viewing a plain text version of this content. The canonical link for it is here.
Posted to xindice-dev@xml.apache.org by Konrad Scherer <bc...@uottawa.ca> on 2002/05/28 17:19:21 UTC
UTF-8 [again]
Hello all,
As I am new to the list I will briefly explain my situation. I work for
small university project that is creating a fully bilingual Canadian
English and French dictionary. The project started in the 1980`s and is
currently still done in SGML. I have completed the conversion to XML and I
have now planning on using the Xindice and Cocoon combination to search the
XML documents and make them available on the Internet.
As I am sure most of you know the 1.0 release totally garbled all special
chars including the letters with french accents. The CVS (last friday May
24) version didn`t solve the problem but the patch from
http://lambiek.amplexor.be/downloads/xindice-utf8-patches
outputs the special chars in UTF-8. Thank you very much for your work.
Some additional observations.
1) The last command in the patch failed. Probably not important.
2) The Xindice-HTTP-0.8 package does not like UTF-8 characters.
I also have a question. Xindice resolves all entities when storing
document. Would it not be possible to store the unresolved entity? XPath
queries would resolve the entity during a search but I could retrieve the
document with the entities unresolved and then let the browser or whatever
worry about the display.
Again thank you for your time.
Konrad Scherer