You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by "Sundling, Paul" <pa...@sonyconnect.com> on 2007/07/27 04:26:34 UTC
Solr and Chines/Japenese
Are there any known Solr sites that are in Chinese or Japenese?
I need to include links to such sites for a comparison I'm doing on
enterprise search engines.
I realize that if I stay UTF-8 it should work and I can use the CJK
analyzer.
Paul Sundling
Re: Solr and Chines/Japenese
Posted by Alan Darnell <al...@utoronto.ca>.
What about scripts that are written right to left? How does Solr
handle these in terms of sorting and searching. Can left-to-right
and right-to-left scripts be handled in the same Solr document?
lan
On 27-Jul-07, at 8:29 AM, Erik Hatcher wrote:
>
> On Jul 27, 2007, at 6:17 AM, Erik Hatcher wrote:
>
>>
>> On Jul 26, 2007, at 10:26 PM, Sundling, Paul wrote:
>>> Are there any known Solr sites that are in Chinese or Japenese?
>>
>> This might be the first mention of this project in the Solr
>> community, and I'm certainly not confident our server can handle
>> the load but here goes anyway :)
>>
>> <http://blacklight.betech.virginia.edu/>
>>
>> The bulk of the content, 3.8M documents, is not Chinese, but there
>> are 320 Tang dynasty poems indexed there with both English and
>> Chinese content. Click on the "Tang Dynasty Poems" on the top
>> right facet. You can search in Chinese, no problem too:
>
> I had trouble with the link I sent before, but maybe this one will
> work more generally:
>
> <http://blacklight.betech.virginia.edu/search?q=%E7%81%AB+AND+%E6%
> B0%B4>
>
>
Re: Solr and Chines/Japenese
Posted by Erik Hatcher <er...@ehatchersolutions.com>.
On Jul 27, 2007, at 6:17 AM, Erik Hatcher wrote:
>
> On Jul 26, 2007, at 10:26 PM, Sundling, Paul wrote:
>> Are there any known Solr sites that are in Chinese or Japenese?
>
> This might be the first mention of this project in the Solr
> community, and I'm certainly not confident our server can handle
> the load but here goes anyway :)
>
> <http://blacklight.betech.virginia.edu/>
>
> The bulk of the content, 3.8M documents, is not Chinese, but there
> are 320 Tang dynasty poems indexed there with both English and
> Chinese content. Click on the "Tang Dynasty Poems" on the top
> right facet. You can search in Chinese, no problem too:
I had trouble with the link I sent before, but maybe this one will
work more generally:
<http://blacklight.betech.virginia.edu/search?q=%E7%81%AB+AND+%E6%B0%
B4>
Re: Solr and Chines/Japenese
Posted by Erik Hatcher <er...@ehatchersolutions.com>.
On Jul 26, 2007, at 10:26 PM, Sundling, Paul wrote:
> Are there any known Solr sites that are in Chinese or Japenese?
This might be the first mention of this project in the Solr
community, and I'm certainly not confident our server can handle the
load but here goes anyway :)
<http://blacklight.betech.virginia.edu/>
The bulk of the content, 3.8M documents, is not Chinese, but there
are 320 Tang dynasty poems indexed there with both English and
Chinese content. Click on the "Tang Dynasty Poems" on the top right
facet. You can search in Chinese, no problem too:
<http://blacklight.betech.virginia.edu/search?q=火+AND+水>
(hopefully that link will pass through e-mail ok)
Blacklight is an unsupported demo of library data + Solr + Ruby on
Rails. The library data comes from 3 different sources:
* MARC data from our integrated library system, converted to UTF8 -
there are non-English words in some of this data (tinker with the
language facet to stumble on Russian and other stuff)
* TEI data sample from our "Digital Library"
* HTML scrapped Tang dynasty poems
Erik