You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by "varsha.yadav" <va...@orkash.com> on 2013/05/15 10:44:46 UTC

Hierarchical Faceting

Hi Everyone,

I am working on Hierarchical Faceting. I am indexing location of 
document with their state and district.
  I would like to find counts of every country with state count and 
district count. I found facet pivot working well to give me count if i 
use single valued fields like
-------
<doc>
<str name="country">india</str>
<str name="state">maharashtra</str>
</doc>
<doc>
<str name="country">india</str>
<str name="state">gujrat</str>
</doc>
<doc>
<str name="country">india</str>
<str name="district">Faridabad</str>
<str name="state">Haryana</str>
</doc>
<doc>
<str name="country">china</str>
<str name="district">foshan</str>
<str name="state">guangdong</str>
</doc>
------------
I found results that is fine :
<arr name="country,state,district,event">
<lst>
<str name="field">country</str>
<str name="value">india</str>
<int name="count">1</int>
<arr name="pivot">

<lst>
<str name="field">state</str>
<str name="value">maharashtra</str>
<int name="count">1</int>
<arr name="pivot"></arr>
<lst>
<str name="field">state</str>
<str name="value">Haryana</str>
<int name="count">1</int>
<arr name="pivot">
<lst>
<str name="field">district</str>
<str name="value">Faridabad</str>
<int name="count">1</int>
</lst>
</arr>
</lst>
</arr>
</lst>
</arr>
</lst>
<lst>
<str name="field">country</str>
<str name="value">china</str>
<int name="count">1</int>
<arr name="pivot">
<lst>
<str name="field">state</str>
....
</lst>
</arr>


But if my document have multiple location like :

<doc>
<arr name="location">
<str>japan|JAPAN|null|</str>
<str>
brisbane|Australia|Queensland
</str>
<str>
afghanistan|AFGHANISTAN|null
</str>
</arr>
</doc>

<doc>
<arr name="location">
<str>
afghanistan|AFGHANISTAN|null
</str>
</arr>
</doc>

<doc>
<arr name="location">
<str>
brisbane|Australia|Queensland
</str>
</str>
</arr>
</doc>


Can anyone tell , me how should i put data in solr index to get 
hierarical data.

Thanks
Varsha

Re: Hierarchical Faceting

Posted by Chris Hostetter <ho...@fucit.org>.
: Subject: Hierarchical Faceting
: References:
:     <15...@uni-bielefeld.de>
: In-Reply-To:
:     <15...@uni-bielefeld.de>

https://people.apache.org/~hossman/#threadhijack
Thread Hijacking on Mailing Lists

When starting a new discussion on a mailing list, please do not reply to 
an existing message, instead start a fresh email.  Even if you change the 
subject line of your email, other mail headers still track which thread 
you replied to and your question is "hidden" in that thread and gets less 
attention.   It makes following discussions in the mailing list archives 
particularly difficult.



-Hoss

Re: Hierarchical Faceting

Posted by "varsha.yadav" <va...@orkash.com>.
Hi,

Thanks Upayavira . But still i am not getting results well.
I have been followinghttp://wiki.apache.org/solr/HierarchicalFaceting
I have hierarchical data for facet . Some documents also have multiple 
hierarchy. like :
Doc#1 London > UK > 51.5
Doc#2 UK >54.0
Doc#3 Indiana > United States > 40.0, London >UK>51.5
Doc#4 United States > 39.7, Washington > United States > 38.8

what can be optimal schema for indexing this data so that i get 
following result by solr query :
1) i want to retrieve hierarchical data count by facet pivot query . ex: 
facet.pivot=country,state
2) I want Lat values wrt every document in query output.ex: Doc#3 
40.0,51.5 . Doc#2 54.0
3) I get direct search query like country:"United states" . state 
:"Washington"

  I think through this i am able to express my requirement along with data .
Please tell me how can i put data index and retreive through query .
I check out solution which you provided me about 
PathHierarchyTokenizerFactory. But along with hierarachy i have to put 
data with name State,district,lat,lon etc. So that i can also access 
direct query on fields.

Thanks
Varsha

On 05/15/2013 10:32 PM, Upayavira wrote:
> Can't you use the PathHierarchyTokenizerFactory mentioned on that page?
> I think it is called descendent-path in the default schema. Won't that
> get you what you want?
>
>    UK/London/Covent Garden
> becomes
>    UK
>    UK/London
>    UK/London/Covent Garden
>
> and
>    India/Maharastra/Pune/Dapodi
> becomes
>    India
>    India/Maharastra
>    India/Maharastra/Pune
>    India/Maharastra/Pune/Dapodi
>
> These fields can be multivalued.
>
> Upayavira
>
> On Wed, May 15, 2013, at 12:29 PM, varsha.yadav wrote:
>> Hi
>>
>> I go through that but i want to index multiple location in single
>> document and a single location have multiple feature/attribute like
>> country,state,district etc. I want  Index and want hierarchical facet
>> result on facet pivot query. One more thing , my document varies may
>> have single ,two ,three.. any number of location.
>>
>>
>> On 05/15/2013 03:55 PM, Upayavira wrote:
>>> http://wiki.apache.org/solr/HierarchicalFaceting
>>>
>>> On Wed, May 15, 2013, at 09:44 AM, varsha.yadav wrote:
>>>> Hi Everyone,
>>>>
>>>> I am working on Hierarchical Faceting. I am indexing location of
>>>> document with their state and district.
>>>>     I would like to find counts of every country with state count and
>>>> district count. I found facet pivot working well to give me count if i
>>>> use single valued fields like
>>>> -------
>>>> <doc>
>>>> <str name="country">india</str>
>>>> <str name="state">maharashtra</str>
>>>> </doc>
>>>> <doc>
>>>> <str name="country">india</str>
>>>> <str name="state">gujrat</str>
>>>> </doc>
>>>> <doc>
>>>> <str name="country">india</str>
>>>> <str name="district">Faridabad</str>
>>>> <str name="state">Haryana</str>
>>>> </doc>
>>>> <doc>
>>>> <str name="country">china</str>
>>>> <str name="district">foshan</str>
>>>> <str name="state">guangdong</str>
>>>> </doc>
>>>> ------------
>>>> I found results that is fine :
>>>> <arr name="country,state,district,event">
>>>> <lst>
>>>> <str name="field">country</str>
>>>> <str name="value">india</str>
>>>> <int name="count">1</int>
>>>> <arr name="pivot">
>>>>
>>>> <lst>
>>>> <str name="field">state</str>
>>>> <str name="value">maharashtra</str>
>>>> <int name="count">1</int>
>>>> <arr name="pivot"></arr>
>>>> <lst>
>>>> <str name="field">state</str>
>>>> <str name="value">Haryana</str>
>>>> <int name="count">1</int>
>>>> <arr name="pivot">
>>>> <lst>
>>>> <str name="field">district</str>
>>>> <str name="value">Faridabad</str>
>>>> <int name="count">1</int>
>>>> </lst>
>>>> </arr>
>>>> </lst>
>>>> </arr>
>>>> </lst>
>>>> </arr>
>>>> </lst>
>>>> <lst>
>>>> <str name="field">country</str>
>>>> <str name="value">china</str>
>>>> <int name="count">1</int>
>>>> <arr name="pivot">
>>>> <lst>
>>>> <str name="field">state</str>
>>>> ....
>>>> </lst>
>>>> </arr>
>>>>
>>>>
>>>> But if my document have multiple location like :
>>>>
>>>> <doc>
>>>> <arr name="location">
>>>> <str>japan|JAPAN|null|</str>
>>>> <str>
>>>> brisbane|Australia|Queensland
>>>> </str>
>>>> <str>
>>>> afghanistan|AFGHANISTAN|null
>>>> </str>
>>>> </arr>
>>>> </doc>
>>>>
>>>> <doc>
>>>> <arr name="location">
>>>> <str>
>>>> afghanistan|AFGHANISTAN|null
>>>> </str>
>>>> </arr>
>>>> </doc>
>>>>
>>>> <doc>
>>>> <arr name="location">
>>>> <str>
>>>> brisbane|Australia|Queensland
>>>> </str>
>>>> </str>
>>>> </arr>
>>>> </doc>
>>>>
>>>>
>>>> Can anyone tell , me how should i put data in solr index to get
>>>> hierarical data.
>>>>
>>>> Thanks
>>>> Varsha
>>
>> -- 
>> Thanks & Regards
>> Varsha
>>


-- 
Thanks & Regards
Varsha


Re: Hierarchical Faceting

Posted by Upayavira <uv...@odoko.co.uk>.
Can't you use the PathHierarchyTokenizerFactory mentioned on that page?
I think it is called descendent-path in the default schema. Won't that
get you what you want?

  UK/London/Covent Garden
becomes
  UK
  UK/London
  UK/London/Covent Garden

and 
  India/Maharastra/Pune/Dapodi
becomes
  India
  India/Maharastra
  India/Maharastra/Pune
  India/Maharastra/Pune/Dapodi

These fields can be multivalued.

Upayavira

On Wed, May 15, 2013, at 12:29 PM, varsha.yadav wrote:
> Hi
> 
> I go through that but i want to index multiple location in single 
> document and a single location have multiple feature/attribute like 
> country,state,district etc. I want  Index and want hierarchical facet 
> result on facet pivot query. One more thing , my document varies may 
> have single ,two ,three.. any number of location.
> 
> 
> On 05/15/2013 03:55 PM, Upayavira wrote:
> > http://wiki.apache.org/solr/HierarchicalFaceting
> >
> > On Wed, May 15, 2013, at 09:44 AM, varsha.yadav wrote:
> >> Hi Everyone,
> >>
> >> I am working on Hierarchical Faceting. I am indexing location of
> >> document with their state and district.
> >>    I would like to find counts of every country with state count and
> >> district count. I found facet pivot working well to give me count if i
> >> use single valued fields like
> >> -------
> >> <doc>
> >> <str name="country">india</str>
> >> <str name="state">maharashtra</str>
> >> </doc>
> >> <doc>
> >> <str name="country">india</str>
> >> <str name="state">gujrat</str>
> >> </doc>
> >> <doc>
> >> <str name="country">india</str>
> >> <str name="district">Faridabad</str>
> >> <str name="state">Haryana</str>
> >> </doc>
> >> <doc>
> >> <str name="country">china</str>
> >> <str name="district">foshan</str>
> >> <str name="state">guangdong</str>
> >> </doc>
> >> ------------
> >> I found results that is fine :
> >> <arr name="country,state,district,event">
> >> <lst>
> >> <str name="field">country</str>
> >> <str name="value">india</str>
> >> <int name="count">1</int>
> >> <arr name="pivot">
> >>
> >> <lst>
> >> <str name="field">state</str>
> >> <str name="value">maharashtra</str>
> >> <int name="count">1</int>
> >> <arr name="pivot"></arr>
> >> <lst>
> >> <str name="field">state</str>
> >> <str name="value">Haryana</str>
> >> <int name="count">1</int>
> >> <arr name="pivot">
> >> <lst>
> >> <str name="field">district</str>
> >> <str name="value">Faridabad</str>
> >> <int name="count">1</int>
> >> </lst>
> >> </arr>
> >> </lst>
> >> </arr>
> >> </lst>
> >> </arr>
> >> </lst>
> >> <lst>
> >> <str name="field">country</str>
> >> <str name="value">china</str>
> >> <int name="count">1</int>
> >> <arr name="pivot">
> >> <lst>
> >> <str name="field">state</str>
> >> ....
> >> </lst>
> >> </arr>
> >>
> >>
> >> But if my document have multiple location like :
> >>
> >> <doc>
> >> <arr name="location">
> >> <str>japan|JAPAN|null|</str>
> >> <str>
> >> brisbane|Australia|Queensland
> >> </str>
> >> <str>
> >> afghanistan|AFGHANISTAN|null
> >> </str>
> >> </arr>
> >> </doc>
> >>
> >> <doc>
> >> <arr name="location">
> >> <str>
> >> afghanistan|AFGHANISTAN|null
> >> </str>
> >> </arr>
> >> </doc>
> >>
> >> <doc>
> >> <arr name="location">
> >> <str>
> >> brisbane|Australia|Queensland
> >> </str>
> >> </str>
> >> </arr>
> >> </doc>
> >>
> >>
> >> Can anyone tell , me how should i put data in solr index to get
> >> hierarical data.
> >>
> >> Thanks
> >> Varsha
> 
> 
> -- 
> Thanks & Regards
> Varsha
> 

Re: Hierarchical Faceting

Posted by "varsha.yadav" <va...@orkash.com>.
Hi

I go through that but i want to index multiple location in single 
document and a single location have multiple feature/attribute like 
country,state,district etc. I want  Index and want hierarchical facet 
result on facet pivot query. One more thing , my document varies may 
have single ,two ,three.. any number of location.


On 05/15/2013 03:55 PM, Upayavira wrote:
> http://wiki.apache.org/solr/HierarchicalFaceting
>
> On Wed, May 15, 2013, at 09:44 AM, varsha.yadav wrote:
>> Hi Everyone,
>>
>> I am working on Hierarchical Faceting. I am indexing location of
>> document with their state and district.
>>    I would like to find counts of every country with state count and
>> district count. I found facet pivot working well to give me count if i
>> use single valued fields like
>> -------
>> <doc>
>> <str name="country">india</str>
>> <str name="state">maharashtra</str>
>> </doc>
>> <doc>
>> <str name="country">india</str>
>> <str name="state">gujrat</str>
>> </doc>
>> <doc>
>> <str name="country">india</str>
>> <str name="district">Faridabad</str>
>> <str name="state">Haryana</str>
>> </doc>
>> <doc>
>> <str name="country">china</str>
>> <str name="district">foshan</str>
>> <str name="state">guangdong</str>
>> </doc>
>> ------------
>> I found results that is fine :
>> <arr name="country,state,district,event">
>> <lst>
>> <str name="field">country</str>
>> <str name="value">india</str>
>> <int name="count">1</int>
>> <arr name="pivot">
>>
>> <lst>
>> <str name="field">state</str>
>> <str name="value">maharashtra</str>
>> <int name="count">1</int>
>> <arr name="pivot"></arr>
>> <lst>
>> <str name="field">state</str>
>> <str name="value">Haryana</str>
>> <int name="count">1</int>
>> <arr name="pivot">
>> <lst>
>> <str name="field">district</str>
>> <str name="value">Faridabad</str>
>> <int name="count">1</int>
>> </lst>
>> </arr>
>> </lst>
>> </arr>
>> </lst>
>> </arr>
>> </lst>
>> <lst>
>> <str name="field">country</str>
>> <str name="value">china</str>
>> <int name="count">1</int>
>> <arr name="pivot">
>> <lst>
>> <str name="field">state</str>
>> ....
>> </lst>
>> </arr>
>>
>>
>> But if my document have multiple location like :
>>
>> <doc>
>> <arr name="location">
>> <str>japan|JAPAN|null|</str>
>> <str>
>> brisbane|Australia|Queensland
>> </str>
>> <str>
>> afghanistan|AFGHANISTAN|null
>> </str>
>> </arr>
>> </doc>
>>
>> <doc>
>> <arr name="location">
>> <str>
>> afghanistan|AFGHANISTAN|null
>> </str>
>> </arr>
>> </doc>
>>
>> <doc>
>> <arr name="location">
>> <str>
>> brisbane|Australia|Queensland
>> </str>
>> </str>
>> </arr>
>> </doc>
>>
>>
>> Can anyone tell , me how should i put data in solr index to get
>> hierarical data.
>>
>> Thanks
>> Varsha


-- 
Thanks & Regards
Varsha


Re: Hierarchical Faceting

Posted by Upayavira <uv...@odoko.co.uk>.
http://wiki.apache.org/solr/HierarchicalFaceting

On Wed, May 15, 2013, at 09:44 AM, varsha.yadav wrote:
> Hi Everyone,
> 
> I am working on Hierarchical Faceting. I am indexing location of 
> document with their state and district.
>   I would like to find counts of every country with state count and 
> district count. I found facet pivot working well to give me count if i 
> use single valued fields like
> -------
> <doc>
> <str name="country">india</str>
> <str name="state">maharashtra</str>
> </doc>
> <doc>
> <str name="country">india</str>
> <str name="state">gujrat</str>
> </doc>
> <doc>
> <str name="country">india</str>
> <str name="district">Faridabad</str>
> <str name="state">Haryana</str>
> </doc>
> <doc>
> <str name="country">china</str>
> <str name="district">foshan</str>
> <str name="state">guangdong</str>
> </doc>
> ------------
> I found results that is fine :
> <arr name="country,state,district,event">
> <lst>
> <str name="field">country</str>
> <str name="value">india</str>
> <int name="count">1</int>
> <arr name="pivot">
> 
> <lst>
> <str name="field">state</str>
> <str name="value">maharashtra</str>
> <int name="count">1</int>
> <arr name="pivot"></arr>
> <lst>
> <str name="field">state</str>
> <str name="value">Haryana</str>
> <int name="count">1</int>
> <arr name="pivot">
> <lst>
> <str name="field">district</str>
> <str name="value">Faridabad</str>
> <int name="count">1</int>
> </lst>
> </arr>
> </lst>
> </arr>
> </lst>
> </arr>
> </lst>
> <lst>
> <str name="field">country</str>
> <str name="value">china</str>
> <int name="count">1</int>
> <arr name="pivot">
> <lst>
> <str name="field">state</str>
> ....
> </lst>
> </arr>
> 
> 
> But if my document have multiple location like :
> 
> <doc>
> <arr name="location">
> <str>japan|JAPAN|null|</str>
> <str>
> brisbane|Australia|Queensland
> </str>
> <str>
> afghanistan|AFGHANISTAN|null
> </str>
> </arr>
> </doc>
> 
> <doc>
> <arr name="location">
> <str>
> afghanistan|AFGHANISTAN|null
> </str>
> </arr>
> </doc>
> 
> <doc>
> <arr name="location">
> <str>
> brisbane|Australia|Queensland
> </str>
> </str>
> </arr>
> </doc>
> 
> 
> Can anyone tell , me how should i put data in solr index to get 
> hierarical data.
> 
> Thanks
> Varsha