You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Asfand Qazi <aq...@sanger.ac.uk> on 2012/08/30 12:03:12 UTC

Possible to have Solr documents with deeply nested data structures (i.e. 'hashes within hashes')?

Hi,

Is it possible to have a Solr documents with deeply nested data structures?

e.g. (in JSON)

{
     "name": "Fred",
     "measurements": {
         "chest": "15",
         "legs": "32",
         ...
     }
}

?

Thanks

-- 
Regards,
       Asfand Yar Qazi
       Team 87 - High Throughput Gene Targeting
       Wellcome Trust Sanger Institute



-- 
 The Wellcome Trust Sanger Institute is operated by Genome Research 
 Limited, a charity registered in England with number 1021457 and a 
 company registered in England with number 2742969, whose registered 
 office is 215 Euston Road, London, NW1 2BE. 

Re: Possible to have Solr documents with deeply nested data structures (i.e. 'hashes within hashes')?

Posted by Walter Underwood <wu...@wunderwood.org>.
On Aug 30, 2012, at 8:07 AM, Asfand Qazi wrote:

> The consensus around here seems to be to start using multiple cores to hold the different bits of a deeply nested record anyway, but I'll remember the flattening suggestions made.

It is hard for me to understand how that could possibly work.

If you gave a real example of your data, we could help more.

wunder
--
Walter Underwood
wunder@wunderwood.org




Re: Possible to have Solr documents with deeply nested data structures (i.e. 'hashes within hashes')?

Posted by Asfand Qazi <aq...@sanger.ac.uk>.
On 30/08/12 15:51, Alexandre Rafalovitch wrote:
> Don't treat SOLR as your primary database with complex structure, it
> is not built for that.

On 30/08/12 16:01, Jack Krupansky wrote:
> Maybe start by focusing on what you expect that a user query will look
> like in Solr.
>

Yeah, thanks guys - I'll remember that.  Maybe I'm getting ahead of myself.

The consensus around here seems to be to start using multiple cores to 
hold the different bits of a deeply nested record anyway, but I'll 
remember the flattening suggestions made.

Thanks

-- 
Regards,
       Asfand Yar Qazi
       Team 87 - High Throughput Gene Targeting
       Wellcome Trust Sanger Institute



-- 
 The Wellcome Trust Sanger Institute is operated by Genome Research 
 Limited, a charity registered in England with number 1021457 and a 
 company registered in England with number 2742969, whose registered 
 office is 215 Euston Road, London, NW1 2BE. 

Re: Possible to have Solr documents with deeply nested data structures (i.e. 'hashes within hashes')?

Posted by Jack Krupansky <ja...@basetechnology.com>.
There are multi-valued fields as well. You just have to be creative in the 
flattening process.

And there is a "join" capability as well:
http://wiki.apache.org/solr/Join

In any case, try to take the simplest approaches first before getting overly 
complex.

Maybe start by focusing on what you expect that a user query will look like 
in Solr.

-- Jack Krupansky

-----Original Message----- 
From: Asfand Qazi
Sent: Thursday, August 30, 2012 10:46 AM
To: solr-user@lucene.apache.org
Subject: Re: Possible to have Solr documents with deeply nested data 
structures (i.e. 'hashes within hashes')?

On 30/08/12 15:19, Jack Krupansky wrote:
> The general rule is that you need to flatten your data. So, you would
> have "chest_measurement" and "leg_measurement" fields.
>
> -- Jack Krupansky

Ah.  What if I cannot flatten it because I have an array of hashes?

Thanks

Asfand Yar Qazi

>
> -----Original Message----- From: Asfand Qazi
> Sent: Thursday, August 30, 2012 6:03 AM
> To: solr-user@lucene.apache.org
> Subject: Possible to have Solr documents with deeply nested data
> structures (i.e. 'hashes within hashes')?
>
> Hi,
>
> Is it possible to have a Solr documents with deeply nested data 
> structures?
>
> e.g. (in JSON)
>
> {
>      "name": "Fred",
>      "measurements": {
>          "chest": "15",
>          "legs": "32",
>          ...
>      }
> }
>
> ?
>
> Thanks
>


-- 
Regards,
       Asfand Yar Qazi
       Team 87 - High Throughput Gene Targeting
       Wellcome Trust Sanger Institute



-- 
The Wellcome Trust Sanger Institute is operated by Genome Research
Limited, a charity registered in England with number 1021457 and a
company registered in England with number 2742969, whose registered
office is 215 Euston Road, London, NW1 2BE. 


Re: Possible to have Solr documents with deeply nested data structures (i.e. 'hashes within hashes')?

Posted by Alexandre Rafalovitch <ar...@gmail.com>.
You can always flatten. Remember, you are using SOLR to find the
records. So, just make sure you flatten so each result represents one
unit of information for your purposes.

So, you push all leg_measurements into one big leg_measurement
multiValue field (and return all ?? that have at least one
leg_measurement equal to 15.

Or you pull parent details into individual record with a record being
a measurement and include enough information to ID the parent record,
which you then sort/uniq on client.

If you have a deep structure, you will probably use SOLR just to index
(stored=false for fields) and use record ID to get the full record
detail from the original database. Don't treat SOLR as your primary
database with complex structure, it is not built for that.

Regards,
   Alex.

Personal blog: http://blog.outerthoughts.com/
LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
- Time is the quality of nature that keeps events from happening all
at once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD
book)


On Thu, Aug 30, 2012 at 10:46 AM, Asfand Qazi <aq...@sanger.ac.uk> wrote:
> On 30/08/12 15:19, Jack Krupansky wrote:
>>
>> The general rule is that you need to flatten your data. So, you would
>> have "chest_measurement" and "leg_measurement" fields.
>>
>> -- Jack Krupansky
>
>
> Ah.  What if I cannot flatten it because I have an array of hashes?
>
> Thanks
>
> Asfand Yar Qazi
>
>
>>
>> -----Original Message----- From: Asfand Qazi
>> Sent: Thursday, August 30, 2012 6:03 AM
>> To: solr-user@lucene.apache.org
>> Subject: Possible to have Solr documents with deeply nested data
>> structures (i.e. 'hashes within hashes')?
>>
>> Hi,
>>
>> Is it possible to have a Solr documents with deeply nested data
>> structures?
>>
>> e.g. (in JSON)
>>
>> {
>>      "name": "Fred",
>>      "measurements": {
>>          "chest": "15",
>>          "legs": "32",
>>          ...
>>      }
>> }
>>
>> ?
>>
>> Thanks
>>
>
>
> --
> Regards,
>       Asfand Yar Qazi
>       Team 87 - High Throughput Gene Targeting
>       Wellcome Trust Sanger Institute
>
>
>
> --
> The Wellcome Trust Sanger Institute is operated by Genome Research Limited,
> a charity registered in England with number 1021457 and a company registered
> in England with number 2742969, whose registered office is 215 Euston Road,
> London, NW1 2BE.

Re: Possible to have Solr documents with deeply nested data structures (i.e. 'hashes within hashes')?

Posted by Asfand Qazi <aq...@sanger.ac.uk>.
On 30/08/12 15:19, Jack Krupansky wrote:
> The general rule is that you need to flatten your data. So, you would
> have "chest_measurement" and "leg_measurement" fields.
>
> -- Jack Krupansky

Ah.  What if I cannot flatten it because I have an array of hashes?

Thanks

Asfand Yar Qazi

>
> -----Original Message----- From: Asfand Qazi
> Sent: Thursday, August 30, 2012 6:03 AM
> To: solr-user@lucene.apache.org
> Subject: Possible to have Solr documents with deeply nested data
> structures (i.e. 'hashes within hashes')?
>
> Hi,
>
> Is it possible to have a Solr documents with deeply nested data structures?
>
> e.g. (in JSON)
>
> {
>      "name": "Fred",
>      "measurements": {
>          "chest": "15",
>          "legs": "32",
>          ...
>      }
> }
>
> ?
>
> Thanks
>


-- 
Regards,
       Asfand Yar Qazi
       Team 87 - High Throughput Gene Targeting
       Wellcome Trust Sanger Institute



-- 
 The Wellcome Trust Sanger Institute is operated by Genome Research 
 Limited, a charity registered in England with number 1021457 and a 
 company registered in England with number 2742969, whose registered 
 office is 215 Euston Road, London, NW1 2BE. 

Re: Possible to have Solr documents with deeply nested data structures (i.e. 'hashes within hashes')?

Posted by Jack Krupansky <ja...@basetechnology.com>.
The general rule is that you need to flatten your data. So, you would have 
"chest_measurement" and "leg_measurement" fields.

-- Jack Krupansky

-----Original Message----- 
From: Asfand Qazi
Sent: Thursday, August 30, 2012 6:03 AM
To: solr-user@lucene.apache.org
Subject: Possible to have Solr documents with deeply nested data structures 
(i.e. 'hashes within hashes')?

Hi,

Is it possible to have a Solr documents with deeply nested data structures?

e.g. (in JSON)

{
     "name": "Fred",
     "measurements": {
         "chest": "15",
         "legs": "32",
         ...
     }
}

?

Thanks

-- 
Regards,
       Asfand Yar Qazi
       Team 87 - High Throughput Gene Targeting
       Wellcome Trust Sanger Institute



-- 
The Wellcome Trust Sanger Institute is operated by Genome Research
Limited, a charity registered in England with number 1021457 and a
company registered in England with number 2742969, whose registered
office is 215 Euston Road, London, NW1 2BE.