You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by adityab <ad...@yahoo.com> on 2013/05/18 20:49:33 UTC

Wide vs Tall document in Solr 4.2.1

Hi, 
We recently decided to move from Solr version 3.5 to 4.2.1. The transition
seam to be smooth from development point but i see some intermediate issues
with our cluster. 
Some information We use the classic Master/Slave model (have plans to move
to Cloud v4.3)

#documents 300K and have around 150 fields (including dynamic) 
index size 10GB

Most of the fields are multiValued (type String) and the size of array in
those vary from 5 to 50K. So our 30% of popular documents are tall. Not all
information in this multivalued fields is required so at application layer
we loop and eliminate the unwanted. These are stored is such fashion because
of the 1 to many mapping in SQL DB.

Issues that we observed is high CPU and Memory utilization while retrieving
these document with large multivalued fields.
So my questions is if its possible to make this tall document to a wide
document so only required information is fetched. Is this a better approach
to look for? Any other thoughts are welcomed.

thanks
Aditya 



--
View this message in context: http://lucene.472066.n3.nabble.com/Wide-vs-Tall-document-in-Solr-4-2-1-tp4064409.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Wide vs Tall document in Solr 4.2.1

Posted by adityab <ad...@yahoo.com>.
thanks for our reply Chris, 
Yes i am aware of this Bug. we had reported this through lucid work during
our 4.2.0 evaluation :) 

I will try to get thread dump and verify where CPU is pegging  

Regarding tall documents. We have a huge list of multivalued in the
document. which i refer as tall document. 
If we have the huge multivalued field in fl we see high CPU and take long
time. I have a way to split this large multivalued into smaller list to
which I am referring as wide document. 

thanks
Aditya



--
View this message in context: http://lucene.472066.n3.nabble.com/Wide-vs-Tall-document-in-Solr-4-2-1-tp4064409p4064472.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Wide vs Tall document in Solr 4.2.1

Posted by Chris Hostetter <ho...@fucit.org>.
: We recently decided to move from Solr version 3.5 to 4.2.1. The transition
	...
: Most of the fields are multiValued (type String) and the size of array in
: those vary from 5 to 50K. So our 30% of popular documents are tall. Not all
	...
: Issues that we observed is high CPU and Memory utilization while retrieving
: these document with large multivalued fields.

Are you certain you ar using 4.2.1 and not 4.2 ?

There was a particularly bad bug related to "enableLazyFieldLoading" 
affecting Solr 4.0, 4.1, and 4.2, but it should *not* affect 4.2.1...

	https://issues.apache.org/jira/browse/SOLR-4589

If you are seeing slow response times and heavy CPU spikes, it would help 
to know if you could take some thread dumps during those CPU spikes to see 
what it chewing up CPU ... you may just be seeing the effects of stored 
field compression -- which uses more CPU on stored field retrieval to 
decompress the blocks of field values, but allows the index size to be 
much smaller so more things can be cached in RAM.

: So my questions is if its possible to make this tall document to a wide
: document so only required information is fetched. Is this a better 
: approach to look for? Any other thoughts are welcomed.

I don't really understand what you mean by "tall" vs "wide" (i thought i 
understood what you ment by "tall" initially, but i don't understand what 
you mean by "make the tall document side"

just in case it's not obvious: if there are stored fields you don't want 
back in the response, leave them out of your "fl" param and only request 
the fields you actaully want.


-Hoss