You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Aliya Virani <al...@80legs.com> on 2011/09/15 17:05:09 UTC

Multiple shards on same machine find matches but return 0 results.

Hi,

Recently we have been trying to scale up our Solandra setup to make use of a
more powerful server. To improve query speeds we tried reducing the index
size, and thus increasing the number of shards on a single machine. While we
had no trouble searching and return results when we had a single shard, with
multiple shards we are getting an interesting bug where the search is able
to find the matching documents and returns an accurate count, but returns 0
documents. What could be causing this problem? Are we missing an obvious
parameter? While using multiple shards, if I set isShard=true, we do get
back results with total number found from only one shard.  We tried hitting
each of the cores directly by using the setParam from SolrJ, but are not
getting any results back.  We found the names of the core from the logs
(ie.10.1.10.200:8983/solandra/df~1).


To debug we have set up a test environment that has the latest release of
Solandra and uses the default settings. In this setting with just two shards
we are seeing the same issue. We have also tested with different size shards
ranging from 256 to 4194304. In all cases as soon as we have more than 1
shard Solandra stops returning results even though it is clear the results
were found. Below is some of the log information.

Server specs:
8 cores
32 GB of memory (though we are only allocating 16GB for Solandra)

Using the default settings in Solandra Properties, we added in 2,000,000
documents to ensure there were two shards on the same machine.

For a query that has not been cached:

DEBUG 10:28:55,004 core: df
DEBUG 10:28:55,005 Adding shard(df): 10.1.10.200:8983/solandra/df~0
DEBUG 10:28:55,005 Adding shard(df): 10.1.10.200:8983/solandra/df~1
DEBUG 10:28:55,014 Fetching 0 Docs
INFO 10:28:55,015 [df] webapp=/solandra path=/select
params={fl=key,score&start=0&q=province:az&isShard=true&wt=javabin&fsv=true&rows=10&version=2}
hits=0 status=0 QTime=3
INFO 10:28:55,821 GC for ParNew: 258 ms, 586012000 reclaimed leaving
2122387984 used; max is 16955473920
DEBUG 10:28:58,034 Fetching 10 Docs
DEBUG 10:28:58,035 Going to bulk load 10 documents
DEBUG 10:28:58,099 Document read took: 63ms
INFO 10:28:58,099 [df] webapp=/solandra path=/select
params={fl=key,score&start=0&q=province:az&isShard=true&wt=javabin&fsv=true&rows=10&version=2}
hits=99470 status=0 QTime=3087
DEBUG 10:28:58,101 Document read took: 1ms
DEBUG 10:28:58,102 Document read took: 1ms
DEBUG 10:28:58,104 Document read took: 1ms
DEBUG 10:28:58,105 Document read took: 1ms
DEBUG 10:28:58,107 Document read took: 2ms
DEBUG 10:28:58,108 Document read took: 1ms
DEBUG 10:28:58,109 Document read took: 1ms
DEBUG 10:28:58,110 Document read took: 1ms
DEBUG 10:28:58,112 Document read took: 1ms
DEBUG 10:28:58,113 Document read took: 1ms
DEBUG 10:28:58,118 Fetching 0 Docs
INFO 10:28:58,118 [df] webapp=/solandra path=/select
params={isShard=true&wt=javabin&q=province:az&ids=[us/az/yuma/1152s4thave],[us/az/tempe/208sriverdr],[us/az/mundspark/475pinewoodblvd],[us/az/phoenix/2338wstellaln],[us/az/tucson/3341wwildwooddr],[us/az/surprise/15128wbellrd],[us/az/phoenix/3222egeorgiaave],[us/az/lakehavasucity/2250catamarandr],[us/az/huachucacity/264shuachucablvd],[us/az/tucson/6161sparkave]&version=2}
status=0 QTime=1
INFO 10:28:58,119 [df] webapp=/solandra path=/select
params={wt=javabin&q=province:az&version=2} status=0 QTime=3115

For a query that has been cached:

DEBUG 10:27:36,350 core: df
INFO 10:27:36,351 ShardInfo for df has expired
INFO 10:27:36,353 Found reserved
shard1(106758077800188110322537822484278066430):178410 TO 180224
DEBUG 10:27:36,353 Adding shard(df): 10.1.10.200:8983/solandra/df~0
DEBUG 10:27:36,353 Adding shard(df): 10.1.10.200:8983/solandra/df~1
DEBUG 10:27:36,359 Fetching 0 Docs
INFO 10:27:36,360 [df] webapp=/solandra path=/select
params={fl=key,score&start=0&q=province:ak&isShard=true&wt=javabin&fsv=true&rows=10&version=2}
hits=0 status=0 QTime=2
DEBUG 10:27:36,362 Fetching 10 Docs
DEBUG 10:27:36,363 Found doc in cache
INFO 10:27:36,363 [df] webapp=/solandra path=/select
params={fl=key,score&start=0&q=province:ak&isShard=true&wt=javabin&fsv=true&rows=10&version=2}
hits=14707 status=0 QTime=5
DEBUG 10:27:36,363 Found doc in cache
DEBUG 10:27:36,363 Found doc in cache
DEBUG 10:27:36,363 Found doc in cache
DEBUG 10:27:36,364 Found doc in cache
DEBUG 10:27:36,364 Found doc in cache
DEBUG 10:27:36,364 Found doc in cache
DEBUG 10:27:36,364 Found doc in cache
DEBUG 10:27:36,364 Found doc in cache
DEBUG 10:27:36,365 Found doc in cache
DEBUG 10:27:36,365 Found doc in cache
DEBUG 10:27:36,369 Fetching 0 Docs
INFO 10:27:36,369 [df] webapp=/solandra path=/select
params={isShard=true&wt=javabin&q=province:ak&ids=[us/ak/fairbanks/1483ballainerd],[us/ak/anchorage/4451etudorrd],[us/ak/anchorage/600cordovast],[us/ak/anchorage/6048e6thave],[us/ak/anchorage/940tyonekdr],[us/ak/fairbanks/3800universityaves],[us/ak/kenai/47189sherwoodcir],[us/ak/anchorage/12801oldsewardhwy],[us/ak/anchorage/8400raintreecir],[us/ak/juneau/9150skywoodln]&version=2}
status=0 QTime=1
INFO 10:27:36,370 [df] webapp=/solandra path=/select
params={wt=javabin&q=province:ak&version=2} status=0 QTime=20

In case it was due to the way we were adding the documents, here is the log
for a single write/update:

INFO 10:33:14,666 {add=[us/ak/anchorage/429industrialway]} 0 8
INFO 10:33:14,666 [df] webapp=/solandra path=/update
params={wt=javabin&version=2} status=0 QTime=8
DEBUG 10:33:14,683 update for document 606298
DEBUG 10:33:14,683 Adding 606298 to df~0
DEBUG 10:33:14,685 Deleted all terms for: 606298
DEBUG 10:33:14,685 df~0 - firstTerm:
142390502986727170797762286641249888713���key
DEBUG 10:33:14,685 df~0 - firstTerm:
142390502986727170797762286641249888713���category
DEBUG 10:33:14,686 df~0 - firstTerm:
142390502986727170797762286641249888713���category
DEBUG 10:33:14,686 df~0 - firstTerm:
142390502986727170797762286641249888713���category
DEBUG 10:33:14,686 df~0 - firstTerm:
142390502986727170797762286641249888713���category
DEBUG 10:33:14,687 df~0 - firstTerm:
142390502986727170797762286641249888713���category
DEBUG 10:33:14,687 df~0 - firstTerm:
142390502986727170797762286641249888713���source
DEBUG 10:33:14,687 df~0 - firstTerm:
142390502986727170797762286641249888713���dateAdded
DEBUG 10:33:14,687 df~0 - firstTerm:
142390502986727170797762286641249888713���dateUpdated
DEBUG 10:33:14,687 df~0 - firstTerm:
142390502986727170797762286641249888713���type
DEBUG 10:33:14,687 df~0 - firstTerm:
142390502986727170797762286641249888713���encoding
DEBUG 10:33:14,688 df~0 - firstTerm:
142390502986727170797762286641249888713���postalcode
DEBUG 10:33:14,688 df~0 - firstTerm:
142390502986727170797762286641249888713���sic
DEBUG 10:33:14,688 df~0 - firstTerm:
142390502986727170797762286641249888713���phone
DEBUG 10:33:14,688 df~0 - firstTerm:
142390502986727170797762286641249888713���long
DEBUG 10:33:14,688 df~0 - firstTerm:
142390502986727170797762286641249888713���city
DEBUG 10:33:14,688 df~0 - firstTerm:
142390502986727170797762286641249888713���country
DEBUG 10:33:14,689 df~0 - firstTerm:
142390502986727170797762286641249888713���address
DEBUG 10:33:14,689 df~0 - firstTerm:
142390502986727170797762286641249888713���name
DEBUG 10:33:14,689 df~0 - firstTerm:
142390502986727170797762286641249888713���province
DEBUG 10:33:14,689 df~0 - firstTerm:
142390502986727170797762286641249888713���lat