You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by je...@bnf.fr on 2013/08/29 14:52:12 UTC

[SOLR 4.4 or 4.2] indexing with dih and solrcloud

Hello,

I'm trying to index documents with Data import handler and solrcloud at the
same time. (huge collection, need to make parallel indexing)

First I had a dih configuration whichs works with solr standalone.
(Indexing for two month every week)

I've transformed my configuration to "cloudify" it with one shard at the
begining (adding config file + launching with zkrun option)
I see my solr admin interface with the cloud panels (tree view, 1 shard
connected and active ...), so it seems to work.

When I indexusing DIH, it looks like it was working, the entry xml files
are read but no documents are stored in the index, exactly as I would have
put commit argument to false.

This is the answer of dih request
{
  "responseHeader":{
    "status":0,
    "QTime":32871},
  "initArgs":[
    "defaults",[
      "config","mnb-data-config.xml"]],
  "command":"full-import",
  "mode":"debug",
  "documents":[],
  "verbose-output":[
    "entity:noticebib",[
      "entity:processorDocument",[],
...
      "entity:processorDocument",[],
      null,"----------- row #1-------------",
      "CHEMINRELATIF","3/7/000/37000143.xml",
      null,"---------------------------------------------",
...
	"status":"idle",
  "importResponse":"",
  "statusMessages":{
    "Total Requests made to DataSource":"16",
    "Total Rows Fetched":"15",
    "Total Documents Skipped":"0",
    "Full Dump Started":"2013-08-29 12:08:48",
    "Total Documents Processed":"0",
    "Time taken":"0:0:32.684"},

In the logs (see above), I see PRE_UPDATE FINISH message
And after, some debug messages about "Could not retrieve configuration"
coming from zookeeper.

So my question, what can be wrong in my config?
_ something about synchro in zookeeper (could not retrieve message)
_ A step missing in data import handler
I don't see how to diagnose that point?

DEBUG 2013-08-29 12:09:21,411 http-8080-1
org.apache.solr.handler.dataimport.URLDataSource  (92) - Accessing URL:
file:/X:/3/7/000/37000190.xml
DEBUG 2013-08-29 12:09:21,520 http-8080-1
org.apache.solr.handler.dataimport.LogTransformer  (58) - Notice fichier:
3/7/000/37000190.xml
DEBUG 2013-08-29 12:09:21,520 http-8080-1 fr.bnf.solr.BnfDateTransformer
(696) - NN=37000190
INFO 2013-08-29 12:09:21,520 http-8080-1
org.apache.solr.handler.dataimport.DocBuilder  (267) - Time taken =
0:0:32.684
DEBUG 2013-08-29 12:09:21,536 http-8080-1
org.apache.solr.update.processor.LogUpdateProcessor  (178) - PRE_UPDATE
FINISH {{params
(optimize=true&indent=true&start=10&commit=true&verbose=true&entity=noticebib&command=full-import&debug=true&wt=json&rows=5),defaults
(config=mnb-data-config.xml)}}
INFO 2013-08-29 12:09:21,536 http-8080-1
org.apache.solr.update.processor.LogUpdateProcessor  (198) - [noticesBIB]
webapp=/solr-0.4.0-pfd path=/dataimportMNb params=
{optimize=true&indent=true&start=10&commit=true&verbose=true&entity=noticebib&command=full-import&debug=true&wt=json&rows=5}
 {} 0 32871
DEBUG 2013-08-29 12:09:21,583 http-8080-1
org.apache.solr.servlet.SolrDispatchFilter  (388) - Closing out
SolrRequest: {{params
(optimize=true&indent=true&start=10&commit=true&verbose=true&entity=noticebib&command=full-import&debug=true&wt=json&rows=5),defaults
(config=mnb-data-config.xml)}}
DEBUG 2013-08-29 12:09:21,833 main-SendThread(127.0.0.1:9080)
org.apache.zookeeper.client.ZooKeeperSaslClient  (519) - Could not retrieve
login configuration: java.lang.SecurityException: Impossible de trouver une
configuration de connexion
DEBUG 2013-08-29 12:09:21,833 main-SendThread(127.0.0.1:9080)
org.apache.zookeeper.client.ZooKeeperSaslClient  (519) - Could not retrieve
login configuration: java.lang.SecurityException: Impossible de trouver une
configuration de connexion
DEBUG 2013-08-29 12:09:21,833 main-SendThread(127.0.0.1:9080)
org.apache.zookeeper.client.ZooKeeperSaslClient  (519) - Could not retrieve
login configuration: java.lang.SecurityException: Impossible de trouver une
configuration de connexion
DEBUG 2013-08-29 12:09:21,833 SyncThread:0
org.apache.zookeeper.server.FinalRequestProcessor  (88) - Processing
request:: sessionid:0x140c98bbe430000 type:getData cxid:0x39d
zxid:0xfffffffffffffffe txntype:unknown reqpath:/overseer_elect/leader
DEBUG 2013-08-29 12:09:21,833 SyncThread:0
org.apache.zookeeper.server.FinalRequestProcessor  (160) -
sessionid:0x140c98bbe430000 type:getData cxid:0x39d zxid:0xfffffffffffffffe
txntype:unknown reqpath:/overseer_elect/leader
DEBUG 2013-08-29 12:09:21,833 main-SendThread(127.0.0.1:9080)
org.apache.zookeeper.client.ZooKeeperSaslClient  (519) - Could not retrieve
login configuration: java.lang.SecurityException: Impossible de trouver une
configuration de connexion
DEBUG 2013-08-29 12:09:21,833 main-SendThread(127.0.0.1:9080)
org.apache.zookeeper.client.ZooKeeperSaslClient  (519) - Could not retrieve
login configuration: java.lang.SecurityException: Impossible de trouver une
configuration de connexion


PS: At the begining, I was in solr 4.2.1 and I tried with 4.0.0, but I have
the same problem.

Regards,
Jérôme


Fermeture annuelle des sites François-Mitterrand et Richelieu du 2 au 15 septembre 2013 Avant d'imprimer, pensez à l'environnement. 

Re: [SOLR 4.4 or 4.2] indexing with dih and solrcloud

Posted by Erick Erickson <er...@gmail.com>.
First, are you sure you have a functioning SolrCloud setup? It
looks from the error like you haven't pushed the config files up
to ZK. Take a look at:
http://wiki.apache.org/solr/SolrCloud#Command_Line_Util

You should be able to do a "downconfig" on the Solr configuration
files you uploaded to Solr. If you can't, I'd suspect you
didn't start Solr up the first time with the bootstrap option....

This may be completely off base....

Best
Erick


On Thu, Aug 29, 2013 at 8:52 AM, <je...@bnf.fr> wrote:

>
> Hello,
>
> I'm trying to index documents with Data import handler and solrcloud at the
> same time. (huge collection, need to make parallel indexing)
>
> First I had a dih configuration whichs works with solr standalone.
> (Indexing for two month every week)
>
> I've transformed my configuration to "cloudify" it with one shard at the
> begining (adding config file + launching with zkrun option)
> I see my solr admin interface with the cloud panels (tree view, 1 shard
> connected and active ...), so it seems to work.
>
> When I indexusing DIH, it looks like it was working, the entry xml files
> are read but no documents are stored in the index, exactly as I would have
> put commit argument to false.
>
> This is the answer of dih request
> {
>   "responseHeader":{
>     "status":0,
>     "QTime":32871},
>   "initArgs":[
>     "defaults",[
>       "config","mnb-data-config.xml"]],
>   "command":"full-import",
>   "mode":"debug",
>   "documents":[],
>   "verbose-output":[
>     "entity:noticebib",[
>       "entity:processorDocument",[],
> ...
>       "entity:processorDocument",[],
>       null,"----------- row #1-------------",
>       "CHEMINRELATIF","3/7/000/37000143.xml",
>       null,"---------------------------------------------",
> ...
>         "status":"idle",
>   "importResponse":"",
>   "statusMessages":{
>     "Total Requests made to DataSource":"16",
>     "Total Rows Fetched":"15",
>     "Total Documents Skipped":"0",
>     "Full Dump Started":"2013-08-29 12:08:48",
>     "Total Documents Processed":"0",
>     "Time taken":"0:0:32.684"},
>
> In the logs (see above), I see PRE_UPDATE FINISH message
> And after, some debug messages about "Could not retrieve configuration"
> coming from zookeeper.
>
> So my question, what can be wrong in my config?
> _ something about synchro in zookeeper (could not retrieve message)
> _ A step missing in data import handler
> I don't see how to diagnose that point?
>
> DEBUG 2013-08-29 12:09:21,411 http-8080-1
> org.apache.solr.handler.dataimport.URLDataSource  (92) - Accessing URL:
> file:/X:/3/7/000/37000190.xml
> DEBUG 2013-08-29 12:09:21,520 http-8080-1
> org.apache.solr.handler.dataimport.LogTransformer  (58) - Notice fichier:
> 3/7/000/37000190.xml
> DEBUG 2013-08-29 12:09:21,520 http-8080-1 fr.bnf.solr.BnfDateTransformer
> (696) - NN=37000190
> INFO 2013-08-29 12:09:21,520 http-8080-1
> org.apache.solr.handler.dataimport.DocBuilder  (267) - Time taken =
> 0:0:32.684
> DEBUG 2013-08-29 12:09:21,536 http-8080-1
> org.apache.solr.update.processor.LogUpdateProcessor  (178) - PRE_UPDATE
> FINISH {{params
>
> (optimize=true&indent=true&start=10&commit=true&verbose=true&entity=noticebib&command=full-import&debug=true&wt=json&rows=5),defaults
> (config=mnb-data-config.xml)}}
> INFO 2013-08-29 12:09:21,536 http-8080-1
> org.apache.solr.update.processor.LogUpdateProcessor  (198) - [noticesBIB]
> webapp=/solr-0.4.0-pfd path=/dataimportMNb params=
>
> {optimize=true&indent=true&start=10&commit=true&verbose=true&entity=noticebib&command=full-import&debug=true&wt=json&rows=5}
>  {} 0 32871
> DEBUG 2013-08-29 12:09:21,583 http-8080-1
> org.apache.solr.servlet.SolrDispatchFilter  (388) - Closing out
> SolrRequest: {{params
>
> (optimize=true&indent=true&start=10&commit=true&verbose=true&entity=noticebib&command=full-import&debug=true&wt=json&rows=5),defaults
> (config=mnb-data-config.xml)}}
> DEBUG 2013-08-29 12:09:21,833 main-SendThread(127.0.0.1:9080)
> org.apache.zookeeper.client.ZooKeeperSaslClient  (519) - Could not retrieve
> login configuration: java.lang.SecurityException: Impossible de trouver une
> configuration de connexion
> DEBUG 2013-08-29 12:09:21,833 main-SendThread(127.0.0.1:9080)
> org.apache.zookeeper.client.ZooKeeperSaslClient  (519) - Could not retrieve
> login configuration: java.lang.SecurityException: Impossible de trouver une
> configuration de connexion
> DEBUG 2013-08-29 12:09:21,833 main-SendThread(127.0.0.1:9080)
> org.apache.zookeeper.client.ZooKeeperSaslClient  (519) - Could not retrieve
> login configuration: java.lang.SecurityException: Impossible de trouver une
> configuration de connexion
> DEBUG 2013-08-29 12:09:21,833 SyncThread:0
> org.apache.zookeeper.server.FinalRequestProcessor  (88) - Processing
> request:: sessionid:0x140c98bbe430000 type:getData cxid:0x39d
> zxid:0xfffffffffffffffe txntype:unknown reqpath:/overseer_elect/leader
> DEBUG 2013-08-29 12:09:21,833 SyncThread:0
> org.apache.zookeeper.server.FinalRequestProcessor  (160) -
> sessionid:0x140c98bbe430000 type:getData cxid:0x39d zxid:0xfffffffffffffffe
> txntype:unknown reqpath:/overseer_elect/leader
> DEBUG 2013-08-29 12:09:21,833 main-SendThread(127.0.0.1:9080)
> org.apache.zookeeper.client.ZooKeeperSaslClient  (519) - Could not retrieve
> login configuration: java.lang.SecurityException: Impossible de trouver une
> configuration de connexion
> DEBUG 2013-08-29 12:09:21,833 main-SendThread(127.0.0.1:9080)
> org.apache.zookeeper.client.ZooKeeperSaslClient  (519) - Could not retrieve
> login configuration: java.lang.SecurityException: Impossible de trouver une
> configuration de connexion
>
>
> PS: At the begining, I was in solr 4.2.1 and I tried with 4.0.0, but I have
> the same problem.
>
> Regards,
> Jérôme
>
>
> Fermeture annuelle des sites François-Mitterrand et Richelieu du 2 au 15
> septembre 2013 Avant d'imprimer, pensez à l'environnement.