You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Ravi Solr <ra...@gmail.com> on 2015/09/19 17:36:11 UTC

SolrCloud DIH issue

I am facing a weird problem. As part of upgrade from 4.7.2 (Master-Slave)
to 5.3.0 (Solrcloud) I re-indexed 1.5 million records via DIH using
SolrEntityProcessor yesterday, all of them indexed properly. Today morning
I just ran the DIH again with delta import and I lost all docs...what am I
missing ? Did anybody face similar issue ?

Here are the errors in the logs

9/19/2015, 2:41:17 AM ERROR null SolrCore Previous SolrRequestInfo was not
closed!
req=waitSearcher=true&distrib.from=http://10.128.159.32:8983/solr/sitesearchcore/&update.distrib=FROMLEADER&openSearcher=true&commit=true&wt=javabin&expungeDeletes=false&commit_end_point=true&version=2&softCommit=false
9/19/2015,
2:41:17 AM ERROR null SolrCore prev == info : false 9/19/2015, 2:41:17 AM
WARN null ZKPropertiesWriter Could not read DIH properties from
/configs/sitesearchcore/dataimport.properties :class
org.apache.zookeeper.KeeperException$NoNodeException

org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode
= NoNode for /configs/sitesearchcore/dataimport.properties
	at org.apache.zookeeper.KeeperException.create(KeeperException.java:111)
	at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
	at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1155)
	at org.apache.solr.common.cloud.SolrZkClient.getData(SolrZkClient.java:349)
	at org.apache.solr.handler.dataimport.ZKPropertiesWriter.readIndexerProperties(ZKPropertiesWriter.java:91)
	at org.apache.solr.handler.dataimport.ZKPropertiesWriter.persist(ZKPropertiesWriter.java:65)
	at org.apache.solr.handler.dataimport.DocBuilder.finish(DocBuilder.java:307)
	at org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:253)
	at org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:416)
	at org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:480)
	at org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:461)

9/19/2015, 11:16:43 AM ERROR null SolrCore Previous SolrRequestInfo was not
closed!
req=waitSearcher=true&distrib.from=http://10.128.159.32:8983/solr/sitesearchcore/&update.distrib=FROMLEADER&openSearcher=true&commit=true&wt=javabin&expungeDeletes=false&commit_end_point=true&version=2&softCommit=false
9/19/2015,
11:16:43 AM ERROR null SolrCore prev == info : false



Thanks

Ravi Kiran Bhaskar

Re: SolrCloud DIH issue

Posted by Ravi Solr <ra...@gmail.com>.
Yes Upayavira, that's exactly what prompted me to ask Erick as soon as I
read https://cwiki.apache.org/confluence/display/solr/Config+Sets

Erick, Regarding my delta-import not working I do see the
dataimport.properties in zookeeper. after I "upconfig" and "linkconfig" my
conf files into ZK...see below

[zk: localhost:YYYY (CONNECTED) 0] ls /configs/xxxxxx
[admin-extra.menu-top.html, person-synonyms.txt, entity-stopwords.txt,
protwords.txt, location-synonyms.txt, solrconfig.xml,
organization-synonyms.txt, stopwords.txt, spellings.txt,
dataimport.properties, admin-extra.html, xslt, synonyms.txt, scripts.conf,
subject-synonyms.txt, elevate.xml, admin-extra.menu-bottom.html,
solr-import-config.xml, clustering, schema.xml]

However, when I look into dataimport.properties in my 'conf' folder it
hasn't updated even after running full-import on Sep 19 2015 1:00AM
successfully and subsequent delta-import on Sep 20 2015 11:AM which did not
import newer docs, This prompted me to look into the dataimport.properties
in the conf folder...the details are shown below, you can clearly see the
dates are quite a bit off.

[xxxx@yyyyy conf]$ cat dataimport.properties
#Tue Sep 15 18:11:17 UTC 2015
reindex-docs.last_index_time=2015-09-15 18\:11\:16
last_index_time=2015-09-15 18\:11\:16
sep.last_index_time=2014-03-24 13\:41\:46


I saw some JIRA tickets about different location of dataimport.properties
for SolrCloud but couldnt find the path where it stores...Anybody have idea
where it stores it ?

Thanks

Ravi Kiran Bhaskar



On Sun, Sep 20, 2015 at 5:28 AM, Upayavira <uv...@odoko.co.uk> wrote:

> It is worth noting that the ref guide page on configsets refers to
> non-cloud mode (a useful new feature) whereas people may confuse this
> with configsets in cloud mode,  which use Zookeeper.
>
> Upayavira
>
> On Sun, Sep 20, 2015, at 04:59 AM, Ravi Solr wrote:
> > Cant thank you enough for clarifying it at length. Yeah its pretty
> > confusing even for experienced Solr users. I used the upconfig and
> > linkconfig commands to update 4 collections into zookeeper...As you
> > described, I lucked out as I used the same name for configset and the
> > collection and hence did not have to use the collections API :-)
> >
> > Thanks,
> >
> > Ravi Kiran Bhaskar
> >
> > On Sat, Sep 19, 2015 at 11:22 PM, Erick Erickson
> > <er...@gmail.com>
> > wrote:
> >
> > > Let's back up a second. Configsets are what _used_ to be in the conf
> > > directory for each core on a local drive, it's just that they're now
> > > kept up on Zookeeper. Otherwise, you'd have to put them on each
> > > instance in SolrCloud, and bringing up a new replica on a new machine
> > > would look a lot like adding a core with the old core admin API.
> > >
> > > So instead, configurations are kept on zookeeper. A config set
> > > consists of, essentially, a named old-style "conf" directory. There's
> > > no a-priori limit to the number of config sets you can have. Look in
> > > the admin UI, Cloud>>tree>>configs and you'll see each name you've
> > > pushed to ZK. If you explore that tree, you'll see a lot of old
> > > familiar faces, schema.xml, solrconfig.xml, etc.
> > >
> > > So now we come to associating configs with collections. You've
> > > probably done one of the examples where some things happen under the
> > > covers, including explicitly pushing the configset to Zookeeper.
> > > Currently, there's no option in the bin/solr script to push a config,
> > > although I know there's a JIRA to do that.
> > >
> > > So, to put a new config set up you currently need to use the zkCli.sh
> > > script see:
> > >
> https://cwiki.apache.org/confluence/display/solr/Command+Line+Utilities,
> > > the "upconfig" command. That pushes the configset up to ZK and gives
> > > it a name.
> > >
> > > Now, you create a collection and it needs a configset stored in ZK.
> > > It's a little tricky in that if you do _not_ explicitly specify a
> > > configest (using the collection.configName parameter to the
> > > collections API CREATE command), then by default it'll look for a
> > > configset with the same name as the collection. If it doesn't find
> > > one, _and_ there is one and only one configset, then it'll use that
> > > one (personally I find that confusing, but that's the way it works).
> > > See: https://cwiki.apache.org/confluence/display/solr/Collections+API
> > >
> > > If you have two or more configsets in ZK, then either the configset
> > > name has to be identical to the collection name (if you don't specify
> > > collection.configName), _or_ you specify collection.configName at
> > > create time.
> > >
> > > NOTE: there are _no_ config files on the local disk! When a replica of
> > > a collection loads, it "knows" what collection it's part of and pulls
> > > the corresponding configset from ZK.
> > >
> > > So typically the process is this.
> > > > you create the config set by editing all the usual suspects,
> schema.xml,
> > > solrconfig.xml, DIH config etc.
> > > > you put those configuration files into some version control system
> (you
> > > are using one, right?)
> > > > you push the configs to Zookeeper
> > > > you create the collection
> > > > you figure out you need to change the configs so you
> > >   > check the code out of your version control
> > >   > edit them
> > >   > put the current version back into version control
> > >   > push the configs up to zookeeper, overwriting the ones already
> > > there with that name
> > >   > reload the collection or bounce all the servers. As each replica
> > > in the collection comes up,
> > >      it downloads the latest configs from Zookeeper to memory (not to
> > > disk) and uses them.
> > >
> > > Seems like a long drawn-out process, but pretty soon it's automatic.
> > > And really, the only extra step is the push to Zookeeper, the rest is
> > > just like old-style cores with the exception that you don't have to
> > > manually push all the configs to all the machines hosting cores.
> > >
> > > Notice that I have mostly avoided talking about "cores" here. Although
> > > it's true that a replica in a collection is just another core, it's
> > > "special" in that it has certain very specific properties set. I
> > > _strongly_ advise you stop thinking about old-style Solr cores and
> > > instead thing about collections and replicas. And above all, do _not_
> > > use the admin core API to try to create members of a collection
> > > (cores), use the collections API to ADDREPLICA/DELETEREPLICA instead.
> > > Loading/unloading cores is less "fraught", but I try to avoid that too
> > > and use
> > >
> > > Best,
> > > Erick
> > >
> > > On Sat, Sep 19, 2015 at 9:08 PM, Ravi Solr <ra...@gmail.com> wrote:
> > > > Thanks Erick, I will report back once the reindex is finished. Oh,
> your
> > > > answer reminded me of another question - Regarding configsets the
> > > > documentation says
> > > >
> > > > "On a multicore Solr instance, you may find that you want to share
> > > > configuration between a number of different cores."
> > > >
> > > > Can the same be used to push disparate mutually exclusive configs ?
> I ask
> > > > this as I have 4 mutually exclusive apps each with a 4 single core
> index
> > > on
> > > > a single machine which I am trying to convert to SolrCloud with
> single
> > > > shard approach. Just being lazy and trying to find a way to update
> and
> > > link
> > > > configs to zookeeper ;-)
> > > >
> > > > Thanks
> > > >
> > > > Rvai Kiran Bhaskar
> > > >
> > > > On Sat, Sep 19, 2015 at 6:54 PM, Erick Erickson <
> erickerickson@gmail.com
> > > >
> > > > wrote:
> > > >
> > > >> Just pushing up the entire configset would be easiest, but the
> > > >> Zookeeper command line tools allow you to push up a single
> > > >> file if you want.
> > > >>
> > > >> Yeah, it puzzles me too that the import worked yesterday, not really
> > > >> sure what happened, the file shouldn't just disappear....
> > > >>
> > > >> Erick
> > > >>
> > > >> On Sat, Sep 19, 2015 at 2:46 PM, Ravi Solr <ra...@gmail.com>
> wrote:
> > > >> > Thank you for the prompt response Erick. I did a full-import
> > > yesterday,
> > > >> you
> > > >> > are correct that I did not push dataimport.properties to ZK,
> should it
> > > >> have
> > > >> > not worked even for a full import ?. You may be right about
> 'clean'
> > > >> option,
> > > >> > I will reindex again today. BTW how do we push a single file to a
> > > >> specific
> > > >> > config name in zookeeper ?
> > > >> >
> > > >> >
> > > >> > Thanks,
> > > >> >
> > > >> > Ravi Kiran Bhaskar
> > > >> >
> > > >> >
> > > >> > On Sat, Sep 19, 2015 at 1:48 PM, Erick Erickson <
> > > erickerickson@gmail.com
> > > >> >
> > > >> > wrote:
> > > >> >
> > > >> >> Could not read DIH properties from
> > > >> >> /configs/sitesearchcore/dataimport.properties
> > > >> >>
> > > >> >> This looks like somehow you didn't push this file up to
> Zookeeper.
> > > You
> > > >> >> can check what files are there in the admin UI. How you indexed
> > > >> >> yesterday is a mystery though, unless somehow this file was
> removed
> > > >> >> from ZK.
> > > >> >>
> > > >> >> As for why you lost all the docs, my suspicion is that you have
> the
> > > >> >> clean param set up for delta import....
> > > >> >>
> > > >> >> FWIW,
> > > >> >> Erick
> > > >> >>
> > > >> >> On Sat, Sep 19, 2015 at 10:36 AM, Ravi Solr <ra...@gmail.com>
> > > wrote:
> > > >> >> > I am facing a weird problem. As part of upgrade from 4.7.2
> > > >> (Master-Slave)
> > > >> >> > to 5.3.0 (Solrcloud) I re-indexed 1.5 million records via DIH
> using
> > > >> >> > SolrEntityProcessor yesterday, all of them indexed properly.
> Today
> > > >> >> morning
> > > >> >> > I just ran the DIH again with delta import and I lost all
> > > docs...what
> > > >> am
> > > >> >> I
> > > >> >> > missing ? Did anybody face similar issue ?
> > > >> >> >
> > > >> >> > Here are the errors in the logs
> > > >> >> >
> > > >> >> > 9/19/2015, 2:41:17 AM ERROR null SolrCore Previous
> SolrRequestInfo
> > > was
> > > >> >> not
> > > >> >> > closed!
> > > >> >> > req=waitSearcher=true&distrib.from=
> > > >> >>
> > > >>
> > >
> http://10.128.159.32:8983/solr/sitesearchcore/&update.distrib=FROMLEADER&openSearcher=true&commit=true&wt=javabin&expungeDeletes=false&commit_end_point=true&version=2&softCommit=false
> > > >> >> > 9/19/2015,
> > > >> >> > 2:41:17 AM ERROR null SolrCore prev == info : false 9/19/2015,
> > > >> 2:41:17 AM
> > > >> >> > WARN null ZKPropertiesWriter Could not read DIH properties from
> > > >> >> > /configs/sitesearchcore/dataimport.properties :class
> > > >> >> > org.apache.zookeeper.KeeperException$NoNodeException
> > > >> >> >
> > > >> >> > org.apache.zookeeper.KeeperException$NoNodeException:
> > > KeeperErrorCode
> > > >> >> > = NoNode for /configs/sitesearchcore/dataimport.properties
> > > >> >> >         at
> > > >> >>
> org.apache.zookeeper.KeeperException.create(KeeperException.java:111)
> > > >> >> >         at
> > > >> >>
> org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
> > > >> >> >         at
> > > org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1155)
> > > >> >> >         at
> > > >> >>
> > >
> org.apache.solr.common.cloud.SolrZkClient.getData(SolrZkClient.java:349)
> > > >> >> >         at
> > > >> >>
> > > >>
> > >
> org.apache.solr.handler.dataimport.ZKPropertiesWriter.readIndexerProperties(ZKPropertiesWriter.java:91)
> > > >> >> >         at
> > > >> >>
> > > >>
> > >
> org.apache.solr.handler.dataimport.ZKPropertiesWriter.persist(ZKPropertiesWriter.java:65)
> > > >> >> >         at
> > > >> >>
> > > >>
> > >
> org.apache.solr.handler.dataimport.DocBuilder.finish(DocBuilder.java:307)
> > > >> >> >         at
> > > >> >>
> > > >>
> > >
> org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:253)
> > > >> >> >         at
> > > >> >>
> > > >>
> > >
> org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:416)
> > > >> >> >         at
> > > >> >>
> > > >>
> > >
> org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:480)
> > > >> >> >         at
> > > >> >>
> > > >>
> > >
> org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:461)
> > > >> >> >
> > > >> >> > 9/19/2015, 11:16:43 AM ERROR null SolrCore Previous
> SolrRequestInfo
> > > >> was
> > > >> >> not
> > > >> >> > closed!
> > > >> >> > req=waitSearcher=true&distrib.from=
> > > >> >>
> > > >>
> > >
> http://10.128.159.32:8983/solr/sitesearchcore/&update.distrib=FROMLEADER&openSearcher=true&commit=true&wt=javabin&expungeDeletes=false&commit_end_point=true&version=2&softCommit=false
> > > >> >> > 9/19/2015,
> > > >> >> > 11:16:43 AM ERROR null SolrCore prev == info : false
> > > >> >> >
> > > >> >> >
> > > >> >> >
> > > >> >> > Thanks
> > > >> >> >
> > > >> >> > Ravi Kiran Bhaskar
> > > >> >>
> > > >>
> > >
>

Re: SolrCloud DIH issue

Posted by Upayavira <uv...@odoko.co.uk>.
It is worth noting that the ref guide page on configsets refers to
non-cloud mode (a useful new feature) whereas people may confuse this
with configsets in cloud mode,  which use Zookeeper.

Upayavira

On Sun, Sep 20, 2015, at 04:59 AM, Ravi Solr wrote:
> Cant thank you enough for clarifying it at length. Yeah its pretty
> confusing even for experienced Solr users. I used the upconfig and
> linkconfig commands to update 4 collections into zookeeper...As you
> described, I lucked out as I used the same name for configset and the
> collection and hence did not have to use the collections API :-)
> 
> Thanks,
> 
> Ravi Kiran Bhaskar
> 
> On Sat, Sep 19, 2015 at 11:22 PM, Erick Erickson
> <er...@gmail.com>
> wrote:
> 
> > Let's back up a second. Configsets are what _used_ to be in the conf
> > directory for each core on a local drive, it's just that they're now
> > kept up on Zookeeper. Otherwise, you'd have to put them on each
> > instance in SolrCloud, and bringing up a new replica on a new machine
> > would look a lot like adding a core with the old core admin API.
> >
> > So instead, configurations are kept on zookeeper. A config set
> > consists of, essentially, a named old-style "conf" directory. There's
> > no a-priori limit to the number of config sets you can have. Look in
> > the admin UI, Cloud>>tree>>configs and you'll see each name you've
> > pushed to ZK. If you explore that tree, you'll see a lot of old
> > familiar faces, schema.xml, solrconfig.xml, etc.
> >
> > So now we come to associating configs with collections. You've
> > probably done one of the examples where some things happen under the
> > covers, including explicitly pushing the configset to Zookeeper.
> > Currently, there's no option in the bin/solr script to push a config,
> > although I know there's a JIRA to do that.
> >
> > So, to put a new config set up you currently need to use the zkCli.sh
> > script see:
> > https://cwiki.apache.org/confluence/display/solr/Command+Line+Utilities,
> > the "upconfig" command. That pushes the configset up to ZK and gives
> > it a name.
> >
> > Now, you create a collection and it needs a configset stored in ZK.
> > It's a little tricky in that if you do _not_ explicitly specify a
> > configest (using the collection.configName parameter to the
> > collections API CREATE command), then by default it'll look for a
> > configset with the same name as the collection. If it doesn't find
> > one, _and_ there is one and only one configset, then it'll use that
> > one (personally I find that confusing, but that's the way it works).
> > See: https://cwiki.apache.org/confluence/display/solr/Collections+API
> >
> > If you have two or more configsets in ZK, then either the configset
> > name has to be identical to the collection name (if you don't specify
> > collection.configName), _or_ you specify collection.configName at
> > create time.
> >
> > NOTE: there are _no_ config files on the local disk! When a replica of
> > a collection loads, it "knows" what collection it's part of and pulls
> > the corresponding configset from ZK.
> >
> > So typically the process is this.
> > > you create the config set by editing all the usual suspects, schema.xml,
> > solrconfig.xml, DIH config etc.
> > > you put those configuration files into some version control system (you
> > are using one, right?)
> > > you push the configs to Zookeeper
> > > you create the collection
> > > you figure out you need to change the configs so you
> >   > check the code out of your version control
> >   > edit them
> >   > put the current version back into version control
> >   > push the configs up to zookeeper, overwriting the ones already
> > there with that name
> >   > reload the collection or bounce all the servers. As each replica
> > in the collection comes up,
> >      it downloads the latest configs from Zookeeper to memory (not to
> > disk) and uses them.
> >
> > Seems like a long drawn-out process, but pretty soon it's automatic.
> > And really, the only extra step is the push to Zookeeper, the rest is
> > just like old-style cores with the exception that you don't have to
> > manually push all the configs to all the machines hosting cores.
> >
> > Notice that I have mostly avoided talking about "cores" here. Although
> > it's true that a replica in a collection is just another core, it's
> > "special" in that it has certain very specific properties set. I
> > _strongly_ advise you stop thinking about old-style Solr cores and
> > instead thing about collections and replicas. And above all, do _not_
> > use the admin core API to try to create members of a collection
> > (cores), use the collections API to ADDREPLICA/DELETEREPLICA instead.
> > Loading/unloading cores is less "fraught", but I try to avoid that too
> > and use
> >
> > Best,
> > Erick
> >
> > On Sat, Sep 19, 2015 at 9:08 PM, Ravi Solr <ra...@gmail.com> wrote:
> > > Thanks Erick, I will report back once the reindex is finished. Oh, your
> > > answer reminded me of another question - Regarding configsets the
> > > documentation says
> > >
> > > "On a multicore Solr instance, you may find that you want to share
> > > configuration between a number of different cores."
> > >
> > > Can the same be used to push disparate mutually exclusive configs ? I ask
> > > this as I have 4 mutually exclusive apps each with a 4 single core index
> > on
> > > a single machine which I am trying to convert to SolrCloud with single
> > > shard approach. Just being lazy and trying to find a way to update and
> > link
> > > configs to zookeeper ;-)
> > >
> > > Thanks
> > >
> > > Rvai Kiran Bhaskar
> > >
> > > On Sat, Sep 19, 2015 at 6:54 PM, Erick Erickson <erickerickson@gmail.com
> > >
> > > wrote:
> > >
> > >> Just pushing up the entire configset would be easiest, but the
> > >> Zookeeper command line tools allow you to push up a single
> > >> file if you want.
> > >>
> > >> Yeah, it puzzles me too that the import worked yesterday, not really
> > >> sure what happened, the file shouldn't just disappear....
> > >>
> > >> Erick
> > >>
> > >> On Sat, Sep 19, 2015 at 2:46 PM, Ravi Solr <ra...@gmail.com> wrote:
> > >> > Thank you for the prompt response Erick. I did a full-import
> > yesterday,
> > >> you
> > >> > are correct that I did not push dataimport.properties to ZK, should it
> > >> have
> > >> > not worked even for a full import ?. You may be right about 'clean'
> > >> option,
> > >> > I will reindex again today. BTW how do we push a single file to a
> > >> specific
> > >> > config name in zookeeper ?
> > >> >
> > >> >
> > >> > Thanks,
> > >> >
> > >> > Ravi Kiran Bhaskar
> > >> >
> > >> >
> > >> > On Sat, Sep 19, 2015 at 1:48 PM, Erick Erickson <
> > erickerickson@gmail.com
> > >> >
> > >> > wrote:
> > >> >
> > >> >> Could not read DIH properties from
> > >> >> /configs/sitesearchcore/dataimport.properties
> > >> >>
> > >> >> This looks like somehow you didn't push this file up to Zookeeper.
> > You
> > >> >> can check what files are there in the admin UI. How you indexed
> > >> >> yesterday is a mystery though, unless somehow this file was removed
> > >> >> from ZK.
> > >> >>
> > >> >> As for why you lost all the docs, my suspicion is that you have the
> > >> >> clean param set up for delta import....
> > >> >>
> > >> >> FWIW,
> > >> >> Erick
> > >> >>
> > >> >> On Sat, Sep 19, 2015 at 10:36 AM, Ravi Solr <ra...@gmail.com>
> > wrote:
> > >> >> > I am facing a weird problem. As part of upgrade from 4.7.2
> > >> (Master-Slave)
> > >> >> > to 5.3.0 (Solrcloud) I re-indexed 1.5 million records via DIH using
> > >> >> > SolrEntityProcessor yesterday, all of them indexed properly. Today
> > >> >> morning
> > >> >> > I just ran the DIH again with delta import and I lost all
> > docs...what
> > >> am
> > >> >> I
> > >> >> > missing ? Did anybody face similar issue ?
> > >> >> >
> > >> >> > Here are the errors in the logs
> > >> >> >
> > >> >> > 9/19/2015, 2:41:17 AM ERROR null SolrCore Previous SolrRequestInfo
> > was
> > >> >> not
> > >> >> > closed!
> > >> >> > req=waitSearcher=true&distrib.from=
> > >> >>
> > >>
> > http://10.128.159.32:8983/solr/sitesearchcore/&update.distrib=FROMLEADER&openSearcher=true&commit=true&wt=javabin&expungeDeletes=false&commit_end_point=true&version=2&softCommit=false
> > >> >> > 9/19/2015,
> > >> >> > 2:41:17 AM ERROR null SolrCore prev == info : false 9/19/2015,
> > >> 2:41:17 AM
> > >> >> > WARN null ZKPropertiesWriter Could not read DIH properties from
> > >> >> > /configs/sitesearchcore/dataimport.properties :class
> > >> >> > org.apache.zookeeper.KeeperException$NoNodeException
> > >> >> >
> > >> >> > org.apache.zookeeper.KeeperException$NoNodeException:
> > KeeperErrorCode
> > >> >> > = NoNode for /configs/sitesearchcore/dataimport.properties
> > >> >> >         at
> > >> >> org.apache.zookeeper.KeeperException.create(KeeperException.java:111)
> > >> >> >         at
> > >> >> org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
> > >> >> >         at
> > org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1155)
> > >> >> >         at
> > >> >>
> > org.apache.solr.common.cloud.SolrZkClient.getData(SolrZkClient.java:349)
> > >> >> >         at
> > >> >>
> > >>
> > org.apache.solr.handler.dataimport.ZKPropertiesWriter.readIndexerProperties(ZKPropertiesWriter.java:91)
> > >> >> >         at
> > >> >>
> > >>
> > org.apache.solr.handler.dataimport.ZKPropertiesWriter.persist(ZKPropertiesWriter.java:65)
> > >> >> >         at
> > >> >>
> > >>
> > org.apache.solr.handler.dataimport.DocBuilder.finish(DocBuilder.java:307)
> > >> >> >         at
> > >> >>
> > >>
> > org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:253)
> > >> >> >         at
> > >> >>
> > >>
> > org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:416)
> > >> >> >         at
> > >> >>
> > >>
> > org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:480)
> > >> >> >         at
> > >> >>
> > >>
> > org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:461)
> > >> >> >
> > >> >> > 9/19/2015, 11:16:43 AM ERROR null SolrCore Previous SolrRequestInfo
> > >> was
> > >> >> not
> > >> >> > closed!
> > >> >> > req=waitSearcher=true&distrib.from=
> > >> >>
> > >>
> > http://10.128.159.32:8983/solr/sitesearchcore/&update.distrib=FROMLEADER&openSearcher=true&commit=true&wt=javabin&expungeDeletes=false&commit_end_point=true&version=2&softCommit=false
> > >> >> > 9/19/2015,
> > >> >> > 11:16:43 AM ERROR null SolrCore prev == info : false
> > >> >> >
> > >> >> >
> > >> >> >
> > >> >> > Thanks
> > >> >> >
> > >> >> > Ravi Kiran Bhaskar
> > >> >>
> > >>
> >

Re: SolrCloud DIH issue

Posted by Ravi Solr <ra...@gmail.com>.
Cant thank you enough for clarifying it at length. Yeah its pretty
confusing even for experienced Solr users. I used the upconfig and
linkconfig commands to update 4 collections into zookeeper...As you
described, I lucked out as I used the same name for configset and the
collection and hence did not have to use the collections API :-)

Thanks,

Ravi Kiran Bhaskar

On Sat, Sep 19, 2015 at 11:22 PM, Erick Erickson <er...@gmail.com>
wrote:

> Let's back up a second. Configsets are what _used_ to be in the conf
> directory for each core on a local drive, it's just that they're now
> kept up on Zookeeper. Otherwise, you'd have to put them on each
> instance in SolrCloud, and bringing up a new replica on a new machine
> would look a lot like adding a core with the old core admin API.
>
> So instead, configurations are kept on zookeeper. A config set
> consists of, essentially, a named old-style "conf" directory. There's
> no a-priori limit to the number of config sets you can have. Look in
> the admin UI, Cloud>>tree>>configs and you'll see each name you've
> pushed to ZK. If you explore that tree, you'll see a lot of old
> familiar faces, schema.xml, solrconfig.xml, etc.
>
> So now we come to associating configs with collections. You've
> probably done one of the examples where some things happen under the
> covers, including explicitly pushing the configset to Zookeeper.
> Currently, there's no option in the bin/solr script to push a config,
> although I know there's a JIRA to do that.
>
> So, to put a new config set up you currently need to use the zkCli.sh
> script see:
> https://cwiki.apache.org/confluence/display/solr/Command+Line+Utilities,
> the "upconfig" command. That pushes the configset up to ZK and gives
> it a name.
>
> Now, you create a collection and it needs a configset stored in ZK.
> It's a little tricky in that if you do _not_ explicitly specify a
> configest (using the collection.configName parameter to the
> collections API CREATE command), then by default it'll look for a
> configset with the same name as the collection. If it doesn't find
> one, _and_ there is one and only one configset, then it'll use that
> one (personally I find that confusing, but that's the way it works).
> See: https://cwiki.apache.org/confluence/display/solr/Collections+API
>
> If you have two or more configsets in ZK, then either the configset
> name has to be identical to the collection name (if you don't specify
> collection.configName), _or_ you specify collection.configName at
> create time.
>
> NOTE: there are _no_ config files on the local disk! When a replica of
> a collection loads, it "knows" what collection it's part of and pulls
> the corresponding configset from ZK.
>
> So typically the process is this.
> > you create the config set by editing all the usual suspects, schema.xml,
> solrconfig.xml, DIH config etc.
> > you put those configuration files into some version control system (you
> are using one, right?)
> > you push the configs to Zookeeper
> > you create the collection
> > you figure out you need to change the configs so you
>   > check the code out of your version control
>   > edit them
>   > put the current version back into version control
>   > push the configs up to zookeeper, overwriting the ones already
> there with that name
>   > reload the collection or bounce all the servers. As each replica
> in the collection comes up,
>      it downloads the latest configs from Zookeeper to memory (not to
> disk) and uses them.
>
> Seems like a long drawn-out process, but pretty soon it's automatic.
> And really, the only extra step is the push to Zookeeper, the rest is
> just like old-style cores with the exception that you don't have to
> manually push all the configs to all the machines hosting cores.
>
> Notice that I have mostly avoided talking about "cores" here. Although
> it's true that a replica in a collection is just another core, it's
> "special" in that it has certain very specific properties set. I
> _strongly_ advise you stop thinking about old-style Solr cores and
> instead thing about collections and replicas. And above all, do _not_
> use the admin core API to try to create members of a collection
> (cores), use the collections API to ADDREPLICA/DELETEREPLICA instead.
> Loading/unloading cores is less "fraught", but I try to avoid that too
> and use
>
> Best,
> Erick
>
> On Sat, Sep 19, 2015 at 9:08 PM, Ravi Solr <ra...@gmail.com> wrote:
> > Thanks Erick, I will report back once the reindex is finished. Oh, your
> > answer reminded me of another question - Regarding configsets the
> > documentation says
> >
> > "On a multicore Solr instance, you may find that you want to share
> > configuration between a number of different cores."
> >
> > Can the same be used to push disparate mutually exclusive configs ? I ask
> > this as I have 4 mutually exclusive apps each with a 4 single core index
> on
> > a single machine which I am trying to convert to SolrCloud with single
> > shard approach. Just being lazy and trying to find a way to update and
> link
> > configs to zookeeper ;-)
> >
> > Thanks
> >
> > Rvai Kiran Bhaskar
> >
> > On Sat, Sep 19, 2015 at 6:54 PM, Erick Erickson <erickerickson@gmail.com
> >
> > wrote:
> >
> >> Just pushing up the entire configset would be easiest, but the
> >> Zookeeper command line tools allow you to push up a single
> >> file if you want.
> >>
> >> Yeah, it puzzles me too that the import worked yesterday, not really
> >> sure what happened, the file shouldn't just disappear....
> >>
> >> Erick
> >>
> >> On Sat, Sep 19, 2015 at 2:46 PM, Ravi Solr <ra...@gmail.com> wrote:
> >> > Thank you for the prompt response Erick. I did a full-import
> yesterday,
> >> you
> >> > are correct that I did not push dataimport.properties to ZK, should it
> >> have
> >> > not worked even for a full import ?. You may be right about 'clean'
> >> option,
> >> > I will reindex again today. BTW how do we push a single file to a
> >> specific
> >> > config name in zookeeper ?
> >> >
> >> >
> >> > Thanks,
> >> >
> >> > Ravi Kiran Bhaskar
> >> >
> >> >
> >> > On Sat, Sep 19, 2015 at 1:48 PM, Erick Erickson <
> erickerickson@gmail.com
> >> >
> >> > wrote:
> >> >
> >> >> Could not read DIH properties from
> >> >> /configs/sitesearchcore/dataimport.properties
> >> >>
> >> >> This looks like somehow you didn't push this file up to Zookeeper.
> You
> >> >> can check what files are there in the admin UI. How you indexed
> >> >> yesterday is a mystery though, unless somehow this file was removed
> >> >> from ZK.
> >> >>
> >> >> As for why you lost all the docs, my suspicion is that you have the
> >> >> clean param set up for delta import....
> >> >>
> >> >> FWIW,
> >> >> Erick
> >> >>
> >> >> On Sat, Sep 19, 2015 at 10:36 AM, Ravi Solr <ra...@gmail.com>
> wrote:
> >> >> > I am facing a weird problem. As part of upgrade from 4.7.2
> >> (Master-Slave)
> >> >> > to 5.3.0 (Solrcloud) I re-indexed 1.5 million records via DIH using
> >> >> > SolrEntityProcessor yesterday, all of them indexed properly. Today
> >> >> morning
> >> >> > I just ran the DIH again with delta import and I lost all
> docs...what
> >> am
> >> >> I
> >> >> > missing ? Did anybody face similar issue ?
> >> >> >
> >> >> > Here are the errors in the logs
> >> >> >
> >> >> > 9/19/2015, 2:41:17 AM ERROR null SolrCore Previous SolrRequestInfo
> was
> >> >> not
> >> >> > closed!
> >> >> > req=waitSearcher=true&distrib.from=
> >> >>
> >>
> http://10.128.159.32:8983/solr/sitesearchcore/&update.distrib=FROMLEADER&openSearcher=true&commit=true&wt=javabin&expungeDeletes=false&commit_end_point=true&version=2&softCommit=false
> >> >> > 9/19/2015,
> >> >> > 2:41:17 AM ERROR null SolrCore prev == info : false 9/19/2015,
> >> 2:41:17 AM
> >> >> > WARN null ZKPropertiesWriter Could not read DIH properties from
> >> >> > /configs/sitesearchcore/dataimport.properties :class
> >> >> > org.apache.zookeeper.KeeperException$NoNodeException
> >> >> >
> >> >> > org.apache.zookeeper.KeeperException$NoNodeException:
> KeeperErrorCode
> >> >> > = NoNode for /configs/sitesearchcore/dataimport.properties
> >> >> >         at
> >> >> org.apache.zookeeper.KeeperException.create(KeeperException.java:111)
> >> >> >         at
> >> >> org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
> >> >> >         at
> org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1155)
> >> >> >         at
> >> >>
> org.apache.solr.common.cloud.SolrZkClient.getData(SolrZkClient.java:349)
> >> >> >         at
> >> >>
> >>
> org.apache.solr.handler.dataimport.ZKPropertiesWriter.readIndexerProperties(ZKPropertiesWriter.java:91)
> >> >> >         at
> >> >>
> >>
> org.apache.solr.handler.dataimport.ZKPropertiesWriter.persist(ZKPropertiesWriter.java:65)
> >> >> >         at
> >> >>
> >>
> org.apache.solr.handler.dataimport.DocBuilder.finish(DocBuilder.java:307)
> >> >> >         at
> >> >>
> >>
> org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:253)
> >> >> >         at
> >> >>
> >>
> org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:416)
> >> >> >         at
> >> >>
> >>
> org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:480)
> >> >> >         at
> >> >>
> >>
> org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:461)
> >> >> >
> >> >> > 9/19/2015, 11:16:43 AM ERROR null SolrCore Previous SolrRequestInfo
> >> was
> >> >> not
> >> >> > closed!
> >> >> > req=waitSearcher=true&distrib.from=
> >> >>
> >>
> http://10.128.159.32:8983/solr/sitesearchcore/&update.distrib=FROMLEADER&openSearcher=true&commit=true&wt=javabin&expungeDeletes=false&commit_end_point=true&version=2&softCommit=false
> >> >> > 9/19/2015,
> >> >> > 11:16:43 AM ERROR null SolrCore prev == info : false
> >> >> >
> >> >> >
> >> >> >
> >> >> > Thanks
> >> >> >
> >> >> > Ravi Kiran Bhaskar
> >> >>
> >>
>

Re: SolrCloud DIH issue

Posted by Erick Erickson <er...@gmail.com>.
Let's back up a second. Configsets are what _used_ to be in the conf
directory for each core on a local drive, it's just that they're now
kept up on Zookeeper. Otherwise, you'd have to put them on each
instance in SolrCloud, and bringing up a new replica on a new machine
would look a lot like adding a core with the old core admin API.

So instead, configurations are kept on zookeeper. A config set
consists of, essentially, a named old-style "conf" directory. There's
no a-priori limit to the number of config sets you can have. Look in
the admin UI, Cloud>>tree>>configs and you'll see each name you've
pushed to ZK. If you explore that tree, you'll see a lot of old
familiar faces, schema.xml, solrconfig.xml, etc.

So now we come to associating configs with collections. You've
probably done one of the examples where some things happen under the
covers, including explicitly pushing the configset to Zookeeper.
Currently, there's no option in the bin/solr script to push a config,
although I know there's a JIRA to do that.

So, to put a new config set up you currently need to use the zkCli.sh
script see: https://cwiki.apache.org/confluence/display/solr/Command+Line+Utilities,
the "upconfig" command. That pushes the configset up to ZK and gives
it a name.

Now, you create a collection and it needs a configset stored in ZK.
It's a little tricky in that if you do _not_ explicitly specify a
configest (using the collection.configName parameter to the
collections API CREATE command), then by default it'll look for a
configset with the same name as the collection. If it doesn't find
one, _and_ there is one and only one configset, then it'll use that
one (personally I find that confusing, but that's the way it works).
See: https://cwiki.apache.org/confluence/display/solr/Collections+API

If you have two or more configsets in ZK, then either the configset
name has to be identical to the collection name (if you don't specify
collection.configName), _or_ you specify collection.configName at
create time.

NOTE: there are _no_ config files on the local disk! When a replica of
a collection loads, it "knows" what collection it's part of and pulls
the corresponding configset from ZK.

So typically the process is this.
> you create the config set by editing all the usual suspects, schema.xml, solrconfig.xml, DIH config etc.
> you put those configuration files into some version control system (you are using one, right?)
> you push the configs to Zookeeper
> you create the collection
> you figure out you need to change the configs so you
  > check the code out of your version control
  > edit them
  > put the current version back into version control
  > push the configs up to zookeeper, overwriting the ones already
there with that name
  > reload the collection or bounce all the servers. As each replica
in the collection comes up,
     it downloads the latest configs from Zookeeper to memory (not to
disk) and uses them.

Seems like a long drawn-out process, but pretty soon it's automatic.
And really, the only extra step is the push to Zookeeper, the rest is
just like old-style cores with the exception that you don't have to
manually push all the configs to all the machines hosting cores.

Notice that I have mostly avoided talking about "cores" here. Although
it's true that a replica in a collection is just another core, it's
"special" in that it has certain very specific properties set. I
_strongly_ advise you stop thinking about old-style Solr cores and
instead thing about collections and replicas. And above all, do _not_
use the admin core API to try to create members of a collection
(cores), use the collections API to ADDREPLICA/DELETEREPLICA instead.
Loading/unloading cores is less "fraught", but I try to avoid that too
and use

Best,
Erick

On Sat, Sep 19, 2015 at 9:08 PM, Ravi Solr <ra...@gmail.com> wrote:
> Thanks Erick, I will report back once the reindex is finished. Oh, your
> answer reminded me of another question - Regarding configsets the
> documentation says
>
> "On a multicore Solr instance, you may find that you want to share
> configuration between a number of different cores."
>
> Can the same be used to push disparate mutually exclusive configs ? I ask
> this as I have 4 mutually exclusive apps each with a 4 single core index on
> a single machine which I am trying to convert to SolrCloud with single
> shard approach. Just being lazy and trying to find a way to update and link
> configs to zookeeper ;-)
>
> Thanks
>
> Rvai Kiran Bhaskar
>
> On Sat, Sep 19, 2015 at 6:54 PM, Erick Erickson <er...@gmail.com>
> wrote:
>
>> Just pushing up the entire configset would be easiest, but the
>> Zookeeper command line tools allow you to push up a single
>> file if you want.
>>
>> Yeah, it puzzles me too that the import worked yesterday, not really
>> sure what happened, the file shouldn't just disappear....
>>
>> Erick
>>
>> On Sat, Sep 19, 2015 at 2:46 PM, Ravi Solr <ra...@gmail.com> wrote:
>> > Thank you for the prompt response Erick. I did a full-import yesterday,
>> you
>> > are correct that I did not push dataimport.properties to ZK, should it
>> have
>> > not worked even for a full import ?. You may be right about 'clean'
>> option,
>> > I will reindex again today. BTW how do we push a single file to a
>> specific
>> > config name in zookeeper ?
>> >
>> >
>> > Thanks,
>> >
>> > Ravi Kiran Bhaskar
>> >
>> >
>> > On Sat, Sep 19, 2015 at 1:48 PM, Erick Erickson <erickerickson@gmail.com
>> >
>> > wrote:
>> >
>> >> Could not read DIH properties from
>> >> /configs/sitesearchcore/dataimport.properties
>> >>
>> >> This looks like somehow you didn't push this file up to Zookeeper. You
>> >> can check what files are there in the admin UI. How you indexed
>> >> yesterday is a mystery though, unless somehow this file was removed
>> >> from ZK.
>> >>
>> >> As for why you lost all the docs, my suspicion is that you have the
>> >> clean param set up for delta import....
>> >>
>> >> FWIW,
>> >> Erick
>> >>
>> >> On Sat, Sep 19, 2015 at 10:36 AM, Ravi Solr <ra...@gmail.com> wrote:
>> >> > I am facing a weird problem. As part of upgrade from 4.7.2
>> (Master-Slave)
>> >> > to 5.3.0 (Solrcloud) I re-indexed 1.5 million records via DIH using
>> >> > SolrEntityProcessor yesterday, all of them indexed properly. Today
>> >> morning
>> >> > I just ran the DIH again with delta import and I lost all docs...what
>> am
>> >> I
>> >> > missing ? Did anybody face similar issue ?
>> >> >
>> >> > Here are the errors in the logs
>> >> >
>> >> > 9/19/2015, 2:41:17 AM ERROR null SolrCore Previous SolrRequestInfo was
>> >> not
>> >> > closed!
>> >> > req=waitSearcher=true&distrib.from=
>> >>
>> http://10.128.159.32:8983/solr/sitesearchcore/&update.distrib=FROMLEADER&openSearcher=true&commit=true&wt=javabin&expungeDeletes=false&commit_end_point=true&version=2&softCommit=false
>> >> > 9/19/2015,
>> >> > 2:41:17 AM ERROR null SolrCore prev == info : false 9/19/2015,
>> 2:41:17 AM
>> >> > WARN null ZKPropertiesWriter Could not read DIH properties from
>> >> > /configs/sitesearchcore/dataimport.properties :class
>> >> > org.apache.zookeeper.KeeperException$NoNodeException
>> >> >
>> >> > org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode
>> >> > = NoNode for /configs/sitesearchcore/dataimport.properties
>> >> >         at
>> >> org.apache.zookeeper.KeeperException.create(KeeperException.java:111)
>> >> >         at
>> >> org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
>> >> >         at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1155)
>> >> >         at
>> >> org.apache.solr.common.cloud.SolrZkClient.getData(SolrZkClient.java:349)
>> >> >         at
>> >>
>> org.apache.solr.handler.dataimport.ZKPropertiesWriter.readIndexerProperties(ZKPropertiesWriter.java:91)
>> >> >         at
>> >>
>> org.apache.solr.handler.dataimport.ZKPropertiesWriter.persist(ZKPropertiesWriter.java:65)
>> >> >         at
>> >>
>> org.apache.solr.handler.dataimport.DocBuilder.finish(DocBuilder.java:307)
>> >> >         at
>> >>
>> org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:253)
>> >> >         at
>> >>
>> org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:416)
>> >> >         at
>> >>
>> org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:480)
>> >> >         at
>> >>
>> org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:461)
>> >> >
>> >> > 9/19/2015, 11:16:43 AM ERROR null SolrCore Previous SolrRequestInfo
>> was
>> >> not
>> >> > closed!
>> >> > req=waitSearcher=true&distrib.from=
>> >>
>> http://10.128.159.32:8983/solr/sitesearchcore/&update.distrib=FROMLEADER&openSearcher=true&commit=true&wt=javabin&expungeDeletes=false&commit_end_point=true&version=2&softCommit=false
>> >> > 9/19/2015,
>> >> > 11:16:43 AM ERROR null SolrCore prev == info : false
>> >> >
>> >> >
>> >> >
>> >> > Thanks
>> >> >
>> >> > Ravi Kiran Bhaskar
>> >>
>>

Re: SolrCloud DIH issue

Posted by Ravi Solr <ra...@gmail.com>.
Thanks Erick, I will report back once the reindex is finished. Oh, your
answer reminded me of another question - Regarding configsets the
documentation says

"On a multicore Solr instance, you may find that you want to share
configuration between a number of different cores."

Can the same be used to push disparate mutually exclusive configs ? I ask
this as I have 4 mutually exclusive apps each with a 4 single core index on
a single machine which I am trying to convert to SolrCloud with single
shard approach. Just being lazy and trying to find a way to update and link
configs to zookeeper ;-)

Thanks

Rvai Kiran Bhaskar

On Sat, Sep 19, 2015 at 6:54 PM, Erick Erickson <er...@gmail.com>
wrote:

> Just pushing up the entire configset would be easiest, but the
> Zookeeper command line tools allow you to push up a single
> file if you want.
>
> Yeah, it puzzles me too that the import worked yesterday, not really
> sure what happened, the file shouldn't just disappear....
>
> Erick
>
> On Sat, Sep 19, 2015 at 2:46 PM, Ravi Solr <ra...@gmail.com> wrote:
> > Thank you for the prompt response Erick. I did a full-import yesterday,
> you
> > are correct that I did not push dataimport.properties to ZK, should it
> have
> > not worked even for a full import ?. You may be right about 'clean'
> option,
> > I will reindex again today. BTW how do we push a single file to a
> specific
> > config name in zookeeper ?
> >
> >
> > Thanks,
> >
> > Ravi Kiran Bhaskar
> >
> >
> > On Sat, Sep 19, 2015 at 1:48 PM, Erick Erickson <erickerickson@gmail.com
> >
> > wrote:
> >
> >> Could not read DIH properties from
> >> /configs/sitesearchcore/dataimport.properties
> >>
> >> This looks like somehow you didn't push this file up to Zookeeper. You
> >> can check what files are there in the admin UI. How you indexed
> >> yesterday is a mystery though, unless somehow this file was removed
> >> from ZK.
> >>
> >> As for why you lost all the docs, my suspicion is that you have the
> >> clean param set up for delta import....
> >>
> >> FWIW,
> >> Erick
> >>
> >> On Sat, Sep 19, 2015 at 10:36 AM, Ravi Solr <ra...@gmail.com> wrote:
> >> > I am facing a weird problem. As part of upgrade from 4.7.2
> (Master-Slave)
> >> > to 5.3.0 (Solrcloud) I re-indexed 1.5 million records via DIH using
> >> > SolrEntityProcessor yesterday, all of them indexed properly. Today
> >> morning
> >> > I just ran the DIH again with delta import and I lost all docs...what
> am
> >> I
> >> > missing ? Did anybody face similar issue ?
> >> >
> >> > Here are the errors in the logs
> >> >
> >> > 9/19/2015, 2:41:17 AM ERROR null SolrCore Previous SolrRequestInfo was
> >> not
> >> > closed!
> >> > req=waitSearcher=true&distrib.from=
> >>
> http://10.128.159.32:8983/solr/sitesearchcore/&update.distrib=FROMLEADER&openSearcher=true&commit=true&wt=javabin&expungeDeletes=false&commit_end_point=true&version=2&softCommit=false
> >> > 9/19/2015,
> >> > 2:41:17 AM ERROR null SolrCore prev == info : false 9/19/2015,
> 2:41:17 AM
> >> > WARN null ZKPropertiesWriter Could not read DIH properties from
> >> > /configs/sitesearchcore/dataimport.properties :class
> >> > org.apache.zookeeper.KeeperException$NoNodeException
> >> >
> >> > org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode
> >> > = NoNode for /configs/sitesearchcore/dataimport.properties
> >> >         at
> >> org.apache.zookeeper.KeeperException.create(KeeperException.java:111)
> >> >         at
> >> org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
> >> >         at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1155)
> >> >         at
> >> org.apache.solr.common.cloud.SolrZkClient.getData(SolrZkClient.java:349)
> >> >         at
> >>
> org.apache.solr.handler.dataimport.ZKPropertiesWriter.readIndexerProperties(ZKPropertiesWriter.java:91)
> >> >         at
> >>
> org.apache.solr.handler.dataimport.ZKPropertiesWriter.persist(ZKPropertiesWriter.java:65)
> >> >         at
> >>
> org.apache.solr.handler.dataimport.DocBuilder.finish(DocBuilder.java:307)
> >> >         at
> >>
> org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:253)
> >> >         at
> >>
> org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:416)
> >> >         at
> >>
> org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:480)
> >> >         at
> >>
> org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:461)
> >> >
> >> > 9/19/2015, 11:16:43 AM ERROR null SolrCore Previous SolrRequestInfo
> was
> >> not
> >> > closed!
> >> > req=waitSearcher=true&distrib.from=
> >>
> http://10.128.159.32:8983/solr/sitesearchcore/&update.distrib=FROMLEADER&openSearcher=true&commit=true&wt=javabin&expungeDeletes=false&commit_end_point=true&version=2&softCommit=false
> >> > 9/19/2015,
> >> > 11:16:43 AM ERROR null SolrCore prev == info : false
> >> >
> >> >
> >> >
> >> > Thanks
> >> >
> >> > Ravi Kiran Bhaskar
> >>
>

Re: SolrCloud DIH issue

Posted by Erick Erickson <er...@gmail.com>.
Just pushing up the entire configset would be easiest, but the
Zookeeper command line tools allow you to push up a single
file if you want.

Yeah, it puzzles me too that the import worked yesterday, not really
sure what happened, the file shouldn't just disappear....

Erick

On Sat, Sep 19, 2015 at 2:46 PM, Ravi Solr <ra...@gmail.com> wrote:
> Thank you for the prompt response Erick. I did a full-import yesterday, you
> are correct that I did not push dataimport.properties to ZK, should it have
> not worked even for a full import ?. You may be right about 'clean' option,
> I will reindex again today. BTW how do we push a single file to a specific
> config name in zookeeper ?
>
>
> Thanks,
>
> Ravi Kiran Bhaskar
>
>
> On Sat, Sep 19, 2015 at 1:48 PM, Erick Erickson <er...@gmail.com>
> wrote:
>
>> Could not read DIH properties from
>> /configs/sitesearchcore/dataimport.properties
>>
>> This looks like somehow you didn't push this file up to Zookeeper. You
>> can check what files are there in the admin UI. How you indexed
>> yesterday is a mystery though, unless somehow this file was removed
>> from ZK.
>>
>> As for why you lost all the docs, my suspicion is that you have the
>> clean param set up for delta import....
>>
>> FWIW,
>> Erick
>>
>> On Sat, Sep 19, 2015 at 10:36 AM, Ravi Solr <ra...@gmail.com> wrote:
>> > I am facing a weird problem. As part of upgrade from 4.7.2 (Master-Slave)
>> > to 5.3.0 (Solrcloud) I re-indexed 1.5 million records via DIH using
>> > SolrEntityProcessor yesterday, all of them indexed properly. Today
>> morning
>> > I just ran the DIH again with delta import and I lost all docs...what am
>> I
>> > missing ? Did anybody face similar issue ?
>> >
>> > Here are the errors in the logs
>> >
>> > 9/19/2015, 2:41:17 AM ERROR null SolrCore Previous SolrRequestInfo was
>> not
>> > closed!
>> > req=waitSearcher=true&distrib.from=
>> http://10.128.159.32:8983/solr/sitesearchcore/&update.distrib=FROMLEADER&openSearcher=true&commit=true&wt=javabin&expungeDeletes=false&commit_end_point=true&version=2&softCommit=false
>> > 9/19/2015,
>> > 2:41:17 AM ERROR null SolrCore prev == info : false 9/19/2015, 2:41:17 AM
>> > WARN null ZKPropertiesWriter Could not read DIH properties from
>> > /configs/sitesearchcore/dataimport.properties :class
>> > org.apache.zookeeper.KeeperException$NoNodeException
>> >
>> > org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode
>> > = NoNode for /configs/sitesearchcore/dataimport.properties
>> >         at
>> org.apache.zookeeper.KeeperException.create(KeeperException.java:111)
>> >         at
>> org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
>> >         at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1155)
>> >         at
>> org.apache.solr.common.cloud.SolrZkClient.getData(SolrZkClient.java:349)
>> >         at
>> org.apache.solr.handler.dataimport.ZKPropertiesWriter.readIndexerProperties(ZKPropertiesWriter.java:91)
>> >         at
>> org.apache.solr.handler.dataimport.ZKPropertiesWriter.persist(ZKPropertiesWriter.java:65)
>> >         at
>> org.apache.solr.handler.dataimport.DocBuilder.finish(DocBuilder.java:307)
>> >         at
>> org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:253)
>> >         at
>> org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:416)
>> >         at
>> org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:480)
>> >         at
>> org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:461)
>> >
>> > 9/19/2015, 11:16:43 AM ERROR null SolrCore Previous SolrRequestInfo was
>> not
>> > closed!
>> > req=waitSearcher=true&distrib.from=
>> http://10.128.159.32:8983/solr/sitesearchcore/&update.distrib=FROMLEADER&openSearcher=true&commit=true&wt=javabin&expungeDeletes=false&commit_end_point=true&version=2&softCommit=false
>> > 9/19/2015,
>> > 11:16:43 AM ERROR null SolrCore prev == info : false
>> >
>> >
>> >
>> > Thanks
>> >
>> > Ravi Kiran Bhaskar
>>

Re: SolrCloud DIH issue

Posted by Ravi Solr <ra...@gmail.com>.
Thank you for the prompt response Erick. I did a full-import yesterday, you
are correct that I did not push dataimport.properties to ZK, should it have
not worked even for a full import ?. You may be right about 'clean' option,
I will reindex again today. BTW how do we push a single file to a specific
config name in zookeeper ?


Thanks,

Ravi Kiran Bhaskar


On Sat, Sep 19, 2015 at 1:48 PM, Erick Erickson <er...@gmail.com>
wrote:

> Could not read DIH properties from
> /configs/sitesearchcore/dataimport.properties
>
> This looks like somehow you didn't push this file up to Zookeeper. You
> can check what files are there in the admin UI. How you indexed
> yesterday is a mystery though, unless somehow this file was removed
> from ZK.
>
> As for why you lost all the docs, my suspicion is that you have the
> clean param set up for delta import....
>
> FWIW,
> Erick
>
> On Sat, Sep 19, 2015 at 10:36 AM, Ravi Solr <ra...@gmail.com> wrote:
> > I am facing a weird problem. As part of upgrade from 4.7.2 (Master-Slave)
> > to 5.3.0 (Solrcloud) I re-indexed 1.5 million records via DIH using
> > SolrEntityProcessor yesterday, all of them indexed properly. Today
> morning
> > I just ran the DIH again with delta import and I lost all docs...what am
> I
> > missing ? Did anybody face similar issue ?
> >
> > Here are the errors in the logs
> >
> > 9/19/2015, 2:41:17 AM ERROR null SolrCore Previous SolrRequestInfo was
> not
> > closed!
> > req=waitSearcher=true&distrib.from=
> http://10.128.159.32:8983/solr/sitesearchcore/&update.distrib=FROMLEADER&openSearcher=true&commit=true&wt=javabin&expungeDeletes=false&commit_end_point=true&version=2&softCommit=false
> > 9/19/2015,
> > 2:41:17 AM ERROR null SolrCore prev == info : false 9/19/2015, 2:41:17 AM
> > WARN null ZKPropertiesWriter Could not read DIH properties from
> > /configs/sitesearchcore/dataimport.properties :class
> > org.apache.zookeeper.KeeperException$NoNodeException
> >
> > org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode
> > = NoNode for /configs/sitesearchcore/dataimport.properties
> >         at
> org.apache.zookeeper.KeeperException.create(KeeperException.java:111)
> >         at
> org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
> >         at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1155)
> >         at
> org.apache.solr.common.cloud.SolrZkClient.getData(SolrZkClient.java:349)
> >         at
> org.apache.solr.handler.dataimport.ZKPropertiesWriter.readIndexerProperties(ZKPropertiesWriter.java:91)
> >         at
> org.apache.solr.handler.dataimport.ZKPropertiesWriter.persist(ZKPropertiesWriter.java:65)
> >         at
> org.apache.solr.handler.dataimport.DocBuilder.finish(DocBuilder.java:307)
> >         at
> org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:253)
> >         at
> org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:416)
> >         at
> org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:480)
> >         at
> org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:461)
> >
> > 9/19/2015, 11:16:43 AM ERROR null SolrCore Previous SolrRequestInfo was
> not
> > closed!
> > req=waitSearcher=true&distrib.from=
> http://10.128.159.32:8983/solr/sitesearchcore/&update.distrib=FROMLEADER&openSearcher=true&commit=true&wt=javabin&expungeDeletes=false&commit_end_point=true&version=2&softCommit=false
> > 9/19/2015,
> > 11:16:43 AM ERROR null SolrCore prev == info : false
> >
> >
> >
> > Thanks
> >
> > Ravi Kiran Bhaskar
>

Re: SolrCloud DIH issue

Posted by Erick Erickson <er...@gmail.com>.
Could not read DIH properties from /configs/sitesearchcore/dataimport.properties

This looks like somehow you didn't push this file up to Zookeeper. You
can check what files are there in the admin UI. How you indexed
yesterday is a mystery though, unless somehow this file was removed
from ZK.

As for why you lost all the docs, my suspicion is that you have the
clean param set up for delta import....

FWIW,
Erick

On Sat, Sep 19, 2015 at 10:36 AM, Ravi Solr <ra...@gmail.com> wrote:
> I am facing a weird problem. As part of upgrade from 4.7.2 (Master-Slave)
> to 5.3.0 (Solrcloud) I re-indexed 1.5 million records via DIH using
> SolrEntityProcessor yesterday, all of them indexed properly. Today morning
> I just ran the DIH again with delta import and I lost all docs...what am I
> missing ? Did anybody face similar issue ?
>
> Here are the errors in the logs
>
> 9/19/2015, 2:41:17 AM ERROR null SolrCore Previous SolrRequestInfo was not
> closed!
> req=waitSearcher=true&distrib.from=http://10.128.159.32:8983/solr/sitesearchcore/&update.distrib=FROMLEADER&openSearcher=true&commit=true&wt=javabin&expungeDeletes=false&commit_end_point=true&version=2&softCommit=false
> 9/19/2015,
> 2:41:17 AM ERROR null SolrCore prev == info : false 9/19/2015, 2:41:17 AM
> WARN null ZKPropertiesWriter Could not read DIH properties from
> /configs/sitesearchcore/dataimport.properties :class
> org.apache.zookeeper.KeeperException$NoNodeException
>
> org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode
> = NoNode for /configs/sitesearchcore/dataimport.properties
>         at org.apache.zookeeper.KeeperException.create(KeeperException.java:111)
>         at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
>         at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1155)
>         at org.apache.solr.common.cloud.SolrZkClient.getData(SolrZkClient.java:349)
>         at org.apache.solr.handler.dataimport.ZKPropertiesWriter.readIndexerProperties(ZKPropertiesWriter.java:91)
>         at org.apache.solr.handler.dataimport.ZKPropertiesWriter.persist(ZKPropertiesWriter.java:65)
>         at org.apache.solr.handler.dataimport.DocBuilder.finish(DocBuilder.java:307)
>         at org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:253)
>         at org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:416)
>         at org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:480)
>         at org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:461)
>
> 9/19/2015, 11:16:43 AM ERROR null SolrCore Previous SolrRequestInfo was not
> closed!
> req=waitSearcher=true&distrib.from=http://10.128.159.32:8983/solr/sitesearchcore/&update.distrib=FROMLEADER&openSearcher=true&commit=true&wt=javabin&expungeDeletes=false&commit_end_point=true&version=2&softCommit=false
> 9/19/2015,
> 11:16:43 AM ERROR null SolrCore prev == info : false
>
>
>
> Thanks
>
> Ravi Kiran Bhaskar