You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by Tom Chen <to...@gmail.com> on 2014/08/01 16:45:36 UTC

Re: SolrCloud on HDFS empty tlog hence doesn't replay after Solr process crash and restart

I wonder if there's any update on this. Should we create a JIRA to track
this?

Thanks,
Tom


On Mon, Jul 21, 2014 at 12:18 PM, Mark Miller <ma...@gmail.com> wrote:

> It’s on my list to investigate.
>
> --
> Mark Miller
> about.me/markrmiller
>
> On July 21, 2014 at 10:26:09 AM, Tom Chen (tomchen1000@gmail.com) wrote:
> > Any thought about this issue: Solr on HDFS generate empty tlog when add
> > documents without commit.
> >
> > Thanks,
> > Tom
> >
> >
> > On Fri, Jul 18, 2014 at 12:21 PM, Tom Chen wrote:
> >
> > > Hi,
> > >
> > > This seems a bug for Solr running on HDFS.
> > >
> > > Reproduce steps:
> > > 1) Setup Solr to run on HDFS like this:
> > >
> > > java -Dsolr.directoryFactory=HdfsDirectoryFactory
> > > -Dsolr.lock.type=hdfs
> > > -Dsolr.hdfs.home=hdfs://host:port/path
> > >
> > > For the purpose of this testing, turn off the default auto commit in
> > > solrconfig.xml, i.e. comment out autoCommit like this:
> > >
> > >
> > > 2) Add a document without commit:
> > > curl "http://localhost:8983/solr/collection1/update?commit=false" -H
> > > "Content-type:text/xml; charset=utf-8" --data-binary "@solr.xml"
> > >
> > > 3) Solr generate empty tlog file (0 file size, the last one ends with
> 6):
> > > [hadoop@hdtest042 exampledocs]$ hadoop fs -ls
> > > /path/collection1/core_node1/data/tlog
> > > Found 5 items
> > > -rw-r--r-- 1 hadoop hadoop 667 2014-07-18 08:47
> > > /path/collection1/core_node1/data/tlog/tlog.0000000000000000001
> > > -rw-r--r-- 1 hadoop hadoop 67 2014-07-18 08:47
> > > /path/collection1/core_node1/data/tlog/tlog.0000000000000000003
> > > -rw-r--r-- 1 hadoop hadoop 667 2014-07-18 08:47
> > > /path/collection1/core_node1/data/tlog/tlog.0000000000000000004
> > > -rw-r--r-- 1 hadoop hadoop 0 2014-07-18 09:02
> > > /path/collection1/core_node1/data/tlog/tlog.0000000000000000005
> > > -rw-r--r-- 1 hadoop hadoop 0 2014-07-18 09:02
> > > /path/collection1/core_node1/data/tlog/tlog.0000000000000000006
> > >
> > > 4) Simulate Solr crash by killing the process with -9 option.
> > >
> > > 5) restart the Solr process. Observation is that uncommitted document
> are
> > > not replayed, files in tlog directory are cleaned up. Hence uncommitted
> > > document(s) is lost.
> > >
> > > Am I missing anything or this is a bug?
> > >
> > > BTW, additional observations:
> > > a) If in step 4) Solr is stopped gracefully (i.e. without -9 option),
> > > non-empty tlog file is geneated and after re-starting Solr, uncommitted
> > > document is replayed as expected.
> > >
> > > b) If Solr doesn't run on HDFS (i.e. on local file system), this issue
> is
> > > not observed either.
> > >
> > > Thanks,
> > > Tom
> > >
> >
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>
>

Re: SolrCloud on HDFS empty tlog hence doesn't replay after Solr process crash and restart

Posted by Chris Hostetter <ho...@fucit.org>.
Tom: i don't know enough about hte HDFS code to fully understand what's 
going on here, but based on your description of the problem it definitely 
smells like a bug, so i've opened an issue ot make sure we don't lose 
track of it...

https://issues.apache.org/jira/browse/SOLR-6367


: Date: Fri, 1 Aug 2014 10:45:36 -0400
: From: Tom Chen <to...@gmail.com>
: Reply-To: dev@lucene.apache.org
: To: dev@lucene.apache.org
: Subject: Re: SolrCloud on HDFS empty tlog hence doesn't replay after Solr
:     process crash and restart
: 
: I wonder if there's any update on this. Should we create a JIRA to track
: this?
: 
: Thanks,
: Tom
: 
: 
: On Mon, Jul 21, 2014 at 12:18 PM, Mark Miller <ma...@gmail.com> wrote:
: 
: > It’s on my list to investigate.
: >
: > --
: > Mark Miller
: > about.me/markrmiller
: >
: > On July 21, 2014 at 10:26:09 AM, Tom Chen (tomchen1000@gmail.com) wrote:
: > > Any thought about this issue: Solr on HDFS generate empty tlog when add
: > > documents without commit.
: > >
: > > Thanks,
: > > Tom
: > >
: > >
: > > On Fri, Jul 18, 2014 at 12:21 PM, Tom Chen wrote:
: > >
: > > > Hi,
: > > >
: > > > This seems a bug for Solr running on HDFS.
: > > >
: > > > Reproduce steps:
: > > > 1) Setup Solr to run on HDFS like this:
: > > >
: > > > java -Dsolr.directoryFactory=HdfsDirectoryFactory
: > > > -Dsolr.lock.type=hdfs
: > > > -Dsolr.hdfs.home=hdfs://host:port/path
: > > >
: > > > For the purpose of this testing, turn off the default auto commit in
: > > > solrconfig.xml, i.e. comment out autoCommit like this:
: > > >
: > > >
: > > > 2) Add a document without commit:
: > > > curl "http://localhost:8983/solr/collection1/update?commit=false" -H
: > > > "Content-type:text/xml; charset=utf-8" --data-binary "@solr.xml"
: > > >
: > > > 3) Solr generate empty tlog file (0 file size, the last one ends with
: > 6):
: > > > [hadoop@hdtest042 exampledocs]$ hadoop fs -ls
: > > > /path/collection1/core_node1/data/tlog
: > > > Found 5 items
: > > > -rw-r--r-- 1 hadoop hadoop 667 2014-07-18 08:47
: > > > /path/collection1/core_node1/data/tlog/tlog.0000000000000000001
: > > > -rw-r--r-- 1 hadoop hadoop 67 2014-07-18 08:47
: > > > /path/collection1/core_node1/data/tlog/tlog.0000000000000000003
: > > > -rw-r--r-- 1 hadoop hadoop 667 2014-07-18 08:47
: > > > /path/collection1/core_node1/data/tlog/tlog.0000000000000000004
: > > > -rw-r--r-- 1 hadoop hadoop 0 2014-07-18 09:02
: > > > /path/collection1/core_node1/data/tlog/tlog.0000000000000000005
: > > > -rw-r--r-- 1 hadoop hadoop 0 2014-07-18 09:02
: > > > /path/collection1/core_node1/data/tlog/tlog.0000000000000000006
: > > >
: > > > 4) Simulate Solr crash by killing the process with -9 option.
: > > >
: > > > 5) restart the Solr process. Observation is that uncommitted document
: > are
: > > > not replayed, files in tlog directory are cleaned up. Hence uncommitted
: > > > document(s) is lost.
: > > >
: > > > Am I missing anything or this is a bug?
: > > >
: > > > BTW, additional observations:
: > > > a) If in step 4) Solr is stopped gracefully (i.e. without -9 option),
: > > > non-empty tlog file is geneated and after re-starting Solr, uncommitted
: > > > document is replayed as expected.
: > > >
: > > > b) If Solr doesn't run on HDFS (i.e. on local file system), this issue
: > is
: > > > not observed either.
: > > >
: > > > Thanks,
: > > > Tom
: > > >
: > >
: >
: >
: > ---------------------------------------------------------------------
: > To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
: > For additional commands, e-mail: dev-help@lucene.apache.org
: >
: >
: 

-Hoss
http://www.lucidworks.com/