You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Archana Satheesh Kumar <ak...@kogentix.com> on 2016/07/26 17:52:21 UTC

Solr MapReduce Indexer : go-live option throwing exception

Hi,


I was trying to use the Mapreduce Indexer tool from cloudera, to index my data in Hive table using Solr.


hadoop jar /path/to/lib/solr/contrib/mr/search-mr-*-job.jar  org.apache.solr.hadoop.MapReduceIndexerTool -Djute.maxbuffer=<buff size>--morphline-file /path/to/morphlines.conf --output-dir hdfs://path/to/output/dir --reducers -1 --mappers -1 --verbose --go-live --zk-host <zookeeperHostIP>:2181/solr --shards 2 --collection <collection name> hdfs://location/of/hive/table

My MR job runs successfully and I am able to view _SUCCESS flag in the specified output loc

 hadoop fs -ls /path/to/output/results
Found 2 items
-rwxrwx--x+  3 hive hive          0 2016-07-26 11:35 /path/to/output/results/_SUCCESS
drwxrwx--x+  - hive hive          0 2016-07-26 11:20 /path/to/output/results/part-00000

But my go-live option is not working.

Exception:
java.util.concurrent.ExecutionException: org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException: Expected mime type application/octet-stream but got text/html

I also tried using jaas-client.conf

Client {
 com.sun.security.auth.module.Krb5LoginModule required
 useKeyTab=false
 useTicketCache=true
 principal="<My...@DOMAIN>";
 };

So, before executing the Mapreduce job, HADOOP_OPTS was set to the jass-client.conf
export HADOOP_OPTS="-Djava.security.auth.login.config=/path/to/jaas-client.conf"


1. What could be the issue?
2. Am I missing something?
3. Since I have my data indexed locally, is there a way to perform the go-live option separately?

Thanks in advance

Archana






Re: Solr MapReduce Indexer : go-live option throwing exception

Posted by Erick Erickson <er...@gmail.com>.
Can't really deal with the security issues, but...

The resulting indexes created by MRIT are just plain vanilla
Solr/Lucene indexes. All the --go-live step does is issue a
MERGEINDEXES command from the core where they live to the directory
MRIT leaves them in, you might get some joy there, see:
https://cwiki.apache.org/confluence/display/solr/CoreAdmin+API#CoreAdminAPI-MERGEINDEXES

Or you can copy them around by hand and start Solr.

You have to be _really_ sure that you get the right index for each
replica though, if you get the index intended for a replica on shard1
on a replica for shard2 it's A Bad Thing.

Best,
Erick

On Tue, Jul 26, 2016 at 10:52 AM, Archana Satheesh Kumar
<ak...@kogentix.com> wrote:
> Hi,
>
>
> I was trying to use the Mapreduce Indexer tool from cloudera, to index my data in Hive table using Solr.
>
>
> hadoop jar /path/to/lib/solr/contrib/mr/search-mr-*-job.jar  org.apache.solr.hadoop.MapReduceIndexerTool -Djute.maxbuffer=<buff size>--morphline-file /path/to/morphlines.conf --output-dir hdfs://path/to/output/dir --reducers -1 --mappers -1 --verbose --go-live --zk-host <zookeeperHostIP>:2181/solr --shards 2 --collection <collection name> hdfs://location/of/hive/table
>
> My MR job runs successfully and I am able to view _SUCCESS flag in the specified output loc
>
>  hadoop fs -ls /path/to/output/results
> Found 2 items
> -rwxrwx--x+  3 hive hive          0 2016-07-26 11:35 /path/to/output/results/_SUCCESS
> drwxrwx--x+  - hive hive          0 2016-07-26 11:20 /path/to/output/results/part-00000
>
> But my go-live option is not working.
>
> Exception:
> java.util.concurrent.ExecutionException: org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException: Expected mime type application/octet-stream but got text/html
>
> I also tried using jaas-client.conf
>
> Client {
>  com.sun.security.auth.module.Krb5LoginModule required
>  useKeyTab=false
>  useTicketCache=true
>  principal="<My...@DOMAIN>";
>  };
>
> So, before executing the Mapreduce job, HADOOP_OPTS was set to the jass-client.conf
> export HADOOP_OPTS="-Djava.security.auth.login.config=/path/to/jaas-client.conf"
>
>
> 1. What could be the issue?
> 2. Am I missing something?
> 3. Since I have my data indexed locally, is there a way to perform the go-live option separately?
>
> Thanks in advance
>
> Archana
>
>
>
>
>