You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@oozie.apache.org by Abraham Elmahrek <ab...@cloudera.com> on 2014/06/21 02:11:27 UTC

Review Request 22848: Create a Search Indexing action

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22848/
-----------------------------------------------------------

Review request for oozie.


Bugs: OOZIE-1895
    https://issues.apache.org/jira/browse/OOZIE-1895


Repository: oozie-git


Description
-------

- Provide 2 different paths of execution: Zookeeper config and solrconfig. With solrconfig, --shards argument should be passed via "argument" in the action xml. With the zookeeper config, --collection should be provided via "argument" in the action xml.
- Go live mode not supported in secure clusters.
- Only available with Hadoop 2.


Diffs
-----

  client/src/main/java/org/apache/oozie/cli/OozieCLI.java 33935d3 
  client/src/main/resources/search-batch-indexer-action-0.1.xsd PRE-CREATION 
  core/pom.xml e152266 
  core/src/main/java/org/apache/oozie/action/hadoop/SearchBatchIndexerActionExecutor.java PRE-CREATION 
  docs/src/site/twiki/DG_SearchBatchIndexerActionExtension.twiki PRE-CREATION 
  docs/src/site/twiki/index.twiki f078bf5 
  pom.xml bad1e0f 
  sharelib/pom.xml df20294 
  sharelib/search/pom.xml PRE-CREATION 
  sharelib/search/src/main/java/org/apache/oozie/action/hadoop/SearchBatchIndexerMain.java PRE-CREATION 
  sharelib/search/src/test/java/org/apache/oozie/action/hadoop/TestSearchBatchIndexerActionExecutor.java PRE-CREATION 
  sharelib/search/src/test/resources/morphlines/basic.conf PRE-CREATION 
  sharelib/search/src/test/resources/solr/conf/currency.xml PRE-CREATION 
  sharelib/search/src/test/resources/solr/conf/elevate.xml PRE-CREATION 
  sharelib/search/src/test/resources/solr/conf/lang/contractions_ca.txt PRE-CREATION 
  sharelib/search/src/test/resources/solr/conf/lang/contractions_fr.txt PRE-CREATION 
  sharelib/search/src/test/resources/solr/conf/lang/contractions_ga.txt PRE-CREATION 
  sharelib/search/src/test/resources/solr/conf/lang/contractions_it.txt PRE-CREATION 
  sharelib/search/src/test/resources/solr/conf/lang/hyphenations_ga.txt PRE-CREATION 
  sharelib/search/src/test/resources/solr/conf/lang/stemdict_nl.txt PRE-CREATION 
  sharelib/search/src/test/resources/solr/conf/lang/stoptags_ja.txt PRE-CREATION 
  sharelib/search/src/test/resources/solr/conf/lang/stopwords_ar.txt PRE-CREATION 
  sharelib/search/src/test/resources/solr/conf/lang/stopwords_bg.txt PRE-CREATION 
  sharelib/search/src/test/resources/solr/conf/lang/stopwords_ca.txt PRE-CREATION 
  sharelib/search/src/test/resources/solr/conf/lang/stopwords_cz.txt PRE-CREATION 
  sharelib/search/src/test/resources/solr/conf/lang/stopwords_da.txt PRE-CREATION 
  sharelib/search/src/test/resources/solr/conf/lang/stopwords_de.txt PRE-CREATION 
  sharelib/search/src/test/resources/solr/conf/lang/stopwords_el.txt PRE-CREATION 
  sharelib/search/src/test/resources/solr/conf/lang/stopwords_en.txt PRE-CREATION 
  sharelib/search/src/test/resources/solr/conf/lang/stopwords_es.txt PRE-CREATION 
  sharelib/search/src/test/resources/solr/conf/lang/stopwords_eu.txt PRE-CREATION 
  sharelib/search/src/test/resources/solr/conf/lang/stopwords_fa.txt PRE-CREATION 
  sharelib/search/src/test/resources/solr/conf/lang/stopwords_fi.txt PRE-CREATION 
  sharelib/search/src/test/resources/solr/conf/lang/stopwords_fr.txt PRE-CREATION 
  sharelib/search/src/test/resources/solr/conf/lang/stopwords_ga.txt PRE-CREATION 
  sharelib/search/src/test/resources/solr/conf/lang/stopwords_gl.txt PRE-CREATION 
  sharelib/search/src/test/resources/solr/conf/lang/stopwords_hi.txt PRE-CREATION 
  sharelib/search/src/test/resources/solr/conf/lang/stopwords_hu.txt PRE-CREATION 
  sharelib/search/src/test/resources/solr/conf/lang/stopwords_hy.txt PRE-CREATION 
  sharelib/search/src/test/resources/solr/conf/lang/stopwords_id.txt PRE-CREATION 
  sharelib/search/src/test/resources/solr/conf/lang/stopwords_it.txt PRE-CREATION 
  sharelib/search/src/test/resources/solr/conf/lang/stopwords_ja.txt PRE-CREATION 
  sharelib/search/src/test/resources/solr/conf/lang/stopwords_lv.txt PRE-CREATION 
  sharelib/search/src/test/resources/solr/conf/lang/stopwords_nl.txt PRE-CREATION 
  sharelib/search/src/test/resources/solr/conf/lang/stopwords_no.txt PRE-CREATION 
  sharelib/search/src/test/resources/solr/conf/lang/stopwords_pt.txt PRE-CREATION 
  sharelib/search/src/test/resources/solr/conf/lang/stopwords_ro.txt PRE-CREATION 
  sharelib/search/src/test/resources/solr/conf/lang/stopwords_ru.txt PRE-CREATION 
  sharelib/search/src/test/resources/solr/conf/lang/stopwords_sv.txt PRE-CREATION 
  sharelib/search/src/test/resources/solr/conf/lang/stopwords_th.txt PRE-CREATION 
  sharelib/search/src/test/resources/solr/conf/lang/stopwords_tr.txt PRE-CREATION 
  sharelib/search/src/test/resources/solr/conf/lang/userdict_ja.txt PRE-CREATION 
  sharelib/search/src/test/resources/solr/conf/protwords.txt PRE-CREATION 
  sharelib/search/src/test/resources/solr/conf/schema.xml PRE-CREATION 
  sharelib/search/src/test/resources/solr/conf/solrconfig.xml PRE-CREATION 
  sharelib/search/src/test/resources/solr/conf/stopwords.txt PRE-CREATION 
  sharelib/search/src/test/resources/solr/conf/synonyms.txt PRE-CREATION 
  src/main/assemblies/sharelib.xml 891d9dc 
  webapp/pom.xml 93cfcef 

Diff: https://reviews.apache.org/r/22848/diff/


Testing
-------

Wrote a couple of tests and manually tested on Kerberized environment.


Thanks,

Abraham Elmahrek


Re: Review Request 22848: Create a Search Indexing action

Posted by Abraham Elmahrek <ab...@cloudera.com>.

> On June 26, 2014, 11:55 p.m., Robert Kanter wrote:
> > Great work Abe!  I know this wasn't easy to figure out and get working.
> > 
> > I did an initial review and made some comments.  I haven't really looked at any of the tests yet.  
> > Also, do we need so many extra txt files for the tests?
> 
> Abraham Elmahrek wrote:
>     Thanks Robert. I'll try to remove dependencies and see what that does.
>     
>     I noticed a mistake in the rules of which arguments are required:
>     1. At least one of --zk-host or --solr-home-dir are required.
>     2. If solr-home-dir is specified, then --zk-host or --shard-url or --shards must be specified (mutually exclusive).
>     3. If --zk-host is specified at all (with solr-home-dir or without), --collection should be provided.
>     
>     Will rectify.

Also, I'm open to suggestions on the name of the action "search batch indexer" ;).


- Abraham


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22848/#review46810
-----------------------------------------------------------


On July 10, 2014, 11:58 p.m., Abraham Elmahrek wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22848/
> -----------------------------------------------------------
> 
> (Updated July 10, 2014, 11:58 p.m.)
> 
> 
> Review request for oozie.
> 
> 
> Bugs: OOZIE-1895
>     https://issues.apache.org/jira/browse/OOZIE-1895
> 
> 
> Repository: oozie-git
> 
> 
> Description
> -------
> 
> - Provide 2 different paths of execution: Zookeeper config and solrconfig. With solrconfig, --shards argument should be passed via "argument" in the action xml. With the zookeeper config, --collection should be provided via "argument" in the action xml.
> - Go live mode not supported in secure clusters.
> - Only available with Hadoop 2.
> 
> 
> Diffs
> -----
> 
>   client/src/main/java/org/apache/oozie/cli/OozieCLI.java 33935d3 
>   client/src/main/resources/search-batch-indexer-action-0.1.xsd PRE-CREATION 
>   core/pom.xml e152266 
>   core/src/main/java/org/apache/oozie/action/hadoop/SearchBatchIndexerActionExecutor.java PRE-CREATION 
>   core/src/main/resources/oozie-default.xml b944d3d 
>   docs/src/site/twiki/DG_SearchBatchIndexerActionExtension.twiki PRE-CREATION 
>   docs/src/site/twiki/index.twiki f078bf5 
>   pom.xml bad1e0f 
>   sharelib/pom.xml df20294 
>   sharelib/search/pom.xml PRE-CREATION 
>   sharelib/search/src/main/java/org/apache/oozie/action/hadoop/SearchBatchIndexerMain.java PRE-CREATION 
>   sharelib/search/src/test/java/org/apache/oozie/action/hadoop/TestSearchBatchIndexerActionExecutor.java PRE-CREATION 
>   sharelib/search/src/test/resources/morphlines/basic.conf PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/elevate.xml PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/contractions_ca.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/contractions_fr.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/contractions_ga.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/contractions_it.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/hyphenations_ga.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/stemdict_nl.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/stoptags_ja.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/stopwords_en.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/userdict_ja.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/protwords.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/schema.xml PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/solrconfig.xml PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/stopwords.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/synonyms.txt PRE-CREATION 
>   src/main/assemblies/sharelib.xml 891d9dc 
>   webapp/pom.xml 93cfcef 
> 
> Diff: https://reviews.apache.org/r/22848/diff/
> 
> 
> Testing
> -------
> 
> Wrote a couple of tests and manually tested on Kerberized environment.
> 
> 
> Thanks,
> 
> Abraham Elmahrek
> 
>


Re: Review Request 22848: Create a Search Indexing action

Posted by Abraham Elmahrek <ab...@cloudera.com>.

> On June 26, 2014, 11:55 p.m., Robert Kanter wrote:
> > Great work Abe!  I know this wasn't easy to figure out and get working.
> > 
> > I did an initial review and made some comments.  I haven't really looked at any of the tests yet.  
> > Also, do we need so many extra txt files for the tests?

Thanks Robert. I'll try to remove dependencies and see what that does.

I noticed a mistake in the rules of which arguments are required:
1. At least one of --zk-host or --solr-home-dir are required.
2. If solr-home-dir is specified, then --zk-host or --shard-url or --shards must be specified (mutually exclusive).
3. If --zk-host is specified at all (with solr-home-dir or without), --collection should be provided.

Will rectify.


> On June 26, 2014, 11:55 p.m., Robert Kanter wrote:
> > core/src/main/java/org/apache/oozie/action/hadoop/SearchBatchIndexerActionExecutor.java, line 35
> > <https://reviews.apache.org/r/22848/diff/2/?file=614633#file614633line35>
> >
> >     Can you explicitly list the imports instead of using *

Indeed!


> On June 26, 2014, 11:55 p.m., Robert Kanter wrote:
> > core/pom.xml, line 41
> > <https://reviews.apache.org/r/22848/diff/2/?file=614632#file614632line41>
> >
> >     This should stay at "provided"
> >     
> >     Also, can you put back the order?  You swapped this and oozie-hadoop-test, so it's not a "real" change.

Can we create a separate Jira for this? It turns out the order matters for Idea w/ Maven.

Also, it appears that commons-io is a compile time dependency on oozie-hadoop. commons-io appears to be used in GzipCompressionCodec.java, which is part of the core package. It can be done in a separate jira, but it seems it needs to be included?


> On June 26, 2014, 11:55 p.m., Robert Kanter wrote:
> > docs/src/site/twiki/DG_SearchBatchIndexerActionExtension.twiki, lines 79-81
> > <https://reviews.apache.org/r/22848/diff/2/?file=614634#file614634line79>
> >
> >     If these additional arguments are required for the zookeeper and solrconfigs, why not make them required by the schema?
> >     
> >     I'm not sure, but I think this would do it; if not, it would be something similar:
> >     <xs:choice minOccurs="1" maxOccurs="1">
> >        <xs:sequence>
> >           <xs:element name="zookeeper" type="xs:string" minOccurs="1" maxOccurs="1"/>
> >           <xs:element name="collection" type="xs:string" minOccurs="1" maxOccurs="1"/>
> >        </xs:sequence>
> >        <xs:sequence>
> >           <xs:element name="solrconfig" type="xs:string" minOccurs="1" maxOccurs="1"/>
> >           <xs:element name="shards" type="xs:string" minOccurs="1" maxOccurs="1"/>
> >        </xs:sequence>
> >     </xs:choice>

We could provide options for all/some of these options:
--solr-home-dir
--zk-host
--shard-url
--shards
--collection

The big worry with providing options of this nature is that if the options change or the requirements change, then we need to rev. the action. Since there are so many options, it seems possible that one may drop out and we'll have to add extra logic to be backwards compatible.


> On June 26, 2014, 11:55 p.m., Robert Kanter wrote:
> > docs/src/site/twiki/DG_SearchBatchIndexerActionExtension.twiki, lines 55-56
> > <https://reviews.apache.org/r/22848/diff/2/?file=614634#file614634line55>
> >
> >     You're only allowed to have one of these, right?

It looks like you can have both, will change in our validation internally. Rules are defined in comment above.


> On June 26, 2014, 11:55 p.m., Robert Kanter wrote:
> > docs/src/site/twiki/DG_SearchBatchIndexerActionExtension.twiki, line 9
> > <https://reviews.apache.org/r/22848/diff/2/?file=614634#file614634line9>
> >
> >     It may make sense to add an "About" section (or some other name) saying that this refers to Cloudea Search, an Apache licensed Search thing built on Lucene and Solr that can be found at GITHUB_PAGE_OR_WEBPAGE.  This is less well-known than the other actions types and isn't an Apache-run project.
> >     
> >     We can then also put the Hadoop 2 requirement in the same section.


- Abraham


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22848/#review46810
-----------------------------------------------------------


On June 21, 2014, 12:11 a.m., Abraham Elmahrek wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22848/
> -----------------------------------------------------------
> 
> (Updated June 21, 2014, 12:11 a.m.)
> 
> 
> Review request for oozie.
> 
> 
> Bugs: OOZIE-1895
>     https://issues.apache.org/jira/browse/OOZIE-1895
> 
> 
> Repository: oozie-git
> 
> 
> Description
> -------
> 
> - Provide 2 different paths of execution: Zookeeper config and solrconfig. With solrconfig, --shards argument should be passed via "argument" in the action xml. With the zookeeper config, --collection should be provided via "argument" in the action xml.
> - Go live mode not supported in secure clusters.
> - Only available with Hadoop 2.
> 
> 
> Diffs
> -----
> 
>   client/src/main/java/org/apache/oozie/cli/OozieCLI.java 33935d3 
>   client/src/main/resources/search-batch-indexer-action-0.1.xsd PRE-CREATION 
>   core/pom.xml e152266 
>   core/src/main/java/org/apache/oozie/action/hadoop/SearchBatchIndexerActionExecutor.java PRE-CREATION 
>   docs/src/site/twiki/DG_SearchBatchIndexerActionExtension.twiki PRE-CREATION 
>   docs/src/site/twiki/index.twiki f078bf5 
>   pom.xml bad1e0f 
>   sharelib/pom.xml df20294 
>   sharelib/search/pom.xml PRE-CREATION 
>   sharelib/search/src/main/java/org/apache/oozie/action/hadoop/SearchBatchIndexerMain.java PRE-CREATION 
>   sharelib/search/src/test/java/org/apache/oozie/action/hadoop/TestSearchBatchIndexerActionExecutor.java PRE-CREATION 
>   sharelib/search/src/test/resources/morphlines/basic.conf PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/currency.xml PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/elevate.xml PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/contractions_ca.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/contractions_fr.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/contractions_ga.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/contractions_it.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/hyphenations_ga.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/stemdict_nl.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/stoptags_ja.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/stopwords_ar.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/stopwords_bg.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/stopwords_ca.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/stopwords_cz.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/stopwords_da.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/stopwords_de.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/stopwords_el.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/stopwords_en.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/stopwords_es.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/stopwords_eu.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/stopwords_fa.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/stopwords_fi.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/stopwords_fr.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/stopwords_ga.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/stopwords_gl.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/stopwords_hi.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/stopwords_hu.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/stopwords_hy.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/stopwords_id.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/stopwords_it.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/stopwords_ja.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/stopwords_lv.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/stopwords_nl.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/stopwords_no.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/stopwords_pt.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/stopwords_ro.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/stopwords_ru.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/stopwords_sv.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/stopwords_th.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/stopwords_tr.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/userdict_ja.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/protwords.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/schema.xml PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/solrconfig.xml PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/stopwords.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/synonyms.txt PRE-CREATION 
>   src/main/assemblies/sharelib.xml 891d9dc 
>   webapp/pom.xml 93cfcef 
> 
> Diff: https://reviews.apache.org/r/22848/diff/
> 
> 
> Testing
> -------
> 
> Wrote a couple of tests and manually tested on Kerberized environment.
> 
> 
> Thanks,
> 
> Abraham Elmahrek
> 
>


Re: Review Request 22848: Create a Search Indexing action

Posted by Robert Kanter <rk...@cloudera.com>.

> On June 26, 2014, 11:55 p.m., Robert Kanter wrote:
> > core/pom.xml, line 41
> > <https://reviews.apache.org/r/22848/diff/2/?file=614632#file614632line41>
> >
> >     This should stay at "provided"
> >     
> >     Also, can you put back the order?  You swapped this and oozie-hadoop-test, so it's not a "real" change.
> 
> Abraham Elmahrek wrote:
>     Can we create a separate Jira for this? It turns out the order matters for Idea w/ Maven.
>     
>     Also, it appears that commons-io is a compile time dependency on oozie-hadoop. commons-io appears to be used in GzipCompressionCodec.java, which is part of the core package. It can be done in a separate jira, but it seems it needs to be included?

In that case, it's fine to change the order (though that's silly that Idea has that problem; I'm pretty sure the order isn't supposed to matter).  But oozie-hadoop should definitely be "provided".

It does look like commons-io should be "compile".


> On June 26, 2014, 11:55 p.m., Robert Kanter wrote:
> > docs/src/site/twiki/DG_SearchBatchIndexerActionExtension.twiki, lines 79-81
> > <https://reviews.apache.org/r/22848/diff/2/?file=614634#file614634line79>
> >
> >     If these additional arguments are required for the zookeeper and solrconfigs, why not make them required by the schema?
> >     
> >     I'm not sure, but I think this would do it; if not, it would be something similar:
> >     <xs:choice minOccurs="1" maxOccurs="1">
> >        <xs:sequence>
> >           <xs:element name="zookeeper" type="xs:string" minOccurs="1" maxOccurs="1"/>
> >           <xs:element name="collection" type="xs:string" minOccurs="1" maxOccurs="1"/>
> >        </xs:sequence>
> >        <xs:sequence>
> >           <xs:element name="solrconfig" type="xs:string" minOccurs="1" maxOccurs="1"/>
> >           <xs:element name="shards" type="xs:string" minOccurs="1" maxOccurs="1"/>
> >        </xs:sequence>
> >     </xs:choice>
> 
> Abraham Elmahrek wrote:
>     We could provide options for all/some of these options:
>     --solr-home-dir
>     --zk-host
>     --shard-url
>     --shards
>     --collection
>     
>     The big worry with providing options of this nature is that if the options change or the requirements change, then we need to rev. the action. Since there are so many options, it seems possible that one may drop out and we'll have to add extra logic to be backwards compatible.

I think it should be okay because, say one of them gets dropped.  Then schema 0.2 would not have it and if using schema 0.1, the code would just completely ignore that field.  If a field is added, it can still be put in the <argument> field, and we can make schema 0.2 to add it as its own field.  I'd say that we should only add the required options, if any are optional then leave them out (unless you think they'd be helpful).  My concern is a user who doesn't know what they're doing and specified conflicting options or doesn't specify enough options, etc.  It's best if it fails at submission time (during the schema validation) rather than later when the Oozie server tries to actually run the action.


- Robert


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22848/#review46810
-----------------------------------------------------------


On June 21, 2014, 12:11 a.m., Abraham Elmahrek wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22848/
> -----------------------------------------------------------
> 
> (Updated June 21, 2014, 12:11 a.m.)
> 
> 
> Review request for oozie.
> 
> 
> Bugs: OOZIE-1895
>     https://issues.apache.org/jira/browse/OOZIE-1895
> 
> 
> Repository: oozie-git
> 
> 
> Description
> -------
> 
> - Provide 2 different paths of execution: Zookeeper config and solrconfig. With solrconfig, --shards argument should be passed via "argument" in the action xml. With the zookeeper config, --collection should be provided via "argument" in the action xml.
> - Go live mode not supported in secure clusters.
> - Only available with Hadoop 2.
> 
> 
> Diffs
> -----
> 
>   client/src/main/java/org/apache/oozie/cli/OozieCLI.java 33935d3 
>   client/src/main/resources/search-batch-indexer-action-0.1.xsd PRE-CREATION 
>   core/pom.xml e152266 
>   core/src/main/java/org/apache/oozie/action/hadoop/SearchBatchIndexerActionExecutor.java PRE-CREATION 
>   docs/src/site/twiki/DG_SearchBatchIndexerActionExtension.twiki PRE-CREATION 
>   docs/src/site/twiki/index.twiki f078bf5 
>   pom.xml bad1e0f 
>   sharelib/pom.xml df20294 
>   sharelib/search/pom.xml PRE-CREATION 
>   sharelib/search/src/main/java/org/apache/oozie/action/hadoop/SearchBatchIndexerMain.java PRE-CREATION 
>   sharelib/search/src/test/java/org/apache/oozie/action/hadoop/TestSearchBatchIndexerActionExecutor.java PRE-CREATION 
>   sharelib/search/src/test/resources/morphlines/basic.conf PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/currency.xml PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/elevate.xml PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/contractions_ca.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/contractions_fr.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/contractions_ga.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/contractions_it.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/hyphenations_ga.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/stemdict_nl.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/stoptags_ja.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/stopwords_ar.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/stopwords_bg.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/stopwords_ca.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/stopwords_cz.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/stopwords_da.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/stopwords_de.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/stopwords_el.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/stopwords_en.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/stopwords_es.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/stopwords_eu.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/stopwords_fa.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/stopwords_fi.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/stopwords_fr.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/stopwords_ga.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/stopwords_gl.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/stopwords_hi.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/stopwords_hu.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/stopwords_hy.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/stopwords_id.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/stopwords_it.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/stopwords_ja.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/stopwords_lv.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/stopwords_nl.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/stopwords_no.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/stopwords_pt.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/stopwords_ro.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/stopwords_ru.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/stopwords_sv.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/stopwords_th.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/stopwords_tr.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/userdict_ja.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/protwords.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/schema.xml PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/solrconfig.xml PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/stopwords.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/synonyms.txt PRE-CREATION 
>   src/main/assemblies/sharelib.xml 891d9dc 
>   webapp/pom.xml 93cfcef 
> 
> Diff: https://reviews.apache.org/r/22848/diff/
> 
> 
> Testing
> -------
> 
> Wrote a couple of tests and manually tested on Kerberized environment.
> 
> 
> Thanks,
> 
> Abraham Elmahrek
> 
>


Re: Review Request 22848: Create a Search Indexing action

Posted by Robert Kanter <rk...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22848/#review46810
-----------------------------------------------------------


Great work Abe!  I know this wasn't easy to figure out and get working.

I did an initial review and made some comments.  I haven't really looked at any of the tests yet.  
Also, do we need so many extra txt files for the tests?  


client/src/main/resources/search-batch-indexer-action-0.1.xsd
<https://reviews.apache.org/r/22848/#comment82378>

    Shouldn't the minOccurs be "1" on these?  I'm not sure, but I believe the xs:choice will enforce that only one of them is actually provided, right?



core/pom.xml
<https://reviews.apache.org/r/22848/#comment82369>

    This should stay at "provided"
    
    Also, can you put back the order?  You swapped this and oozie-hadoop-test, so it's not a "real" change.



core/src/main/java/org/apache/oozie/action/hadoop/SearchBatchIndexerActionExecutor.java
<https://reviews.apache.org/r/22848/#comment82370>

    Can you explicitly list the imports instead of using *



core/src/main/java/org/apache/oozie/action/hadoop/SearchBatchIndexerActionExecutor.java
<https://reviews.apache.org/r/22848/#comment82371>

    It might be a good idea to add a config property to oozie-default to override this check just in case.  Otherwise, I could see someone using a custom version of Hadoop where this could work (or at least they think it should) and we have no way of going around this check if it doesn't agree.



core/src/main/java/org/apache/oozie/action/hadoop/SearchBatchIndexerActionExecutor.java
<https://reviews.apache.org/r/22848/#comment82372>

    Should this be the following?
    conf = super.setupLauncherConf(conf, actionXml, appPath, context);



docs/src/site/twiki/DG_SearchBatchIndexerActionExtension.twiki
<https://reviews.apache.org/r/22848/#comment82374>

    It may make sense to add an "About" section (or some other name) saying that this refers to Cloudea Search, an Apache licensed Search thing built on Lucene and Solr that can be found at GITHUB_PAGE_OR_WEBPAGE.  This is less well-known than the other actions types and isn't an Apache-run project.
    
    We can then also put the Hadoop 2 requirement in the same section.



docs/src/site/twiki/DG_SearchBatchIndexerActionExtension.twiki
<https://reviews.apache.org/r/22848/#comment82375>

    Can we call this something else?  It's confusing with NAME-NODE.  And also the "ok to" and "error to" transitions should have some other name, or it looks like a loop (e.g. SOMENAME2)



docs/src/site/twiki/DG_SearchBatchIndexerActionExtension.twiki
<https://reviews.apache.org/r/22848/#comment82376>

    You're only allowed to have one of these, right?  



docs/src/site/twiki/DG_SearchBatchIndexerActionExtension.twiki
<https://reviews.apache.org/r/22848/#comment82377>

    If these additional arguments are required for the zookeeper and solrconfigs, why not make them required by the schema?
    
    I'm not sure, but I think this would do it; if not, it would be something similar:
    <xs:choice minOccurs="1" maxOccurs="1">
       <xs:sequence>
          <xs:element name="zookeeper" type="xs:string" minOccurs="1" maxOccurs="1"/>
          <xs:element name="collection" type="xs:string" minOccurs="1" maxOccurs="1"/>
       </xs:sequence>
       <xs:sequence>
          <xs:element name="solrconfig" type="xs:string" minOccurs="1" maxOccurs="1"/>
          <xs:element name="shards" type="xs:string" minOccurs="1" maxOccurs="1"/>
       </xs:sequence>
    </xs:choice>



sharelib/search/src/test/java/org/apache/oozie/action/hadoop/TestSearchBatchIndexerActionExecutor.java
<https://reviews.apache.org/r/22848/#comment82379>

    Replace with explicit classes



sharelib/search/src/test/resources/solr/conf/currency.xml
<https://reviews.apache.org/r/22848/#comment82380>

    Please remove the trailing whitespace if possible.



sharelib/search/src/test/resources/solr/conf/elevate.xml
<https://reviews.apache.org/r/22848/#comment82381>

    Please remove the trailing whitespace if possible.



sharelib/search/src/test/resources/solr/conf/lang/contractions_it.txt
<https://reviews.apache.org/r/22848/#comment82383>

    Trailing whitespace



sharelib/search/src/test/resources/solr/conf/lang/stoptags_ja.txt
<https://reviews.apache.org/r/22848/#comment82384>

    Trailing whitespace



sharelib/search/src/test/resources/solr/conf/schema.xml
<https://reviews.apache.org/r/22848/#comment82382>

    Trailing whitespace throughout this file again if possible.


- Robert Kanter


On June 21, 2014, 12:11 a.m., Abraham Elmahrek wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22848/
> -----------------------------------------------------------
> 
> (Updated June 21, 2014, 12:11 a.m.)
> 
> 
> Review request for oozie.
> 
> 
> Bugs: OOZIE-1895
>     https://issues.apache.org/jira/browse/OOZIE-1895
> 
> 
> Repository: oozie-git
> 
> 
> Description
> -------
> 
> - Provide 2 different paths of execution: Zookeeper config and solrconfig. With solrconfig, --shards argument should be passed via "argument" in the action xml. With the zookeeper config, --collection should be provided via "argument" in the action xml.
> - Go live mode not supported in secure clusters.
> - Only available with Hadoop 2.
> 
> 
> Diffs
> -----
> 
>   client/src/main/java/org/apache/oozie/cli/OozieCLI.java 33935d3 
>   client/src/main/resources/search-batch-indexer-action-0.1.xsd PRE-CREATION 
>   core/pom.xml e152266 
>   core/src/main/java/org/apache/oozie/action/hadoop/SearchBatchIndexerActionExecutor.java PRE-CREATION 
>   docs/src/site/twiki/DG_SearchBatchIndexerActionExtension.twiki PRE-CREATION 
>   docs/src/site/twiki/index.twiki f078bf5 
>   pom.xml bad1e0f 
>   sharelib/pom.xml df20294 
>   sharelib/search/pom.xml PRE-CREATION 
>   sharelib/search/src/main/java/org/apache/oozie/action/hadoop/SearchBatchIndexerMain.java PRE-CREATION 
>   sharelib/search/src/test/java/org/apache/oozie/action/hadoop/TestSearchBatchIndexerActionExecutor.java PRE-CREATION 
>   sharelib/search/src/test/resources/morphlines/basic.conf PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/currency.xml PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/elevate.xml PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/contractions_ca.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/contractions_fr.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/contractions_ga.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/contractions_it.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/hyphenations_ga.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/stemdict_nl.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/stoptags_ja.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/stopwords_ar.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/stopwords_bg.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/stopwords_ca.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/stopwords_cz.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/stopwords_da.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/stopwords_de.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/stopwords_el.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/stopwords_en.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/stopwords_es.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/stopwords_eu.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/stopwords_fa.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/stopwords_fi.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/stopwords_fr.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/stopwords_ga.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/stopwords_gl.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/stopwords_hi.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/stopwords_hu.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/stopwords_hy.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/stopwords_id.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/stopwords_it.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/stopwords_ja.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/stopwords_lv.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/stopwords_nl.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/stopwords_no.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/stopwords_pt.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/stopwords_ro.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/stopwords_ru.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/stopwords_sv.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/stopwords_th.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/stopwords_tr.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/userdict_ja.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/protwords.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/schema.xml PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/solrconfig.xml PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/stopwords.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/synonyms.txt PRE-CREATION 
>   src/main/assemblies/sharelib.xml 891d9dc 
>   webapp/pom.xml 93cfcef 
> 
> Diff: https://reviews.apache.org/r/22848/diff/
> 
> 
> Testing
> -------
> 
> Wrote a couple of tests and manually tested on Kerberized environment.
> 
> 
> Thanks,
> 
> Abraham Elmahrek
> 
>


Re: Review Request 22848: Create a Search Indexing action

Posted by Abraham Elmahrek <ab...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22848/
-----------------------------------------------------------

(Updated July 10, 2014, 11:58 p.m.)


Review request for oozie.


Bugs: OOZIE-1895
    https://issues.apache.org/jira/browse/OOZIE-1895


Repository: oozie-git


Description
-------

- Provide 2 different paths of execution: Zookeeper config and solrconfig. With solrconfig, --shards argument should be passed via "argument" in the action xml. With the zookeeper config, --collection should be provided via "argument" in the action xml.
- Go live mode not supported in secure clusters.
- Only available with Hadoop 2.


Diffs (updated)
-----

  client/src/main/java/org/apache/oozie/cli/OozieCLI.java 33935d3 
  client/src/main/resources/search-batch-indexer-action-0.1.xsd PRE-CREATION 
  core/pom.xml e152266 
  core/src/main/java/org/apache/oozie/action/hadoop/SearchBatchIndexerActionExecutor.java PRE-CREATION 
  core/src/main/resources/oozie-default.xml b944d3d 
  docs/src/site/twiki/DG_SearchBatchIndexerActionExtension.twiki PRE-CREATION 
  docs/src/site/twiki/index.twiki f078bf5 
  pom.xml bad1e0f 
  sharelib/pom.xml df20294 
  sharelib/search/pom.xml PRE-CREATION 
  sharelib/search/src/main/java/org/apache/oozie/action/hadoop/SearchBatchIndexerMain.java PRE-CREATION 
  sharelib/search/src/test/java/org/apache/oozie/action/hadoop/TestSearchBatchIndexerActionExecutor.java PRE-CREATION 
  sharelib/search/src/test/resources/morphlines/basic.conf PRE-CREATION 
  sharelib/search/src/test/resources/solr/conf/elevate.xml PRE-CREATION 
  sharelib/search/src/test/resources/solr/conf/lang/contractions_ca.txt PRE-CREATION 
  sharelib/search/src/test/resources/solr/conf/lang/contractions_fr.txt PRE-CREATION 
  sharelib/search/src/test/resources/solr/conf/lang/contractions_ga.txt PRE-CREATION 
  sharelib/search/src/test/resources/solr/conf/lang/contractions_it.txt PRE-CREATION 
  sharelib/search/src/test/resources/solr/conf/lang/hyphenations_ga.txt PRE-CREATION 
  sharelib/search/src/test/resources/solr/conf/lang/stemdict_nl.txt PRE-CREATION 
  sharelib/search/src/test/resources/solr/conf/lang/stoptags_ja.txt PRE-CREATION 
  sharelib/search/src/test/resources/solr/conf/lang/stopwords_en.txt PRE-CREATION 
  sharelib/search/src/test/resources/solr/conf/lang/userdict_ja.txt PRE-CREATION 
  sharelib/search/src/test/resources/solr/conf/protwords.txt PRE-CREATION 
  sharelib/search/src/test/resources/solr/conf/schema.xml PRE-CREATION 
  sharelib/search/src/test/resources/solr/conf/solrconfig.xml PRE-CREATION 
  sharelib/search/src/test/resources/solr/conf/stopwords.txt PRE-CREATION 
  sharelib/search/src/test/resources/solr/conf/synonyms.txt PRE-CREATION 
  src/main/assemblies/sharelib.xml 891d9dc 
  webapp/pom.xml 93cfcef 

Diff: https://reviews.apache.org/r/22848/diff/


Testing
-------

Wrote a couple of tests and manually tested on Kerberized environment.


Thanks,

Abraham Elmahrek