You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@oozie.apache.org by Robert Kanter <rk...@cloudera.com> on 2014/07/01 02:49:03 UTC

Re: Review Request 22848: Create a Search Indexing action


> On June 26, 2014, 11:55 p.m., Robert Kanter wrote:
> > core/pom.xml, line 41
> > <https://reviews.apache.org/r/22848/diff/2/?file=614632#file614632line41>
> >
> >     This should stay at "provided"
> >     
> >     Also, can you put back the order?  You swapped this and oozie-hadoop-test, so it's not a "real" change.
> 
> Abraham Elmahrek wrote:
>     Can we create a separate Jira for this? It turns out the order matters for Idea w/ Maven.
>     
>     Also, it appears that commons-io is a compile time dependency on oozie-hadoop. commons-io appears to be used in GzipCompressionCodec.java, which is part of the core package. It can be done in a separate jira, but it seems it needs to be included?

In that case, it's fine to change the order (though that's silly that Idea has that problem; I'm pretty sure the order isn't supposed to matter).  But oozie-hadoop should definitely be "provided".

It does look like commons-io should be "compile".


> On June 26, 2014, 11:55 p.m., Robert Kanter wrote:
> > docs/src/site/twiki/DG_SearchBatchIndexerActionExtension.twiki, lines 79-81
> > <https://reviews.apache.org/r/22848/diff/2/?file=614634#file614634line79>
> >
> >     If these additional arguments are required for the zookeeper and solrconfigs, why not make them required by the schema?
> >     
> >     I'm not sure, but I think this would do it; if not, it would be something similar:
> >     <xs:choice minOccurs="1" maxOccurs="1">
> >        <xs:sequence>
> >           <xs:element name="zookeeper" type="xs:string" minOccurs="1" maxOccurs="1"/>
> >           <xs:element name="collection" type="xs:string" minOccurs="1" maxOccurs="1"/>
> >        </xs:sequence>
> >        <xs:sequence>
> >           <xs:element name="solrconfig" type="xs:string" minOccurs="1" maxOccurs="1"/>
> >           <xs:element name="shards" type="xs:string" minOccurs="1" maxOccurs="1"/>
> >        </xs:sequence>
> >     </xs:choice>
> 
> Abraham Elmahrek wrote:
>     We could provide options for all/some of these options:
>     --solr-home-dir
>     --zk-host
>     --shard-url
>     --shards
>     --collection
>     
>     The big worry with providing options of this nature is that if the options change or the requirements change, then we need to rev. the action. Since there are so many options, it seems possible that one may drop out and we'll have to add extra logic to be backwards compatible.

I think it should be okay because, say one of them gets dropped.  Then schema 0.2 would not have it and if using schema 0.1, the code would just completely ignore that field.  If a field is added, it can still be put in the <argument> field, and we can make schema 0.2 to add it as its own field.  I'd say that we should only add the required options, if any are optional then leave them out (unless you think they'd be helpful).  My concern is a user who doesn't know what they're doing and specified conflicting options or doesn't specify enough options, etc.  It's best if it fails at submission time (during the schema validation) rather than later when the Oozie server tries to actually run the action.


- Robert


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22848/#review46810
-----------------------------------------------------------


On June 21, 2014, 12:11 a.m., Abraham Elmahrek wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22848/
> -----------------------------------------------------------
> 
> (Updated June 21, 2014, 12:11 a.m.)
> 
> 
> Review request for oozie.
> 
> 
> Bugs: OOZIE-1895
>     https://issues.apache.org/jira/browse/OOZIE-1895
> 
> 
> Repository: oozie-git
> 
> 
> Description
> -------
> 
> - Provide 2 different paths of execution: Zookeeper config and solrconfig. With solrconfig, --shards argument should be passed via "argument" in the action xml. With the zookeeper config, --collection should be provided via "argument" in the action xml.
> - Go live mode not supported in secure clusters.
> - Only available with Hadoop 2.
> 
> 
> Diffs
> -----
> 
>   client/src/main/java/org/apache/oozie/cli/OozieCLI.java 33935d3 
>   client/src/main/resources/search-batch-indexer-action-0.1.xsd PRE-CREATION 
>   core/pom.xml e152266 
>   core/src/main/java/org/apache/oozie/action/hadoop/SearchBatchIndexerActionExecutor.java PRE-CREATION 
>   docs/src/site/twiki/DG_SearchBatchIndexerActionExtension.twiki PRE-CREATION 
>   docs/src/site/twiki/index.twiki f078bf5 
>   pom.xml bad1e0f 
>   sharelib/pom.xml df20294 
>   sharelib/search/pom.xml PRE-CREATION 
>   sharelib/search/src/main/java/org/apache/oozie/action/hadoop/SearchBatchIndexerMain.java PRE-CREATION 
>   sharelib/search/src/test/java/org/apache/oozie/action/hadoop/TestSearchBatchIndexerActionExecutor.java PRE-CREATION 
>   sharelib/search/src/test/resources/morphlines/basic.conf PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/currency.xml PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/elevate.xml PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/contractions_ca.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/contractions_fr.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/contractions_ga.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/contractions_it.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/hyphenations_ga.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/stemdict_nl.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/stoptags_ja.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/stopwords_ar.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/stopwords_bg.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/stopwords_ca.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/stopwords_cz.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/stopwords_da.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/stopwords_de.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/stopwords_el.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/stopwords_en.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/stopwords_es.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/stopwords_eu.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/stopwords_fa.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/stopwords_fi.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/stopwords_fr.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/stopwords_ga.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/stopwords_gl.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/stopwords_hi.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/stopwords_hu.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/stopwords_hy.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/stopwords_id.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/stopwords_it.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/stopwords_ja.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/stopwords_lv.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/stopwords_nl.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/stopwords_no.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/stopwords_pt.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/stopwords_ro.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/stopwords_ru.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/stopwords_sv.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/stopwords_th.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/stopwords_tr.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/lang/userdict_ja.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/protwords.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/schema.xml PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/solrconfig.xml PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/stopwords.txt PRE-CREATION 
>   sharelib/search/src/test/resources/solr/conf/synonyms.txt PRE-CREATION 
>   src/main/assemblies/sharelib.xml 891d9dc 
>   webapp/pom.xml 93cfcef 
> 
> Diff: https://reviews.apache.org/r/22848/diff/
> 
> 
> Testing
> -------
> 
> Wrote a couple of tests and manually tested on Kerberized environment.
> 
> 
> Thanks,
> 
> Abraham Elmahrek
> 
>