You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@oozie.apache.org by Abraham Elmahrek <ab...@cloudera.com> on 2014/06/21 02:11:27 UTC
Review Request 22848: Create a Search Indexing action
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22848/
-----------------------------------------------------------
Review request for oozie.
Bugs: OOZIE-1895
https://issues.apache.org/jira/browse/OOZIE-1895
Repository: oozie-git
Description
-------
- Provide 2 different paths of execution: Zookeeper config and solrconfig. With solrconfig, --shards argument should be passed via "argument" in the action xml. With the zookeeper config, --collection should be provided via "argument" in the action xml.
- Go live mode not supported in secure clusters.
- Only available with Hadoop 2.
Diffs
-----
client/src/main/java/org/apache/oozie/cli/OozieCLI.java 33935d3
client/src/main/resources/search-batch-indexer-action-0.1.xsd PRE-CREATION
core/pom.xml e152266
core/src/main/java/org/apache/oozie/action/hadoop/SearchBatchIndexerActionExecutor.java PRE-CREATION
docs/src/site/twiki/DG_SearchBatchIndexerActionExtension.twiki PRE-CREATION
docs/src/site/twiki/index.twiki f078bf5
pom.xml bad1e0f
sharelib/pom.xml df20294
sharelib/search/pom.xml PRE-CREATION
sharelib/search/src/main/java/org/apache/oozie/action/hadoop/SearchBatchIndexerMain.java PRE-CREATION
sharelib/search/src/test/java/org/apache/oozie/action/hadoop/TestSearchBatchIndexerActionExecutor.java PRE-CREATION
sharelib/search/src/test/resources/morphlines/basic.conf PRE-CREATION
sharelib/search/src/test/resources/solr/conf/currency.xml PRE-CREATION
sharelib/search/src/test/resources/solr/conf/elevate.xml PRE-CREATION
sharelib/search/src/test/resources/solr/conf/lang/contractions_ca.txt PRE-CREATION
sharelib/search/src/test/resources/solr/conf/lang/contractions_fr.txt PRE-CREATION
sharelib/search/src/test/resources/solr/conf/lang/contractions_ga.txt PRE-CREATION
sharelib/search/src/test/resources/solr/conf/lang/contractions_it.txt PRE-CREATION
sharelib/search/src/test/resources/solr/conf/lang/hyphenations_ga.txt PRE-CREATION
sharelib/search/src/test/resources/solr/conf/lang/stemdict_nl.txt PRE-CREATION
sharelib/search/src/test/resources/solr/conf/lang/stoptags_ja.txt PRE-CREATION
sharelib/search/src/test/resources/solr/conf/lang/stopwords_ar.txt PRE-CREATION
sharelib/search/src/test/resources/solr/conf/lang/stopwords_bg.txt PRE-CREATION
sharelib/search/src/test/resources/solr/conf/lang/stopwords_ca.txt PRE-CREATION
sharelib/search/src/test/resources/solr/conf/lang/stopwords_cz.txt PRE-CREATION
sharelib/search/src/test/resources/solr/conf/lang/stopwords_da.txt PRE-CREATION
sharelib/search/src/test/resources/solr/conf/lang/stopwords_de.txt PRE-CREATION
sharelib/search/src/test/resources/solr/conf/lang/stopwords_el.txt PRE-CREATION
sharelib/search/src/test/resources/solr/conf/lang/stopwords_en.txt PRE-CREATION
sharelib/search/src/test/resources/solr/conf/lang/stopwords_es.txt PRE-CREATION
sharelib/search/src/test/resources/solr/conf/lang/stopwords_eu.txt PRE-CREATION
sharelib/search/src/test/resources/solr/conf/lang/stopwords_fa.txt PRE-CREATION
sharelib/search/src/test/resources/solr/conf/lang/stopwords_fi.txt PRE-CREATION
sharelib/search/src/test/resources/solr/conf/lang/stopwords_fr.txt PRE-CREATION
sharelib/search/src/test/resources/solr/conf/lang/stopwords_ga.txt PRE-CREATION
sharelib/search/src/test/resources/solr/conf/lang/stopwords_gl.txt PRE-CREATION
sharelib/search/src/test/resources/solr/conf/lang/stopwords_hi.txt PRE-CREATION
sharelib/search/src/test/resources/solr/conf/lang/stopwords_hu.txt PRE-CREATION
sharelib/search/src/test/resources/solr/conf/lang/stopwords_hy.txt PRE-CREATION
sharelib/search/src/test/resources/solr/conf/lang/stopwords_id.txt PRE-CREATION
sharelib/search/src/test/resources/solr/conf/lang/stopwords_it.txt PRE-CREATION
sharelib/search/src/test/resources/solr/conf/lang/stopwords_ja.txt PRE-CREATION
sharelib/search/src/test/resources/solr/conf/lang/stopwords_lv.txt PRE-CREATION
sharelib/search/src/test/resources/solr/conf/lang/stopwords_nl.txt PRE-CREATION
sharelib/search/src/test/resources/solr/conf/lang/stopwords_no.txt PRE-CREATION
sharelib/search/src/test/resources/solr/conf/lang/stopwords_pt.txt PRE-CREATION
sharelib/search/src/test/resources/solr/conf/lang/stopwords_ro.txt PRE-CREATION
sharelib/search/src/test/resources/solr/conf/lang/stopwords_ru.txt PRE-CREATION
sharelib/search/src/test/resources/solr/conf/lang/stopwords_sv.txt PRE-CREATION
sharelib/search/src/test/resources/solr/conf/lang/stopwords_th.txt PRE-CREATION
sharelib/search/src/test/resources/solr/conf/lang/stopwords_tr.txt PRE-CREATION
sharelib/search/src/test/resources/solr/conf/lang/userdict_ja.txt PRE-CREATION
sharelib/search/src/test/resources/solr/conf/protwords.txt PRE-CREATION
sharelib/search/src/test/resources/solr/conf/schema.xml PRE-CREATION
sharelib/search/src/test/resources/solr/conf/solrconfig.xml PRE-CREATION
sharelib/search/src/test/resources/solr/conf/stopwords.txt PRE-CREATION
sharelib/search/src/test/resources/solr/conf/synonyms.txt PRE-CREATION
src/main/assemblies/sharelib.xml 891d9dc
webapp/pom.xml 93cfcef
Diff: https://reviews.apache.org/r/22848/diff/
Testing
-------
Wrote a couple of tests and manually tested on Kerberized environment.
Thanks,
Abraham Elmahrek
Re: Review Request 22848: Create a Search Indexing action
Posted by Abraham Elmahrek <ab...@cloudera.com>.
> On June 26, 2014, 11:55 p.m., Robert Kanter wrote:
> > Great work Abe! I know this wasn't easy to figure out and get working.
> >
> > I did an initial review and made some comments. I haven't really looked at any of the tests yet.
> > Also, do we need so many extra txt files for the tests?
>
> Abraham Elmahrek wrote:
> Thanks Robert. I'll try to remove dependencies and see what that does.
>
> I noticed a mistake in the rules of which arguments are required:
> 1. At least one of --zk-host or --solr-home-dir are required.
> 2. If solr-home-dir is specified, then --zk-host or --shard-url or --shards must be specified (mutually exclusive).
> 3. If --zk-host is specified at all (with solr-home-dir or without), --collection should be provided.
>
> Will rectify.
Also, I'm open to suggestions on the name of the action "search batch indexer" ;).
- Abraham
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22848/#review46810
-----------------------------------------------------------
On July 10, 2014, 11:58 p.m., Abraham Elmahrek wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22848/
> -----------------------------------------------------------
>
> (Updated July 10, 2014, 11:58 p.m.)
>
>
> Review request for oozie.
>
>
> Bugs: OOZIE-1895
> https://issues.apache.org/jira/browse/OOZIE-1895
>
>
> Repository: oozie-git
>
>
> Description
> -------
>
> - Provide 2 different paths of execution: Zookeeper config and solrconfig. With solrconfig, --shards argument should be passed via "argument" in the action xml. With the zookeeper config, --collection should be provided via "argument" in the action xml.
> - Go live mode not supported in secure clusters.
> - Only available with Hadoop 2.
>
>
> Diffs
> -----
>
> client/src/main/java/org/apache/oozie/cli/OozieCLI.java 33935d3
> client/src/main/resources/search-batch-indexer-action-0.1.xsd PRE-CREATION
> core/pom.xml e152266
> core/src/main/java/org/apache/oozie/action/hadoop/SearchBatchIndexerActionExecutor.java PRE-CREATION
> core/src/main/resources/oozie-default.xml b944d3d
> docs/src/site/twiki/DG_SearchBatchIndexerActionExtension.twiki PRE-CREATION
> docs/src/site/twiki/index.twiki f078bf5
> pom.xml bad1e0f
> sharelib/pom.xml df20294
> sharelib/search/pom.xml PRE-CREATION
> sharelib/search/src/main/java/org/apache/oozie/action/hadoop/SearchBatchIndexerMain.java PRE-CREATION
> sharelib/search/src/test/java/org/apache/oozie/action/hadoop/TestSearchBatchIndexerActionExecutor.java PRE-CREATION
> sharelib/search/src/test/resources/morphlines/basic.conf PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/elevate.xml PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/lang/contractions_ca.txt PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/lang/contractions_fr.txt PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/lang/contractions_ga.txt PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/lang/contractions_it.txt PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/lang/hyphenations_ga.txt PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/lang/stemdict_nl.txt PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/lang/stoptags_ja.txt PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/lang/stopwords_en.txt PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/lang/userdict_ja.txt PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/protwords.txt PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/schema.xml PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/solrconfig.xml PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/stopwords.txt PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/synonyms.txt PRE-CREATION
> src/main/assemblies/sharelib.xml 891d9dc
> webapp/pom.xml 93cfcef
>
> Diff: https://reviews.apache.org/r/22848/diff/
>
>
> Testing
> -------
>
> Wrote a couple of tests and manually tested on Kerberized environment.
>
>
> Thanks,
>
> Abraham Elmahrek
>
>
Re: Review Request 22848: Create a Search Indexing action
Posted by Abraham Elmahrek <ab...@cloudera.com>.
> On June 26, 2014, 11:55 p.m., Robert Kanter wrote:
> > Great work Abe! I know this wasn't easy to figure out and get working.
> >
> > I did an initial review and made some comments. I haven't really looked at any of the tests yet.
> > Also, do we need so many extra txt files for the tests?
Thanks Robert. I'll try to remove dependencies and see what that does.
I noticed a mistake in the rules of which arguments are required:
1. At least one of --zk-host or --solr-home-dir are required.
2. If solr-home-dir is specified, then --zk-host or --shard-url or --shards must be specified (mutually exclusive).
3. If --zk-host is specified at all (with solr-home-dir or without), --collection should be provided.
Will rectify.
> On June 26, 2014, 11:55 p.m., Robert Kanter wrote:
> > core/src/main/java/org/apache/oozie/action/hadoop/SearchBatchIndexerActionExecutor.java, line 35
> > <https://reviews.apache.org/r/22848/diff/2/?file=614633#file614633line35>
> >
> > Can you explicitly list the imports instead of using *
Indeed!
> On June 26, 2014, 11:55 p.m., Robert Kanter wrote:
> > core/pom.xml, line 41
> > <https://reviews.apache.org/r/22848/diff/2/?file=614632#file614632line41>
> >
> > This should stay at "provided"
> >
> > Also, can you put back the order? You swapped this and oozie-hadoop-test, so it's not a "real" change.
Can we create a separate Jira for this? It turns out the order matters for Idea w/ Maven.
Also, it appears that commons-io is a compile time dependency on oozie-hadoop. commons-io appears to be used in GzipCompressionCodec.java, which is part of the core package. It can be done in a separate jira, but it seems it needs to be included?
> On June 26, 2014, 11:55 p.m., Robert Kanter wrote:
> > docs/src/site/twiki/DG_SearchBatchIndexerActionExtension.twiki, lines 79-81
> > <https://reviews.apache.org/r/22848/diff/2/?file=614634#file614634line79>
> >
> > If these additional arguments are required for the zookeeper and solrconfigs, why not make them required by the schema?
> >
> > I'm not sure, but I think this would do it; if not, it would be something similar:
> > <xs:choice minOccurs="1" maxOccurs="1">
> > <xs:sequence>
> > <xs:element name="zookeeper" type="xs:string" minOccurs="1" maxOccurs="1"/>
> > <xs:element name="collection" type="xs:string" minOccurs="1" maxOccurs="1"/>
> > </xs:sequence>
> > <xs:sequence>
> > <xs:element name="solrconfig" type="xs:string" minOccurs="1" maxOccurs="1"/>
> > <xs:element name="shards" type="xs:string" minOccurs="1" maxOccurs="1"/>
> > </xs:sequence>
> > </xs:choice>
We could provide options for all/some of these options:
--solr-home-dir
--zk-host
--shard-url
--shards
--collection
The big worry with providing options of this nature is that if the options change or the requirements change, then we need to rev. the action. Since there are so many options, it seems possible that one may drop out and we'll have to add extra logic to be backwards compatible.
> On June 26, 2014, 11:55 p.m., Robert Kanter wrote:
> > docs/src/site/twiki/DG_SearchBatchIndexerActionExtension.twiki, lines 55-56
> > <https://reviews.apache.org/r/22848/diff/2/?file=614634#file614634line55>
> >
> > You're only allowed to have one of these, right?
It looks like you can have both, will change in our validation internally. Rules are defined in comment above.
> On June 26, 2014, 11:55 p.m., Robert Kanter wrote:
> > docs/src/site/twiki/DG_SearchBatchIndexerActionExtension.twiki, line 9
> > <https://reviews.apache.org/r/22848/diff/2/?file=614634#file614634line9>
> >
> > It may make sense to add an "About" section (or some other name) saying that this refers to Cloudea Search, an Apache licensed Search thing built on Lucene and Solr that can be found at GITHUB_PAGE_OR_WEBPAGE. This is less well-known than the other actions types and isn't an Apache-run project.
> >
> > We can then also put the Hadoop 2 requirement in the same section.
- Abraham
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22848/#review46810
-----------------------------------------------------------
On June 21, 2014, 12:11 a.m., Abraham Elmahrek wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22848/
> -----------------------------------------------------------
>
> (Updated June 21, 2014, 12:11 a.m.)
>
>
> Review request for oozie.
>
>
> Bugs: OOZIE-1895
> https://issues.apache.org/jira/browse/OOZIE-1895
>
>
> Repository: oozie-git
>
>
> Description
> -------
>
> - Provide 2 different paths of execution: Zookeeper config and solrconfig. With solrconfig, --shards argument should be passed via "argument" in the action xml. With the zookeeper config, --collection should be provided via "argument" in the action xml.
> - Go live mode not supported in secure clusters.
> - Only available with Hadoop 2.
>
>
> Diffs
> -----
>
> client/src/main/java/org/apache/oozie/cli/OozieCLI.java 33935d3
> client/src/main/resources/search-batch-indexer-action-0.1.xsd PRE-CREATION
> core/pom.xml e152266
> core/src/main/java/org/apache/oozie/action/hadoop/SearchBatchIndexerActionExecutor.java PRE-CREATION
> docs/src/site/twiki/DG_SearchBatchIndexerActionExtension.twiki PRE-CREATION
> docs/src/site/twiki/index.twiki f078bf5
> pom.xml bad1e0f
> sharelib/pom.xml df20294
> sharelib/search/pom.xml PRE-CREATION
> sharelib/search/src/main/java/org/apache/oozie/action/hadoop/SearchBatchIndexerMain.java PRE-CREATION
> sharelib/search/src/test/java/org/apache/oozie/action/hadoop/TestSearchBatchIndexerActionExecutor.java PRE-CREATION
> sharelib/search/src/test/resources/morphlines/basic.conf PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/currency.xml PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/elevate.xml PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/lang/contractions_ca.txt PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/lang/contractions_fr.txt PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/lang/contractions_ga.txt PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/lang/contractions_it.txt PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/lang/hyphenations_ga.txt PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/lang/stemdict_nl.txt PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/lang/stoptags_ja.txt PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/lang/stopwords_ar.txt PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/lang/stopwords_bg.txt PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/lang/stopwords_ca.txt PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/lang/stopwords_cz.txt PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/lang/stopwords_da.txt PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/lang/stopwords_de.txt PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/lang/stopwords_el.txt PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/lang/stopwords_en.txt PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/lang/stopwords_es.txt PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/lang/stopwords_eu.txt PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/lang/stopwords_fa.txt PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/lang/stopwords_fi.txt PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/lang/stopwords_fr.txt PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/lang/stopwords_ga.txt PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/lang/stopwords_gl.txt PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/lang/stopwords_hi.txt PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/lang/stopwords_hu.txt PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/lang/stopwords_hy.txt PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/lang/stopwords_id.txt PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/lang/stopwords_it.txt PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/lang/stopwords_ja.txt PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/lang/stopwords_lv.txt PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/lang/stopwords_nl.txt PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/lang/stopwords_no.txt PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/lang/stopwords_pt.txt PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/lang/stopwords_ro.txt PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/lang/stopwords_ru.txt PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/lang/stopwords_sv.txt PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/lang/stopwords_th.txt PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/lang/stopwords_tr.txt PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/lang/userdict_ja.txt PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/protwords.txt PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/schema.xml PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/solrconfig.xml PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/stopwords.txt PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/synonyms.txt PRE-CREATION
> src/main/assemblies/sharelib.xml 891d9dc
> webapp/pom.xml 93cfcef
>
> Diff: https://reviews.apache.org/r/22848/diff/
>
>
> Testing
> -------
>
> Wrote a couple of tests and manually tested on Kerberized environment.
>
>
> Thanks,
>
> Abraham Elmahrek
>
>
Re: Review Request 22848: Create a Search Indexing action
Posted by Robert Kanter <rk...@cloudera.com>.
> On June 26, 2014, 11:55 p.m., Robert Kanter wrote:
> > core/pom.xml, line 41
> > <https://reviews.apache.org/r/22848/diff/2/?file=614632#file614632line41>
> >
> > This should stay at "provided"
> >
> > Also, can you put back the order? You swapped this and oozie-hadoop-test, so it's not a "real" change.
>
> Abraham Elmahrek wrote:
> Can we create a separate Jira for this? It turns out the order matters for Idea w/ Maven.
>
> Also, it appears that commons-io is a compile time dependency on oozie-hadoop. commons-io appears to be used in GzipCompressionCodec.java, which is part of the core package. It can be done in a separate jira, but it seems it needs to be included?
In that case, it's fine to change the order (though that's silly that Idea has that problem; I'm pretty sure the order isn't supposed to matter). But oozie-hadoop should definitely be "provided".
It does look like commons-io should be "compile".
> On June 26, 2014, 11:55 p.m., Robert Kanter wrote:
> > docs/src/site/twiki/DG_SearchBatchIndexerActionExtension.twiki, lines 79-81
> > <https://reviews.apache.org/r/22848/diff/2/?file=614634#file614634line79>
> >
> > If these additional arguments are required for the zookeeper and solrconfigs, why not make them required by the schema?
> >
> > I'm not sure, but I think this would do it; if not, it would be something similar:
> > <xs:choice minOccurs="1" maxOccurs="1">
> > <xs:sequence>
> > <xs:element name="zookeeper" type="xs:string" minOccurs="1" maxOccurs="1"/>
> > <xs:element name="collection" type="xs:string" minOccurs="1" maxOccurs="1"/>
> > </xs:sequence>
> > <xs:sequence>
> > <xs:element name="solrconfig" type="xs:string" minOccurs="1" maxOccurs="1"/>
> > <xs:element name="shards" type="xs:string" minOccurs="1" maxOccurs="1"/>
> > </xs:sequence>
> > </xs:choice>
>
> Abraham Elmahrek wrote:
> We could provide options for all/some of these options:
> --solr-home-dir
> --zk-host
> --shard-url
> --shards
> --collection
>
> The big worry with providing options of this nature is that if the options change or the requirements change, then we need to rev. the action. Since there are so many options, it seems possible that one may drop out and we'll have to add extra logic to be backwards compatible.
I think it should be okay because, say one of them gets dropped. Then schema 0.2 would not have it and if using schema 0.1, the code would just completely ignore that field. If a field is added, it can still be put in the <argument> field, and we can make schema 0.2 to add it as its own field. I'd say that we should only add the required options, if any are optional then leave them out (unless you think they'd be helpful). My concern is a user who doesn't know what they're doing and specified conflicting options or doesn't specify enough options, etc. It's best if it fails at submission time (during the schema validation) rather than later when the Oozie server tries to actually run the action.
- Robert
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22848/#review46810
-----------------------------------------------------------
On June 21, 2014, 12:11 a.m., Abraham Elmahrek wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22848/
> -----------------------------------------------------------
>
> (Updated June 21, 2014, 12:11 a.m.)
>
>
> Review request for oozie.
>
>
> Bugs: OOZIE-1895
> https://issues.apache.org/jira/browse/OOZIE-1895
>
>
> Repository: oozie-git
>
>
> Description
> -------
>
> - Provide 2 different paths of execution: Zookeeper config and solrconfig. With solrconfig, --shards argument should be passed via "argument" in the action xml. With the zookeeper config, --collection should be provided via "argument" in the action xml.
> - Go live mode not supported in secure clusters.
> - Only available with Hadoop 2.
>
>
> Diffs
> -----
>
> client/src/main/java/org/apache/oozie/cli/OozieCLI.java 33935d3
> client/src/main/resources/search-batch-indexer-action-0.1.xsd PRE-CREATION
> core/pom.xml e152266
> core/src/main/java/org/apache/oozie/action/hadoop/SearchBatchIndexerActionExecutor.java PRE-CREATION
> docs/src/site/twiki/DG_SearchBatchIndexerActionExtension.twiki PRE-CREATION
> docs/src/site/twiki/index.twiki f078bf5
> pom.xml bad1e0f
> sharelib/pom.xml df20294
> sharelib/search/pom.xml PRE-CREATION
> sharelib/search/src/main/java/org/apache/oozie/action/hadoop/SearchBatchIndexerMain.java PRE-CREATION
> sharelib/search/src/test/java/org/apache/oozie/action/hadoop/TestSearchBatchIndexerActionExecutor.java PRE-CREATION
> sharelib/search/src/test/resources/morphlines/basic.conf PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/currency.xml PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/elevate.xml PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/lang/contractions_ca.txt PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/lang/contractions_fr.txt PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/lang/contractions_ga.txt PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/lang/contractions_it.txt PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/lang/hyphenations_ga.txt PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/lang/stemdict_nl.txt PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/lang/stoptags_ja.txt PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/lang/stopwords_ar.txt PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/lang/stopwords_bg.txt PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/lang/stopwords_ca.txt PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/lang/stopwords_cz.txt PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/lang/stopwords_da.txt PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/lang/stopwords_de.txt PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/lang/stopwords_el.txt PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/lang/stopwords_en.txt PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/lang/stopwords_es.txt PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/lang/stopwords_eu.txt PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/lang/stopwords_fa.txt PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/lang/stopwords_fi.txt PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/lang/stopwords_fr.txt PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/lang/stopwords_ga.txt PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/lang/stopwords_gl.txt PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/lang/stopwords_hi.txt PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/lang/stopwords_hu.txt PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/lang/stopwords_hy.txt PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/lang/stopwords_id.txt PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/lang/stopwords_it.txt PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/lang/stopwords_ja.txt PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/lang/stopwords_lv.txt PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/lang/stopwords_nl.txt PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/lang/stopwords_no.txt PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/lang/stopwords_pt.txt PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/lang/stopwords_ro.txt PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/lang/stopwords_ru.txt PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/lang/stopwords_sv.txt PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/lang/stopwords_th.txt PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/lang/stopwords_tr.txt PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/lang/userdict_ja.txt PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/protwords.txt PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/schema.xml PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/solrconfig.xml PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/stopwords.txt PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/synonyms.txt PRE-CREATION
> src/main/assemblies/sharelib.xml 891d9dc
> webapp/pom.xml 93cfcef
>
> Diff: https://reviews.apache.org/r/22848/diff/
>
>
> Testing
> -------
>
> Wrote a couple of tests and manually tested on Kerberized environment.
>
>
> Thanks,
>
> Abraham Elmahrek
>
>
Re: Review Request 22848: Create a Search Indexing action
Posted by Robert Kanter <rk...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22848/#review46810
-----------------------------------------------------------
Great work Abe! I know this wasn't easy to figure out and get working.
I did an initial review and made some comments. I haven't really looked at any of the tests yet.
Also, do we need so many extra txt files for the tests?
client/src/main/resources/search-batch-indexer-action-0.1.xsd
<https://reviews.apache.org/r/22848/#comment82378>
Shouldn't the minOccurs be "1" on these? I'm not sure, but I believe the xs:choice will enforce that only one of them is actually provided, right?
core/pom.xml
<https://reviews.apache.org/r/22848/#comment82369>
This should stay at "provided"
Also, can you put back the order? You swapped this and oozie-hadoop-test, so it's not a "real" change.
core/src/main/java/org/apache/oozie/action/hadoop/SearchBatchIndexerActionExecutor.java
<https://reviews.apache.org/r/22848/#comment82370>
Can you explicitly list the imports instead of using *
core/src/main/java/org/apache/oozie/action/hadoop/SearchBatchIndexerActionExecutor.java
<https://reviews.apache.org/r/22848/#comment82371>
It might be a good idea to add a config property to oozie-default to override this check just in case. Otherwise, I could see someone using a custom version of Hadoop where this could work (or at least they think it should) and we have no way of going around this check if it doesn't agree.
core/src/main/java/org/apache/oozie/action/hadoop/SearchBatchIndexerActionExecutor.java
<https://reviews.apache.org/r/22848/#comment82372>
Should this be the following?
conf = super.setupLauncherConf(conf, actionXml, appPath, context);
docs/src/site/twiki/DG_SearchBatchIndexerActionExtension.twiki
<https://reviews.apache.org/r/22848/#comment82374>
It may make sense to add an "About" section (or some other name) saying that this refers to Cloudea Search, an Apache licensed Search thing built on Lucene and Solr that can be found at GITHUB_PAGE_OR_WEBPAGE. This is less well-known than the other actions types and isn't an Apache-run project.
We can then also put the Hadoop 2 requirement in the same section.
docs/src/site/twiki/DG_SearchBatchIndexerActionExtension.twiki
<https://reviews.apache.org/r/22848/#comment82375>
Can we call this something else? It's confusing with NAME-NODE. And also the "ok to" and "error to" transitions should have some other name, or it looks like a loop (e.g. SOMENAME2)
docs/src/site/twiki/DG_SearchBatchIndexerActionExtension.twiki
<https://reviews.apache.org/r/22848/#comment82376>
You're only allowed to have one of these, right?
docs/src/site/twiki/DG_SearchBatchIndexerActionExtension.twiki
<https://reviews.apache.org/r/22848/#comment82377>
If these additional arguments are required for the zookeeper and solrconfigs, why not make them required by the schema?
I'm not sure, but I think this would do it; if not, it would be something similar:
<xs:choice minOccurs="1" maxOccurs="1">
<xs:sequence>
<xs:element name="zookeeper" type="xs:string" minOccurs="1" maxOccurs="1"/>
<xs:element name="collection" type="xs:string" minOccurs="1" maxOccurs="1"/>
</xs:sequence>
<xs:sequence>
<xs:element name="solrconfig" type="xs:string" minOccurs="1" maxOccurs="1"/>
<xs:element name="shards" type="xs:string" minOccurs="1" maxOccurs="1"/>
</xs:sequence>
</xs:choice>
sharelib/search/src/test/java/org/apache/oozie/action/hadoop/TestSearchBatchIndexerActionExecutor.java
<https://reviews.apache.org/r/22848/#comment82379>
Replace with explicit classes
sharelib/search/src/test/resources/solr/conf/currency.xml
<https://reviews.apache.org/r/22848/#comment82380>
Please remove the trailing whitespace if possible.
sharelib/search/src/test/resources/solr/conf/elevate.xml
<https://reviews.apache.org/r/22848/#comment82381>
Please remove the trailing whitespace if possible.
sharelib/search/src/test/resources/solr/conf/lang/contractions_it.txt
<https://reviews.apache.org/r/22848/#comment82383>
Trailing whitespace
sharelib/search/src/test/resources/solr/conf/lang/stoptags_ja.txt
<https://reviews.apache.org/r/22848/#comment82384>
Trailing whitespace
sharelib/search/src/test/resources/solr/conf/schema.xml
<https://reviews.apache.org/r/22848/#comment82382>
Trailing whitespace throughout this file again if possible.
- Robert Kanter
On June 21, 2014, 12:11 a.m., Abraham Elmahrek wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22848/
> -----------------------------------------------------------
>
> (Updated June 21, 2014, 12:11 a.m.)
>
>
> Review request for oozie.
>
>
> Bugs: OOZIE-1895
> https://issues.apache.org/jira/browse/OOZIE-1895
>
>
> Repository: oozie-git
>
>
> Description
> -------
>
> - Provide 2 different paths of execution: Zookeeper config and solrconfig. With solrconfig, --shards argument should be passed via "argument" in the action xml. With the zookeeper config, --collection should be provided via "argument" in the action xml.
> - Go live mode not supported in secure clusters.
> - Only available with Hadoop 2.
>
>
> Diffs
> -----
>
> client/src/main/java/org/apache/oozie/cli/OozieCLI.java 33935d3
> client/src/main/resources/search-batch-indexer-action-0.1.xsd PRE-CREATION
> core/pom.xml e152266
> core/src/main/java/org/apache/oozie/action/hadoop/SearchBatchIndexerActionExecutor.java PRE-CREATION
> docs/src/site/twiki/DG_SearchBatchIndexerActionExtension.twiki PRE-CREATION
> docs/src/site/twiki/index.twiki f078bf5
> pom.xml bad1e0f
> sharelib/pom.xml df20294
> sharelib/search/pom.xml PRE-CREATION
> sharelib/search/src/main/java/org/apache/oozie/action/hadoop/SearchBatchIndexerMain.java PRE-CREATION
> sharelib/search/src/test/java/org/apache/oozie/action/hadoop/TestSearchBatchIndexerActionExecutor.java PRE-CREATION
> sharelib/search/src/test/resources/morphlines/basic.conf PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/currency.xml PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/elevate.xml PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/lang/contractions_ca.txt PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/lang/contractions_fr.txt PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/lang/contractions_ga.txt PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/lang/contractions_it.txt PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/lang/hyphenations_ga.txt PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/lang/stemdict_nl.txt PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/lang/stoptags_ja.txt PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/lang/stopwords_ar.txt PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/lang/stopwords_bg.txt PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/lang/stopwords_ca.txt PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/lang/stopwords_cz.txt PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/lang/stopwords_da.txt PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/lang/stopwords_de.txt PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/lang/stopwords_el.txt PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/lang/stopwords_en.txt PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/lang/stopwords_es.txt PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/lang/stopwords_eu.txt PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/lang/stopwords_fa.txt PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/lang/stopwords_fi.txt PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/lang/stopwords_fr.txt PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/lang/stopwords_ga.txt PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/lang/stopwords_gl.txt PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/lang/stopwords_hi.txt PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/lang/stopwords_hu.txt PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/lang/stopwords_hy.txt PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/lang/stopwords_id.txt PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/lang/stopwords_it.txt PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/lang/stopwords_ja.txt PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/lang/stopwords_lv.txt PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/lang/stopwords_nl.txt PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/lang/stopwords_no.txt PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/lang/stopwords_pt.txt PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/lang/stopwords_ro.txt PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/lang/stopwords_ru.txt PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/lang/stopwords_sv.txt PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/lang/stopwords_th.txt PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/lang/stopwords_tr.txt PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/lang/userdict_ja.txt PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/protwords.txt PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/schema.xml PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/solrconfig.xml PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/stopwords.txt PRE-CREATION
> sharelib/search/src/test/resources/solr/conf/synonyms.txt PRE-CREATION
> src/main/assemblies/sharelib.xml 891d9dc
> webapp/pom.xml 93cfcef
>
> Diff: https://reviews.apache.org/r/22848/diff/
>
>
> Testing
> -------
>
> Wrote a couple of tests and manually tested on Kerberized environment.
>
>
> Thanks,
>
> Abraham Elmahrek
>
>
Re: Review Request 22848: Create a Search Indexing action
Posted by Abraham Elmahrek <ab...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22848/
-----------------------------------------------------------
(Updated July 10, 2014, 11:58 p.m.)
Review request for oozie.
Bugs: OOZIE-1895
https://issues.apache.org/jira/browse/OOZIE-1895
Repository: oozie-git
Description
-------
- Provide 2 different paths of execution: Zookeeper config and solrconfig. With solrconfig, --shards argument should be passed via "argument" in the action xml. With the zookeeper config, --collection should be provided via "argument" in the action xml.
- Go live mode not supported in secure clusters.
- Only available with Hadoop 2.
Diffs (updated)
-----
client/src/main/java/org/apache/oozie/cli/OozieCLI.java 33935d3
client/src/main/resources/search-batch-indexer-action-0.1.xsd PRE-CREATION
core/pom.xml e152266
core/src/main/java/org/apache/oozie/action/hadoop/SearchBatchIndexerActionExecutor.java PRE-CREATION
core/src/main/resources/oozie-default.xml b944d3d
docs/src/site/twiki/DG_SearchBatchIndexerActionExtension.twiki PRE-CREATION
docs/src/site/twiki/index.twiki f078bf5
pom.xml bad1e0f
sharelib/pom.xml df20294
sharelib/search/pom.xml PRE-CREATION
sharelib/search/src/main/java/org/apache/oozie/action/hadoop/SearchBatchIndexerMain.java PRE-CREATION
sharelib/search/src/test/java/org/apache/oozie/action/hadoop/TestSearchBatchIndexerActionExecutor.java PRE-CREATION
sharelib/search/src/test/resources/morphlines/basic.conf PRE-CREATION
sharelib/search/src/test/resources/solr/conf/elevate.xml PRE-CREATION
sharelib/search/src/test/resources/solr/conf/lang/contractions_ca.txt PRE-CREATION
sharelib/search/src/test/resources/solr/conf/lang/contractions_fr.txt PRE-CREATION
sharelib/search/src/test/resources/solr/conf/lang/contractions_ga.txt PRE-CREATION
sharelib/search/src/test/resources/solr/conf/lang/contractions_it.txt PRE-CREATION
sharelib/search/src/test/resources/solr/conf/lang/hyphenations_ga.txt PRE-CREATION
sharelib/search/src/test/resources/solr/conf/lang/stemdict_nl.txt PRE-CREATION
sharelib/search/src/test/resources/solr/conf/lang/stoptags_ja.txt PRE-CREATION
sharelib/search/src/test/resources/solr/conf/lang/stopwords_en.txt PRE-CREATION
sharelib/search/src/test/resources/solr/conf/lang/userdict_ja.txt PRE-CREATION
sharelib/search/src/test/resources/solr/conf/protwords.txt PRE-CREATION
sharelib/search/src/test/resources/solr/conf/schema.xml PRE-CREATION
sharelib/search/src/test/resources/solr/conf/solrconfig.xml PRE-CREATION
sharelib/search/src/test/resources/solr/conf/stopwords.txt PRE-CREATION
sharelib/search/src/test/resources/solr/conf/synonyms.txt PRE-CREATION
src/main/assemblies/sharelib.xml 891d9dc
webapp/pom.xml 93cfcef
Diff: https://reviews.apache.org/r/22848/diff/
Testing
-------
Wrote a couple of tests and manually tested on Kerberized environment.
Thanks,
Abraham Elmahrek