You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@accumulo.apache.org by ec...@apache.org on 2012/01/11 15:29:28 UTC
svn commit: r1230064 - in /incubator/accumulo/branches/1.4/src/wikisearch:
README
ingest/src/main/java/org/apache/accumulo/wikisearch/ingest/WikipediaIngester.java
Author: ecn
Date: Wed Jan 11 14:29:28 2012
New Revision: 1230064
URL: http://svn.apache.org/viewvc?rev=1230064&view=rev
Log:
ACCUMULO-285 finely tune instructions, turn off speculative execution
Modified:
incubator/accumulo/branches/1.4/src/wikisearch/README
incubator/accumulo/branches/1.4/src/wikisearch/ingest/src/main/java/org/apache/accumulo/wikisearch/ingest/WikipediaIngester.java
Modified: incubator/accumulo/branches/1.4/src/wikisearch/README
URL: http://svn.apache.org/viewvc/incubator/accumulo/branches/1.4/src/wikisearch/README?rev=1230064&r1=1230063&r2=1230064&view=diff
==============================================================================
--- incubator/accumulo/branches/1.4/src/wikisearch/README (original)
+++ incubator/accumulo/branches/1.4/src/wikisearch/README Wed Jan 11 14:29:28 2012
@@ -15,9 +15,9 @@
INSTRUCTIONS
------------
- 1. Copy the conf/wikipedia.xml.example to conf/wikipedia.xml and change it to specify Accumulo information.
- 2. Copy the lib/wikisearch-*.jar and lib/protobuf*.jar to $ACCUMULO_HOME/lib/ext
- 3. Then run bin/ingest.sh with one argument (the name of the directory in HDFS where the wikipedia XML
+ 1. Copy the ingest/conf/wikipedia.xml.example to ingest/conf/wikipedia.xml and change it to specify Accumulo information.
+ 2. Copy the ingest/lib/wikisearch-*.jar and ingest/lib/protobuf*.jar to $ACCUMULO_HOME/lib/ext
+ 3. Then run ingest/bin/ingest.sh with one argument (the name of the directory in HDFS where the wikipedia XML
files reside) and this will kick off a MapReduce job to ingest the data into Accumulo.
Query
@@ -34,7 +34,7 @@
-------------
1. Modify the query/src/main/resources/META-INF/ejb-jar.xml file with the same information that you put into the wikipedia.xml
file from the Ingest step above.
- 2. Re-build the query distribution by running 'mvn assembly:single' in the top-level directory.
+ 2. Re-build the query distribution by running 'mvn package assembly:single' in the top-level directory.
3. Untar the resulting file in the $JBOSS_HOME/server/default directory.
$ cd $JBOSS_HOME/server/default
Modified: incubator/accumulo/branches/1.4/src/wikisearch/ingest/src/main/java/org/apache/accumulo/wikisearch/ingest/WikipediaIngester.java
URL: http://svn.apache.org/viewvc/incubator/accumulo/branches/1.4/src/wikisearch/ingest/src/main/java/org/apache/accumulo/wikisearch/ingest/WikipediaIngester.java?rev=1230064&r1=1230063&r2=1230064&view=diff
==============================================================================
--- incubator/accumulo/branches/1.4/src/wikisearch/ingest/src/main/java/org/apache/accumulo/wikisearch/ingest/WikipediaIngester.java (original)
+++ incubator/accumulo/branches/1.4/src/wikisearch/ingest/src/main/java/org/apache/accumulo/wikisearch/ingest/WikipediaIngester.java Wed Jan 11 14:29:28 2012
@@ -135,7 +135,8 @@ public class WikipediaIngester extends C
public int run(String[] args) throws Exception {
Job job = new Job(getConf(), "Ingest Wikipedia");
Configuration conf = job.getConfiguration();
-
+ conf.set("mapred.map.tasks.speculative.execution", "false");
+
String tablename = WikipediaConfiguration.getTableName(conf);
String zookeepers = WikipediaConfiguration.getZookeepers(conf);