You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@samoa.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2016/03/06 12:06:40 UTC
[jira] [Commented] (SAMOA-58) Samoa AvroFileStream from
HDFSFileStreamSource stops at end of first file
[ https://issues.apache.org/jira/browse/SAMOA-58?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15182095#comment-15182095 ]
ASF GitHub Bot commented on SAMOA-58:
-------------------------------------
Github user gdfm commented on a diff in the pull request:
https://github.com/apache/incubator-samoa/pull/48#discussion_r55136966
--- Diff: pom.xml ---
@@ -136,7 +136,7 @@
<metrics-core.version>2.2.0</metrics-core.version>
<miniball.version>1.0.3</miniball.version>
<s4.version>0.6.0-incubating</s4.version>
- <samza.version>0.7.0</samza.version>
+ <samza.version>0.10.1-MINE</samza.version>
--- End diff --
Probably should be 0.10.0
> Samoa AvroFileStream from HDFSFileStreamSource stops at end of first file
> -------------------------------------------------------------------------
>
> Key: SAMOA-58
> URL: https://issues.apache.org/jira/browse/SAMOA-58
> Project: SAMOA
> Issue Type: Bug
> Components: SAMOA-Instances
> Environment: RHEL 6.6, java 1.8.0_72
> Reporter: Edi Bice
>
> It appears Samoa is capable of streaming a collection of files as a single stream effectively concatenating the files. However using Samoa AvroFileStream from HDFSFileStreamSource seems the stream stops at end of first file:
> bin/samoa local target/SAMOA-Local-0.4.0-incubating-SNAPSHOT.jar "PrequentialEvaluation -i -1 -l (classifiers.ensemble.Bagging -s 100) -s (AvroFileStream -s HDFSFileStreamSource -f /tmp/order_and_feats_flat_avro/2016_02_18/ -c 1 -e binary) -f 10000"
> 2016-02-18 20:43:20,991 [main] INFO org.apache.samoa.evaluation.EvaluatorProcessor (EvaluatorProcessor.java:183) - last event is received!
> 2016-02-18 20:43:20,991 [main] INFO org.apache.samoa.evaluation.EvaluatorProcessor (EvaluatorProcessor.java:184) - total count: 262144
> ...
> 2016-02-18 20:43:20,993 [main] INFO org.apache.samoa.evaluation.EvaluatorProcessor (EvaluatorProcessor.java:191) - total evaluation time: 34 seconds for 262144 instances
> bash-4.1$ hadoop fs -ls /tmp/order_and_feats_flat_avro/2016_02_18 | more
> Found 70 items
> -rw-r--r-- 3 yarn hdfs 230855335 2016-02-18 16:01 /tmp/order_and_feats_flat_avro/2016_02_18/hdfs-1a238673-c4ec-4462-be67-78d573efa790-00001
> -rw-r--r-- 3 yarn hdfs 229800273 2016-02-18 16:04 /tmp/order_and_feats_flat_avro/2016_02_18/hdfs-1a238673-c4ec-4462-be67-78d573efa790-00002
> ...
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)