You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@samza.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2018/09/11 15:52:00 UTC

[jira] [Commented] (SAMZA-1870) HDFS system admin not handling END_OF_STREAM offset

    [ https://issues.apache.org/jira/browse/SAMZA-1870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16610822#comment-16610822 ] 

ASF GitHub Bot commented on SAMZA-1870:
---------------------------------------

GitHub user lhaiesp opened a pull request:

    https://github.com/apache/samza/pull/633

    SAMZA-1870: hdfs offset comparator to handle end of stream offset

    This happens particularly when using HDFS as a bootstrap stream:
    
    org.apache.samza.SamzaException: Invalid offset for MultiFileHdfsReader: END_OF_STREAM
    at org.apache.samza.system.hdfs.reader.MultiFileHdfsReader.getCurFileIndex(MultiFileHdfsReader.java:64)
    at org.apache.samza.system.hdfs.HdfsSystemAdmin.offsetComparator(HdfsSystemAdmin.java:224)
    at org.apache.samza.system.chooser.BootstrappingChooser.org$apache$samza$system$chooser$BootstrappingChooser$$checkOffset(BootstrappingChooser.scala:274)
    at org.apache.samza.system.chooser.BootstrappingChooser.choose(BootstrappingChooser.scala:204)
    at org.apache.samza.system.chooser.DefaultChooser.choose(DefaultChooser.scala:294)
    at org.apache.samza.system.SystemConsumers.choose(SystemConsumers.scala:210)
    at org.apache.samza.task.AsyncRunLoop.chooseEnvelope(AsyncRunLoop.java:208)
    at org.apache.samza.task.AsyncRunLoop.run(AsyncRunLoop.java:156)
    at org.apache.samza.container.SamzaContainer.run(SamzaContainer.scala:787)
    at org.apache.samza.runtime.LocalContainerRunner.run(LocalContainerRunner.java:101)
    at org.apache.samza.runtime.LocalContainerRunner.main(LocalContainerRunner.java:148)

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/lhaiesp/samza master

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/samza/pull/633.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #633
    
----
commit 42b80cc34a999955b79997494fe078f8024c9c2c
Author: Hai Lu <ha...@...>
Date:   2018-09-11T15:49:51Z

    hdfs offset comparator to handle end of stream offset

----


> HDFS system admin not handling END_OF_STREAM offset
> ---------------------------------------------------
>
>                 Key: SAMZA-1870
>                 URL: https://issues.apache.org/jira/browse/SAMZA-1870
>             Project: Samza
>          Issue Type: Bug
>            Reporter: Hai
>            Assignee: Hai
>            Priority: Major
>
> This happens particularly when using HDFS as a bootstrap stream:
> org.apache.samza.SamzaException: Invalid offset for MultiFileHdfsReader: END_OF_STREAM
>  at org.apache.samza.system.hdfs.reader.MultiFileHdfsReader.getCurFileIndex(MultiFileHdfsReader.java:64)
>  at org.apache.samza.system.hdfs.HdfsSystemAdmin.offsetComparator(HdfsSystemAdmin.java:224)
>  at org.apache.samza.system.chooser.BootstrappingChooser.org$apache$samza$system$chooser$BootstrappingChooser$$checkOffset(BootstrappingChooser.scala:274)
>  at org.apache.samza.system.chooser.BootstrappingChooser.choose(BootstrappingChooser.scala:204)
>  at org.apache.samza.system.chooser.DefaultChooser.choose(DefaultChooser.scala:294)
>  at org.apache.samza.system.SystemConsumers.choose(SystemConsumers.scala:210)
>  at org.apache.samza.task.AsyncRunLoop.chooseEnvelope(AsyncRunLoop.java:208)
>  at org.apache.samza.task.AsyncRunLoop.run(AsyncRunLoop.java:156)
>  at org.apache.samza.container.SamzaContainer.run(SamzaContainer.scala:787)
>  at org.apache.samza.runtime.LocalContainerRunner.run(LocalContainerRunner.java:101)
>  at org.apache.samza.runtime.LocalContainerRunner.main(LocalContainerRunner.java:148)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)