You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by ri...@sina.cn on 2016/09/06 12:56:29 UTC
回复:Re: 回复:Re: fromParallelCollection
my data from a Hbase table ,it is like a List[rowkey,Map[String,String]],
class MySplittableIterator extends SplittableIterator[String]{
// Members declared in java.util.Iterator
def hasNext(): Boolean = {
}
def next(): Nothing = {
}
// Members declared in org.apache.flink.util.SplittableIterator
def getMaximumNumberOfSplits(): Int = {
}
def split(num: Int): Array[Iterator[String]] = {
}
}
i do not know the methods to write,can you give me a example.
----- 原始邮件 -----
发件人:Timo Walther <tw...@apache.org>
收件人:user@flink.apache.org
主题:Re: 回复:Re: fromParallelCollection
日期:2016年09月06日 17点03分
Hi,
you have to implement a class that extends
"org.apache.flink.util.SplittableIterator". The runtime will ask
this class for multiple "java.util.Iterator"s over your split
data. How you split your data and how an iterator looks like
depends on your data and implementation.
If you need more help, you should show us some examples of your
data.
Timo
Am 06/09/16 um 09:46 schrieb rimin515@sina.cn:
fromCollection is not parallelization,the data is
huge,so i want to use env.fromParallelCollection(data),but the
data i do not know how to initialize,
----- 原始邮件 -----
发件人:Maximilian Michels <mx...@apache.org>
收件人:"user@flink.apache.org" <us...@flink.apache.org>,
rimin515@sina.cn
主题:Re: fromParallelCollection
日期:2016年09月05日 16点58分
Please give us a bit more insight on what you're trying to do.
On Sat, Sep 3, 2016 at 5:01 AM, <ri...@sina.cn> wrote:
> Hi,
> val env =
StreamExecutionEnvironment.getExecutionEnvironment
> val tr = env.fromParallelCollection(data)
>
> the data i do not know initialize,some one can tell me..
> --------------------------------
>
>
>
--
Freundliche Grüße / Kind Regards
Timo Walther
Follow me: @twalthr
https://www.linkedin.com/in/twalthr
Re: 回复:Re: 回复:Re: fromParallelCollection
Posted by Timo Walther <tw...@apache.org>.
If your data comes from HBase maybe it would also good to implement a
HBase source. A current HBase sink is in the making:
https://github.com/apache/flink/pull/2332
Maybe it would be better to save your data in an HDFS (e.g. CSV file)
and use the built-in "readFile()". This does the parallelism automatically.
Am 06/09/16 um 14:56 schrieb rimin515@sina.cn:
> my data from a Hbase table ,it is like a List[rowkey,Map[String,String]],
> class MySplittableIterator extends SplittableIterator[String]{
>
> // Members declared in java.util.Iterator
> def hasNext(): Boolean = {
>
> }
> def next(): Nothing = {
>
> }
>
> // Members declared in org.apache.flink.util.SplittableIterator
> def getMaximumNumberOfSplits(): Int = {
>
> }
> def split(num: Int): Array[Iterator[String]] = {
>
> }
> }
> i do not know the methods to write,can you give me a example.
> ----- \u539f\u59cb\u90ae\u4ef6 -----
> \u53d1\u4ef6\u4eba\uff1aTimo Walther <tw...@apache.org>
> \u6536\u4ef6\u4eba\uff1auser@flink.apache.org
> \u4e3b\u9898\uff1aRe: \u56de\u590d\uff1aRe: fromParallelCollection
> \u65e5\u671f\uff1a2016\u5e7409\u670806\u65e5 17\u70b903\u5206
>
> Hi,
>
> you have to implement a class that extends
> "org.apache.flink.util.SplittableIterator". The runtime will ask this
> class for multiple "java.util.Iterator"s over your split data. How you
> split your data and how an iterator looks like depends on your data
> and implementation.
>
> If you need more help, you should show us some examples of your data.
>
> Timo
>
> Am 06/09/16 um 09:46 schrieb rimin515@sina.cn <ma...@sina.cn>:
>> fromCollection is not parallelization,the data is huge,so i want to
>> use env.fromParallelCollection(data),but the data i do not know how
>> to initialize,
>> ----- \u539f\u59cb\u90ae\u4ef6 -----
>> \u53d1\u4ef6\u4eba\uff1aMaximilian Michels <mx...@apache.org> <ma...@apache.org>
>> \u6536\u4ef6\u4eba\uff1a"user@flink.apache.org" <ma...@flink.apache.org>
>> <us...@flink.apache.org> <ma...@flink.apache.org>,
>> rimin515@sina.cn <ma...@sina.cn>
>> \u4e3b\u9898\uff1aRe: fromParallelCollection
>> \u65e5\u671f\uff1a2016\u5e7409\u670805\u65e5 16\u70b958\u5206
>>
>>
>> Please give us a bit more insight on what you're trying to do.
>> On Sat, Sep 3, 2016 at 5:01 AM, <ri...@sina.cn>
>> <ma...@sina.cn> wrote:
>> > Hi\uff0c
>> > val env = StreamExecutionEnvironment.getExecutionEnvironment
>> > val tr = env.fromParallelCollection(data)
>> >
>> > the data i do not know initialize,some one can tell me..
>> > --------------------------------
>> >
>> >
>> >
>
>
> --
> Freundliche Gr��e / Kind Regards
>
> Timo Walther
>
> Follow me: @twalthr
> https://www.linkedin.com/in/twalthr
--
Freundliche Gr��e / Kind Regards
Timo Walther
Follow me: @twalthr
https://www.linkedin.com/in/twalthr