You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mrql.apache.org by "Leonidas Fegaras (JIRA)" <ji...@apache.org> on 2014/10/19 18:06:33 UTC

[jira] [Commented] (MRQL-55) Add support for Hadoop Sequence input format in flink mode

    [ https://issues.apache.org/jira/browse/MRQL-55?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14176330#comment-14176330 ] 

Leonidas Fegaras commented on MRQL-55:
--------------------------------------

Here are some performance results (in secs) on a small Yarn cluster with 12 nodes (48 cores):

# PageRank (6 steps) 1M nodes, 10M edges
# K-means clustering (5 steps) 10M points
# DBLP XML PageRank (12 steps) 1.5GB
# matrix multiplication 500x500

{noformat}
     Map-Reduce    Spark      Flink
-----------------------------------------------
1       591.8       145.1     145.3
2      1068.1       184.0     516.4
3       994.2       149.4     181.6
4        78.7        83.2      94.9
{noformat}

k-means is slower in Flink mode than in Spark mode because MRQL doesn't use Flink iterations for k-means (but it does use Flink iterations for pagerank).

> Add support for Hadoop Sequence input format in flink mode
> ----------------------------------------------------------
>
>                 Key: MRQL-55
>                 URL: https://issues.apache.org/jira/browse/MRQL-55
>             Project: MRQL
>          Issue Type: Improvement
>          Components: Run-Time/Flink
>    Affects Versions: 0.9.4
>            Reporter: Leonidas Fegaras
>            Assignee: Leonidas Fegaras
>            Priority: Minor
>         Attachments: MRQL-55.patch
>
>
> The following patch adds support for hadoop Sequence input format in flink mode. Before this, we used the flink binary input format to read/write binary files, which was not compatible with other MRQL evaluation modes. The patch also fixes the mrql.flink script to get the flink job manager from conf/.yarn-properties instead of conf/.yarn-jobmanager.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)