You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@tajo.apache.org by jihoonson <gi...@git.apache.org> on 2016/04/19 06:59:54 UTC

[GitHub] tajo pull request: TAJO-2122: PullServer as an Auxiliary service o...

GitHub user jihoonson opened a pull request:

    https://github.com/apache/tajo/pull/1001

    TAJO-2122: PullServer as an Auxiliary service of Yarn

    

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/jihoonson/tajo-2 TAJO-2122

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/tajo/pull/1001.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1001
    
----
commit 6034f96ddf69511665975984491f644bad1e61de
Author: Jihoon Son <ji...@apache.org>
Date:   2016-04-19T02:53:35Z

    TAJO-2122

commit eda7d71fe18daf7596e34118bd14547fadd550ef
Author: Jihoon Son <ji...@apache.org>
Date:   2016-04-19T04:58:58Z

    Fix test failure

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] tajo pull request: TAJO-2122: PullServer as an Auxiliary service o...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/tajo/pull/1001


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] tajo pull request: TAJO-2122: PullServer as an Auxiliary service o...

Posted by jinossy <gi...@git.apache.org>.
Github user jinossy commented on a diff in the pull request:

    https://github.com/apache/tajo/pull/1001#discussion_r60535035
  
    --- Diff: tajo-core/src/main/java/org/apache/tajo/worker/TaskImpl.java ---
    @@ -696,44 +696,48 @@ private synchronized void fetcherFinished(TaskAttemptContext ctx) {
     
               WorkerConnectionInfo conn = executionBlockContext.getWorkerContext().getConnectionInfo();
               if (NetUtils.isLocalAddress(address) && conn.getPullServerPort() == uri.getPort()) {
    -
    -            List<FileChunk> localChunkCandidates = getLocalStoredFileChunk(uri, systemConf);
    -
    -            for (FileChunk localChunk : localChunkCandidates) {
    -              // When a range request is out of range, storeChunk will be NULL. This case is normal state.
    -              // So, we should skip and don't need to create storeChunk.
    -              if (localChunk == null || localChunk.length() == 0) {
    -                continue;
    -              }
    -
    -              if (localChunk.getFile() != null && localChunk.startOffset() > -1) {
    -                localChunk.setFromRemote(false);
    -                localStoreChunkCount++;
    -              } else {
    -                localChunk = new FileChunk(defaultStoreFile, 0, -1);
    -                localChunk.setFromRemote(true);
    -              }
    -              localChunk.setEbId(f.getName());
    -              storeChunkList.add(localChunk);
    -            }
    +            localStoreChunkCount++;
    +            runnerList.add(new LocalFetcher(systemConf, uri, executionBlockContext, f.getName()));
    +
    +//            List<FileChunk> localChunkCandidates = getLocalStoredFileChunk(uri, systemConf);
    --- End diff --
    
    Could you explain why you comment out this block?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] tajo pull request: TAJO-2122: PullServer as an Auxiliary service o...

Posted by jinossy <gi...@git.apache.org>.
Github user jinossy commented on a diff in the pull request:

    https://github.com/apache/tajo/pull/1001#discussion_r60535225
  
    --- Diff: tajo-core/src/main/java/org/apache/tajo/worker/LocalFetcher.java ---
    @@ -0,0 +1,451 @@
    +/**
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *     http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +  package org.apache.tajo.worker;
    +
    +import com.google.common.annotations.VisibleForTesting;
    +import com.google.gson.Gson;
    +import io.netty.bootstrap.Bootstrap;
    +import io.netty.buffer.ByteBuf;
    +import io.netty.channel.*;
    +import io.netty.channel.socket.nio.NioSocketChannel;
    +import io.netty.handler.codec.http.*;
    +import io.netty.handler.timeout.ReadTimeoutException;
    +import io.netty.handler.timeout.ReadTimeoutHandler;
    +import io.netty.util.ReferenceCountUtil;
    +import org.apache.commons.logging.Log;
    +import org.apache.commons.logging.LogFactory;
    +import org.apache.hadoop.fs.FileSystem;
    +import org.apache.hadoop.fs.LocalDirAllocator;
    +import org.apache.hadoop.fs.LocalFileSystem;
    +import org.apache.hadoop.fs.Path;
    +import org.apache.tajo.TajoProtos;
    +import org.apache.tajo.TajoProtos.FetcherState;
    +import org.apache.tajo.conf.TajoConf;
    +import org.apache.tajo.conf.TajoConf.ConfVars;
    +import org.apache.tajo.exception.TajoInternalError;
    +import org.apache.tajo.pullserver.PullServerConstants;
    +import org.apache.tajo.pullserver.PullServerConstants.Param;
    +import org.apache.tajo.pullserver.PullServerUtil;
    +import org.apache.tajo.pullserver.PullServerUtil.PullServerParams;
    +import org.apache.tajo.pullserver.PullServerUtil.PullServerRequestURIBuilder;
    +import org.apache.tajo.pullserver.TajoPullServerService;
    +import org.apache.tajo.pullserver.retriever.FileChunk;
    +import org.apache.tajo.pullserver.retriever.FileChunkMeta;
    +import org.apache.tajo.rpc.NettyUtils;
    +import org.apache.tajo.storage.HashShuffleAppenderManager;
    +import org.apache.tajo.storage.StorageUtil;
    +
    +import java.io.File;
    +import java.io.IOException;
    +import java.net.InetSocketAddress;
    +import java.net.URI;
    +import java.nio.charset.Charset;
    +import java.util.ArrayList;
    +import java.util.List;
    +import java.util.Optional;
    +import java.util.concurrent.ExecutionException;
    +import java.util.concurrent.TimeUnit;
    +
    +public class LocalFetcher extends AbstractFetcher {
    +
    +  private final static Log LOG = LogFactory.getLog(LocalFetcher.class);
    +
    +//  private final ExecutionBlockContext executionBlockContext;
    +  private final TajoPullServerService pullServerService;
    +
    +  private final String host;
    +  private int port;
    +  private final Bootstrap bootstrap;
    +  private final int maxUrlLength;
    +  private final List<FileChunkMeta> chunkMetas = new ArrayList<>();
    +  private final String tableName;
    +  private final FileSystem localFileSystem;
    +  private final LocalDirAllocator localDirAllocator;
    +
    +  @VisibleForTesting
    +  public LocalFetcher(TajoConf conf, URI uri, String tableName) throws IOException {
    +    super(conf, uri);
    +    this.maxUrlLength = conf.getIntVar(ConfVars.PULLSERVER_FETCH_URL_MAX_LENGTH);
    +    this.tableName = tableName;
    +    this.localFileSystem = new LocalFileSystem();
    +    this.localDirAllocator = new LocalDirAllocator(ConfVars.WORKER_TEMPORAL_DIR.varname);
    +    this.pullServerService = null;
    +
    +    String scheme = uri.getScheme() == null ? "http" : uri.getScheme();
    +    this.host = uri.getHost() == null ? "localhost" : uri.getHost();
    +    this.port = uri.getPort();
    +    if (port == -1) {
    +      if (scheme.equalsIgnoreCase("http")) {
    +        this.port = 80;
    +      } else if (scheme.equalsIgnoreCase("https")) {
    +        this.port = 443;
    +      }
    +    }
    +
    +    bootstrap = new Bootstrap()
    +        .group(
    +            NettyUtils.getSharedEventLoopGroup(NettyUtils.GROUP.FETCHER,
    +                conf.getIntVar(ConfVars.SHUFFLE_RPC_CLIENT_WORKER_THREAD_NUM)))
    +        .channel(NioSocketChannel.class)
    +        .option(ChannelOption.ALLOCATOR, NettyUtils.ALLOCATOR)
    +        .option(ChannelOption.CONNECT_TIMEOUT_MILLIS,
    +            conf.getIntVar(ConfVars.SHUFFLE_FETCHER_CONNECT_TIMEOUT) * 1000)
    +        .option(ChannelOption.SO_RCVBUF, 1048576) // set 1M
    +        .option(ChannelOption.TCP_NODELAY, true);
    +  }
    +
    +  public LocalFetcher(TajoConf conf, URI uri, ExecutionBlockContext executionBlockContext, String tableName) {
    +    super(conf, uri);
    +    this.localFileSystem = executionBlockContext.getLocalFS();
    +    this.localDirAllocator = executionBlockContext.getLocalDirAllocator();
    +    this.maxUrlLength = conf.getIntVar(ConfVars.PULLSERVER_FETCH_URL_MAX_LENGTH);
    +    this.tableName = tableName;
    +
    +    Optional<TajoPullServerService> optional = executionBlockContext.getSharedResource().getPullServerService();
    +    if (optional.isPresent()) {
    +      // local pull server service
    +      this.pullServerService = optional.get();
    +      this.host = null;
    +      this.bootstrap = null;
    +
    +    } else if (PullServerUtil.useExternalPullServerService(conf)) {
    +      // external pull server service
    +      pullServerService = null;
    +
    +      String scheme = uri.getScheme() == null ? "http" : uri.getScheme();
    +      this.host = uri.getHost() == null ? "localhost" : uri.getHost();
    +      this.port = uri.getPort();
    +      if (port == -1) {
    +        if (scheme.equalsIgnoreCase("http")) {
    +          this.port = 80;
    +        } else if (scheme.equalsIgnoreCase("https")) {
    +          this.port = 443;
    +        }
    +      }
    +
    +      bootstrap = new Bootstrap()
    +          .group(
    +              NettyUtils.getSharedEventLoopGroup(NettyUtils.GROUP.FETCHER,
    +                  conf.getIntVar(ConfVars.SHUFFLE_RPC_CLIENT_WORKER_THREAD_NUM)))
    +          .channel(NioSocketChannel.class)
    +          .option(ChannelOption.ALLOCATOR, NettyUtils.ALLOCATOR)
    +          .option(ChannelOption.CONNECT_TIMEOUT_MILLIS,
    +              conf.getIntVar(ConfVars.SHUFFLE_FETCHER_CONNECT_TIMEOUT) * 1000)
    +          .option(ChannelOption.SO_RCVBUF, 1048576) // set 1M
    +          .option(ChannelOption.TCP_NODELAY, true);
    +    } else {
    +      endFetch(FetcherState.FETCH_FAILED);
    +      throw new TajoInternalError("Pull server service is not initialized");
    +    }
    +  }
    +
    +  @Override
    +  public List<FileChunk> get() throws IOException {
    +    return pullServerService != null ? getDirect() : getFromFetchURI();
    --- End diff --
    
    Why local fetcher connect to remote server?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] tajo pull request: TAJO-2122: PullServer as an Auxiliary service o...

Posted by jihoonson <gi...@git.apache.org>.
Github user jihoonson commented on a diff in the pull request:

    https://github.com/apache/tajo/pull/1001#discussion_r60541663
  
    --- Diff: tajo-core/src/main/java/org/apache/tajo/worker/TaskImpl.java ---
    @@ -696,44 +696,48 @@ private synchronized void fetcherFinished(TaskAttemptContext ctx) {
     
               WorkerConnectionInfo conn = executionBlockContext.getWorkerContext().getConnectionInfo();
               if (NetUtils.isLocalAddress(address) && conn.getPullServerPort() == uri.getPort()) {
    -
    -            List<FileChunk> localChunkCandidates = getLocalStoredFileChunk(uri, systemConf);
    -
    -            for (FileChunk localChunk : localChunkCandidates) {
    -              // When a range request is out of range, storeChunk will be NULL. This case is normal state.
    -              // So, we should skip and don't need to create storeChunk.
    -              if (localChunk == null || localChunk.length() == 0) {
    -                continue;
    -              }
    -
    -              if (localChunk.getFile() != null && localChunk.startOffset() > -1) {
    -                localChunk.setFromRemote(false);
    -                localStoreChunkCount++;
    -              } else {
    -                localChunk = new FileChunk(defaultStoreFile, 0, -1);
    -                localChunk.setFromRemote(true);
    -              }
    -              localChunk.setEbId(f.getName());
    -              storeChunkList.add(localChunk);
    -            }
    +            localStoreChunkCount++;
    +            runnerList.add(new LocalFetcher(systemConf, uri, executionBlockContext, f.getName()));
    +
    +//            List<FileChunk> localChunkCandidates = getLocalStoredFileChunk(uri, systemConf);
    --- End diff --
    
    This code is not used anymore. I forgot to remove it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] tajo pull request: TAJO-2122: PullServer as an Auxiliary service o...

Posted by jihoonson <gi...@git.apache.org>.
Github user jihoonson commented on the pull request:

    https://github.com/apache/tajo/pull/1001#issuecomment-214944710
  
    @jinossy, thanks. I've fixed NPE problem.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] tajo pull request: TAJO-2122: PullServer as an Auxiliary service o...

Posted by jinossy <gi...@git.apache.org>.
Github user jinossy commented on a diff in the pull request:

    https://github.com/apache/tajo/pull/1001#discussion_r60542686
  
    --- Diff: tajo-core/src/main/java/org/apache/tajo/worker/LocalFetcher.java ---
    @@ -0,0 +1,451 @@
    +/**
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *     http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +  package org.apache.tajo.worker;
    +
    +import com.google.common.annotations.VisibleForTesting;
    +import com.google.gson.Gson;
    +import io.netty.bootstrap.Bootstrap;
    +import io.netty.buffer.ByteBuf;
    +import io.netty.channel.*;
    +import io.netty.channel.socket.nio.NioSocketChannel;
    +import io.netty.handler.codec.http.*;
    +import io.netty.handler.timeout.ReadTimeoutException;
    +import io.netty.handler.timeout.ReadTimeoutHandler;
    +import io.netty.util.ReferenceCountUtil;
    +import org.apache.commons.logging.Log;
    +import org.apache.commons.logging.LogFactory;
    +import org.apache.hadoop.fs.FileSystem;
    +import org.apache.hadoop.fs.LocalDirAllocator;
    +import org.apache.hadoop.fs.LocalFileSystem;
    +import org.apache.hadoop.fs.Path;
    +import org.apache.tajo.TajoProtos;
    +import org.apache.tajo.TajoProtos.FetcherState;
    +import org.apache.tajo.conf.TajoConf;
    +import org.apache.tajo.conf.TajoConf.ConfVars;
    +import org.apache.tajo.exception.TajoInternalError;
    +import org.apache.tajo.pullserver.PullServerConstants;
    +import org.apache.tajo.pullserver.PullServerConstants.Param;
    +import org.apache.tajo.pullserver.PullServerUtil;
    +import org.apache.tajo.pullserver.PullServerUtil.PullServerParams;
    +import org.apache.tajo.pullserver.PullServerUtil.PullServerRequestURIBuilder;
    +import org.apache.tajo.pullserver.TajoPullServerService;
    +import org.apache.tajo.pullserver.retriever.FileChunk;
    +import org.apache.tajo.pullserver.retriever.FileChunkMeta;
    +import org.apache.tajo.rpc.NettyUtils;
    +import org.apache.tajo.storage.HashShuffleAppenderManager;
    +import org.apache.tajo.storage.StorageUtil;
    +
    +import java.io.File;
    +import java.io.IOException;
    +import java.net.InetSocketAddress;
    +import java.net.URI;
    +import java.nio.charset.Charset;
    +import java.util.ArrayList;
    +import java.util.List;
    +import java.util.Optional;
    +import java.util.concurrent.ExecutionException;
    +import java.util.concurrent.TimeUnit;
    +
    +public class LocalFetcher extends AbstractFetcher {
    +
    +  private final static Log LOG = LogFactory.getLog(LocalFetcher.class);
    +
    +//  private final ExecutionBlockContext executionBlockContext;
    +  private final TajoPullServerService pullServerService;
    +
    +  private final String host;
    +  private int port;
    +  private final Bootstrap bootstrap;
    +  private final int maxUrlLength;
    +  private final List<FileChunkMeta> chunkMetas = new ArrayList<>();
    +  private final String tableName;
    +  private final FileSystem localFileSystem;
    +  private final LocalDirAllocator localDirAllocator;
    +
    +  @VisibleForTesting
    +  public LocalFetcher(TajoConf conf, URI uri, String tableName) throws IOException {
    +    super(conf, uri);
    +    this.maxUrlLength = conf.getIntVar(ConfVars.PULLSERVER_FETCH_URL_MAX_LENGTH);
    +    this.tableName = tableName;
    +    this.localFileSystem = new LocalFileSystem();
    +    this.localDirAllocator = new LocalDirAllocator(ConfVars.WORKER_TEMPORAL_DIR.varname);
    +    this.pullServerService = null;
    +
    +    String scheme = uri.getScheme() == null ? "http" : uri.getScheme();
    +    this.host = uri.getHost() == null ? "localhost" : uri.getHost();
    +    this.port = uri.getPort();
    +    if (port == -1) {
    +      if (scheme.equalsIgnoreCase("http")) {
    +        this.port = 80;
    +      } else if (scheme.equalsIgnoreCase("https")) {
    +        this.port = 443;
    +      }
    +    }
    +
    +    bootstrap = new Bootstrap()
    +        .group(
    +            NettyUtils.getSharedEventLoopGroup(NettyUtils.GROUP.FETCHER,
    +                conf.getIntVar(ConfVars.SHUFFLE_RPC_CLIENT_WORKER_THREAD_NUM)))
    +        .channel(NioSocketChannel.class)
    +        .option(ChannelOption.ALLOCATOR, NettyUtils.ALLOCATOR)
    +        .option(ChannelOption.CONNECT_TIMEOUT_MILLIS,
    +            conf.getIntVar(ConfVars.SHUFFLE_FETCHER_CONNECT_TIMEOUT) * 1000)
    +        .option(ChannelOption.SO_RCVBUF, 1048576) // set 1M
    +        .option(ChannelOption.TCP_NODELAY, true);
    +  }
    +
    +  public LocalFetcher(TajoConf conf, URI uri, ExecutionBlockContext executionBlockContext, String tableName) {
    +    super(conf, uri);
    +    this.localFileSystem = executionBlockContext.getLocalFS();
    +    this.localDirAllocator = executionBlockContext.getLocalDirAllocator();
    +    this.maxUrlLength = conf.getIntVar(ConfVars.PULLSERVER_FETCH_URL_MAX_LENGTH);
    +    this.tableName = tableName;
    +
    +    Optional<TajoPullServerService> optional = executionBlockContext.getSharedResource().getPullServerService();
    +    if (optional.isPresent()) {
    +      // local pull server service
    +      this.pullServerService = optional.get();
    +      this.host = null;
    +      this.bootstrap = null;
    +
    +    } else if (PullServerUtil.useExternalPullServerService(conf)) {
    +      // external pull server service
    +      pullServerService = null;
    +
    +      String scheme = uri.getScheme() == null ? "http" : uri.getScheme();
    +      this.host = uri.getHost() == null ? "localhost" : uri.getHost();
    +      this.port = uri.getPort();
    +      if (port == -1) {
    +        if (scheme.equalsIgnoreCase("http")) {
    +          this.port = 80;
    +        } else if (scheme.equalsIgnoreCase("https")) {
    +          this.port = 443;
    +        }
    +      }
    +
    +      bootstrap = new Bootstrap()
    +          .group(
    +              NettyUtils.getSharedEventLoopGroup(NettyUtils.GROUP.FETCHER,
    +                  conf.getIntVar(ConfVars.SHUFFLE_RPC_CLIENT_WORKER_THREAD_NUM)))
    +          .channel(NioSocketChannel.class)
    +          .option(ChannelOption.ALLOCATOR, NettyUtils.ALLOCATOR)
    +          .option(ChannelOption.CONNECT_TIMEOUT_MILLIS,
    +              conf.getIntVar(ConfVars.SHUFFLE_FETCHER_CONNECT_TIMEOUT) * 1000)
    +          .option(ChannelOption.SO_RCVBUF, 1048576) // set 1M
    +          .option(ChannelOption.TCP_NODELAY, true);
    +    } else {
    +      endFetch(FetcherState.FETCH_FAILED);
    +      throw new TajoInternalError("Pull server service is not initialized");
    +    }
    +  }
    +
    +  @Override
    +  public List<FileChunk> get() throws IOException {
    +    return pullServerService != null ? getDirect() : getFromFetchURI();
    --- End diff --
    
    OK, thanks


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] tajo pull request: TAJO-2122: PullServer as an Auxiliary service o...

Posted by jihoonson <gi...@git.apache.org>.
Github user jihoonson commented on a diff in the pull request:

    https://github.com/apache/tajo/pull/1001#discussion_r60859841
  
    --- Diff: tajo-yarn/pom.xml ---
    @@ -0,0 +1,185 @@
    +<?xml version="1.0" encoding="UTF-8"?>
    +<!--
    +  ~ Licensed to the Apache Software Foundation (ASF) under one
    +  ~ or more contributor license agreements.  See the NOTICE file
    +  ~ distributed with this work for additional information
    +  ~ regarding copyright ownership.  The ASF licenses this file
    +  ~ to you under the Apache License, Version 2.0 (the
    +  ~ "License"); you may not use this file except in compliance
    +  ~ with the License.  You may obtain a copy of the License at
    +  ~
    +  ~     http://www.apache.org/licenses/LICENSE-2.0
    +  ~
    +  ~ Unless required by applicable law or agreed to in writing, software
    +  ~ distributed under the License is distributed on an "AS IS" BASIS,
    +  ~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    +  ~ See the License for the specific language governing permissions and
    +  ~ limitations under the License.
    +  -->
    +
    +<project xmlns="http://maven.apache.org/POM/4.0.0"
    +         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    +         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
    +  <parent>
    +    <artifactId>tajo-project</artifactId>
    +    <groupId>org.apache.tajo</groupId>
    +    <version>0.12.0-SNAPSHOT</version>
    +    <relativePath>../tajo-project</relativePath>
    +  </parent>
    +  <modelVersion>4.0.0</modelVersion>
    +  <artifactId>tajo-yarn</artifactId>
    +  <packaging>jar</packaging>
    +  <name>Tajo Yarn</name>
    +  <properties>
    +    <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
    +    <project.reporting.outputEncoding>UTF-8</project.reporting.outputEncoding>
    +  </properties>
    +
    +  <build>
    +    <plugins>
    +      <plugin>
    +        <groupId>org.apache.maven.plugins</groupId>
    +        <artifactId>maven-compiler-plugin</artifactId>
    +      </plugin>
    +      <plugin>
    +        <groupId>org.apache.rat</groupId>
    +        <artifactId>apache-rat-plugin</artifactId>
    +        <executions>
    +          <execution>
    +            <phase>verify</phase>
    +            <goals>
    +              <goal>check</goal>
    +            </goals>
    +          </execution>
    +        </executions>
    +      </plugin>
    +      <plugin>
    +        <groupId>org.apache.maven.plugins</groupId>
    +        <artifactId>maven-surefire-report-plugin</artifactId>
    +      </plugin>
    +      <plugin>
    +        <artifactId>maven-assembly-plugin</artifactId>
    +        <version>2.4.1</version>
    +        <configuration>
    +          <descriptorRefs>
    +            <descriptorRef>jar-with-dependencies</descriptorRef>
    +          </descriptorRefs>
    +        </configuration>
    +        <executions>
    +          <execution>
    +            <id>make-assembly</id>
    +            <phase>package</phase>
    +            <goals>
    +              <goal>single</goal>
    +            </goals>
    +          </execution>
    +        </executions>
    +      </plugin>
    +    </plugins>
    +  </build>
    +
    +  <dependencies>
    +    <dependency>
    +      <groupId>org.apache.hadoop</groupId>
    +      <artifactId>hadoop-mapreduce-client-shuffle</artifactId>
    +      <scope>provided</scope>
    +    </dependency>
    +    <dependency>
    +      <groupId>org.apache.hadoop</groupId>
    +      <artifactId>hadoop-yarn-api</artifactId>
    +      <scope>provided</scope>
    +    </dependency>
    +    <dependency>
    +      <groupId>org.apache.hadoop</groupId>
    +      <artifactId>hadoop-common</artifactId>
    +      <scope>provided</scope>
    +    </dependency>
    +    <dependency>
    +      <groupId>org.apache.tajo</groupId>
    +      <artifactId>tajo-common</artifactId>
    +    </dependency>
    +    <dependency>
    +      <groupId>org.apache.tajo</groupId>
    +      <artifactId>tajo-storage-common</artifactId>
    +    </dependency>
    +    <dependency>
    +      <groupId>org.apache.tajo</groupId>
    +      <artifactId>tajo-storage-hdfs</artifactId>
    +    </dependency>
    +    <dependency>
    +      <groupId>org.apache.tajo</groupId>
    +      <artifactId>tajo-pullserver</artifactId>
    +    </dependency>
    +  </dependencies>
    +  <profiles>
    +    <profile>
    +      <id>docs</id>
    +      <activation>
    +        <activeByDefault>false</activeByDefault>
    +      </activation>
    +      <build>
    +        <plugins>
    +          <plugin>
    +            <groupId>org.apache.maven.plugins</groupId>
    +            <artifactId>maven-javadoc-plugin</artifactId>
    +            <executions>
    +              <execution>
    +                <!-- build javadoc jars per jar for publishing to maven -->
    +                <id>module-javadocs</id>
    +                <phase>package</phase>
    +                <goals>
    +                  <goal>jar</goal>
    +                </goals>
    +                <configuration>
    +                  <destDir>${project.build.directory}</destDir>
    +                </configuration>
    +              </execution>
    +            </executions>
    +          </plugin>
    +        </plugins>
    +      </build>
    +    </profile>
    +    <profile>
    +      <id>src</id>
    +      <activation>
    +        <activeByDefault>false</activeByDefault>
    +      </activation>
    +      <build>
    +        <plugins>
    +          <plugin>
    +            <groupId>org.apache.maven.plugins</groupId>
    +            <artifactId>maven-source-plugin</artifactId>
    +            <executions>
    +              <execution>
    +                <!-- builds source jars and attaches them to the project for publishing -->
    +                <id>tajo-java-sources</id>
    +                <phase>package</phase>
    +                <goals>
    +                  <goal>jar-no-fork</goal>
    +                </goals>
    +              </execution>
    +            </executions>
    +          </plugin>
    +        </plugins>
    +      </build>
    +    </profile>
    +  </profiles>
    +
    +  <reporting>
    +    <plugins>
    +      <plugin>
    +        <groupId>org.apache.maven.plugins</groupId>
    +        <artifactId>maven-project-info-reports-plugin</artifactId>
    +        <version>2.4</version>
    +        <configuration>
    +          <dependencyLocationsEnabled>false</dependencyLocationsEnabled>
    +        </configuration>
    +      </plugin>
    +      <plugin>
    +        <groupId>org.apache.maven.plugins</groupId>
    +        <artifactId>maven-surefire-report-plugin</artifactId>
    +      </plugin>
    +    </plugins>
    +  </reporting>
    +
    +</project>
    --- End diff --
    
    Thanks. I've cleaned up the dependency of module yarn.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] tajo pull request: TAJO-2122: PullServer as an Auxiliary service o...

Posted by jihoonson <gi...@git.apache.org>.
Github user jihoonson commented on the pull request:

    https://github.com/apache/tajo/pull/1001#issuecomment-214966835
  
    @jinossy, thanks for your review!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] tajo pull request: TAJO-2122: PullServer as an Auxiliary service o...

Posted by jinossy <gi...@git.apache.org>.
Github user jinossy commented on the pull request:

    https://github.com/apache/tajo/pull/1001#issuecomment-214966415
  
    +1 Great work!
    Thanks for your effort!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] tajo pull request: TAJO-2122: PullServer as an Auxiliary service o...

Posted by jinossy <gi...@git.apache.org>.
Github user jinossy commented on the pull request:

    https://github.com/apache/tajo/pull/1001#issuecomment-214130995
  
    @jihoonson 
    I've test on hadoop-2.7.2 and I found following error
    ```
    2016-04-25 13:49:36,698 WARN org.apache.tajo.pullserver.PullServerUtil: Failed to manage OS cache for /data02/tajo/q_1461559624789_0001/output/1/hash-shuffle/30/158
    java.lang.NullPointerException
    	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    	at java.lang.reflect.Method.invoke(Method.java:498)
    	at org.apache.tajo.pullserver.PullServerUtil.posixFadviseIfPossible(PullServerUtil.java:84)
    	at org.apache.tajo.yarn.FadvisedFileRegion.transferSuccessful(FadvisedFileRegion.java:164)
    	at org.apache.tajo.yarn.FileCloseListener.operationComplete(FileCloseListener.java:37)
    	at org.jboss.netty.channel.DefaultChannelFuture.notifyListener(DefaultChannelFuture.java:427)
    	at org.jboss.netty.channel.DefaultChannelFuture.addListener(DefaultChannelFuture.java:145)
    	at org.apache.tajo.yarn.TajoPullServerService$PullServer.sendFile(TajoPullServerService.java:533)
    	at org.apache.tajo.yarn.TajoPullServerService$PullServer.handleChunkRequest(TajoPullServerService.java:506)
    	at org.apache.tajo.yarn.TajoPullServerService$PullServer.messageReceived(TajoPullServerService.java:369)
    	at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
    	at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:560)
    	at org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:787)
    	at org.jboss.netty.handler.stream.ChunkedWriteHandler.handleUpstream(ChunkedWriteHandler.java:142)
    	at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:560)
    	at org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:787)
    	at org.jboss.netty.handler.codec.http.HttpChunkAggregator.messageReceived(HttpChunkAggregator.java:148)
    	at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
    	at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:560)
    	at org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:787)
    	at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:296)
    	at org.jboss.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:459)
    	at org.jboss.netty.handler.codec.replay.ReplayingDecoder.callDecode(ReplayingDecoder.java:536)
    	at org.jboss.netty.handler.codec.replay.ReplayingDecoder.messageReceived(ReplayingDecoder.java:435)
    	at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
    	at org.jboss.netty.handler.codec.http.HttpServerCodec.handleUpstream(HttpServerCodec.java:56)
    	at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:560)
    	at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:555)
    	at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268)
    	at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255)
    	at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88)
    	at org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:107)
    	at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:312)
    	at org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:88)
    	at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
    	at org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
    	at org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
    	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    	at java.lang.Thread.run(Thread.java:745)
    ```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] tajo pull request: TAJO-2122: PullServer as an Auxiliary service o...

Posted by jinossy <gi...@git.apache.org>.
Github user jinossy commented on a diff in the pull request:

    https://github.com/apache/tajo/pull/1001#discussion_r60856908
  
    --- Diff: tajo-yarn/pom.xml ---
    @@ -0,0 +1,185 @@
    +<?xml version="1.0" encoding="UTF-8"?>
    +<!--
    +  ~ Licensed to the Apache Software Foundation (ASF) under one
    +  ~ or more contributor license agreements.  See the NOTICE file
    +  ~ distributed with this work for additional information
    +  ~ regarding copyright ownership.  The ASF licenses this file
    +  ~ to you under the Apache License, Version 2.0 (the
    +  ~ "License"); you may not use this file except in compliance
    +  ~ with the License.  You may obtain a copy of the License at
    +  ~
    +  ~     http://www.apache.org/licenses/LICENSE-2.0
    +  ~
    +  ~ Unless required by applicable law or agreed to in writing, software
    +  ~ distributed under the License is distributed on an "AS IS" BASIS,
    +  ~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    +  ~ See the License for the specific language governing permissions and
    +  ~ limitations under the License.
    +  -->
    +
    +<project xmlns="http://maven.apache.org/POM/4.0.0"
    +         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    +         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
    +  <parent>
    +    <artifactId>tajo-project</artifactId>
    +    <groupId>org.apache.tajo</groupId>
    +    <version>0.12.0-SNAPSHOT</version>
    +    <relativePath>../tajo-project</relativePath>
    +  </parent>
    +  <modelVersion>4.0.0</modelVersion>
    +  <artifactId>tajo-yarn</artifactId>
    +  <packaging>jar</packaging>
    +  <name>Tajo Yarn</name>
    +  <properties>
    +    <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
    +    <project.reporting.outputEncoding>UTF-8</project.reporting.outputEncoding>
    +  </properties>
    +
    +  <build>
    +    <plugins>
    +      <plugin>
    +        <groupId>org.apache.maven.plugins</groupId>
    +        <artifactId>maven-compiler-plugin</artifactId>
    +      </plugin>
    +      <plugin>
    +        <groupId>org.apache.rat</groupId>
    +        <artifactId>apache-rat-plugin</artifactId>
    +        <executions>
    +          <execution>
    +            <phase>verify</phase>
    +            <goals>
    +              <goal>check</goal>
    +            </goals>
    +          </execution>
    +        </executions>
    +      </plugin>
    +      <plugin>
    +        <groupId>org.apache.maven.plugins</groupId>
    +        <artifactId>maven-surefire-report-plugin</artifactId>
    +      </plugin>
    +      <plugin>
    +        <artifactId>maven-assembly-plugin</artifactId>
    +        <version>2.4.1</version>
    +        <configuration>
    +          <descriptorRefs>
    +            <descriptorRef>jar-with-dependencies</descriptorRef>
    +          </descriptorRefs>
    +        </configuration>
    +        <executions>
    +          <execution>
    +            <id>make-assembly</id>
    +            <phase>package</phase>
    +            <goals>
    +              <goal>single</goal>
    +            </goals>
    +          </execution>
    +        </executions>
    +      </plugin>
    +    </plugins>
    +  </build>
    +
    +  <dependencies>
    +    <dependency>
    +      <groupId>org.apache.hadoop</groupId>
    +      <artifactId>hadoop-mapreduce-client-shuffle</artifactId>
    +      <scope>provided</scope>
    +    </dependency>
    +    <dependency>
    +      <groupId>org.apache.hadoop</groupId>
    +      <artifactId>hadoop-yarn-api</artifactId>
    +      <scope>provided</scope>
    +    </dependency>
    +    <dependency>
    +      <groupId>org.apache.hadoop</groupId>
    +      <artifactId>hadoop-common</artifactId>
    +      <scope>provided</scope>
    +    </dependency>
    +    <dependency>
    +      <groupId>org.apache.tajo</groupId>
    +      <artifactId>tajo-common</artifactId>
    +    </dependency>
    +    <dependency>
    +      <groupId>org.apache.tajo</groupId>
    +      <artifactId>tajo-storage-common</artifactId>
    +    </dependency>
    +    <dependency>
    +      <groupId>org.apache.tajo</groupId>
    +      <artifactId>tajo-storage-hdfs</artifactId>
    +    </dependency>
    +    <dependency>
    +      <groupId>org.apache.tajo</groupId>
    +      <artifactId>tajo-pullserver</artifactId>
    +    </dependency>
    +  </dependencies>
    +  <profiles>
    +    <profile>
    +      <id>docs</id>
    +      <activation>
    +        <activeByDefault>false</activeByDefault>
    +      </activation>
    +      <build>
    +        <plugins>
    +          <plugin>
    +            <groupId>org.apache.maven.plugins</groupId>
    +            <artifactId>maven-javadoc-plugin</artifactId>
    +            <executions>
    +              <execution>
    +                <!-- build javadoc jars per jar for publishing to maven -->
    +                <id>module-javadocs</id>
    +                <phase>package</phase>
    +                <goals>
    +                  <goal>jar</goal>
    +                </goals>
    +                <configuration>
    +                  <destDir>${project.build.directory}</destDir>
    +                </configuration>
    +              </execution>
    +            </executions>
    +          </plugin>
    +        </plugins>
    +      </build>
    +    </profile>
    +    <profile>
    +      <id>src</id>
    +      <activation>
    +        <activeByDefault>false</activeByDefault>
    +      </activation>
    +      <build>
    +        <plugins>
    +          <plugin>
    +            <groupId>org.apache.maven.plugins</groupId>
    +            <artifactId>maven-source-plugin</artifactId>
    +            <executions>
    +              <execution>
    +                <!-- builds source jars and attaches them to the project for publishing -->
    +                <id>tajo-java-sources</id>
    +                <phase>package</phase>
    +                <goals>
    +                  <goal>jar-no-fork</goal>
    +                </goals>
    +              </execution>
    +            </executions>
    +          </plugin>
    +        </plugins>
    +      </build>
    +    </profile>
    +  </profiles>
    +
    +  <reporting>
    +    <plugins>
    +      <plugin>
    +        <groupId>org.apache.maven.plugins</groupId>
    +        <artifactId>maven-project-info-reports-plugin</artifactId>
    +        <version>2.4</version>
    +        <configuration>
    +          <dependencyLocationsEnabled>false</dependencyLocationsEnabled>
    +        </configuration>
    +      </plugin>
    +      <plugin>
    +        <groupId>org.apache.maven.plugins</groupId>
    +        <artifactId>maven-surefire-report-plugin</artifactId>
    +      </plugin>
    +    </plugins>
    +  </reporting>
    +
    +</project>
    --- End diff --
    
    @jihoonson 
    There are included too many dependencies. Could exclude unused module ?
    ```
    [INFO] ------------------------------------------------------------------------
    [INFO] Building Tajo Yarn 0.12.0-SNAPSHOT
    [INFO] ------------------------------------------------------------------------
    [INFO]
    [INFO] --- maven-dependency-plugin:2.4:tree (default-cli) @ tajo-yarn ---
    [INFO] org.apache.tajo:tajo-yarn:jar:0.12.0-SNAPSHOT
    [INFO] +- org.apache.hadoop:hadoop-mapreduce-client-shuffle:jar:2.7.2:provided
    [INFO] |  +- org.apache.hadoop:hadoop-yarn-server-common:jar:2.7.2:provided
    [INFO] |  |  \- org.apache.hadoop:hadoop-yarn-common:jar:2.7.2:provided
    [INFO] |  |     +- javax.xml.bind:jaxb-api:jar:2.2.2:provided
    [INFO] |  |     |  +- javax.xml.stream:stax-api:jar:1.0-2:provided
    [INFO] |  |     |  \- javax.activation:activation:jar:1.1:provided
    [INFO] |  |     +- com.sun.jersey:jersey-client:jar:1.9:provided
    [INFO] |  |     \- com.sun.jersey.contribs:jersey-guice:jar:1.9:provided
    [INFO] |  +- org.apache.hadoop:hadoop-mapreduce-client-common:jar:2.7.2:provided
    [INFO] |  |  \- org.apache.hadoop:hadoop-yarn-client:jar:2.7.2:provided
    [INFO] |  +- org.fusesource.leveldbjni:leveldbjni-all:jar:1.8:provided
    [INFO] |  +- org.slf4j:slf4j-log4j12:jar:1.7.10:provided
    [INFO] |  +- org.apache.hadoop:hadoop-annotations:jar:2.7.2:provided
    [INFO] |  |  \- jdk.tools:jdk.tools:jar:1.8:system
    [INFO] |  \- com.google.inject.extensions:guice-servlet:jar:3.0:provided
    [INFO] |     \- com.google.inject:guice:jar:3.0:provided
    [INFO] |        +- javax.inject:javax.inject:jar:1:provided
    [INFO] |        \- aopalliance:aopalliance:jar:1.0:provided
    [INFO] +- org.apache.hadoop:hadoop-yarn-api:jar:2.7.2:provided
    [INFO] +- org.apache.hadoop:hadoop-common:jar:2.7.2:provided (scope not updated to compile)
    [INFO] |  +- commons-cli:commons-cli:jar:1.2:provided
    [INFO] |  +- org.apache.commons:commons-math3:jar:3.1.1:provided
    [INFO] |  +- xmlenc:xmlenc:jar:0.52:provided
    [INFO] |  +- commons-httpclient:commons-httpclient:jar:3.1:provided
    [INFO] |  +- commons-codec:commons-codec:jar:1.10:provided
    [INFO] |  +- commons-io:commons-io:jar:2.4:provided
    [INFO] |  +- commons-net:commons-net:jar:3.1:provided
    [INFO] |  +- javax.servlet:servlet-api:jar:2.5:provided
    [INFO] |  +- javax.servlet.jsp:jsp-api:jar:2.1:provided
    [INFO] |  +- com.sun.jersey:jersey-core:jar:1.9:provided
    [INFO] |  +- com.sun.jersey:jersey-json:jar:1.9:provided
    [INFO] |  |  +- org.codehaus.jettison:jettison:jar:1.1:provided
    [INFO] |  |  +- com.sun.xml.bind:jaxb-impl:jar:2.2.3-1:provided
    [INFO] |  |  +- org.codehaus.jackson:jackson-jaxrs:jar:1.8.3:provided
    [INFO] |  |  \- org.codehaus.jackson:jackson-xc:jar:1.8.3:provided
    [INFO] |  +- com.sun.jersey:jersey-server:jar:1.9:provided
    [INFO] |  +- log4j:log4j:jar:1.2.17:provided
    [INFO] |  +- net.java.dev.jets3t:jets3t:jar:0.9.0:provided
    [INFO] |  |  +- org.apache.httpcomponents:httpclient:jar:4.1.2:provided
    [INFO] |  |  +- org.apache.httpcomponents:httpcore:jar:4.1.2:provided
    [INFO] |  |  \- com.jamesmurty.utils:java-xmlbuilder:jar:0.4:provided
    [INFO] |  +- commons-configuration:commons-configuration:jar:1.6:provided
    [INFO] |  |  +- commons-digester:commons-digester:jar:1.8:provided
    [INFO] |  |  |  \- commons-beanutils:commons-beanutils:jar:1.7.0:provided
    [INFO] |  |  \- commons-beanutils:commons-beanutils-core:jar:1.8.0:provided
    [INFO] |  +- org.apache.hadoop:hadoop-auth:jar:2.7.2:provided
    [INFO] |  |  +- org.apache.directory.server:apacheds-kerberos-codec:jar:2.0.0-M15:provided
    [INFO] |  |  |  +- org.apache.directory.server:apacheds-i18n:jar:2.0.0-M15:provided
    [INFO] |  |  |  +- org.apache.directory.api:api-asn1-api:jar:1.0.0-M20:provided
    [INFO] |  |  |  \- org.apache.directory.api:api-util:jar:1.0.0-M20:provided
    [INFO] |  |  \- org.apache.curator:curator-framework:jar:2.7.1:provided
    [INFO] |  +- com.jcraft:jsch:jar:0.1.42:provided
    [INFO] |  +- org.apache.curator:curator-client:jar:2.7.1:provided
    [INFO] |  +- org.apache.curator:curator-recipes:jar:2.7.1:provided
    [INFO] |  +- org.apache.htrace:htrace-core:jar:3.1.0-incubating:provided
    [INFO] |  +- org.apache.zookeeper:zookeeper:jar:3.4.6:provided
    [INFO] |  \- org.apache.commons:commons-compress:jar:1.4.1:provided
    [INFO] |     \- org.tukaani:xz:jar:1.0:provided
    [INFO] +- org.apache.tajo:tajo-common:jar:0.12.0-SNAPSHOT:compile
    [INFO] |  +- com.google.protobuf:protobuf-java:jar:2.5.0:compile
    [INFO] |  +- commons-logging:commons-logging:jar:1.1.1:compile
    [INFO] |  +- commons-logging:commons-logging-api:jar:1.1:compile
    [INFO] |  +- commons-lang:commons-lang:jar:2.6:compile
    [INFO] |  +- com.google.guava:guava:jar:11.0.2:compile
    [INFO] |  |  \- com.google.code.findbugs:jsr305:jar:3.0.0:compile
    [INFO] |  +- com.google.code.gson:gson:jar:2.2.2:compile
    [INFO] |  +- io.netty:netty-buffer:jar:4.0.34.Final:compile
    [INFO] |  |  \- io.netty:netty-common:jar:4.0.34.Final:compile
    [INFO] |  \- org.iq80.snappy:snappy:jar:0.4:compile
    [INFO] +- org.apache.tajo:tajo-storage-common:jar:0.12.0-SNAPSHOT:compile
    [INFO] |  +- org.apache.tajo:tajo-catalog-common:jar:0.12.0-SNAPSHOT:compile
    [INFO] |  +- org.apache.tajo:tajo-plan:jar:0.12.0-SNAPSHOT:compile
    [INFO] |  |  \- org.apache.tajo:tajo-algebra:jar:0.12.0-SNAPSHOT:compile
    [INFO] |  \- net.minidev:json-smart:jar:2.1.1:compile
    [INFO] |     \- net.minidev:asm:jar:1.0.2:compile
    [INFO] |        \- asm:asm:jar:3.1:compile
    [INFO] +- org.apache.tajo:tajo-storage-hdfs:jar:0.12.0-SNAPSHOT:compile
    [INFO] |  +- io.netty:netty-transport:jar:4.0.34.Final:compile
    [INFO] |  +- io.netty:netty-codec:jar:4.0.34.Final:compile
    [INFO] |  +- io.netty:netty-codec-http:jar:4.0.34.Final:compile
    [INFO] |  |  \- io.netty:netty-handler:jar:4.0.34.Final:compile
    [INFO] |  +- org.apache.avro:trevni-core:jar:1.7.3:compile
    [INFO] |  |  +- org.xerial.snappy:snappy-java:jar:1.0.4.1:compile
    [INFO] |  |  \- org.slf4j:slf4j-api:jar:1.7.10:compile
    [INFO] |  +- org.apache.avro:trevni-avro:jar:1.7.3:compile
    [INFO] |  |  \- org.apache.avro:avro-mapred:jar:1.7.3:compile
    [INFO] |  |     +- org.apache.avro:avro-ipc:jar:1.7.3:compile
    [INFO] |  |     |  +- org.apache.avro:avro:jar:1.7.4:compile
    [INFO] |  |     |  |  \- com.thoughtworks.paranamer:paranamer:jar:2.3:compile
    [INFO] |  |     |  +- org.mortbay.jetty:jetty:jar:6.1.26:compile
    [INFO] |  |     |  +- org.mortbay.jetty:jetty-util:jar:6.1.26:compile
    [INFO] |  |     |  +- io.netty:netty:jar:3.6.2.Final:compile
    [INFO] |  |     |  +- org.apache.velocity:velocity:jar:1.7:compile
    [INFO] |  |     |  |  \- commons-collections:commons-collections:jar:3.2.2:compile
    [INFO] |  |     |  \- org.mortbay.jetty:servlet-api:jar:2.5-20081211:compile
    [INFO] |  |     +- org.apache.avro:avro-ipc:jar:tests:1.7.3:compile
    [INFO] |  |     +- org.codehaus.jackson:jackson-core-asl:jar:1.8.8:compile
    [INFO] |  |     \- org.codehaus.jackson:jackson-mapper-asl:jar:1.8.8:compile
    [INFO] |  +- org.apache.parquet:parquet-hadoop-bundle:jar:1.8.1:compile
    [INFO] |  +- org.apache.hive:hive-orc:jar:2.0.0:compile
    [INFO] |  \- org.apache.hive:hive-storage-api:jar:2.0.0:compile
    [INFO] \- org.apache.tajo:tajo-pullserver:jar:0.12.0-SNAPSHOT:compile
    [INFO]    \- org.apache.tajo:tajo-rpc-protobuf:jar:0.12.0-SNAPSHOT:compile
    [INFO]       \- org.apache.tajo:tajo-rpc-common:jar:0.12.0-SNAPSHOT:compile
    ```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] tajo pull request: TAJO-2122: PullServer as an Auxiliary service o...

Posted by jihoonson <gi...@git.apache.org>.
Github user jihoonson commented on a diff in the pull request:

    https://github.com/apache/tajo/pull/1001#discussion_r60542212
  
    --- Diff: tajo-core/src/main/java/org/apache/tajo/worker/LocalFetcher.java ---
    @@ -0,0 +1,451 @@
    +/**
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *     http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +  package org.apache.tajo.worker;
    +
    +import com.google.common.annotations.VisibleForTesting;
    +import com.google.gson.Gson;
    +import io.netty.bootstrap.Bootstrap;
    +import io.netty.buffer.ByteBuf;
    +import io.netty.channel.*;
    +import io.netty.channel.socket.nio.NioSocketChannel;
    +import io.netty.handler.codec.http.*;
    +import io.netty.handler.timeout.ReadTimeoutException;
    +import io.netty.handler.timeout.ReadTimeoutHandler;
    +import io.netty.util.ReferenceCountUtil;
    +import org.apache.commons.logging.Log;
    +import org.apache.commons.logging.LogFactory;
    +import org.apache.hadoop.fs.FileSystem;
    +import org.apache.hadoop.fs.LocalDirAllocator;
    +import org.apache.hadoop.fs.LocalFileSystem;
    +import org.apache.hadoop.fs.Path;
    +import org.apache.tajo.TajoProtos;
    +import org.apache.tajo.TajoProtos.FetcherState;
    +import org.apache.tajo.conf.TajoConf;
    +import org.apache.tajo.conf.TajoConf.ConfVars;
    +import org.apache.tajo.exception.TajoInternalError;
    +import org.apache.tajo.pullserver.PullServerConstants;
    +import org.apache.tajo.pullserver.PullServerConstants.Param;
    +import org.apache.tajo.pullserver.PullServerUtil;
    +import org.apache.tajo.pullserver.PullServerUtil.PullServerParams;
    +import org.apache.tajo.pullserver.PullServerUtil.PullServerRequestURIBuilder;
    +import org.apache.tajo.pullserver.TajoPullServerService;
    +import org.apache.tajo.pullserver.retriever.FileChunk;
    +import org.apache.tajo.pullserver.retriever.FileChunkMeta;
    +import org.apache.tajo.rpc.NettyUtils;
    +import org.apache.tajo.storage.HashShuffleAppenderManager;
    +import org.apache.tajo.storage.StorageUtil;
    +
    +import java.io.File;
    +import java.io.IOException;
    +import java.net.InetSocketAddress;
    +import java.net.URI;
    +import java.nio.charset.Charset;
    +import java.util.ArrayList;
    +import java.util.List;
    +import java.util.Optional;
    +import java.util.concurrent.ExecutionException;
    +import java.util.concurrent.TimeUnit;
    +
    +public class LocalFetcher extends AbstractFetcher {
    +
    +  private final static Log LOG = LogFactory.getLog(LocalFetcher.class);
    +
    +//  private final ExecutionBlockContext executionBlockContext;
    +  private final TajoPullServerService pullServerService;
    +
    +  private final String host;
    +  private int port;
    +  private final Bootstrap bootstrap;
    +  private final int maxUrlLength;
    +  private final List<FileChunkMeta> chunkMetas = new ArrayList<>();
    +  private final String tableName;
    +  private final FileSystem localFileSystem;
    +  private final LocalDirAllocator localDirAllocator;
    +
    +  @VisibleForTesting
    +  public LocalFetcher(TajoConf conf, URI uri, String tableName) throws IOException {
    +    super(conf, uri);
    +    this.maxUrlLength = conf.getIntVar(ConfVars.PULLSERVER_FETCH_URL_MAX_LENGTH);
    +    this.tableName = tableName;
    +    this.localFileSystem = new LocalFileSystem();
    +    this.localDirAllocator = new LocalDirAllocator(ConfVars.WORKER_TEMPORAL_DIR.varname);
    +    this.pullServerService = null;
    +
    +    String scheme = uri.getScheme() == null ? "http" : uri.getScheme();
    +    this.host = uri.getHost() == null ? "localhost" : uri.getHost();
    +    this.port = uri.getPort();
    +    if (port == -1) {
    +      if (scheme.equalsIgnoreCase("http")) {
    +        this.port = 80;
    +      } else if (scheme.equalsIgnoreCase("https")) {
    +        this.port = 443;
    +      }
    +    }
    +
    +    bootstrap = new Bootstrap()
    +        .group(
    +            NettyUtils.getSharedEventLoopGroup(NettyUtils.GROUP.FETCHER,
    +                conf.getIntVar(ConfVars.SHUFFLE_RPC_CLIENT_WORKER_THREAD_NUM)))
    +        .channel(NioSocketChannel.class)
    +        .option(ChannelOption.ALLOCATOR, NettyUtils.ALLOCATOR)
    +        .option(ChannelOption.CONNECT_TIMEOUT_MILLIS,
    +            conf.getIntVar(ConfVars.SHUFFLE_FETCHER_CONNECT_TIMEOUT) * 1000)
    +        .option(ChannelOption.SO_RCVBUF, 1048576) // set 1M
    +        .option(ChannelOption.TCP_NODELAY, true);
    +  }
    +
    +  public LocalFetcher(TajoConf conf, URI uri, ExecutionBlockContext executionBlockContext, String tableName) {
    +    super(conf, uri);
    +    this.localFileSystem = executionBlockContext.getLocalFS();
    +    this.localDirAllocator = executionBlockContext.getLocalDirAllocator();
    +    this.maxUrlLength = conf.getIntVar(ConfVars.PULLSERVER_FETCH_URL_MAX_LENGTH);
    +    this.tableName = tableName;
    +
    +    Optional<TajoPullServerService> optional = executionBlockContext.getSharedResource().getPullServerService();
    +    if (optional.isPresent()) {
    +      // local pull server service
    +      this.pullServerService = optional.get();
    +      this.host = null;
    +      this.bootstrap = null;
    +
    +    } else if (PullServerUtil.useExternalPullServerService(conf)) {
    +      // external pull server service
    +      pullServerService = null;
    +
    +      String scheme = uri.getScheme() == null ? "http" : uri.getScheme();
    +      this.host = uri.getHost() == null ? "localhost" : uri.getHost();
    +      this.port = uri.getPort();
    +      if (port == -1) {
    +        if (scheme.equalsIgnoreCase("http")) {
    +          this.port = 80;
    +        } else if (scheme.equalsIgnoreCase("https")) {
    +          this.port = 443;
    +        }
    +      }
    +
    +      bootstrap = new Bootstrap()
    +          .group(
    +              NettyUtils.getSharedEventLoopGroup(NettyUtils.GROUP.FETCHER,
    +                  conf.getIntVar(ConfVars.SHUFFLE_RPC_CLIENT_WORKER_THREAD_NUM)))
    +          .channel(NioSocketChannel.class)
    +          .option(ChannelOption.ALLOCATOR, NettyUtils.ALLOCATOR)
    +          .option(ChannelOption.CONNECT_TIMEOUT_MILLIS,
    +              conf.getIntVar(ConfVars.SHUFFLE_FETCHER_CONNECT_TIMEOUT) * 1000)
    +          .option(ChannelOption.SO_RCVBUF, 1048576) // set 1M
    +          .option(ChannelOption.TCP_NODELAY, true);
    +    } else {
    +      endFetch(FetcherState.FETCH_FAILED);
    +      throw new TajoInternalError("Pull server service is not initialized");
    +    }
    +  }
    +
    +  @Override
    +  public List<FileChunk> get() throws IOException {
    +    return pullServerService != null ? getDirect() : getFromFetchURI();
    --- End diff --
    
    There are two cases for local fetcher. 
    * When an internal pull server is running, local fetchers can retrieve data directly. 
    * When an external pull server is running, 
     * If the shuffle type is hash, local fetchers can still retrieve data directly.
     * If the shuffle type is range, local fetchers need to get meta information of data via HTTP. Once the meta information is retrieved, they can read data directly. 
    
    I'll add this description.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] tajo pull request: TAJO-2122: PullServer as an Auxiliary service o...

Posted by jinossy <gi...@git.apache.org>.
Github user jinossy commented on a diff in the pull request:

    https://github.com/apache/tajo/pull/1001#discussion_r60860272
  
    --- Diff: tajo-yarn/pom.xml ---
    @@ -0,0 +1,185 @@
    +<?xml version="1.0" encoding="UTF-8"?>
    +<!--
    +  ~ Licensed to the Apache Software Foundation (ASF) under one
    +  ~ or more contributor license agreements.  See the NOTICE file
    +  ~ distributed with this work for additional information
    +  ~ regarding copyright ownership.  The ASF licenses this file
    +  ~ to you under the Apache License, Version 2.0 (the
    +  ~ "License"); you may not use this file except in compliance
    +  ~ with the License.  You may obtain a copy of the License at
    +  ~
    +  ~     http://www.apache.org/licenses/LICENSE-2.0
    +  ~
    +  ~ Unless required by applicable law or agreed to in writing, software
    +  ~ distributed under the License is distributed on an "AS IS" BASIS,
    +  ~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    +  ~ See the License for the specific language governing permissions and
    +  ~ limitations under the License.
    +  -->
    +
    +<project xmlns="http://maven.apache.org/POM/4.0.0"
    +         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    +         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
    +  <parent>
    +    <artifactId>tajo-project</artifactId>
    +    <groupId>org.apache.tajo</groupId>
    +    <version>0.12.0-SNAPSHOT</version>
    +    <relativePath>../tajo-project</relativePath>
    +  </parent>
    +  <modelVersion>4.0.0</modelVersion>
    +  <artifactId>tajo-yarn</artifactId>
    +  <packaging>jar</packaging>
    +  <name>Tajo Yarn</name>
    +  <properties>
    +    <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
    +    <project.reporting.outputEncoding>UTF-8</project.reporting.outputEncoding>
    +  </properties>
    +
    +  <build>
    +    <plugins>
    +      <plugin>
    +        <groupId>org.apache.maven.plugins</groupId>
    +        <artifactId>maven-compiler-plugin</artifactId>
    +      </plugin>
    +      <plugin>
    +        <groupId>org.apache.rat</groupId>
    +        <artifactId>apache-rat-plugin</artifactId>
    +        <executions>
    +          <execution>
    +            <phase>verify</phase>
    +            <goals>
    +              <goal>check</goal>
    +            </goals>
    +          </execution>
    +        </executions>
    +      </plugin>
    +      <plugin>
    +        <groupId>org.apache.maven.plugins</groupId>
    +        <artifactId>maven-surefire-report-plugin</artifactId>
    +      </plugin>
    +      <plugin>
    +        <artifactId>maven-assembly-plugin</artifactId>
    +        <version>2.4.1</version>
    +        <configuration>
    +          <descriptorRefs>
    +            <descriptorRef>jar-with-dependencies</descriptorRef>
    +          </descriptorRefs>
    +        </configuration>
    +        <executions>
    +          <execution>
    +            <id>make-assembly</id>
    +            <phase>package</phase>
    +            <goals>
    +              <goal>single</goal>
    +            </goals>
    +          </execution>
    +        </executions>
    +      </plugin>
    +    </plugins>
    +  </build>
    +
    +  <dependencies>
    +    <dependency>
    +      <groupId>org.apache.hadoop</groupId>
    +      <artifactId>hadoop-mapreduce-client-shuffle</artifactId>
    +      <scope>provided</scope>
    +    </dependency>
    +    <dependency>
    +      <groupId>org.apache.hadoop</groupId>
    +      <artifactId>hadoop-yarn-api</artifactId>
    +      <scope>provided</scope>
    +    </dependency>
    +    <dependency>
    +      <groupId>org.apache.hadoop</groupId>
    +      <artifactId>hadoop-common</artifactId>
    +      <scope>provided</scope>
    +    </dependency>
    +    <dependency>
    +      <groupId>org.apache.tajo</groupId>
    +      <artifactId>tajo-common</artifactId>
    +    </dependency>
    +    <dependency>
    +      <groupId>org.apache.tajo</groupId>
    +      <artifactId>tajo-storage-common</artifactId>
    +    </dependency>
    +    <dependency>
    +      <groupId>org.apache.tajo</groupId>
    +      <artifactId>tajo-storage-hdfs</artifactId>
    +    </dependency>
    +    <dependency>
    +      <groupId>org.apache.tajo</groupId>
    +      <artifactId>tajo-pullserver</artifactId>
    +    </dependency>
    +  </dependencies>
    +  <profiles>
    +    <profile>
    +      <id>docs</id>
    +      <activation>
    +        <activeByDefault>false</activeByDefault>
    +      </activation>
    +      <build>
    +        <plugins>
    +          <plugin>
    +            <groupId>org.apache.maven.plugins</groupId>
    +            <artifactId>maven-javadoc-plugin</artifactId>
    +            <executions>
    +              <execution>
    +                <!-- build javadoc jars per jar for publishing to maven -->
    +                <id>module-javadocs</id>
    +                <phase>package</phase>
    +                <goals>
    +                  <goal>jar</goal>
    +                </goals>
    +                <configuration>
    +                  <destDir>${project.build.directory}</destDir>
    +                </configuration>
    +              </execution>
    +            </executions>
    +          </plugin>
    +        </plugins>
    +      </build>
    +    </profile>
    +    <profile>
    +      <id>src</id>
    +      <activation>
    +        <activeByDefault>false</activeByDefault>
    +      </activation>
    +      <build>
    +        <plugins>
    +          <plugin>
    +            <groupId>org.apache.maven.plugins</groupId>
    +            <artifactId>maven-source-plugin</artifactId>
    +            <executions>
    +              <execution>
    +                <!-- builds source jars and attaches them to the project for publishing -->
    +                <id>tajo-java-sources</id>
    +                <phase>package</phase>
    +                <goals>
    +                  <goal>jar-no-fork</goal>
    +                </goals>
    +              </execution>
    +            </executions>
    +          </plugin>
    +        </plugins>
    +      </build>
    +    </profile>
    +  </profiles>
    +
    +  <reporting>
    +    <plugins>
    +      <plugin>
    +        <groupId>org.apache.maven.plugins</groupId>
    +        <artifactId>maven-project-info-reports-plugin</artifactId>
    +        <version>2.4</version>
    +        <configuration>
    +          <dependencyLocationsEnabled>false</dependencyLocationsEnabled>
    +        </configuration>
    +      </plugin>
    +      <plugin>
    +        <groupId>org.apache.maven.plugins</groupId>
    +        <artifactId>maven-surefire-report-plugin</artifactId>
    +      </plugin>
    +    </plugins>
    +  </reporting>
    +
    +</project>
    --- End diff --
    
    Thanks for your quick update. I’ll test on cluster


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---