You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@drill.apache.org by GitBox <gi...@apache.org> on 2021/12/15 16:35:56 UTC

[GitHub] [drill] Z0ltrix opened a new pull request #2406: DRILL-8077: Prefer IPv4Stack over IPv6Addresses

Z0ltrix opened a new pull request #2406:
URL: https://github.com/apache/drill/pull/2406


   # [DRILL-8077](https://issues.apache.org/jira/browse/DRILL-8077): Thousand of CLOSE_WAIT connections to HDFS Datanode 
   
   ## Description
   
   When Drill runs in an IPv6 environment, it tries to reach the HDFS Datanode over IPv6, but the Datanode does not close the connection correctly. This leads to thousands of CLOSE_WAIT ipv6 connections from Drillbit to Datanode and after a while the Machine runs out of ports and stop working.
   
   To avoid this situation, we can prever IPv4 over IPv6 in drill-env.sh
   
   ## Documentation
   Code is self-explaining
   
   ## Testing
   local build was successful
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@drill.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [drill] Z0ltrix commented on pull request #2406: DRILL-8077: Prefer IPv4Stack over IPv6Addresses

Posted by GitBox <gi...@apache.org>.
Z0ltrix commented on pull request #2406:
URL: https://github.com/apache/drill/pull/2406#issuecomment-995115051


   > Please make all required changes in `drill-config.sh` instead.
   
   done


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@drill.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [drill] Z0ltrix commented on pull request #2406: DRILL-8077: Prefer IPv4Stack over IPv6Addresses

Posted by GitBox <gi...@apache.org>.
Z0ltrix commented on pull request #2406:
URL: https://github.com/apache/drill/pull/2406#issuecomment-999058835


   > Agree, good points @jnturton! So @Z0ltrix please create a separate PR for updating documentation.
   
   done... https://github.com/apache/drill-site/pull/21


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@drill.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [drill] Z0ltrix commented on pull request #2406: DRILL-8077: Prefer IPv4Stack over IPv6Addresses

Posted by GitBox <gi...@apache.org>.
Z0ltrix commented on pull request #2406:
URL: https://github.com/apache/drill/pull/2406#issuecomment-996896699


   > Hi @vvysotskyi, @Z0ltrix. The two considerations,
   > 
   >     1. this change is a workaround for buggy IPv6 support in HDFS and
   > 
   >     2. the workaround is untargeted and could have side effects in some environments because Drill's JVMs get set to prefer IPv4 for every socket they open, whether to HDFS or elsewhere
   > 
   > 
   > , make me wonder if this should really be promoted to a Drill default. An alternative might be that we create a doc page describing this problem and showing users how to apply the workaround uncovered by @Z0ltrix to their own environments...
   
   Hi @jnturton,
   good Point...
   What now? How i should continue?
   This PR or Docs?
   Regards
   Christian
   
   > Agree with @jnturton and @vvysotskyi.
   > 
   > But possibly improvements still can be done for the algorithm of closing connections by Drill to avoid:
   > 
   > > This leads to thousands of CLOSE_WAIT ipv6 connections from Drillbit to Datanode
   > 
   > @Z0ltrix you can submit a separate ticket for that with details and error log
   
   Mmh... I think the ticket https://issues.apache.org/jira/browse/DRILL-8077 is fine, maybe i could change the description to specify the problem more.
   
   Logs would be a problem... All systems on the server just said they could not open a new connection because "address already in use" and we saw > 60k ipv6 connections to different hdfs datanodes in state CLOSE_WAIT. Took about a week or longer to occur, depends on number of queries.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@drill.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [drill] Z0ltrix commented on pull request #2406: DRILL-8077: Prefer IPv4Stack over IPv6Addresses

Posted by GitBox <gi...@apache.org>.
Z0ltrix commented on pull request #2406:
URL: https://github.com/apache/drill/pull/2406#issuecomment-996602728


   > Hi @vvysotskyi, @Z0ltrix. The two considerations,
   > 
   >     1. this change is a workaround for buggy IPv6 support in HDFS and
   > 
   >     2. the workaround is untargeted and could have side effects in some environments because Drill's JVMs get set to prefer IPv4 for every socket they open, whether to HDFS or elsewhere
   > 
   > 
   > , make me wonder if this should really be promoted to a Drill default. An alternative might be that we create a doc page describing this problem and showing users how to apply the workaround uncovered by @Z0ltrix to their own environments...
   
   Hi @jnturton,
   good Point...
   What now? How i should continue?
   This PR or Docs?
   Regards
   Christian


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@drill.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [drill] jnturton closed pull request #2406: DRILL-8077: Prefer IPv4Stack over IPv6Addresses

Posted by GitBox <gi...@apache.org>.
jnturton closed pull request #2406:
URL: https://github.com/apache/drill/pull/2406


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@drill.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [drill] Z0ltrix commented on a change in pull request #2406: DRILL-8077: Prefer IPv4Stack over IPv6Addresses

Posted by GitBox <gi...@apache.org>.
Z0ltrix commented on a change in pull request #2406:
URL: https://github.com/apache/drill/pull/2406#discussion_r769959037



##########
File path: distribution/src/main/resources/drill-config.sh
##########
@@ -213,6 +213,9 @@ fi
 # Set Drill-provided defaults here. Do not put Drill defaults
 # in the distribution or user environment config files.
 
+# Prefer IPv4 over IPv6
+export DRILL_JAVA_OPTS=${$DRILL_JAVA_OPTS:-" -Djava.net.preferIPv4Stack=true -Djava.net.preferIPv6Addresses=false"}

Review comment:
       Isnt this important for both... drillbit and sqlline? 
   
   And i thought this has to be done before including drill-env.sh ?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@drill.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [drill] vdiravka commented on pull request #2406: DRILL-8077: Prefer IPv4Stack over IPv6Addresses

Posted by GitBox <gi...@apache.org>.
vdiravka commented on pull request #2406:
URL: https://github.com/apache/drill/pull/2406#issuecomment-998910362


   @Z0ltrix Do we have https://issues.apache.org/jira/browse/HADOOP ticket for the above issue?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@drill.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [drill] Z0ltrix commented on a change in pull request #2406: DRILL-8077: Prefer IPv4Stack over IPv6Addresses

Posted by GitBox <gi...@apache.org>.
Z0ltrix commented on a change in pull request #2406:
URL: https://github.com/apache/drill/pull/2406#discussion_r770021861



##########
File path: distribution/src/main/resources/drill-config.sh
##########
@@ -213,6 +213,9 @@ fi
 # Set Drill-provided defaults here. Do not put Drill defaults
 # in the distribution or user environment config files.
 
+# Prefer IPv4 over IPv6
+export DRILL_JAVA_OPTS=${$DRILL_JAVA_OPTS:-" -Djava.net.preferIPv4Stack=true -Djava.net.preferIPv6Addresses=false"}

Review comment:
       ah, i see... there is also fine but this will make it impossible for the user to overwrite it in drill-env.sh 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@drill.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [drill] jnturton edited a comment on pull request #2406: DRILL-8077: Prefer IPv4Stack over IPv6Addresses

Posted by GitBox <gi...@apache.org>.
jnturton edited a comment on pull request #2406:
URL: https://github.com/apache/drill/pull/2406#issuecomment-996610838


   Let's get at least one more opinion and see if we have a consensus, ping @vvysotskyi, @vdiravka, @luocooong, @cgivre.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@drill.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [drill] jnturton commented on pull request #2406: DRILL-8077: Prefer IPv4Stack over IPv6Addresses

Posted by GitBox <gi...@apache.org>.
jnturton commented on pull request #2406:
URL: https://github.com/apache/drill/pull/2406#issuecomment-997333735


   We can't, and shouldn't try to, fix an HDFS IPv6 connection management bug with changes to Drill's connection management.  For the observed state to arise, Drill must already be closing connections on its end, and that's where its job stops.  So the existing HADOOP-xxx tickets suffice, we only need a docs PR.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@drill.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [drill] jnturton commented on pull request #2406: DRILL-8077: Prefer IPv4Stack over IPv6Addresses

Posted by GitBox <gi...@apache.org>.
jnturton commented on pull request #2406:
URL: https://github.com/apache/drill/pull/2406#issuecomment-996610838


   Let's get one or two other opinions in and see if we have a consensus, ping @vvysotskyi, @vdiravka, @luocooong, @cgivre.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@drill.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [drill] Z0ltrix commented on a change in pull request #2406: DRILL-8077: Prefer IPv4Stack over IPv6Addresses

Posted by GitBox <gi...@apache.org>.
Z0ltrix commented on a change in pull request #2406:
URL: https://github.com/apache/drill/pull/2406#discussion_r769959037



##########
File path: distribution/src/main/resources/drill-config.sh
##########
@@ -213,6 +213,9 @@ fi
 # Set Drill-provided defaults here. Do not put Drill defaults
 # in the distribution or user environment config files.
 
+# Prefer IPv4 over IPv6
+export DRILL_JAVA_OPTS=${$DRILL_JAVA_OPTS:-" -Djava.net.preferIPv4Stack=true -Djava.net.preferIPv6Addresses=false"}

Review comment:
       Isnt his important for both... drillbit and sqlline? 
   
   And i thought this has to be done before including drill-env.sh ?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@drill.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [drill] vvysotskyi commented on pull request #2406: DRILL-8077: Prefer IPv4Stack over IPv6Addresses

Posted by GitBox <gi...@apache.org>.
vvysotskyi commented on pull request #2406:
URL: https://github.com/apache/drill/pull/2406#issuecomment-996875148


   Agree, good points @jnturton!
   So @Z0ltrix please create a separate PR for updating documentation.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@drill.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [drill] vdiravka commented on pull request #2406: DRILL-8077: Prefer IPv4Stack over IPv6Addresses

Posted by GitBox <gi...@apache.org>.
vdiravka commented on pull request #2406:
URL: https://github.com/apache/drill/pull/2406#issuecomment-996887174


   Agree with @jnturton and @vvysotskyi. 
   
   But possibly improvements still can be done for the algorithm of closing connections by Drill to avoid:
   > This leads to thousands of CLOSE_WAIT ipv6 connections from Drillbit to Datanode
   
   
   @Z0ltrix you can submit a separate ticket for that with details and error log


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@drill.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [drill] jnturton edited a comment on pull request #2406: DRILL-8077: Prefer IPv4Stack over IPv6Addresses

Posted by GitBox <gi...@apache.org>.
jnturton edited a comment on pull request #2406:
URL: https://github.com/apache/drill/pull/2406#issuecomment-997333735


   We can't, and shouldn't try to, fix an HDFS IPv6 connection management bug with changes to Drill's connection management code.  For the observed state to arise, Drill must already be closing connections on its end, and that's where its job stops.  So the existing HADOOP-xxx tickets suffice, we only need a docs PR.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@drill.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [drill] vvysotskyi commented on a change in pull request #2406: DRILL-8077: Prefer IPv4Stack over IPv6Addresses

Posted by GitBox <gi...@apache.org>.
vvysotskyi commented on a change in pull request #2406:
URL: https://github.com/apache/drill/pull/2406#discussion_r769956594



##########
File path: distribution/src/main/resources/drill-config.sh
##########
@@ -213,6 +213,9 @@ fi
 # Set Drill-provided defaults here. Do not put Drill defaults
 # in the distribution or user environment config files.
 
+# Prefer IPv4 over IPv6
+export DRILL_JAVA_OPTS=${$DRILL_JAVA_OPTS:-" -Djava.net.preferIPv4Stack=true -Djava.net.preferIPv6Addresses=false"}

Review comment:
       Please use the `DRILLBIT_OPTS` variable below instead of `DRILL_JAVA_OPTS` and move it to the part where some other specific properties are set, like `ReservedCodeCacheSize` and so on.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@drill.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [drill] jnturton commented on pull request #2406: DRILL-8077: Prefer IPv4Stack over IPv6Addresses

Posted by GitBox <gi...@apache.org>.
jnturton commented on pull request #2406:
URL: https://github.com/apache/drill/pull/2406#issuecomment-996558800


   Hi @vvysotskyi, @Z0ltrix.  The two considerations,
   
   1. this change is a workaround for buggy IPv6 support in HDFS and
   2. the workaround is untargeted and could have side effects in some environments because Drill's JVMs get set to prefer IPv4 for every socket they open, whether to HDFS or elsewhere
   
   , make me wonder if this should really be promoted to a Drill default.  An alternative might be that we create a doc page describing this problem and showing users how to apply the workaround uncovered by @Z0ltrix to their own environments...


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@drill.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [drill] Z0ltrix edited a comment on pull request #2406: DRILL-8077: Prefer IPv4Stack over IPv6Addresses

Posted by GitBox <gi...@apache.org>.
Z0ltrix edited a comment on pull request #2406:
URL: https://github.com/apache/drill/pull/2406#issuecomment-998999681


   > @Z0ltrix Do we have https://issues.apache.org/jira/browse/HADOOP ticket for the above issue?
   
   Not precisely for this Bug but for general ipv6 Support... https://issues.apache.org/jira/browse/HADOOP-11890


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@drill.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [drill] Z0ltrix commented on pull request #2406: DRILL-8077: Prefer IPv4Stack over IPv6Addresses

Posted by GitBox <gi...@apache.org>.
Z0ltrix commented on pull request #2406:
URL: https://github.com/apache/drill/pull/2406#issuecomment-998999681


   > @Z0ltrix Do we have https://issues.apache.org/jira/browse/HADOOP ticket for the above issue?
   Not precisely for this Bug but for general ipv6 Support... https://issues.apache.org/jira/browse/HADOOP-11890


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@drill.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org