You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/04/05 18:36:07 UTC

[GitHub] [arrow-cookbook] thatstatsguy opened a new pull request, #177: Include Python Server Client Example with Certificates

thatstatsguy opened a new pull request, #177:
URL: https://github.com/apache/arrow-cookbook/pull/177

   - Minor capitalisation fix in create.rst
   - User example of how to set up an arrow server using self-signed certificates


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-cookbook] thatstatsguy commented on a diff in pull request #177: Include Python Server Client Example with Certificates

Posted by GitBox <gi...@apache.org>.
thatstatsguy commented on code in PR #177:
URL: https://github.com/apache/arrow-cookbook/pull/177#discussion_r846670172


##########
python/source/flight.rst:
##########
@@ -605,3 +605,102 @@ Or if we use the wrong credentials on login, we also get an error:
     server.shutdown()
 
 .. _(HTTP) basic authentication: https://developer.mozilla.org/en-US/docs/Web/HTTP/Authentication#basic_authentication_scheme
+
+Authentication with certificates
+=================================
+
+Following on from the previous scenario where traffic to the server is managed via a username and password, 
+HTTPS (more specifically TLS) communication allows an additional layer of security by encrypting messages
+between the client and server. This is achieved using certificates. During development, the easiest 
+approach is developing with self-signed certificates. At startup, the server loads the public and private 
+key and the client client authenticates itself to the server with a public key.
+
+.. note:: In production environments it is recommended to make use of a certificate signed by a certificate authority.
+.. note:: This tutorial uses Windows to create a self-signed certificate. For Linux environments, other methods such as OpenSSL can be used.
+
+**Step 1 - Generating the Self Signed Certificate**  
+
+To generate a self-signed certificate, run command prompt as administrator and run the following commands.
+.. testcode::
+    dotnet dev-certs https --trust
+    dotnet dev-certs https -ep "<CertificateName>.pfx" -p <CertificatePassword>
+
+You will receive a prompt asking you confirm that you would like to trust this certificate, select yes. 
+You now have a self-signed certificate that your local environment trusts.
+
+**Step 2 - Converting the .pfx file into public and private keys** 
+
+Since `dotnet dev-certs` does not let you export Public and Private keys directly we need to convert the .pfx file. 
+There are several way to achieve this and this tutorial uses OpenSSL (using Windows Subsystem for Linux) 
+to perform the conversion as per this `IBM article`_.
+
+**Step 3 - Running a server with tls enabled**
+
+We're going to use the pyarrow server example available on the `GitHub repo`_. To run the server with TLS enabled, the python script should be 
+called with the path to the public and private keys.
+.. testcode::
+    python server.py --tls CERTFILE <PathToPublicCertificate> --tls KEYFILE <PathToPrivateKey>

Review Comment:
   Have put in a minimal server example. I removed all overrides except the code required to receive information from the client. Let me know if that's fine



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-cookbook] thatstatsguy commented on a diff in pull request #177: Include Python Server Client Example with Certificates

Posted by GitBox <gi...@apache.org>.
thatstatsguy commented on code in PR #177:
URL: https://github.com/apache/arrow-cookbook/pull/177#discussion_r847534142


##########
python/source/flight.rst:
##########
@@ -605,3 +605,138 @@ Or if we use the wrong credentials on login, we also get an error:
     server.shutdown()
 
 .. _(HTTP) basic authentication: https://developer.mozilla.org/en-US/docs/Web/HTTP/Authentication#basic_authentication_scheme
+
+Authentication with certificates
+=================================
+
+Following on from the previous scenario where traffic to the server is managed via a username and password, 
+HTTPS (more specifically TLS) communication allows an additional layer of security by encrypting messages
+between the client and server. This is achieved using certificates. During development, the easiest 
+approach is developing with self-signed certificates. At startup, the server loads the public and private 
+key and the client client authenticates itself to the server with the tls root certificate.
+
+.. note:: In production environments it is recommended to make use of a certificate signed by a certificate authority.
+
+**Step 1 - Generating the Self Signed Certificate**  
+
+Generate a self-signed certificate by using dotnet on `Windows`_, or `openssl`_ on Linux or MacOS. 
+Alternatively, the self-signed certificate from the `Arrow testing data repository`_ can be used. 
+Depending on the file generated, you may need to convert it to a .crt and .key file as required for the Arrow server. 
+One method to achieve this is openssl, please visit this `IBM article`_ for more info. 
+
+
+**Step 2 - Running a server with TLS enabled**
+
+The code below is a minimal working example of an Arrow server used to receive data with TLS. For a full server example, please visit the Arrow `GitHub repo`_. 
+
+.. testcode::
+    
+    import argparse
+    import pyarrow
+    import pyarrow.flight
+    
+    
+    class FlightServer(pyarrow.flight.FlightServerBase):
+        def __init__(self, host="localhost", location=None,
+                     tls_certificates=None, verify_client=False,
+                     root_certificates=None, auth_handler=None):
+            super(FlightServer, self).__init__(
+                location, auth_handler, tls_certificates, verify_client,
+                root_certificates)
+            self.flights = {}
+            self.host = host
+            self.tls_certificates = tls_certificates

Review Comment:
   Should be used by lines 681-682
   `server = FlightServer(host, location, tls_certificates=tls_certificates)`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-cookbook] thatstatsguy commented on a diff in pull request #177: Include Python Server Client Example with Certificates

Posted by GitBox <gi...@apache.org>.
thatstatsguy commented on code in PR #177:
URL: https://github.com/apache/arrow-cookbook/pull/177#discussion_r846669989


##########
python/source/flight.rst:
##########
@@ -605,3 +605,102 @@ Or if we use the wrong credentials on login, we also get an error:
     server.shutdown()
 
 .. _(HTTP) basic authentication: https://developer.mozilla.org/en-US/docs/Web/HTTP/Authentication#basic_authentication_scheme
+
+Authentication with certificates
+=================================
+
+Following on from the previous scenario where traffic to the server is managed via a username and password, 
+HTTPS (more specifically TLS) communication allows an additional layer of security by encrypting messages
+between the client and server. This is achieved using certificates. During development, the easiest 
+approach is developing with self-signed certificates. At startup, the server loads the public and private 
+key and the client client authenticates itself to the server with a public key.
+
+.. note:: In production environments it is recommended to make use of a certificate signed by a certificate authority.
+.. note:: This tutorial uses Windows to create a self-signed certificate. For Linux environments, other methods such as OpenSSL can be used.
+
+**Step 1 - Generating the Self Signed Certificate**  
+
+To generate a self-signed certificate, run command prompt as administrator and run the following commands.
+.. testcode::
+    dotnet dev-certs https --trust
+    dotnet dev-certs https -ep "<CertificateName>.pfx" -p <CertificatePassword>
+
+You will receive a prompt asking you confirm that you would like to trust this certificate, select yes. 
+You now have a self-signed certificate that your local environment trusts.
+
+**Step 2 - Converting the .pfx file into public and private keys** 
+
+Since `dotnet dev-certs` does not let you export Public and Private keys directly we need to convert the .pfx file. 
+There are several way to achieve this and this tutorial uses OpenSSL (using Windows Subsystem for Linux) 
+to perform the conversion as per this `IBM article`_.
+
+**Step 3 - Running a server with tls enabled**
+
+We're going to use the pyarrow server example available on the `GitHub repo`_. To run the server with TLS enabled, the python script should be 
+called with the path to the public and private keys.
+.. testcode::
+    python server.py --tls CERTFILE <PathToPublicCertificate> --tls KEYFILE <PathToPrivateKey>
+
+Assuming the path was valid, you should see ``Serving on grpc+tls://localhost:5005``. The server is now being served on a port set in the code (or by you).
+
+**Step 4 - Securely Connecting a client to the Server**
+Suppose we want to connect to the client and push some data to it. The following code securely sends information to the server using TLS encryption. 
+There is also the option to use mutual TLS encryption using both the public and private key, but we will assume the client will likely only have 
+the public certificate.
+.. testcode::
+    import pyarrow
+    import pyarrow.flight
+    import pandas as pd
+    
+    # Assumes incoming data object is a Dataframe
+    def pushToServer(name, data, client):
+        objectToSend = pyarrow.Table.from_pandas(data)
+        writer, _ = client.do_put(pyarrow.flight.FlightDescriptor.for_path(name), objectToSend.schema)
+        writer.write_table(objectToSend)
+        writer.close()
+    
+    def getClient():
+        
+        return pyarrow.flight.FlightClient("grpc+tcp://localhost:5005")

Review Comment:
   removed :)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-cookbook] thatstatsguy commented on a diff in pull request #177: Include Python Server Client Example with Certificates

Posted by GitBox <gi...@apache.org>.
thatstatsguy commented on code in PR #177:
URL: https://github.com/apache/arrow-cookbook/pull/177#discussion_r846670075


##########
python/source/flight.rst:
##########
@@ -605,3 +605,102 @@ Or if we use the wrong credentials on login, we also get an error:
     server.shutdown()
 
 .. _(HTTP) basic authentication: https://developer.mozilla.org/en-US/docs/Web/HTTP/Authentication#basic_authentication_scheme
+
+Authentication with certificates
+=================================
+
+Following on from the previous scenario where traffic to the server is managed via a username and password, 
+HTTPS (more specifically TLS) communication allows an additional layer of security by encrypting messages
+between the client and server. This is achieved using certificates. During development, the easiest 
+approach is developing with self-signed certificates. At startup, the server loads the public and private 
+key and the client client authenticates itself to the server with a public key.
+
+.. note:: In production environments it is recommended to make use of a certificate signed by a certificate authority.
+.. note:: This tutorial uses Windows to create a self-signed certificate. For Linux environments, other methods such as OpenSSL can be used.
+
+**Step 1 - Generating the Self Signed Certificate**  
+
+To generate a self-signed certificate, run command prompt as administrator and run the following commands.
+.. testcode::
+    dotnet dev-certs https --trust
+    dotnet dev-certs https -ep "<CertificateName>.pfx" -p <CertificatePassword>
+
+You will receive a prompt asking you confirm that you would like to trust this certificate, select yes. 
+You now have a self-signed certificate that your local environment trusts.
+
+**Step 2 - Converting the .pfx file into public and private keys** 
+
+Since `dotnet dev-certs` does not let you export Public and Private keys directly we need to convert the .pfx file. 
+There are several way to achieve this and this tutorial uses OpenSSL (using Windows Subsystem for Linux) 
+to perform the conversion as per this `IBM article`_.
+
+**Step 3 - Running a server with tls enabled**
+
+We're going to use the pyarrow server example available on the `GitHub repo`_. To run the server with TLS enabled, the python script should be 
+called with the path to the public and private keys.
+.. testcode::
+    python server.py --tls CERTFILE <PathToPublicCertificate> --tls KEYFILE <PathToPrivateKey>
+
+Assuming the path was valid, you should see ``Serving on grpc+tls://localhost:5005``. The server is now being served on a port set in the code (or by you).
+
+**Step 4 - Securely Connecting a client to the Server**
+Suppose we want to connect to the client and push some data to it. The following code securely sends information to the server using TLS encryption. 
+There is also the option to use mutual TLS encryption using both the public and private key, but we will assume the client will likely only have 
+the public certificate.
+.. testcode::
+    import pyarrow
+    import pyarrow.flight
+    import pandas as pd
+    
+    # Assumes incoming data object is a Dataframe
+    def pushToServer(name, data, client):
+        objectToSend = pyarrow.Table.from_pandas(data)
+        writer, _ = client.do_put(pyarrow.flight.FlightDescriptor.for_path(name), objectToSend.schema)
+        writer.write_table(objectToSend)
+        writer.close()
+    
+    def getClient():
+        
+        return pyarrow.flight.FlightClient("grpc+tcp://localhost:5005")
+    
+    def _add_common_arguments(parser):
+        parser.add_argument('--tls', action='store_true',
+                            help='Enable transport-level security')
+        parser.add_argument('--tls-roots', default=None,
+                            help='Path to trusted TLS certificate(s)')
+        parser.add_argument("--mtls", nargs=2, default=None,
+                            metavar=('CERTFILE', 'KEYFILE'),
+                            help="Enable transport-level security")
+                            
+    def main():
+        parser = argparse.ArgumentParser()
+        args = parser.parse_args()
+        connection_args = {}
+        scheme = "grpc+tls"
+        
+        if args.tls:
+            
+            if args.tls_roots:
+                with open(args.tls_roots, "rb") as root_certs:
+                    connection_args["tls_root_certs"] = root_certs.read()
+        if args.mtls:
+            with open(args.mtls[0], "rb") as cert_file:
+                tls_cert_chain = cert_file.read()
+            with open(args.mtls[1], "rb") as key_file:
+                tls_private_key = key_file.read()
+            connection_args["cert_chain"] = tls_cert_chain
+            connection_args["private_key"] = tls_private_key

Review Comment:
   done :)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-cookbook] thatstatsguy commented on pull request #177: Include Python Server Client Example with Certificates

Posted by GitBox <gi...@apache.org>.
thatstatsguy commented on PR #177:
URL: https://github.com/apache/arrow-cookbook/pull/177#issuecomment-1095402082

   Updated for final comment 💪


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-cookbook] thatstatsguy commented on a diff in pull request #177: Include Python Server Client Example with Certificates

Posted by GitBox <gi...@apache.org>.
thatstatsguy commented on code in PR #177:
URL: https://github.com/apache/arrow-cookbook/pull/177#discussion_r846670103


##########
python/source/flight.rst:
##########
@@ -605,3 +605,102 @@ Or if we use the wrong credentials on login, we also get an error:
     server.shutdown()
 
 .. _(HTTP) basic authentication: https://developer.mozilla.org/en-US/docs/Web/HTTP/Authentication#basic_authentication_scheme
+
+Authentication with certificates
+=================================
+
+Following on from the previous scenario where traffic to the server is managed via a username and password, 
+HTTPS (more specifically TLS) communication allows an additional layer of security by encrypting messages
+between the client and server. This is achieved using certificates. During development, the easiest 
+approach is developing with self-signed certificates. At startup, the server loads the public and private 
+key and the client client authenticates itself to the server with a public key.
+
+.. note:: In production environments it is recommended to make use of a certificate signed by a certificate authority.
+.. note:: This tutorial uses Windows to create a self-signed certificate. For Linux environments, other methods such as OpenSSL can be used.
+
+**Step 1 - Generating the Self Signed Certificate**  
+
+To generate a self-signed certificate, run command prompt as administrator and run the following commands.
+.. testcode::
+    dotnet dev-certs https --trust
+    dotnet dev-certs https -ep "<CertificateName>.pfx" -p <CertificatePassword>
+
+You will receive a prompt asking you confirm that you would like to trust this certificate, select yes. 
+You now have a self-signed certificate that your local environment trusts.
+
+**Step 2 - Converting the .pfx file into public and private keys** 
+
+Since `dotnet dev-certs` does not let you export Public and Private keys directly we need to convert the .pfx file. 
+There are several way to achieve this and this tutorial uses OpenSSL (using Windows Subsystem for Linux) 
+to perform the conversion as per this `IBM article`_.
+
+**Step 3 - Running a server with tls enabled**
+
+We're going to use the pyarrow server example available on the `GitHub repo`_. To run the server with TLS enabled, the python script should be 
+called with the path to the public and private keys.
+.. testcode::
+    python server.py --tls CERTFILE <PathToPublicCertificate> --tls KEYFILE <PathToPrivateKey>
+
+Assuming the path was valid, you should see ``Serving on grpc+tls://localhost:5005``. The server is now being served on a port set in the code (or by you).
+
+**Step 4 - Securely Connecting a client to the Server**
+Suppose we want to connect to the client and push some data to it. The following code securely sends information to the server using TLS encryption. 
+There is also the option to use mutual TLS encryption using both the public and private key, but we will assume the client will likely only have 
+the public certificate.
+.. testcode::
+    import pyarrow
+    import pyarrow.flight
+    import pandas as pd
+    
+    # Assumes incoming data object is a Dataframe
+    def pushToServer(name, data, client):

Review Comment:
   Fixed. Apologies - F# camelcase habits! 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-cookbook] thatstatsguy commented on pull request #177: Include Python Server Client Example with Certificates

Posted by GitBox <gi...@apache.org>.
thatstatsguy commented on PR #177:
URL: https://github.com/apache/arrow-cookbook/pull/177#issuecomment-1094108496

   Thank you for the detailed review! Ready for you when you're ready


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-cookbook] thatstatsguy commented on a diff in pull request #177: Include Python Server Client Example with Certificates

Posted by GitBox <gi...@apache.org>.
thatstatsguy commented on code in PR #177:
URL: https://github.com/apache/arrow-cookbook/pull/177#discussion_r847616924


##########
python/source/flight.rst:
##########
@@ -605,3 +605,138 @@ Or if we use the wrong credentials on login, we also get an error:
     server.shutdown()
 
 .. _(HTTP) basic authentication: https://developer.mozilla.org/en-US/docs/Web/HTTP/Authentication#basic_authentication_scheme
+
+Authentication with certificates
+=================================
+
+Following on from the previous scenario where traffic to the server is managed via a username and password, 
+HTTPS (more specifically TLS) communication allows an additional layer of security by encrypting messages
+between the client and server. This is achieved using certificates. During development, the easiest 
+approach is developing with self-signed certificates. At startup, the server loads the public and private 
+key and the client client authenticates itself to the server with the tls root certificate.
+
+.. note:: In production environments it is recommended to make use of a certificate signed by a certificate authority.
+
+**Step 1 - Generating the Self Signed Certificate**  
+
+Generate a self-signed certificate by using dotnet on `Windows`_, or `openssl`_ on Linux or MacOS. 
+Alternatively, the self-signed certificate from the `Arrow testing data repository`_ can be used. 
+Depending on the file generated, you may need to convert it to a .crt and .key file as required for the Arrow server. 
+One method to achieve this is openssl, please visit this `IBM article`_ for more info. 
+
+
+**Step 2 - Running a server with TLS enabled**
+
+The code below is a minimal working example of an Arrow server used to receive data with TLS. For a full server example, please visit the Arrow `GitHub repo`_. 
+
+.. testcode::
+    
+    import argparse
+    import pyarrow
+    import pyarrow.flight
+    
+    
+    class FlightServer(pyarrow.flight.FlightServerBase):
+        def __init__(self, host="localhost", location=None,
+                     tls_certificates=None, verify_client=False,
+                     root_certificates=None, auth_handler=None):
+            super(FlightServer, self).__init__(
+                location, auth_handler, tls_certificates, verify_client,
+                root_certificates)
+            self.flights = {}
+            self.host = host
+            self.tls_certificates = tls_certificates

Review Comment:
   Yea actually that's a good point - removed as suggested.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-cookbook] lidavidm commented on a diff in pull request #177: Include Python Server Client Example with Certificates

Posted by GitBox <gi...@apache.org>.
lidavidm commented on code in PR #177:
URL: https://github.com/apache/arrow-cookbook/pull/177#discussion_r847549858


##########
python/source/flight.rst:
##########
@@ -605,3 +605,138 @@ Or if we use the wrong credentials on login, we also get an error:
     server.shutdown()
 
 .. _(HTTP) basic authentication: https://developer.mozilla.org/en-US/docs/Web/HTTP/Authentication#basic_authentication_scheme
+
+Authentication with certificates
+=================================
+
+Following on from the previous scenario where traffic to the server is managed via a username and password, 
+HTTPS (more specifically TLS) communication allows an additional layer of security by encrypting messages
+between the client and server. This is achieved using certificates. During development, the easiest 
+approach is developing with self-signed certificates. At startup, the server loads the public and private 
+key and the client client authenticates itself to the server with the tls root certificate.
+
+.. note:: In production environments it is recommended to make use of a certificate signed by a certificate authority.
+
+**Step 1 - Generating the Self Signed Certificate**  
+
+Generate a self-signed certificate by using dotnet on `Windows`_, or `openssl`_ on Linux or MacOS. 
+Alternatively, the self-signed certificate from the `Arrow testing data repository`_ can be used. 
+Depending on the file generated, you may need to convert it to a .crt and .key file as required for the Arrow server. 
+One method to achieve this is openssl, please visit this `IBM article`_ for more info. 
+
+
+**Step 2 - Running a server with TLS enabled**
+
+The code below is a minimal working example of an Arrow server used to receive data with TLS. For a full server example, please visit the Arrow `GitHub repo`_. 
+
+.. testcode::
+    
+    import argparse
+    import pyarrow
+    import pyarrow.flight
+    
+    
+    class FlightServer(pyarrow.flight.FlightServerBase):
+        def __init__(self, host="localhost", location=None,
+                     tls_certificates=None, verify_client=False,
+                     root_certificates=None, auth_handler=None):
+            super(FlightServer, self).__init__(
+                location, auth_handler, tls_certificates, verify_client,
+                root_certificates)
+            self.flights = {}
+            self.host = host
+            self.tls_certificates = tls_certificates

Review Comment:
   Right, but neither property is ever actually accessed, so it seems redundant to have them. And `host` is never used at all, so it seems redundant to pass it in the first place.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-cookbook] lidavidm merged pull request #177: Include Python Server Client Example with Certificates

Posted by GitBox <gi...@apache.org>.
lidavidm merged PR #177:
URL: https://github.com/apache/arrow-cookbook/pull/177


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-cookbook] thatstatsguy commented on a diff in pull request #177: Include Python Server Client Example with Certificates

Posted by GitBox <gi...@apache.org>.
thatstatsguy commented on code in PR #177:
URL: https://github.com/apache/arrow-cookbook/pull/177#discussion_r846670311


##########
python/source/flight.rst:
##########
@@ -605,3 +605,102 @@ Or if we use the wrong credentials on login, we also get an error:
     server.shutdown()
 
 .. _(HTTP) basic authentication: https://developer.mozilla.org/en-US/docs/Web/HTTP/Authentication#basic_authentication_scheme
+
+Authentication with certificates
+=================================
+
+Following on from the previous scenario where traffic to the server is managed via a username and password, 
+HTTPS (more specifically TLS) communication allows an additional layer of security by encrypting messages
+between the client and server. This is achieved using certificates. During development, the easiest 
+approach is developing with self-signed certificates. At startup, the server loads the public and private 
+key and the client client authenticates itself to the server with a public key.

Review Comment:
   My understanding is the client still needs the TLS root certificate to authenticate the communication, I've changed it to make it more clear that what I'm loading in is for the TLS root cert



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-cookbook] lidavidm commented on a diff in pull request #177: Include Python Server Client Example with Certificates

Posted by GitBox <gi...@apache.org>.
lidavidm commented on code in PR #177:
URL: https://github.com/apache/arrow-cookbook/pull/177#discussion_r843914433


##########
python/source/flight.rst:
##########
@@ -605,3 +605,102 @@ Or if we use the wrong credentials on login, we also get an error:
     server.shutdown()
 
 .. _(HTTP) basic authentication: https://developer.mozilla.org/en-US/docs/Web/HTTP/Authentication#basic_authentication_scheme
+
+Authentication with certificates
+=================================
+
+Following on from the previous scenario where traffic to the server is managed via a username and password, 
+HTTPS (more specifically TLS) communication allows an additional layer of security by encrypting messages
+between the client and server. This is achieved using certificates. During development, the easiest 
+approach is developing with self-signed certificates. At startup, the server loads the public and private 
+key and the client client authenticates itself to the server with a public key.
+
+.. note:: In production environments it is recommended to make use of a certificate signed by a certificate authority.
+.. note:: This tutorial uses Windows to create a self-signed certificate. For Linux environments, other methods such as OpenSSL can be used.
+
+**Step 1 - Generating the Self Signed Certificate**  
+
+To generate a self-signed certificate, run command prompt as administrator and run the following commands.
+.. testcode::
+    dotnet dev-certs https --trust
+    dotnet dev-certs https -ep "<CertificateName>.pfx" -p <CertificatePassword>
+
+You will receive a prompt asking you confirm that you would like to trust this certificate, select yes. 
+You now have a self-signed certificate that your local environment trusts.
+
+**Step 2 - Converting the .pfx file into public and private keys** 
+
+Since `dotnet dev-certs` does not let you export Public and Private keys directly we need to convert the .pfx file. 
+There are several way to achieve this and this tutorial uses OpenSSL (using Windows Subsystem for Linux) 
+to perform the conversion as per this `IBM article`_.
+
+**Step 3 - Running a server with tls enabled**
+
+We're going to use the pyarrow server example available on the `GitHub repo`_. To run the server with TLS enabled, the python script should be 
+called with the path to the public and private keys.
+.. testcode::
+    python server.py --tls CERTFILE <PathToPublicCertificate> --tls KEYFILE <PathToPrivateKey>
+
+Assuming the path was valid, you should see ``Serving on grpc+tls://localhost:5005``. The server is now being served on a port set in the code (or by you).
+
+**Step 4 - Securely Connecting a client to the Server**
+Suppose we want to connect to the client and push some data to it. The following code securely sends information to the server using TLS encryption. 
+There is also the option to use mutual TLS encryption using both the public and private key, but we will assume the client will likely only have 
+the public certificate.
+.. testcode::
+    import pyarrow
+    import pyarrow.flight
+    import pandas as pd
+    
+    # Assumes incoming data object is a Dataframe
+    def pushToServer(name, data, client):
+        objectToSend = pyarrow.Table.from_pandas(data)
+        writer, _ = client.do_put(pyarrow.flight.FlightDescriptor.for_path(name), objectToSend.schema)
+        writer.write_table(objectToSend)
+        writer.close()
+    
+    def getClient():
+        
+        return pyarrow.flight.FlightClient("grpc+tcp://localhost:5005")
+    
+    def _add_common_arguments(parser):
+        parser.add_argument('--tls', action='store_true',
+                            help='Enable transport-level security')
+        parser.add_argument('--tls-roots', default=None,
+                            help='Path to trusted TLS certificate(s)')
+        parser.add_argument("--mtls", nargs=2, default=None,
+                            metavar=('CERTFILE', 'KEYFILE'),
+                            help="Enable transport-level security")
+                            
+    def main():
+        parser = argparse.ArgumentParser()
+        args = parser.parse_args()
+        connection_args = {}
+        scheme = "grpc+tls"
+        
+        if args.tls:
+            
+            if args.tls_roots:
+                with open(args.tls_roots, "rb") as root_certs:
+                    connection_args["tls_root_certs"] = root_certs.read()
+        if args.mtls:
+            with open(args.mtls[0], "rb") as cert_file:
+                tls_cert_chain = cert_file.read()
+            with open(args.mtls[1], "rb") as key_file:
+                tls_private_key = key_file.read()
+            connection_args["cert_chain"] = tls_cert_chain
+            connection_args["private_key"] = tls_private_key

Review Comment:
   (As in, we can tackle that in a separate PR - not asking you to double the workload here :slightly_smiling_face:)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-cookbook] lidavidm commented on a diff in pull request #177: Include Python Server Client Example with Certificates

Posted by GitBox <gi...@apache.org>.
lidavidm commented on code in PR #177:
URL: https://github.com/apache/arrow-cookbook/pull/177#discussion_r843910123


##########
python/source/flight.rst:
##########
@@ -605,3 +605,102 @@ Or if we use the wrong credentials on login, we also get an error:
     server.shutdown()
 
 .. _(HTTP) basic authentication: https://developer.mozilla.org/en-US/docs/Web/HTTP/Authentication#basic_authentication_scheme
+
+Authentication with certificates
+=================================
+
+Following on from the previous scenario where traffic to the server is managed via a username and password, 
+HTTPS (more specifically TLS) communication allows an additional layer of security by encrypting messages
+between the client and server. This is achieved using certificates. During development, the easiest 
+approach is developing with self-signed certificates. At startup, the server loads the public and private 
+key and the client client authenticates itself to the server with a public key.
+
+.. note:: In production environments it is recommended to make use of a certificate signed by a certificate authority.
+.. note:: This tutorial uses Windows to create a self-signed certificate. For Linux environments, other methods such as OpenSSL can be used.

Review Comment:
   If possible, we should drop this note, and just link to other resources.
   
   Alternatively, we can remove the instructions altogether, and point to the testing data repository. "Generate a self-signed certificate by using `dotnet` on Windows (link), or `openssl` on Linux or MacOS (link). Or, use the self-signed certificate from the Arrow testing data repository (link)." https://github.com/apache/arrow-testing/tree/master/data/flight



##########
python/source/flight.rst:
##########
@@ -605,3 +605,102 @@ Or if we use the wrong credentials on login, we also get an error:
     server.shutdown()
 
 .. _(HTTP) basic authentication: https://developer.mozilla.org/en-US/docs/Web/HTTP/Authentication#basic_authentication_scheme
+
+Authentication with certificates
+=================================
+
+Following on from the previous scenario where traffic to the server is managed via a username and password, 
+HTTPS (more specifically TLS) communication allows an additional layer of security by encrypting messages
+between the client and server. This is achieved using certificates. During development, the easiest 
+approach is developing with self-signed certificates. At startup, the server loads the public and private 
+key and the client client authenticates itself to the server with a public key.

Review Comment:
   This is describing TLS mutual authentication, right? But that's not what the example shows.



##########
python/source/flight.rst:
##########
@@ -605,3 +605,102 @@ Or if we use the wrong credentials on login, we also get an error:
     server.shutdown()
 
 .. _(HTTP) basic authentication: https://developer.mozilla.org/en-US/docs/Web/HTTP/Authentication#basic_authentication_scheme
+
+Authentication with certificates
+=================================
+
+Following on from the previous scenario where traffic to the server is managed via a username and password, 
+HTTPS (more specifically TLS) communication allows an additional layer of security by encrypting messages
+between the client and server. This is achieved using certificates. During development, the easiest 
+approach is developing with self-signed certificates. At startup, the server loads the public and private 
+key and the client client authenticates itself to the server with a public key.
+
+.. note:: In production environments it is recommended to make use of a certificate signed by a certificate authority.
+.. note:: This tutorial uses Windows to create a self-signed certificate. For Linux environments, other methods such as OpenSSL can be used.
+
+**Step 1 - Generating the Self Signed Certificate**  
+
+To generate a self-signed certificate, run command prompt as administrator and run the following commands.
+.. testcode::
+    dotnet dev-certs https --trust
+    dotnet dev-certs https -ep "<CertificateName>.pfx" -p <CertificatePassword>
+
+You will receive a prompt asking you confirm that you would like to trust this certificate, select yes. 
+You now have a self-signed certificate that your local environment trusts.
+
+**Step 2 - Converting the .pfx file into public and private keys** 
+
+Since `dotnet dev-certs` does not let you export Public and Private keys directly we need to convert the .pfx file. 
+There are several way to achieve this and this tutorial uses OpenSSL (using Windows Subsystem for Linux) 
+to perform the conversion as per this `IBM article`_.
+
+**Step 3 - Running a server with tls enabled**

Review Comment:
   ```suggestion
   **Step 3 - Running a server with TLS enabled**
   ```



##########
python/source/flight.rst:
##########
@@ -605,3 +605,102 @@ Or if we use the wrong credentials on login, we also get an error:
     server.shutdown()
 
 .. _(HTTP) basic authentication: https://developer.mozilla.org/en-US/docs/Web/HTTP/Authentication#basic_authentication_scheme
+
+Authentication with certificates
+=================================
+
+Following on from the previous scenario where traffic to the server is managed via a username and password, 
+HTTPS (more specifically TLS) communication allows an additional layer of security by encrypting messages
+between the client and server. This is achieved using certificates. During development, the easiest 
+approach is developing with self-signed certificates. At startup, the server loads the public and private 
+key and the client client authenticates itself to the server with a public key.
+
+.. note:: In production environments it is recommended to make use of a certificate signed by a certificate authority.
+.. note:: This tutorial uses Windows to create a self-signed certificate. For Linux environments, other methods such as OpenSSL can be used.
+
+**Step 1 - Generating the Self Signed Certificate**  
+
+To generate a self-signed certificate, run command prompt as administrator and run the following commands.
+.. testcode::
+    dotnet dev-certs https --trust
+    dotnet dev-certs https -ep "<CertificateName>.pfx" -p <CertificatePassword>
+
+You will receive a prompt asking you confirm that you would like to trust this certificate, select yes. 
+You now have a self-signed certificate that your local environment trusts.
+
+**Step 2 - Converting the .pfx file into public and private keys** 
+
+Since `dotnet dev-certs` does not let you export Public and Private keys directly we need to convert the .pfx file. 
+There are several way to achieve this and this tutorial uses OpenSSL (using Windows Subsystem for Linux) 
+to perform the conversion as per this `IBM article`_.
+
+**Step 3 - Running a server with tls enabled**
+
+We're going to use the pyarrow server example available on the `GitHub repo`_. To run the server with TLS enabled, the python script should be 
+called with the path to the public and private keys.
+.. testcode::
+    python server.py --tls CERTFILE <PathToPublicCertificate> --tls KEYFILE <PathToPrivateKey>
+
+Assuming the path was valid, you should see ``Serving on grpc+tls://localhost:5005``. The server is now being served on a port set in the code (or by you).
+
+**Step 4 - Securely Connecting a client to the Server**
+Suppose we want to connect to the client and push some data to it. The following code securely sends information to the server using TLS encryption. 
+There is also the option to use mutual TLS encryption using both the public and private key, but we will assume the client will likely only have 
+the public certificate.
+.. testcode::
+    import pyarrow
+    import pyarrow.flight
+    import pandas as pd
+    
+    # Assumes incoming data object is a Dataframe
+    def pushToServer(name, data, client):

Review Comment:
   Note that Python generally uses snake_case (ignoring things like the `logging` module :slightly_smiling_face:)



##########
python/source/flight.rst:
##########
@@ -605,3 +605,102 @@ Or if we use the wrong credentials on login, we also get an error:
     server.shutdown()
 
 .. _(HTTP) basic authentication: https://developer.mozilla.org/en-US/docs/Web/HTTP/Authentication#basic_authentication_scheme
+
+Authentication with certificates
+=================================
+
+Following on from the previous scenario where traffic to the server is managed via a username and password, 
+HTTPS (more specifically TLS) communication allows an additional layer of security by encrypting messages
+between the client and server. This is achieved using certificates. During development, the easiest 
+approach is developing with self-signed certificates. At startup, the server loads the public and private 
+key and the client client authenticates itself to the server with a public key.
+
+.. note:: In production environments it is recommended to make use of a certificate signed by a certificate authority.
+.. note:: This tutorial uses Windows to create a self-signed certificate. For Linux environments, other methods such as OpenSSL can be used.
+
+**Step 1 - Generating the Self Signed Certificate**  
+
+To generate a self-signed certificate, run command prompt as administrator and run the following commands.
+.. testcode::
+    dotnet dev-certs https --trust
+    dotnet dev-certs https -ep "<CertificateName>.pfx" -p <CertificatePassword>
+
+You will receive a prompt asking you confirm that you would like to trust this certificate, select yes. 
+You now have a self-signed certificate that your local environment trusts.
+
+**Step 2 - Converting the .pfx file into public and private keys** 
+
+Since `dotnet dev-certs` does not let you export Public and Private keys directly we need to convert the .pfx file. 
+There are several way to achieve this and this tutorial uses OpenSSL (using Windows Subsystem for Linux) 
+to perform the conversion as per this `IBM article`_.
+
+**Step 3 - Running a server with tls enabled**
+
+We're going to use the pyarrow server example available on the `GitHub repo`_. To run the server with TLS enabled, the python script should be 
+called with the path to the public and private keys.
+.. testcode::
+    python server.py --tls CERTFILE <PathToPublicCertificate> --tls KEYFILE <PathToPrivateKey>

Review Comment:
   We want to keep the examples self-contained, so would it be possible to instead modify the previous server?



##########
python/source/flight.rst:
##########
@@ -605,3 +605,102 @@ Or if we use the wrong credentials on login, we also get an error:
     server.shutdown()
 
 .. _(HTTP) basic authentication: https://developer.mozilla.org/en-US/docs/Web/HTTP/Authentication#basic_authentication_scheme
+
+Authentication with certificates
+=================================
+
+Following on from the previous scenario where traffic to the server is managed via a username and password, 
+HTTPS (more specifically TLS) communication allows an additional layer of security by encrypting messages
+between the client and server. This is achieved using certificates. During development, the easiest 
+approach is developing with self-signed certificates. At startup, the server loads the public and private 
+key and the client client authenticates itself to the server with a public key.
+
+.. note:: In production environments it is recommended to make use of a certificate signed by a certificate authority.
+.. note:: This tutorial uses Windows to create a self-signed certificate. For Linux environments, other methods such as OpenSSL can be used.
+
+**Step 1 - Generating the Self Signed Certificate**  
+
+To generate a self-signed certificate, run command prompt as administrator and run the following commands.
+.. testcode::
+    dotnet dev-certs https --trust
+    dotnet dev-certs https -ep "<CertificateName>.pfx" -p <CertificatePassword>
+
+You will receive a prompt asking you confirm that you would like to trust this certificate, select yes. 
+You now have a self-signed certificate that your local environment trusts.
+
+**Step 2 - Converting the .pfx file into public and private keys** 
+
+Since `dotnet dev-certs` does not let you export Public and Private keys directly we need to convert the .pfx file. 
+There are several way to achieve this and this tutorial uses OpenSSL (using Windows Subsystem for Linux) 
+to perform the conversion as per this `IBM article`_.
+
+**Step 3 - Running a server with tls enabled**
+
+We're going to use the pyarrow server example available on the `GitHub repo`_. To run the server with TLS enabled, the python script should be 
+called with the path to the public and private keys.
+.. testcode::
+    python server.py --tls CERTFILE <PathToPublicCertificate> --tls KEYFILE <PathToPrivateKey>
+
+Assuming the path was valid, you should see ``Serving on grpc+tls://localhost:5005``. The server is now being served on a port set in the code (or by you).
+
+**Step 4 - Securely Connecting a client to the Server**

Review Comment:
   Nit, but these section titles are capitalized inconsistently



##########
python/source/flight.rst:
##########
@@ -605,3 +605,102 @@ Or if we use the wrong credentials on login, we also get an error:
     server.shutdown()
 
 .. _(HTTP) basic authentication: https://developer.mozilla.org/en-US/docs/Web/HTTP/Authentication#basic_authentication_scheme
+
+Authentication with certificates
+=================================
+
+Following on from the previous scenario where traffic to the server is managed via a username and password, 
+HTTPS (more specifically TLS) communication allows an additional layer of security by encrypting messages
+between the client and server. This is achieved using certificates. During development, the easiest 
+approach is developing with self-signed certificates. At startup, the server loads the public and private 
+key and the client client authenticates itself to the server with a public key.
+
+.. note:: In production environments it is recommended to make use of a certificate signed by a certificate authority.
+.. note:: This tutorial uses Windows to create a self-signed certificate. For Linux environments, other methods such as OpenSSL can be used.
+
+**Step 1 - Generating the Self Signed Certificate**  
+
+To generate a self-signed certificate, run command prompt as administrator and run the following commands.
+.. testcode::
+    dotnet dev-certs https --trust
+    dotnet dev-certs https -ep "<CertificateName>.pfx" -p <CertificatePassword>
+
+You will receive a prompt asking you confirm that you would like to trust this certificate, select yes. 
+You now have a self-signed certificate that your local environment trusts.
+
+**Step 2 - Converting the .pfx file into public and private keys** 
+
+Since `dotnet dev-certs` does not let you export Public and Private keys directly we need to convert the .pfx file. 
+There are several way to achieve this and this tutorial uses OpenSSL (using Windows Subsystem for Linux) 
+to perform the conversion as per this `IBM article`_.
+
+**Step 3 - Running a server with tls enabled**
+
+We're going to use the pyarrow server example available on the `GitHub repo`_. To run the server with TLS enabled, the python script should be 
+called with the path to the public and private keys.
+.. testcode::
+    python server.py --tls CERTFILE <PathToPublicCertificate> --tls KEYFILE <PathToPrivateKey>
+
+Assuming the path was valid, you should see ``Serving on grpc+tls://localhost:5005``. The server is now being served on a port set in the code (or by you).
+
+**Step 4 - Securely Connecting a client to the Server**
+Suppose we want to connect to the client and push some data to it. The following code securely sends information to the server using TLS encryption. 
+There is also the option to use mutual TLS encryption using both the public and private key, but we will assume the client will likely only have 
+the public certificate.
+.. testcode::
+    import pyarrow
+    import pyarrow.flight
+    import pandas as pd
+    
+    # Assumes incoming data object is a Dataframe
+    def pushToServer(name, data, client):
+        objectToSend = pyarrow.Table.from_pandas(data)
+        writer, _ = client.do_put(pyarrow.flight.FlightDescriptor.for_path(name), objectToSend.schema)
+        writer.write_table(objectToSend)
+        writer.close()
+    
+    def getClient():
+        
+        return pyarrow.flight.FlightClient("grpc+tcp://localhost:5005")
+    
+    def _add_common_arguments(parser):
+        parser.add_argument('--tls', action='store_true',
+                            help='Enable transport-level security')
+        parser.add_argument('--tls-roots', default=None,
+                            help='Path to trusted TLS certificate(s)')
+        parser.add_argument("--mtls", nargs=2, default=None,
+                            metavar=('CERTFILE', 'KEYFILE'),
+                            help="Enable transport-level security")
+                            
+    def main():
+        parser = argparse.ArgumentParser()
+        args = parser.parse_args()
+        connection_args = {}
+        scheme = "grpc+tls"
+        
+        if args.tls:
+            
+            if args.tls_roots:
+                with open(args.tls_roots, "rb") as root_certs:
+                    connection_args["tls_root_certs"] = root_certs.read()
+        if args.mtls:
+            with open(args.mtls[0], "rb") as cert_file:
+                tls_cert_chain = cert_file.read()
+            with open(args.mtls[1], "rb") as key_file:
+                tls_private_key = key_file.read()
+            connection_args["cert_chain"] = tls_cert_chain
+            connection_args["private_key"] = tls_private_key

Review Comment:
   IMO, we can split mutual TLS into a separate example to reduce the number of things going on here.



##########
python/source/flight.rst:
##########
@@ -605,3 +605,102 @@ Or if we use the wrong credentials on login, we also get an error:
     server.shutdown()
 
 .. _(HTTP) basic authentication: https://developer.mozilla.org/en-US/docs/Web/HTTP/Authentication#basic_authentication_scheme
+
+Authentication with certificates
+=================================
+
+Following on from the previous scenario where traffic to the server is managed via a username and password, 
+HTTPS (more specifically TLS) communication allows an additional layer of security by encrypting messages
+between the client and server. This is achieved using certificates. During development, the easiest 
+approach is developing with self-signed certificates. At startup, the server loads the public and private 
+key and the client client authenticates itself to the server with a public key.
+
+.. note:: In production environments it is recommended to make use of a certificate signed by a certificate authority.
+.. note:: This tutorial uses Windows to create a self-signed certificate. For Linux environments, other methods such as OpenSSL can be used.
+
+**Step 1 - Generating the Self Signed Certificate**  
+
+To generate a self-signed certificate, run command prompt as administrator and run the following commands.
+.. testcode::
+    dotnet dev-certs https --trust
+    dotnet dev-certs https -ep "<CertificateName>.pfx" -p <CertificatePassword>
+
+You will receive a prompt asking you confirm that you would like to trust this certificate, select yes. 
+You now have a self-signed certificate that your local environment trusts.
+
+**Step 2 - Converting the .pfx file into public and private keys** 
+
+Since `dotnet dev-certs` does not let you export Public and Private keys directly we need to convert the .pfx file. 
+There are several way to achieve this and this tutorial uses OpenSSL (using Windows Subsystem for Linux) 
+to perform the conversion as per this `IBM article`_.
+
+**Step 3 - Running a server with tls enabled**
+
+We're going to use the pyarrow server example available on the `GitHub repo`_. To run the server with TLS enabled, the python script should be 
+called with the path to the public and private keys.
+.. testcode::
+    python server.py --tls CERTFILE <PathToPublicCertificate> --tls KEYFILE <PathToPrivateKey>
+
+Assuming the path was valid, you should see ``Serving on grpc+tls://localhost:5005``. The server is now being served on a port set in the code (or by you).
+
+**Step 4 - Securely Connecting a client to the Server**
+Suppose we want to connect to the client and push some data to it. The following code securely sends information to the server using TLS encryption. 
+There is also the option to use mutual TLS encryption using both the public and private key, but we will assume the client will likely only have 
+the public certificate.
+.. testcode::
+    import pyarrow
+    import pyarrow.flight
+    import pandas as pd
+    
+    # Assumes incoming data object is a Dataframe
+    def pushToServer(name, data, client):
+        objectToSend = pyarrow.Table.from_pandas(data)
+        writer, _ = client.do_put(pyarrow.flight.FlightDescriptor.for_path(name), objectToSend.schema)
+        writer.write_table(objectToSend)
+        writer.close()
+    
+    def getClient():
+        
+        return pyarrow.flight.FlightClient("grpc+tcp://localhost:5005")

Review Comment:
   Is this used?



##########
python/source/flight.rst:
##########
@@ -605,3 +605,102 @@ Or if we use the wrong credentials on login, we also get an error:
     server.shutdown()
 
 .. _(HTTP) basic authentication: https://developer.mozilla.org/en-US/docs/Web/HTTP/Authentication#basic_authentication_scheme
+
+Authentication with certificates
+=================================
+
+Following on from the previous scenario where traffic to the server is managed via a username and password, 
+HTTPS (more specifically TLS) communication allows an additional layer of security by encrypting messages
+between the client and server. This is achieved using certificates. During development, the easiest 
+approach is developing with self-signed certificates. At startup, the server loads the public and private 
+key and the client client authenticates itself to the server with a public key.
+
+.. note:: In production environments it is recommended to make use of a certificate signed by a certificate authority.
+.. note:: This tutorial uses Windows to create a self-signed certificate. For Linux environments, other methods such as OpenSSL can be used.
+
+**Step 1 - Generating the Self Signed Certificate**  
+
+To generate a self-signed certificate, run command prompt as administrator and run the following commands.
+.. testcode::
+    dotnet dev-certs https --trust
+    dotnet dev-certs https -ep "<CertificateName>.pfx" -p <CertificatePassword>
+
+You will receive a prompt asking you confirm that you would like to trust this certificate, select yes. 
+You now have a self-signed certificate that your local environment trusts.
+
+**Step 2 - Converting the .pfx file into public and private keys** 
+
+Since `dotnet dev-certs` does not let you export Public and Private keys directly we need to convert the .pfx file. 
+There are several way to achieve this and this tutorial uses OpenSSL (using Windows Subsystem for Linux) 
+to perform the conversion as per this `IBM article`_.
+
+**Step 3 - Running a server with tls enabled**
+
+We're going to use the pyarrow server example available on the `GitHub repo`_. To run the server with TLS enabled, the python script should be 
+called with the path to the public and private keys.
+.. testcode::
+    python server.py --tls CERTFILE <PathToPublicCertificate> --tls KEYFILE <PathToPrivateKey>

Review Comment:
   It would also be nice to follow the general structure laid out above when demoing code.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-cookbook] lidavidm commented on a diff in pull request #177: Include Python Server Client Example with Certificates

Posted by GitBox <gi...@apache.org>.
lidavidm commented on code in PR #177:
URL: https://github.com/apache/arrow-cookbook/pull/177#discussion_r847273922


##########
python/source/flight.rst:
##########
@@ -605,3 +605,138 @@ Or if we use the wrong credentials on login, we also get an error:
     server.shutdown()
 
 .. _(HTTP) basic authentication: https://developer.mozilla.org/en-US/docs/Web/HTTP/Authentication#basic_authentication_scheme
+
+Authentication with certificates
+=================================
+
+Following on from the previous scenario where traffic to the server is managed via a username and password, 
+HTTPS (more specifically TLS) communication allows an additional layer of security by encrypting messages
+between the client and server. This is achieved using certificates. During development, the easiest 
+approach is developing with self-signed certificates. At startup, the server loads the public and private 
+key and the client client authenticates itself to the server with the tls root certificate.
+
+.. note:: In production environments it is recommended to make use of a certificate signed by a certificate authority.
+
+**Step 1 - Generating the Self Signed Certificate**  
+
+Generate a self-signed certificate by using dotnet on `Windows`_, or `openssl`_ on Linux or MacOS. 
+Alternatively, the self-signed certificate from the `Arrow testing data repository`_ can be used. 
+Depending on the file generated, you may need to convert it to a .crt and .key file as required for the Arrow server. 
+One method to achieve this is openssl, please visit this `IBM article`_ for more info. 
+
+
+**Step 2 - Running a server with TLS enabled**
+
+The code below is a minimal working example of an Arrow server used to receive data with TLS. For a full server example, please visit the Arrow `GitHub repo`_. 

Review Comment:
   We're in the middle of a bunch of examples, so I don't think we need to link back to the GitHub repo here :slightly_smiling_face: 



##########
python/source/flight.rst:
##########
@@ -605,3 +605,138 @@ Or if we use the wrong credentials on login, we also get an error:
     server.shutdown()
 
 .. _(HTTP) basic authentication: https://developer.mozilla.org/en-US/docs/Web/HTTP/Authentication#basic_authentication_scheme
+
+Authentication with certificates
+=================================
+
+Following on from the previous scenario where traffic to the server is managed via a username and password, 
+HTTPS (more specifically TLS) communication allows an additional layer of security by encrypting messages
+between the client and server. This is achieved using certificates. During development, the easiest 
+approach is developing with self-signed certificates. At startup, the server loads the public and private 
+key and the client client authenticates itself to the server with the tls root certificate.

Review Comment:
   This should be more like "...the client authenticates the server with the TLS root certificate" right? In the usual flow the server does not authenticate the client.



##########
python/source/flight.rst:
##########
@@ -605,3 +605,138 @@ Or if we use the wrong credentials on login, we also get an error:
     server.shutdown()
 
 .. _(HTTP) basic authentication: https://developer.mozilla.org/en-US/docs/Web/HTTP/Authentication#basic_authentication_scheme
+
+Authentication with certificates
+=================================

Review Comment:
   Since we're not talking about mTLS this should be more like "Securing connections with TLS"



##########
python/source/flight.rst:
##########
@@ -605,3 +605,138 @@ Or if we use the wrong credentials on login, we also get an error:
     server.shutdown()
 
 .. _(HTTP) basic authentication: https://developer.mozilla.org/en-US/docs/Web/HTTP/Authentication#basic_authentication_scheme
+
+Authentication with certificates
+=================================
+
+Following on from the previous scenario where traffic to the server is managed via a username and password, 
+HTTPS (more specifically TLS) communication allows an additional layer of security by encrypting messages
+between the client and server. This is achieved using certificates. During development, the easiest 
+approach is developing with self-signed certificates. At startup, the server loads the public and private 
+key and the client client authenticates itself to the server with the tls root certificate.
+
+.. note:: In production environments it is recommended to make use of a certificate signed by a certificate authority.
+
+**Step 1 - Generating the Self Signed Certificate**  
+
+Generate a self-signed certificate by using dotnet on `Windows`_, or `openssl`_ on Linux or MacOS. 
+Alternatively, the self-signed certificate from the `Arrow testing data repository`_ can be used. 
+Depending on the file generated, you may need to convert it to a .crt and .key file as required for the Arrow server. 
+One method to achieve this is openssl, please visit this `IBM article`_ for more info. 
+
+
+**Step 2 - Running a server with TLS enabled**
+
+The code below is a minimal working example of an Arrow server used to receive data with TLS. For a full server example, please visit the Arrow `GitHub repo`_. 
+
+.. testcode::
+    
+    import argparse
+    import pyarrow
+    import pyarrow.flight
+    
+    
+    class FlightServer(pyarrow.flight.FlightServerBase):
+        def __init__(self, host="localhost", location=None,
+                     tls_certificates=None, verify_client=False,
+                     root_certificates=None, auth_handler=None):
+            super(FlightServer, self).__init__(
+                location, auth_handler, tls_certificates, verify_client,
+                root_certificates)
+            self.flights = {}
+            self.host = host
+            self.tls_certificates = tls_certificates

Review Comment:
   Are these ever used?



##########
python/source/flight.rst:
##########
@@ -605,3 +605,138 @@ Or if we use the wrong credentials on login, we also get an error:
     server.shutdown()
 
 .. _(HTTP) basic authentication: https://developer.mozilla.org/en-US/docs/Web/HTTP/Authentication#basic_authentication_scheme
+
+Authentication with certificates
+=================================
+
+Following on from the previous scenario where traffic to the server is managed via a username and password, 
+HTTPS (more specifically TLS) communication allows an additional layer of security by encrypting messages
+between the client and server. This is achieved using certificates. During development, the easiest 
+approach is developing with self-signed certificates. At startup, the server loads the public and private 
+key and the client client authenticates itself to the server with the tls root certificate.
+
+.. note:: In production environments it is recommended to make use of a certificate signed by a certificate authority.
+
+**Step 1 - Generating the Self Signed Certificate**  
+
+Generate a self-signed certificate by using dotnet on `Windows`_, or `openssl`_ on Linux or MacOS. 
+Alternatively, the self-signed certificate from the `Arrow testing data repository`_ can be used. 
+Depending on the file generated, you may need to convert it to a .crt and .key file as required for the Arrow server. 
+One method to achieve this is openssl, please visit this `IBM article`_ for more info. 
+
+
+**Step 2 - Running a server with TLS enabled**
+
+The code below is a minimal working example of an Arrow server used to receive data with TLS. For a full server example, please visit the Arrow `GitHub repo`_. 
+
+.. testcode::
+    
+    import argparse
+    import pyarrow
+    import pyarrow.flight
+    
+    
+    class FlightServer(pyarrow.flight.FlightServerBase):
+        def __init__(self, host="localhost", location=None,
+                     tls_certificates=None, verify_client=False,
+                     root_certificates=None, auth_handler=None):

Review Comment:
   Why not `**kwargs`?



##########
python/source/flight.rst:
##########
@@ -605,3 +605,138 @@ Or if we use the wrong credentials on login, we also get an error:
     server.shutdown()
 
 .. _(HTTP) basic authentication: https://developer.mozilla.org/en-US/docs/Web/HTTP/Authentication#basic_authentication_scheme
+
+Authentication with certificates
+=================================
+
+Following on from the previous scenario where traffic to the server is managed via a username and password, 
+HTTPS (more specifically TLS) communication allows an additional layer of security by encrypting messages
+between the client and server. This is achieved using certificates. During development, the easiest 
+approach is developing with self-signed certificates. At startup, the server loads the public and private 
+key and the client client authenticates itself to the server with the tls root certificate.
+
+.. note:: In production environments it is recommended to make use of a certificate signed by a certificate authority.
+
+**Step 1 - Generating the Self Signed Certificate**  
+
+Generate a self-signed certificate by using dotnet on `Windows`_, or `openssl`_ on Linux or MacOS. 
+Alternatively, the self-signed certificate from the `Arrow testing data repository`_ can be used. 
+Depending on the file generated, you may need to convert it to a .crt and .key file as required for the Arrow server. 
+One method to achieve this is openssl, please visit this `IBM article`_ for more info. 
+
+
+**Step 2 - Running a server with TLS enabled**
+
+The code below is a minimal working example of an Arrow server used to receive data with TLS. For a full server example, please visit the Arrow `GitHub repo`_. 
+
+.. testcode::
+    
+    import argparse
+    import pyarrow
+    import pyarrow.flight
+    
+    
+    class FlightServer(pyarrow.flight.FlightServerBase):
+        def __init__(self, host="localhost", location=None,
+                     tls_certificates=None, verify_client=False,
+                     root_certificates=None, auth_handler=None):
+            super(FlightServer, self).__init__(
+                location, auth_handler, tls_certificates, verify_client,
+                root_certificates)
+            self.flights = {}
+            self.host = host
+            self.tls_certificates = tls_certificates
+    
+        @classmethod
+        def descriptor_to_key(self, descriptor):
+            return (descriptor.descriptor_type.value, descriptor.command,
+                    tuple(descriptor.path or tuple()))
+    
+        def do_put(self, context, descriptor, reader, writer):
+            key = FlightServer.descriptor_to_key(descriptor)
+            print(key)
+            self.flights[key] = reader.read_all()
+            print(self.flights[key])
+    
+    
+    def main():
+        parser = argparse.ArgumentParser()
+        parser.add_argument("--tls", nargs=2, default=None, metavar=('CERTFILE', 'KEYFILE'))
+        args = parser.parse_args()                                
+        tls_certificates = []
+    
+        scheme = "grpc+tls"
+        host = "localhost"
+        port = "5005"
+        
+        with open(args.tls[0], "rb") as cert_file:
+            tls_cert_chain = cert_file.read()
+        with open(args.tls[1], "rb") as key_file:
+            tls_private_key = key_file.read()
+    
+        tls_certificates.append((tls_cert_chain, tls_private_key))
+        
+        location = "{}://{}:{}".format(scheme, host, port)
+    
+        server = FlightServer(host, location,
+                              tls_certificates=tls_certificates)
+        print("Serving on", location)
+        server.serve()
+    
+    
+    if __name__ == '__main__':
+        main()
+
+Running the server, you should see ``Serving on grpc+tls://localhost:5005``.
+
+**Step 3 - Securely Connecting to the Server**
+Suppose we want to connect to the client and push some data to it. The following code securely sends information to the server using TLS encryption.
+The example below shows how one could  

Review Comment:
   this looks to have been cut off?



##########
python/source/flight.rst:
##########
@@ -605,3 +605,138 @@ Or if we use the wrong credentials on login, we also get an error:
     server.shutdown()
 
 .. _(HTTP) basic authentication: https://developer.mozilla.org/en-US/docs/Web/HTTP/Authentication#basic_authentication_scheme
+
+Authentication with certificates
+=================================
+
+Following on from the previous scenario where traffic to the server is managed via a username and password, 
+HTTPS (more specifically TLS) communication allows an additional layer of security by encrypting messages
+between the client and server. This is achieved using certificates. During development, the easiest 
+approach is developing with self-signed certificates. At startup, the server loads the public and private 
+key and the client client authenticates itself to the server with the tls root certificate.
+
+.. note:: In production environments it is recommended to make use of a certificate signed by a certificate authority.
+
+**Step 1 - Generating the Self Signed Certificate**  
+
+Generate a self-signed certificate by using dotnet on `Windows`_, or `openssl`_ on Linux or MacOS. 
+Alternatively, the self-signed certificate from the `Arrow testing data repository`_ can be used. 
+Depending on the file generated, you may need to convert it to a .crt and .key file as required for the Arrow server. 
+One method to achieve this is openssl, please visit this `IBM article`_ for more info. 
+
+
+**Step 2 - Running a server with TLS enabled**
+
+The code below is a minimal working example of an Arrow server used to receive data with TLS. For a full server example, please visit the Arrow `GitHub repo`_. 
+
+.. testcode::
+    
+    import argparse
+    import pyarrow
+    import pyarrow.flight
+    
+    
+    class FlightServer(pyarrow.flight.FlightServerBase):
+        def __init__(self, host="localhost", location=None,
+                     tls_certificates=None, verify_client=False,
+                     root_certificates=None, auth_handler=None):
+            super(FlightServer, self).__init__(
+                location, auth_handler, tls_certificates, verify_client,
+                root_certificates)
+            self.flights = {}
+            self.host = host
+            self.tls_certificates = tls_certificates
+    
+        @classmethod
+        def descriptor_to_key(self, descriptor):
+            return (descriptor.descriptor_type.value, descriptor.command,
+                    tuple(descriptor.path or tuple()))
+    
+        def do_put(self, context, descriptor, reader, writer):
+            key = FlightServer.descriptor_to_key(descriptor)
+            print(key)
+            self.flights[key] = reader.read_all()
+            print(self.flights[key])
+    
+    
+    def main():
+        parser = argparse.ArgumentParser()
+        parser.add_argument("--tls", nargs=2, default=None, metavar=('CERTFILE', 'KEYFILE'))
+        args = parser.parse_args()                                
+        tls_certificates = []
+    
+        scheme = "grpc+tls"
+        host = "localhost"
+        port = "5005"
+        
+        with open(args.tls[0], "rb") as cert_file:
+            tls_cert_chain = cert_file.read()
+        with open(args.tls[1], "rb") as key_file:
+            tls_private_key = key_file.read()
+    
+        tls_certificates.append((tls_cert_chain, tls_private_key))
+        
+        location = "{}://{}:{}".format(scheme, host, port)
+    
+        server = FlightServer(host, location,
+                              tls_certificates=tls_certificates)
+        print("Serving on", location)
+        server.serve()
+    
+    
+    if __name__ == '__main__':
+        main()
+
+Running the server, you should see ``Serving on grpc+tls://localhost:5005``.
+
+**Step 3 - Securely Connecting to the Server**
+Suppose we want to connect to the client and push some data to it. The following code securely sends information to the server using TLS encryption.
+The example below shows how one could  
+
+.. testcode::
+    
+    import argparse
+    import pyarrow
+    import pyarrow.flight
+    import pandas as pd
+    
+    # Assumes incoming data object is a Dataframe

Review Comment:
   ```suggestion
       # Assumes incoming data object is a Pandas DataFrame
   ```



##########
python/source/flight.rst:
##########
@@ -605,3 +605,138 @@ Or if we use the wrong credentials on login, we also get an error:
     server.shutdown()
 
 .. _(HTTP) basic authentication: https://developer.mozilla.org/en-US/docs/Web/HTTP/Authentication#basic_authentication_scheme
+
+Authentication with certificates
+=================================
+
+Following on from the previous scenario where traffic to the server is managed via a username and password, 
+HTTPS (more specifically TLS) communication allows an additional layer of security by encrypting messages
+between the client and server. This is achieved using certificates. During development, the easiest 
+approach is developing with self-signed certificates. At startup, the server loads the public and private 
+key and the client client authenticates itself to the server with the tls root certificate.
+
+.. note:: In production environments it is recommended to make use of a certificate signed by a certificate authority.
+
+**Step 1 - Generating the Self Signed Certificate**  
+
+Generate a self-signed certificate by using dotnet on `Windows`_, or `openssl`_ on Linux or MacOS. 
+Alternatively, the self-signed certificate from the `Arrow testing data repository`_ can be used. 
+Depending on the file generated, you may need to convert it to a .crt and .key file as required for the Arrow server. 
+One method to achieve this is openssl, please visit this `IBM article`_ for more info. 
+
+
+**Step 2 - Running a server with TLS enabled**
+
+The code below is a minimal working example of an Arrow server used to receive data with TLS. For a full server example, please visit the Arrow `GitHub repo`_. 
+
+.. testcode::
+    
+    import argparse
+    import pyarrow
+    import pyarrow.flight
+    
+    
+    class FlightServer(pyarrow.flight.FlightServerBase):
+        def __init__(self, host="localhost", location=None,
+                     tls_certificates=None, verify_client=False,
+                     root_certificates=None, auth_handler=None):
+            super(FlightServer, self).__init__(
+                location, auth_handler, tls_certificates, verify_client,
+                root_certificates)
+            self.flights = {}
+            self.host = host
+            self.tls_certificates = tls_certificates
+    
+        @classmethod
+        def descriptor_to_key(self, descriptor):
+            return (descriptor.descriptor_type.value, descriptor.command,
+                    tuple(descriptor.path or tuple()))
+    
+        def do_put(self, context, descriptor, reader, writer):
+            key = FlightServer.descriptor_to_key(descriptor)
+            print(key)
+            self.flights[key] = reader.read_all()
+            print(self.flights[key])
+    
+    
+    def main():
+        parser = argparse.ArgumentParser()
+        parser.add_argument("--tls", nargs=2, default=None, metavar=('CERTFILE', 'KEYFILE'))
+        args = parser.parse_args()                                
+        tls_certificates = []
+    
+        scheme = "grpc+tls"
+        host = "localhost"
+        port = "5005"
+        
+        with open(args.tls[0], "rb") as cert_file:
+            tls_cert_chain = cert_file.read()
+        with open(args.tls[1], "rb") as key_file:
+            tls_private_key = key_file.read()
+    
+        tls_certificates.append((tls_cert_chain, tls_private_key))
+        
+        location = "{}://{}:{}".format(scheme, host, port)
+    
+        server = FlightServer(host, location,
+                              tls_certificates=tls_certificates)
+        print("Serving on", location)
+        server.serve()
+    
+    
+    if __name__ == '__main__':
+        main()
+
+Running the server, you should see ``Serving on grpc+tls://localhost:5005``.
+
+**Step 3 - Securely Connecting to the Server**
+Suppose we want to connect to the client and push some data to it. The following code securely sends information to the server using TLS encryption.
+The example below shows how one could  
+
+.. testcode::
+    
+    import argparse
+    import pyarrow
+    import pyarrow.flight
+    import pandas as pd
+    
+    # Assumes incoming data object is a Dataframe
+    def push_to_server(name, data, client):
+        objectToSend = pyarrow.Table.from_pandas(data)

Review Comment:
   nit: stick to `snake_case` in Python



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-cookbook] lidavidm commented on pull request #177: Include Python Server Client Example with Certificates

Posted by GitBox <gi...@apache.org>.
lidavidm commented on PR #177:
URL: https://github.com/apache/arrow-cookbook/pull/177#issuecomment-1095468241

   Thank you!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-cookbook] thatstatsguy commented on pull request #177: Include Python Server Client Example with Certificates

Posted by GitBox <gi...@apache.org>.
thatstatsguy commented on PR #177:
URL: https://github.com/apache/arrow-cookbook/pull/177#issuecomment-1095298568

   Appreciate the re-review! Addressed everything as requested except the one comment where the code is being used in the flight server., Hoping for no more copy-paste oopsies!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org