You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@thrift.apache.org by GitBox <gi...@apache.org> on 2021/08/01 12:27:36 UTC

[GitHub] [thrift] jorgecarleitao opened a new pull request #2426: THRIFT-5446: Added code-gen for rust async read.

jorgecarleitao opened a new pull request #2426:
URL: https://github.com/apache/thrift/pull/2426


   This draft PR adds support to generate the async version for Rust. Note that when it comes to async, Rust is effectively colored, and thus we need both versions.
   
   It introduces:
   
   * a small generalization of the compiler to generate async next to the sync version
   * a trait in the library, `TInputStreamProtocol` used to consume the stream
   
   I verified that this code generates a valid implementation for `parquet.thrift`, from which a diff can be seen [here](https://github.com/jorgecarleitao/parquet-format-rs/pull/1/files).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@thrift.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [thrift] jorgecarleitao commented on pull request #2426: THRIFT-5446: Added code-gen for rust async read.

Posted by GitBox <gi...@apache.org>.
jorgecarleitao commented on pull request #2426:
URL: https://github.com/apache/thrift/pull/2426#issuecomment-894806548


   pushed an implementation of the writing and corresponding protocol `TOutputStreamProtocol` and implementation for Compact.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@thrift.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [thrift] jorgecarleitao commented on pull request #2426: THRIFT-5446: Added code-gen for rust async read.

Posted by GitBox <gi...@apache.org>.
jorgecarleitao commented on pull request #2426:
URL: https://github.com/apache/thrift/pull/2426#issuecomment-890551249


   Thanks!
   
   > A couple of high-level thoughts from skimming through the code. I wonder if there's a better way of generalizing the functions than adding an is_sync flag everywhere and repeating the is_sync block in multiple places to add the async prefix and use .await?. There are a couple of places where we now call a function with multiple boolean args, and without keyword args it can get confusing.
   
   I have no strong opinions here: the approach here just avoids code duplication by branching of with the `is_sync`. I personally have no problem with these arguments, but happy to change it to whatever makes sense to you.
   
   > Finally, I've never used async-trait. Do you know if the Thrift generator requires any features that are not supported by it?
   
   `async-trait` is just a macro to enable traits with `async`. We have been using it in Apache Arrow (e.g. [here](https://github.com/apache/arrow-datafusion/blob/master/datafusion/src/physical_plan/hash_aggregate.rs#L194)) to declare a trait that is `async`. `integer-encoding`, which we depend on, also uses it when compiled with `futures_async`. The declarations do not need this, only the lib declaring and implementing the trait.
   
   > Next, I would like to see a rough cut of how TInputStreamProtocol is implemented, just to see how everything fits together.
   
   I agree. I have been working on it today. I just pushed a draft to this. Still not compiling, but it gives a general idea; it is just mundane code of the equivalent `async` of the `TCompactInputProtocol`.
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@thrift.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [thrift] jorgecarleitao commented on pull request #2426: THRIFT-5446: Added code-gen for rust async read.

Posted by GitBox <gi...@apache.org>.
jorgecarleitao commented on pull request #2426:
URL: https://github.com/apache/thrift/pull/2426#issuecomment-891187446


   @allengeorge , I have no idea what practices are used here wrt to testing, could you guide me in what you think makes sense to test so that I can cover this accordingly?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@thrift.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [thrift] allengeorge commented on pull request #2426: THRIFT-5446: Added code-gen for rust async read.

Posted by GitBox <gi...@apache.org>.
allengeorge commented on pull request #2426:
URL: https://github.com/apache/thrift/pull/2426#issuecomment-890546380


   First off, thank you for starting work on this feature!
   
   A couple of high-level thoughts from skimming through the code. I _wonder_ if there's a better way of generalizing the functions than adding an `is_sync` flag everywhere and repeating the `is_sync` block in multiple places to add the `async` prefix and use `.await?`. There are a couple of places where we now call a function with multiple boolean args, and without keyword args it can get confusing.
   
   Next, I would like to see a rough cut of how `TInputStreamProtocol` is implemented, just to see how everything fits together.
   
   Finally, I've never used `async-trait`. Do you know if the Thrift generator requires any features that are _not_ supported by it?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@thrift.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [thrift] allengeorge commented on pull request #2426: THRIFT-5446: Added code-gen for rust async.

Posted by GitBox <gi...@apache.org>.
allengeorge commented on pull request #2426:
URL: https://github.com/apache/thrift/pull/2426#issuecomment-916584168


   > @allengeorge , I have no idea what practices are used here wrt to testing, could you guide me in what you think makes sense to test so that I can cover this accordingly?
   
   @jorgecarleitao I'm sorry for letting this linger. The last month has been very tough in terms of getting to work on open-source. I'm gonna try get up to speed as quickly as I can with the async space so that I can properly review this PR.
   
   wrt your question. Actually, the thrift project has quite a few test suites:
   
   # There are unit tests available for the Rust crate
   # There is a "kitchen-sink" test that allows you to build a client/server pair, and allow you to manually run through a battery of tests
   # There is a "cross-test" application that can be utilitized to test cross-language interoperability
   
   I'm happy to provide any details I can on any of these test suites


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@thrift.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [thrift] jorgecarleitao commented on pull request #2426: THRIFT-5446: Added code-gen for rust async read.

Posted by GitBox <gi...@apache.org>.
jorgecarleitao commented on pull request #2426:
URL: https://github.com/apache/thrift/pull/2426#issuecomment-890729544


   Small update: with this version of the code, I was able to generate new code for `parquet.thrift` from `parquet-format-rs` (https://github.com/sunchao/parquet-format-rs/pull/19) and use it to write an `async` version of parquet metadata (https://github.com/jorgecarleitao/parquet2/pull/33) that reads a large metadata file correctly. Thus, this end-to-end test seems to work as intended.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@thrift.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [thrift] jorgecarleitao commented on a change in pull request #2426: THRIFT-5446: Added code-gen for rust async read.

Posted by GitBox <gi...@apache.org>.
jorgecarleitao commented on a change in pull request #2426:
URL: https://github.com/apache/thrift/pull/2426#discussion_r680535842



##########
File path: lib/rs/src/protocol/compact_stream.rs
##########
@@ -0,0 +1,301 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements. See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership. The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License. You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied. See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+use std::convert::{From, TryFrom};
+use std::io;
+
+use byteorder::LittleEndian;
+use futures::{AsyncRead, AsyncReadExt, AsyncSeek, AsyncSeekExt};
+use integer_encoding::reader::VarIntAsyncReader;
+
+use super::compact::collection_u8_to_type;
+use super::{
+    TFieldIdentifier, TInputProtocol, TInputStreamProtocol, TListIdentifier, TMapIdentifier,
+    TMessageIdentifier, TMessageType,
+};
+use super::{TSetIdentifier, TStructIdentifier, TType};
+
+const COMPACT_PROTOCOL_ID: u8 = 0x82;
+const COMPACT_VERSION: u8 = 0x01;
+const COMPACT_VERSION_MASK: u8 = 0x1F;
+
+#[derive(Debug)]
+pub struct TCompactInputStreamProtocol<T: AsyncRead + Unpin + Send> {

Review comment:
       I think we can re-use the same struct and only need to specialize when implementing the trait, for now I just wanted the full flexibility, we can clean it up at the end.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@thrift.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [thrift] allengeorge edited a comment on pull request #2426: THRIFT-5446: Added code-gen for rust async.

Posted by GitBox <gi...@apache.org>.
allengeorge edited a comment on pull request #2426:
URL: https://github.com/apache/thrift/pull/2426#issuecomment-916584168


   > @allengeorge , I have no idea what practices are used here wrt to testing, could you guide me in what you think makes sense to test so that I can cover this accordingly?
   
   @jorgecarleitao I'm sorry for letting this linger. The last month has been very tough in terms of getting to work on open-source. I'm gonna try get up to speed as quickly as I can with the async space so that I can properly review this PR.
   
   wrt your question. Actually, the thrift project has quite a few test suites:
   
   1. There are unit tests available for the Rust crate
   2. There is a "kitchen-sink" test that allows you to build a client/server pair, and allow you to manually run through a battery of tests
   3. There is a "cross-test" application that can be utilitized to test cross-language interoperability
   
   I'm happy to provide any details I can on any of these test suites


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@thrift.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org