You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "nddipiazza (via GitHub)" <gi...@apache.org> on 2024/03/29 11:35:14 UTC

[PR] Tika 4181 grpc [tika]

nddipiazza opened a new pull request, #1702:
URL: https://github.com/apache/tika/pull/1702

   Add an Apache Tika GRPC Server


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@tika.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] TIKA-4181 - Tika Pipes Grpc Server [tika]

Posted by "nddipiazza (via GitHub)" <gi...@apache.org>.
nddipiazza commented on code in PR #1702:
URL: https://github.com/apache/tika/pull/1702#discussion_r1545858324


##########
tika-pipes/tika-grpc/src/main/proto/tika.proto:
##########


Review Comment:
   i did the normal proto linter. i'm going to leave the other stuff there that buf extension stuff didn't see to add much value for my context and added hours



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@tika.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] TIKA-4181 - Tika Pipes Grpc Server [tika]

Posted by "nddipiazza (via GitHub)" <gi...@apache.org>.
nddipiazza commented on code in PR #1702:
URL: https://github.com/apache/tika/pull/1702#discussion_r1566194105


##########
tika-pipes/tika-grpc/src/main/proto/tika.proto:
##########
@@ -0,0 +1,92 @@
+// Copyright 2015 The gRPC Authors
+//
+// Licensed under the Apache License, Version 2.0 (the "License");
+// you may not use this file except in compliance with the License.
+// You may obtain a copy of the License at
+//
+//     http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+syntax = "proto3";
+package tika;
+
+option java_multiple_files = true;
+option java_package = "org.apache.tika";
+option java_outer_classname = "TikaProto";
+option objc_class_prefix = "HLW";
+
+
+service Tika   {
+  rpc CreateFetcher(CreateFetcherRequest) returns (CreateFetcherReply) {}
+  rpc UpdateFetcher(UpdateFetcherRequest) returns (UpdateFetcherReply) {}
+  rpc GetFetcher(GetFetcherRequest) returns (GetFetcherReply) {}
+  rpc ListFetchers(ListFetchersRequest) returns (ListFetchersReply) {}
+  rpc DeleteFetcher(DeleteFetcherRequest) returns (DeleteFetcherReply) {}
+  rpc FetchAndParse(FetchAndParseRequest) returns (FetchAndParseReply) {}
+  rpc FetchAndParseServerSideStreaming(FetchAndParseRequest)
+    returns (stream FetchAndParseReply) {}
+  rpc FetchAndParseBiDirectionalStreaming(stream FetchAndParseRequest) 
+    returns (stream FetchAndParseReply) {}
+}
+
+message CreateFetcherRequest {
+  string name = 1;
+  string fetcher_class = 2;

Review Comment:
   string needed so people can dynamically add them. validation will make sure class exists and will return nice error message



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@tika.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] TIKA-4181 - Tika Pipes Grpc Server [tika]

Posted by "bartek (via GitHub)" <gi...@apache.org>.
bartek commented on code in PR #1702:
URL: https://github.com/apache/tika/pull/1702#discussion_r1546235205


##########
tika-pipes/tika-grpc/src/main/proto/tika.proto:
##########
@@ -0,0 +1,92 @@
+// Copyright 2015 The gRPC Authors
+//
+// Licensed under the Apache License, Version 2.0 (the "License");
+// you may not use this file except in compliance with the License.
+// You may obtain a copy of the License at
+//
+//     http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+syntax = "proto3";
+package tika;
+
+option java_multiple_files = true;
+option java_package = "org.apache.tika";
+option java_outer_classname = "TikaProto";
+option objc_class_prefix = "HLW";
+
+
+service Tika   {
+  rpc CreateFetcher(CreateFetcherRequest) returns (CreateFetcherReply) {}
+  rpc UpdateFetcher(UpdateFetcherRequest) returns (UpdateFetcherReply) {}
+  rpc GetFetcher(GetFetcherRequest) returns (GetFetcherReply) {}
+  rpc ListFetchers(ListFetchersRequest) returns (ListFetchersReply) {}
+  rpc DeleteFetcher(DeleteFetcherRequest) returns (DeleteFetcherReply) {}
+  rpc FetchAndParse(FetchAndParseRequest) returns (FetchAndParseReply) {}
+  rpc FetchAndParseServerSideStreaming(FetchAndParseRequest)
+    returns (stream FetchAndParseReply) {}
+  rpc FetchAndParseBiDirectionalStreaming(stream FetchAndParseRequest) 
+    returns (stream FetchAndParseReply) {}
+}
+
+message CreateFetcherRequest {
+  string name = 1;
+  string fetcher_class = 2;

Review Comment:
   Should this be a protobuf enum containing the constrained set of classes? Or does Tika need to support arbitrary strings here in case of custom fetchers not included in the Tika project?
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@tika.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] TIKA-4181 - Tika Pipes Grpc Server [tika]

Posted by "nddipiazza (via GitHub)" <gi...@apache.org>.
nddipiazza commented on code in PR #1702:
URL: https://github.com/apache/tika/pull/1702#discussion_r1566193576


##########
tika-pipes/tika-grpc/src/main/proto/tika.proto:
##########
@@ -0,0 +1,92 @@
+// Copyright 2015 The gRPC Authors
+//
+// Licensed under the Apache License, Version 2.0 (the "License");
+// you may not use this file except in compliance with the License.
+// You may obtain a copy of the License at
+//
+//     http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+syntax = "proto3";
+package tika;
+
+option java_multiple_files = true;
+option java_package = "org.apache.tika";
+option java_outer_classname = "TikaProto";
+option objc_class_prefix = "HLW";
+
+
+service Tika   {
+  rpc CreateFetcher(CreateFetcherRequest) returns (CreateFetcherReply) {}
+  rpc UpdateFetcher(UpdateFetcherRequest) returns (UpdateFetcherReply) {}
+  rpc GetFetcher(GetFetcherRequest) returns (GetFetcherReply) {}

Review Comment:
   added not-found detection



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@tika.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] TIKA-4181 - Tika Pipes Grpc Server [tika]

Posted by "bartek (via GitHub)" <gi...@apache.org>.
bartek commented on code in PR #1702:
URL: https://github.com/apache/tika/pull/1702#discussion_r1545596130


##########
tika-pipes/tika-grpc/src/main/proto/tika.proto:
##########
@@ -0,0 +1,90 @@
+// Copyright 2015 The gRPC Authors
+//
+// Licensed under the Apache License, Version 2.0 (the "License");
+// you may not use this file except in compliance with the License.
+// You may obtain a copy of the License at
+//
+//     http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+syntax = "proto3";
+
+option java_multiple_files = true;
+option java_package = "org.apache.tika";
+option java_outer_classname = "TikaProto";
+option objc_class_prefix = "HLW";
+
+package tika;
+
+service Tika {
+  rpc CreateFetcher(CreateFetcherRequest) returns (CreateFetcherReply) {}
+  rpc UpdateFetcher(UpdateFetcherRequest) returns (UpdateFetcherReply) {}
+  rpc GetFetcher(GetFetcherRequest) returns (GetFetcherReply) {}
+  rpc ListFetchers(ListFetchersRequest) returns (ListFetchersReply) {}
+  rpc DeleteFetcher(DeleteFetcherRequest) returns (DeleteFetcherReply) {}
+  rpc FetchAndParse(FetchAndParseRequest) returns (FetchAndParseReply) {}
+  rpc FetchAndParseServerSideStreaming(FetchAndParseRequest) returns (stream FetchAndParseReply) {}
+  rpc FetchAndParseBiDirectionalStreaming(stream FetchAndParseRequest) returns (stream FetchAndParseReply) {}
+}
+
+message CreateFetcherRequest {
+  string name = 1;

Review Comment:
   Must `name` be unique across all initialized fetchers? `name` to me implies it's a descriptive label, is this more of an ID?
   
   Use case I am thinking of if I create multiple fetchers with the same class. Right now I would create a unique name for each one. Is that the correct expectation?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@tika.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] TIKA-4181 - Tika Pipes Grpc Server [tika]

Posted by "bartek (via GitHub)" <gi...@apache.org>.
bartek commented on code in PR #1702:
URL: https://github.com/apache/tika/pull/1702#discussion_r1544981545


##########
tika-pipes/tika-grpc/src/main/proto/tika.proto:
##########


Review Comment:
   For your consideration @nddipiazza, I ran `buf lint` on this protobuf (as I am syncing it to a local repository for development purposes) and here's the report:
   
   ```
   services/tika/pbtika/tika.proto:29:9:Service name "Tika" should be suffixed with "Service".
   services/tika/pbtika/tika.proto:35:3:"tika.FetchAndParseReply" is used as the request or response type for multiple RPCs.
   services/tika/pbtika/tika.proto:35:3:"tika.FetchAndParseRequest" is used as the request or response type for multiple RPCs.
   services/tika/pbtika/tika.proto:36:3:"tika.FetchAndParseReply" is used as the request or response type for multiple RPCs.
   services/tika/pbtika/tika.proto:36:3:"tika.FetchAndParseRequest" is used as the request or response type for multiple RPCs.
   services/tika/pbtika/tika.proto:36:40:RPC request type "FetchAndParseRequest" should be named "FetchAndParseServerSideStreamingRequest" or "TikaFetchAndParseServerSideStreamingRequest".
   services/tika/pbtika/tika.proto:37:3:"tika.FetchAndParseReply" is used as the request or response type for multiple RPCs.
   services/tika/pbtika/tika.proto:37:3:"tika.FetchAndParseRequest" is used as the request or response type for multiple RPCs.
   services/tika/pbtika/tika.proto:37:50:RPC request type "FetchAndParseRequest" should be named "FetchAndParseBiDirectionalStreamingRequest" or "TikaFetchAndParseBiDirectionalStreamingRequest".
   services/tika/pbtika/tika.proto:42:10:Field name "fetcherClass" should be lower_snake_case, such as "fetcher_class".
   services/tika/pbtika/tika.proto:52:10:Field name "fetcherClass" should be lower_snake_case, such as "fetcher_class".
   services/tika/pbtika/tika.proto:61:10:Field name "fetcherName" should be lower_snake_case, such as "fetcher_name".
   services/tika/pbtika/tika.proto:62:10:Field name "fetchKey" should be lower_snake_case, such as "fetch_key".
   services/tika/pbtika/tika.proto:67:10:Field name "fetchKey" should be lower_snake_case, such as "fetch_key".
   services/tika/pbtika/tika.proto:85:10:Field name "fetcherClass" should be lower_snake_case, such as "fetcher_class".
   services/tika/pbtika/tika.proto:90:9:Field name "pageNumber" should be lower_snake_case, such as "page_number".
   services/tika/pbtika/tika.proto:91:9:Field name "numFetchersPerPage" should be lower_snake_case, such as "num_fetchers_per_page".
   services/tika/pbtika/tika.proto:95:28:Field name "getFetcherReply" should be lower_snake_case, such as "get_fetcher_reply".
   Generating protobufs for ./proto/pbingest
   services/tika/pbtika/tika.proto:29:9:Service name "Tika" should be suffixed with "Service".
   services/tika/pbtika/tika.proto:35:3:"tika.FetchAndParseReply" is used as the request or response type for multiple RPCs.
   services/tika/pbtika/tika.proto:35:3:"tika.FetchAndParseRequest" is used as the request or response type for multiple RPCs.
   services/tika/pbtika/tika.proto:36:3:"tika.FetchAndParseReply" is used as the request or response type for multiple RPCs.
   services/tika/pbtika/tika.proto:36:3:"tika.FetchAndParseRequest" is used as the request or response type for multiple RPCs.
   services/tika/pbtika/tika.proto:36:40:RPC request type "FetchAndParseRequest" should be named "FetchAndParseServerSideStreamingRequest" or "TikaFetchAndParseServerSideStreamingRequest".
   services/tika/pbtika/tika.proto:37:3:"tika.FetchAndParseReply" is used as the request or response type for multiple RPCs.
   services/tika/pbtika/tika.proto:37:3:"tika.FetchAndParseRequest" is used as the request or response type for multiple RPCs.
   services/tika/pbtika/tika.proto:37:50:RPC request type "FetchAndParseRequest" should be named "FetchAndParseBiDirectionalStreamingRequest" or "TikaFetchAndParseBiDirectionalStreamingRequest".
   services/tika/pbtika/tika.proto:42:10:Field name "fetcherClass" should be lower_snake_case, such as "fetcher_class".
   services/tika/pbtika/tika.proto:52:10:Field name "fetcherClass" should be lower_snake_case, such as "fetcher_class".
   services/tika/pbtika/tika.proto:61:10:Field name "fetcherName" should be lower_snake_case, such as "fetcher_name".
   services/tika/pbtika/tika.proto:62:10:Field name "fetchKey" should be lower_snake_case, such as "fetch_key".
   services/tika/pbtika/tika.proto:67:10:Field name "fetchKey" should be lower_snake_case, such as "fetch_key".
   services/tika/pbtika/tika.proto:85:10:Field name "fetcherClass" should be lower_snake_case, such as "fetcher_class".
   services/tika/pbtika/tika.proto:90:9:Field name "pageNumber" should be lower_snake_case, such as "page_number".
   services/tika/pbtika/tika.proto:91:9:Field name "numFetchersPerPage" should be lower_snake_case, such as "num_fetchers_per_page".
   services/tika/pbtika/tika.proto:95:28:Field name "getFetcherReply" should be lower_snake_case, such as "get_fetcher_reply".
   Generating protobufs for ./services/tika/pbtika
   services/tika/pbtika/tika.proto:29:9:Service name "Tika" should be suffixed with "Service".
   services/tika/pbtika/tika.proto:35:3:"tika.FetchAndParseReply" is used as the request or response type for multiple RPCs.
   services/tika/pbtika/tika.proto:35:3:"tika.FetchAndParseRequest" is used as the request or response type for multiple RPCs.
   services/tika/pbtika/tika.proto:36:3:"tika.FetchAndParseReply" is used as the request or response type for multiple RPCs.
   services/tika/pbtika/tika.proto:36:3:"tika.FetchAndParseRequest" is used as the request or response type for multiple RPCs.
   services/tika/pbtika/tika.proto:36:40:RPC request type "FetchAndParseRequest" should be named "FetchAndParseServerSideStreamingRequest" or "TikaFetchAndParseServerSideStreamingRequest".
   services/tika/pbtika/tika.proto:37:3:"tika.FetchAndParseReply" is used as the request or response type for multiple RPCs.
   services/tika/pbtika/tika.proto:37:3:"tika.FetchAndParseRequest" is used as the request or response type for multiple RPCs.
   services/tika/pbtika/tika.proto:37:50:RPC request type "FetchAndParseRequest" should be named "FetchAndParseBiDirectionalStreamingRequest" or "TikaFetchAndParseBiDirectionalStreamingRequest".
   services/tika/pbtika/tika.proto:42:10:Field name "fetcherClass" should be lower_snake_case, such as "fetcher_class".
   services/tika/pbtika/tika.proto:52:10:Field name "fetcherClass" should be lower_snake_case, such as "fetcher_class".
   services/tika/pbtika/tika.proto:61:10:Field name "fetcherName" should be lower_snake_case, such as "fetcher_name".
   services/tika/pbtika/tika.proto:62:10:Field name "fetchKey" should be lower_snake_case, such as "fetch_key".
   services/tika/pbtika/tika.proto:67:10:Field name "fetchKey" should be lower_snake_case, such as "fetch_key".
   services/tika/pbtika/tika.proto:85:10:Field name "fetcherClass" should be lower_snake_case, such as "fetcher_class".
   services/tika/pbtika/tika.proto:90:9:Field name "pageNumber" should be lower_snake_case, such as "page_number".
   services/tika/pbtika/tika.proto:91:9:Field name "numFetchersPerPage" should be lower_snake_case, such as "num_fetchers_per_page".
   services/tika/pbtika/tika.proto:95:28:Field name "getFetcherReply" should be lower_snake_case, such as "get_fetcher_reply".
   ```
   
   The [buf linter is pretty aggressive](https://buf.build/docs/lint/rules but I appreciate it for that. Here's the rules I've set:
   
   ```
   lint:
     use:
       - DEFAULT
     except:
       - PACKAGE_VERSION_SUFFIX
       - RPC_RESPONSE_STANDARD_NAME
       - PACKAGE_DIRECTORY_MATCH
     rpc_allow_google_protobuf_empty_responses: true
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@tika.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] TIKA-4181 - Tika Pipes Grpc Server [tika]

Posted by "nddipiazza (via GitHub)" <gi...@apache.org>.
nddipiazza commented on code in PR #1702:
URL: https://github.com/apache/tika/pull/1702#discussion_r1545858447


##########
tika-pipes/tika-grpc/src/main/proto/tika.proto:
##########
@@ -0,0 +1,90 @@
+// Copyright 2015 The gRPC Authors
+//
+// Licensed under the Apache License, Version 2.0 (the "License");
+// you may not use this file except in compliance with the License.
+// You may obtain a copy of the License at
+//
+//     http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+syntax = "proto3";
+
+option java_multiple_files = true;
+option java_package = "org.apache.tika";
+option java_outer_classname = "TikaProto";
+option objc_class_prefix = "HLW";
+
+package tika;
+
+service Tika {
+  rpc CreateFetcher(CreateFetcherRequest) returns (CreateFetcherReply) {}
+  rpc UpdateFetcher(UpdateFetcherRequest) returns (UpdateFetcherReply) {}
+  rpc GetFetcher(GetFetcherRequest) returns (GetFetcherReply) {}
+  rpc ListFetchers(ListFetchersRequest) returns (ListFetchersReply) {}
+  rpc DeleteFetcher(DeleteFetcherRequest) returns (DeleteFetcherReply) {}
+  rpc FetchAndParse(FetchAndParseRequest) returns (FetchAndParseReply) {}
+  rpc FetchAndParseServerSideStreaming(FetchAndParseRequest) returns (stream FetchAndParseReply) {}
+  rpc FetchAndParseBiDirectionalStreaming(stream FetchAndParseRequest) returns (stream FetchAndParseReply) {}
+}
+
+message CreateFetcherRequest {
+  string name = 1;

Review Comment:
   yes we are using "name" as the ID. @tballison any thoughts here? maybe we should rename that for 3.x



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@tika.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] TIKA-4181 - Tika Pipes Grpc Server [tika]

Posted by "bartek (via GitHub)" <gi...@apache.org>.
bartek commented on code in PR #1702:
URL: https://github.com/apache/tika/pull/1702#discussion_r1546232766


##########
tika-pipes/tika-grpc/src/main/proto/tika.proto:
##########
@@ -0,0 +1,92 @@
+// Copyright 2015 The gRPC Authors
+//
+// Licensed under the Apache License, Version 2.0 (the "License");
+// you may not use this file except in compliance with the License.
+// You may obtain a copy of the License at
+//
+//     http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+syntax = "proto3";
+package tika;
+
+option java_multiple_files = true;
+option java_package = "org.apache.tika";
+option java_outer_classname = "TikaProto";
+option objc_class_prefix = "HLW";
+
+
+service Tika   {
+  rpc CreateFetcher(CreateFetcherRequest) returns (CreateFetcherReply) {}

Review Comment:
   Could we document these RPCs to understand high level behaviour? For example, if I try to create a fetcher which already exists, what is the expected reply? Is it an error response on the RPC, will CreateFetchReply have error identifying information?
   
   If CreateFetcher was made idempotent, could we collapse these into a single RPC (UpdateFetcher), which either creates, or updates, or noops (no changes despite call) to the Fetcher?
   
   Don't want to over complicate the Tika side of course, but curious if we can improve the client interface.



##########
tika-pipes/tika-grpc/src/main/proto/tika.proto:
##########
@@ -0,0 +1,92 @@
+// Copyright 2015 The gRPC Authors
+//
+// Licensed under the Apache License, Version 2.0 (the "License");
+// you may not use this file except in compliance with the License.
+// You may obtain a copy of the License at
+//
+//     http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+syntax = "proto3";
+package tika;
+
+option java_multiple_files = true;
+option java_package = "org.apache.tika";
+option java_outer_classname = "TikaProto";
+option objc_class_prefix = "HLW";
+
+
+service Tika   {
+  rpc CreateFetcher(CreateFetcherRequest) returns (CreateFetcherReply) {}
+  rpc UpdateFetcher(UpdateFetcherRequest) returns (UpdateFetcherReply) {}
+  rpc GetFetcher(GetFetcherRequest) returns (GetFetcherReply) {}

Review Comment:
   Similar to above regarding documentation, it would be great to understand what happens if I try to get a fetcher which does not exist. Is there a distinct error, or do I simply get an empty GetFetcherReply?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@tika.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] TIKA-4181 - Tika Pipes Grpc Server [tika]

Posted by "bartek (via GitHub)" <gi...@apache.org>.
bartek commented on code in PR #1702:
URL: https://github.com/apache/tika/pull/1702#discussion_r1544981545


##########
tika-pipes/tika-grpc/src/main/proto/tika.proto:
##########


Review Comment:
   For your consideration @nddipiazza, I ran `buf lint` on this protobuf (as I am syncing it to a local repository for development purposes) and here's the report:
   
   ```
   services/tika/pbtika/tika.proto:29:9:Service name "Tika" should be suffixed with "Service".
   services/tika/pbtika/tika.proto:35:3:"tika.FetchAndParseReply" is used as the request or response type for multiple RPCs.
   services/tika/pbtika/tika.proto:35:3:"tika.FetchAndParseRequest" is used as the request or response type for multiple RPCs.
   services/tika/pbtika/tika.proto:36:3:"tika.FetchAndParseReply" is used as the request or response type for multiple RPCs.
   services/tika/pbtika/tika.proto:36:3:"tika.FetchAndParseRequest" is used as the request or response type for multiple RPCs.
   services/tika/pbtika/tika.proto:36:40:RPC request type "FetchAndParseRequest" should be named "FetchAndParseServerSideStreamingRequest" or "TikaFetchAndParseServerSideStreamingRequest".
   services/tika/pbtika/tika.proto:37:3:"tika.FetchAndParseReply" is used as the request or response type for multiple RPCs.
   services/tika/pbtika/tika.proto:37:3:"tika.FetchAndParseRequest" is used as the request or response type for multiple RPCs.
   services/tika/pbtika/tika.proto:37:50:RPC request type "FetchAndParseRequest" should be named "FetchAndParseBiDirectionalStreamingRequest" or "TikaFetchAndParseBiDirectionalStreamingRequest".
   services/tika/pbtika/tika.proto:42:10:Field name "fetcherClass" should be lower_snake_case, such as "fetcher_class".
   services/tika/pbtika/tika.proto:52:10:Field name "fetcherClass" should be lower_snake_case, such as "fetcher_class".
   services/tika/pbtika/tika.proto:61:10:Field name "fetcherName" should be lower_snake_case, such as "fetcher_name".
   services/tika/pbtika/tika.proto:62:10:Field name "fetchKey" should be lower_snake_case, such as "fetch_key".
   services/tika/pbtika/tika.proto:67:10:Field name "fetchKey" should be lower_snake_case, such as "fetch_key".
   services/tika/pbtika/tika.proto:85:10:Field name "fetcherClass" should be lower_snake_case, such as "fetcher_class".
   services/tika/pbtika/tika.proto:90:9:Field name "pageNumber" should be lower_snake_case, such as "page_number".
   services/tika/pbtika/tika.proto:91:9:Field name "numFetchersPerPage" should be lower_snake_case, such as "num_fetchers_per_page".
   services/tika/pbtika/tika.proto:95:28:Field name "getFetcherReply" should be lower_snake_case, such as "get_fetcher_reply".
   Generating protobufs for ./proto/pbingest
   services/tika/pbtika/tika.proto:29:9:Service name "Tika" should be suffixed with "Service".
   services/tika/pbtika/tika.proto:35:3:"tika.FetchAndParseReply" is used as the request or response type for multiple RPCs.
   services/tika/pbtika/tika.proto:35:3:"tika.FetchAndParseRequest" is used as the request or response type for multiple RPCs.
   services/tika/pbtika/tika.proto:36:3:"tika.FetchAndParseReply" is used as the request or response type for multiple RPCs.
   services/tika/pbtika/tika.proto:36:3:"tika.FetchAndParseRequest" is used as the request or response type for multiple RPCs.
   services/tika/pbtika/tika.proto:36:40:RPC request type "FetchAndParseRequest" should be named "FetchAndParseServerSideStreamingRequest" or "TikaFetchAndParseServerSideStreamingRequest".
   services/tika/pbtika/tika.proto:37:3:"tika.FetchAndParseReply" is used as the request or response type for multiple RPCs.
   services/tika/pbtika/tika.proto:37:3:"tika.FetchAndParseRequest" is used as the request or response type for multiple RPCs.
   services/tika/pbtika/tika.proto:37:50:RPC request type "FetchAndParseRequest" should be named "FetchAndParseBiDirectionalStreamingRequest" or "TikaFetchAndParseBiDirectionalStreamingRequest".
   services/tika/pbtika/tika.proto:42:10:Field name "fetcherClass" should be lower_snake_case, such as "fetcher_class".
   services/tika/pbtika/tika.proto:52:10:Field name "fetcherClass" should be lower_snake_case, such as "fetcher_class".
   services/tika/pbtika/tika.proto:61:10:Field name "fetcherName" should be lower_snake_case, such as "fetcher_name".
   services/tika/pbtika/tika.proto:62:10:Field name "fetchKey" should be lower_snake_case, such as "fetch_key".
   services/tika/pbtika/tika.proto:67:10:Field name "fetchKey" should be lower_snake_case, such as "fetch_key".
   services/tika/pbtika/tika.proto:85:10:Field name "fetcherClass" should be lower_snake_case, such as "fetcher_class".
   services/tika/pbtika/tika.proto:90:9:Field name "pageNumber" should be lower_snake_case, such as "page_number".
   services/tika/pbtika/tika.proto:91:9:Field name "numFetchersPerPage" should be lower_snake_case, such as "num_fetchers_per_page".
   services/tika/pbtika/tika.proto:95:28:Field name "getFetcherReply" should be lower_snake_case, such as "get_fetcher_reply".
   Generating protobufs for ./services/tika/pbtika
   services/tika/pbtika/tika.proto:29:9:Service name "Tika" should be suffixed with "Service".
   services/tika/pbtika/tika.proto:35:3:"tika.FetchAndParseReply" is used as the request or response type for multiple RPCs.
   services/tika/pbtika/tika.proto:35:3:"tika.FetchAndParseRequest" is used as the request or response type for multiple RPCs.
   services/tika/pbtika/tika.proto:36:3:"tika.FetchAndParseReply" is used as the request or response type for multiple RPCs.
   services/tika/pbtika/tika.proto:36:3:"tika.FetchAndParseRequest" is used as the request or response type for multiple RPCs.
   services/tika/pbtika/tika.proto:36:40:RPC request type "FetchAndParseRequest" should be named "FetchAndParseServerSideStreamingRequest" or "TikaFetchAndParseServerSideStreamingRequest".
   services/tika/pbtika/tika.proto:37:3:"tika.FetchAndParseReply" is used as the request or response type for multiple RPCs.
   services/tika/pbtika/tika.proto:37:3:"tika.FetchAndParseRequest" is used as the request or response type for multiple RPCs.
   services/tika/pbtika/tika.proto:37:50:RPC request type "FetchAndParseRequest" should be named "FetchAndParseBiDirectionalStreamingRequest" or "TikaFetchAndParseBiDirectionalStreamingRequest".
   services/tika/pbtika/tika.proto:42:10:Field name "fetcherClass" should be lower_snake_case, such as "fetcher_class".
   services/tika/pbtika/tika.proto:52:10:Field name "fetcherClass" should be lower_snake_case, such as "fetcher_class".
   services/tika/pbtika/tika.proto:61:10:Field name "fetcherName" should be lower_snake_case, such as "fetcher_name".
   services/tika/pbtika/tika.proto:62:10:Field name "fetchKey" should be lower_snake_case, such as "fetch_key".
   services/tika/pbtika/tika.proto:67:10:Field name "fetchKey" should be lower_snake_case, such as "fetch_key".
   services/tika/pbtika/tika.proto:85:10:Field name "fetcherClass" should be lower_snake_case, such as "fetcher_class".
   services/tika/pbtika/tika.proto:90:9:Field name "pageNumber" should be lower_snake_case, such as "page_number".
   services/tika/pbtika/tika.proto:91:9:Field name "numFetchersPerPage" should be lower_snake_case, such as "num_fetchers_per_page".
   services/tika/pbtika/tika.proto:95:28:Field name "getFetcherReply" should be lower_snake_case, such as "get_fetcher_reply".
   ```
   
   The [buf linter is pretty aggressive](https://buf.build/docs/lint/rules) but I appreciate it for that. Here's the rules I've set:
   
   ```
   lint:
     use:
       - DEFAULT
     except:
       - PACKAGE_VERSION_SUFFIX
       - RPC_RESPONSE_STANDARD_NAME
       - PACKAGE_DIRECTORY_MATCH
     rpc_allow_google_protobuf_empty_responses: true
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@tika.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org