You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "abandy (via GitHub)" <gi...@apache.org> on 2023/04/02 03:54:32 UTC
[GitHub] [arrow] abandy opened a new pull request, #34842: [Swift]Initial arrow reader impl
abandy opened a new pull request, #34842:
URL: https://github.com/apache/arrow/pull/34842
Initial check in for the swift arrow reader impl and fixes found during reader testing.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] kou commented on a diff in pull request #34842: GH-34858: [Swift] Initial reader impl
Posted by "kou (via GitHub)" <gi...@apache.org>.
kou commented on code in PR #34842:
URL: https://github.com/apache/arrow/pull/34842#discussion_r1181079541
##########
swift/Arrow/swift-datagen:
##########
Review Comment:
Could you remove this file?
Then we can merge this pull request.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] kou commented on a diff in pull request #34842: GH-34858: [Swift] Initial reader impl
Posted by "kou (via GitHub)" <gi...@apache.org>.
kou commented on code in PR #34842:
URL: https://github.com/apache/arrow/pull/34842#discussion_r1179861325
##########
swift/Arrow/swift-datagen:
##########
Review Comment:
Could you remove this from this repository and build it on CI time instead?
I don't want to add auto generated files as much as possible.
##########
swift/Arrow/README.md:
##########
@@ -47,3 +47,10 @@ An implementation of Arrow targeting Swift.
- Data Types
- Fields
- Schema
+
+## Test data generation
+
+Test data files for the reader tests are generated by an executable built in go whose source is included in the data-generator directory.
+```sh
Review Comment:
```suggestion
```console
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] abandy commented on a diff in pull request #34842: GH-34858: [Swift] Initial reader impl
Posted by "abandy (via GitHub)" <gi...@apache.org>.
abandy commented on code in PR #34842:
URL: https://github.com/apache/arrow/pull/34842#discussion_r1180479976
##########
swift/Arrow/swift-datagen:
##########
Review Comment:
Do you know if we will need to add golang to the swift docker image for this?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] abandy commented on a diff in pull request #34842: GH-34858: [Swift] Initial reader impl
Posted by "abandy (via GitHub)" <gi...@apache.org>.
abandy commented on code in PR #34842:
URL: https://github.com/apache/arrow/pull/34842#discussion_r1170597326
##########
swift/gen-flatbuffers.sh:
##########
@@ -0,0 +1,26 @@
+#!/usr/bin/env bash
+
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements. See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership. The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied. See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
Review Comment:
Will update.
##########
swift/Arrow/swift-datagen:
##########
Review Comment:
Generated with golang
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] abandy commented on a diff in pull request #34842: GH-34858: [Swift] Initial reader impl
Posted by "abandy (via GitHub)" <gi...@apache.org>.
abandy commented on code in PR #34842:
URL: https://github.com/apache/arrow/pull/34842#discussion_r1181088721
##########
swift/Arrow/swift-datagen:
##########
Review Comment:
Good catch, will do.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] abandy commented on a diff in pull request #34842: GH-34858: [Swift] Initial reader impl
Posted by "abandy (via GitHub)" <gi...@apache.org>.
abandy commented on code in PR #34842:
URL: https://github.com/apache/arrow/pull/34842#discussion_r1157128294
##########
swift/Arrow/Tests/ArrowTests/testdata.arrow:
##########
Review Comment:
I created them using python but can switch to C++ and build executable.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] kou commented on a diff in pull request #34842: GH-34858: [Swift] Initial reader impl
Posted by "kou (via GitHub)" <gi...@apache.org>.
kou commented on code in PR #34842:
URL: https://github.com/apache/arrow/pull/34842#discussion_r1158086446
##########
swift/Arrow/Sources/Arrow/ArrowReader.swift:
##########
@@ -0,0 +1,163 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements. See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership. The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License. You may obtain a copy of the License at
+//
+// http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied. See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+import FlatBuffers
+import Foundation
+
+public enum ArrowError: Error {
+ case runtimeError(String)
+}
+
+let FILEMARKER = "ARROW1"
+let CONTINUATIONMARKER = -1
+
+public class ArrowReader {
Review Comment:
> This class contains a method for fromFile and fromStream.
I see. I misread the implementation. Sorry. I think that it's OK with the current implementation as the first step. We can improve it in follow-up pull requests.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] kou merged pull request #34842: GH-34858: [Swift] Initial reader impl
Posted by "kou (via GitHub)" <gi...@apache.org>.
kou merged PR #34842:
URL: https://github.com/apache/arrow/pull/34842
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] abandy commented on a diff in pull request #34842: GH-34858: [Swift] Initial reader impl
Posted by "abandy (via GitHub)" <gi...@apache.org>.
abandy commented on code in PR #34842:
URL: https://github.com/apache/arrow/pull/34842#discussion_r1161043867
##########
swift/Arrow/Tests/ArrowTests/testdata.arrow:
##########
Review Comment:
Do we need to install python into the docker image to build the test data or will it run before the image is built and be pulled into the image
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] abandy commented on a diff in pull request #34842: GH-34858: [Swift] Initial reader impl
Posted by "abandy (via GitHub)" <gi...@apache.org>.
abandy commented on code in PR #34842:
URL: https://github.com/apache/arrow/pull/34842#discussion_r1170598996
##########
swift/gen-flatbuffers.sh:
##########
@@ -0,0 +1,26 @@
+#!/usr/bin/env bash
+
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements. See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership. The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied. See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+pushd Arrow/Sources/Arrow/
+flatc --swift ../../../../format/Message.fbs
+flatc --swift ../../../../format/Schema.fbs
+flatc --swift ../../../../format/SparseTensor.fbs
+flatc --swift ../../../../format/Tensor.fbs
+flatc --swift ../../../../format/File.fbs
+popd
Review Comment:
Will do.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] abandy commented on a diff in pull request #34842: GH-34858: [Swift] Initial reader impl
Posted by "abandy (via GitHub)" <gi...@apache.org>.
abandy commented on code in PR #34842:
URL: https://github.com/apache/arrow/pull/34842#discussion_r1170682882
##########
swift/Arrow/swift-datagen:
##########
Review Comment:
I actually didn't keep the code :). This is only temporary until we get the writer working, so maybe this is ok until then?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] kou commented on a diff in pull request #34842: GH-34858: [Swift] Initial reader impl
Posted by "kou (via GitHub)" <gi...@apache.org>.
kou commented on code in PR #34842:
URL: https://github.com/apache/arrow/pull/34842#discussion_r1170558285
##########
swift/Arrow/swift-datagen:
##########
Review Comment:
How did you generate this binary?
##########
swift/gen-flatbuffers.sh:
##########
@@ -0,0 +1,26 @@
+#!/usr/bin/env bash
+
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements. See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership. The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied. See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
Review Comment:
FYI: We can check "failed to execute a command" and "referred a nonexistent variable" automatically by `set -eu` (`-e` is for "failed to execute a command" and `-u` is for "referred a nonexistent variable"):
```suggestion
set -eu
```
##########
swift/gen-flatbuffers.sh:
##########
@@ -0,0 +1,26 @@
+#!/usr/bin/env bash
+
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements. See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership. The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied. See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+pushd Arrow/Sources/Arrow/
+flatc --swift ../../../../format/Message.fbs
+flatc --swift ../../../../format/Schema.fbs
+flatc --swift ../../../../format/SparseTensor.fbs
+flatc --swift ../../../../format/Tensor.fbs
+flatc --swift ../../../../format/File.fbs
+popd
Review Comment:
Could you prepend our license header to the generated files?
```suggestion
cat <<HEADER > header.swift
// Licensed to the Apache Software Foundation (ASF) under one
// or more contributor license agreements. See the NOTICE file
// distributed with this work for additional information
// regarding copyright ownership. The ASF licenses this file
// to you under the Apache License, Version 2.0 (the
// "License"); you may not use this file except in compliance
// with the License. You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing,
// software distributed under the License is distributed on an
// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
// KIND, either express or implied. See the License for the
// specific language governing permissions and limitations
// under the License.
HEADER
for generated_swift in *_generated.swift; do
mv ${generated_swift} ${generated_swift}.orig
cat header.swift ${generated_swift}.orig > ${generated_swift}
rm ${generated_swift}.orig
done
rm header.swift
popd
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] abandy commented on a diff in pull request #34842: GH-34858: [Swift] Initial reader impl
Posted by "abandy (via GitHub)" <gi...@apache.org>.
abandy commented on code in PR #34842:
URL: https://github.com/apache/arrow/pull/34842#discussion_r1170596151
##########
swift/Arrow/swift-datagen:
##########
Review Comment:
Generate with golang
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] kou commented on a diff in pull request #34842: GH-34858: [Swift] Initial reader impl
Posted by "kou (via GitHub)" <gi...@apache.org>.
kou commented on code in PR #34842:
URL: https://github.com/apache/arrow/pull/34842#discussion_r1170662738
##########
swift/Arrow/swift-datagen:
##########
Review Comment:
I see.
Could you add source code instead of built binary so that anyone can update the program?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] abandy commented on a diff in pull request #34842: GH-34858: [Swift] Initial reader impl
Posted by "abandy (via GitHub)" <gi...@apache.org>.
abandy commented on code in PR #34842:
URL: https://github.com/apache/arrow/pull/34842#discussion_r1180479016
##########
swift/Arrow/README.md:
##########
@@ -47,3 +47,10 @@ An implementation of Arrow targeting Swift.
- Data Types
- Fields
- Schema
+
+## Test data generation
+
+Test data files for the reader tests are generated by an executable built in go whose source is included in the data-generator directory.
+```sh
Review Comment:
Will update.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] kou commented on a diff in pull request #34842: GH-34858: [Swift] Initial reader impl
Posted by "kou (via GitHub)" <gi...@apache.org>.
kou commented on code in PR #34842:
URL: https://github.com/apache/arrow/pull/34842#discussion_r1170708137
##########
swift/Arrow/swift-datagen:
##########
Review Comment:
Hmm.
Does it mean that your next pull request implements double/bool writer? And the following pull requests implement reader and writer at once instead of implementing reader with test data then implementing writer and remove the test data like this, right?
If so, we can embed the two test data into our test in base64 as you did because they are temporary, we don't update them and we don't add other test data.
If we still keep implementing reader with test data then implementing writer and remove the test data style, I think that we should add source code of test data generator program so that anyone can join the Swift implementation development.
See also: The Apache Way: Open: https://theapacheway.com/open/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] abandy commented on a diff in pull request #34842: GH-34858: [Swift] Initial reader impl
Posted by "abandy (via GitHub)" <gi...@apache.org>.
abandy commented on code in PR #34842:
URL: https://github.com/apache/arrow/pull/34842#discussion_r1157937391
##########
swift/Arrow/Tests/ArrowTests/testdata.arrow:
##########
Review Comment:
Add the test data to the test as base64 encoded strings.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] kou commented on a diff in pull request #34842: GH-34858: [Swift] Initial reader impl
Posted by "kou (via GitHub)" <gi...@apache.org>.
kou commented on code in PR #34842:
URL: https://github.com/apache/arrow/pull/34842#discussion_r1158076786
##########
swift/Arrow/Sources/Arrow/ArrowReader.swift:
##########
@@ -0,0 +1,163 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements. See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership. The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License. You may obtain a copy of the License at
+//
+// http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied. See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+import FlatBuffers
+import Foundation
+
+public enum ArrowError: Error {
+ case runtimeError(String)
+}
+
+let FILEMARKER = "ARROW1"
+let CONTINUATIONMARKER = -1
+
+public class ArrowReader {
Review Comment:
> the file format wraps the stream format with some extra data
Correct.
But the file format can random read:
https://arrow.apache.org/docs/format/Columnar.html#ipc-file-format
> This enables random access to any record batch in the file.
The streaming format needs to read from start to end. Because delta dictionary replacement is supported in the streaming format.
See also: https://arrow.apache.org/docs/format/Columnar.html#ipc-file-format
> In the file format, there is no requirement that dictionary keys should be defined in a DictionaryBatch before they are used in a RecordBatch, as long as the keys are defined somewhere in the file. Further more, it is invalid to have more than one non-delta dictionary batch per dictionary ID (i.e. dictionary replacement is not supported). Delta dictionaries are applied in the order they appear in the file footer.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] abandy commented on a diff in pull request #34842: GH-34858: [Swift] Initial reader impl
Posted by "abandy (via GitHub)" <gi...@apache.org>.
abandy commented on code in PR #34842:
URL: https://github.com/apache/arrow/pull/34842#discussion_r1173235178
##########
swift/Arrow/swift-datagen:
##########
Review Comment:
Gotcha, good idea, I will add test data generator source code under swift/data-generator/swift-datagen and add a section to the README on how to build it.
I do plan to add the writer for double/bools in the next PR but having the golang test data will be a good way to test the reader impl.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] kou commented on a diff in pull request #34842: GH-34858: [Swift] Initial reader impl
Posted by "kou (via GitHub)" <gi...@apache.org>.
kou commented on code in PR #34842:
URL: https://github.com/apache/arrow/pull/34842#discussion_r1156400638
##########
swift/Arrow/Sources/Arrow/File_generated.swift:
##########
Review Comment:
Should we add these generated files to this repository?
Can we generate them in build time instead of adding them?
##########
swift/Arrow/Tests/ArrowTests/testdata.arrow:
##########
Review Comment:
How did you create these test data?
If you created them by using other Apache Arrow implementation such as C++ implementation, how about adding a program that generates them and generating them in test time instead of adding these test data?
##########
swift/Arrow/Sources/Arrow/ChunkedArray.swift:
##########
@@ -42,8 +45,32 @@ public class ChunkedArray<T> {
self.nullCount = nullCount
}
- subscript(_ index: UInt) -> ArrowArray<T> {
- return arrays[Int(index)]
+ public subscript(_ index: UInt) -> T? {
+ if arrays.count == 0 {
+ return nil
+ }
+
+ var localIndex = index
+ var arrayIndex = 0;
+ var len: UInt = arrays[arrayIndex].length
+ while(localIndex > len) {
Review Comment:
Is a space missing here?
```suggestion
while (localIndex > len) {
```
##########
swift/Arrow/Sources/Arrow/ArrowReader.swift:
##########
@@ -0,0 +1,163 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements. See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership. The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License. You may obtain a copy of the License at
+//
+// http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied. See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+import FlatBuffers
+import Foundation
+
+public enum ArrowError: Error {
+ case runtimeError(String)
+}
+
+let FILEMARKER = "ARROW1"
+let CONTINUATIONMARKER = -1
+
+public class ArrowReader {
Review Comment:
We have "streaming format" https://arrow.apache.org/docs/format/Columnar.html#ipc-streaming-format and "file format" https://arrow.apache.org/docs/format/Columnar.html#ipc-file-format .
It seems that this is a reader for "file format".
We may want to add "file" to this class name such as `ArrowFileReader`.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] abandy commented on a diff in pull request #34842: GH-34858: [Swift] Initial reader impl
Posted by "abandy (via GitHub)" <gi...@apache.org>.
abandy commented on code in PR #34842:
URL: https://github.com/apache/arrow/pull/34842#discussion_r1157939806
##########
swift/Arrow/Sources/Arrow/File_generated.swift:
##########
Review Comment:
I misread the intent of this change. I believe these generated files need to be included in the repo. If someone wants to use this lib in there code then they would reference this package in their package.swift file and would need these generated files to be part of the code base.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] abandy commented on a diff in pull request #34842: GH-34858: [Swift] Initial reader impl
Posted by "abandy (via GitHub)" <gi...@apache.org>.
abandy commented on code in PR #34842:
URL: https://github.com/apache/arrow/pull/34842#discussion_r1157130160
##########
swift/Arrow/Sources/Arrow/ChunkedArray.swift:
##########
@@ -42,8 +45,32 @@ public class ChunkedArray<T> {
self.nullCount = nullCount
}
- subscript(_ index: UInt) -> ArrowArray<T> {
- return arrays[Int(index)]
+ public subscript(_ index: UInt) -> T? {
+ if arrays.count == 0 {
+ return nil
+ }
+
+ var localIndex = index
+ var arrayIndex = 0;
+ var len: UInt = arrays[arrayIndex].length
+ while(localIndex > len) {
Review Comment:
Good catch! I will remove the parenthesis.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] ursabot commented on pull request #34842: GH-34858: [Swift] Initial reader impl
Posted by "ursabot (via GitHub)" <gi...@apache.org>.
ursabot commented on PR #34842:
URL: https://github.com/apache/arrow/pull/34842#issuecomment-1528933462
['Python', 'R'] benchmarks have high level of regressions.
[test-mac-arm](https://conbench.ursa.dev/compare/runs/14a7045811694c979985a5eb84833db6...10e4d724382a4301a13c7aa0ce29ecc7/)
[ursa-i9-9960x](https://conbench.ursa.dev/compare/runs/e16d1045eab54f8ea631aaef6c113eaa...e977f7ce25ff471b8479da32a205df65/)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] kou commented on a diff in pull request #34842: GH-34858: [Swift] Initial reader impl
Posted by "kou (via GitHub)" <gi...@apache.org>.
kou commented on code in PR #34842:
URL: https://github.com/apache/arrow/pull/34842#discussion_r1181068739
##########
swift/data-generator/swift-datagen/go.sum:
##########
Review Comment:
Could you apply this to accept no-license header for this file?
```diff
diff --git a/dev/release/rat_exclude_files.txt b/dev/release/rat_exclude_files.txt
index d37790912..f61c21776 100644
--- a/dev/release/rat_exclude_files.txt
+++ b/dev/release/rat_exclude_files.txt
@@ -149,3 +149,4 @@ r/tools/nixlibs-allowlist.txt
.gitattributes
ruby/red-arrow/.yardopts
.github/pull_request_template.md
+swift/data-generator/swift-datagen/go.sum
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] kou commented on a diff in pull request #34842: GH-34858: [Swift] Initial reader impl
Posted by "kou (via GitHub)" <gi...@apache.org>.
kou commented on code in PR #34842:
URL: https://github.com/apache/arrow/pull/34842#discussion_r1161076001
##########
swift/Arrow/Tests/ArrowTests/testdata.arrow:
##########
Review Comment:
Could you use the former approach?
If we use the former approach, we don't need PyArrow available environment on host.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] abandy commented on a diff in pull request #34842: GH-34858: [Swift] Initial reader impl
Posted by "abandy (via GitHub)" <gi...@apache.org>.
abandy commented on code in PR #34842:
URL: https://github.com/apache/arrow/pull/34842#discussion_r1157937391
##########
swift/Arrow/Tests/ArrowTests/testdata.arrow:
##########
Review Comment:
Added the test data to the test as base64 encoded strings. We can remove this data and generate it once the FileWriter has been implemented.
##########
swift/Arrow/Tests/ArrowTests/testdata.arrow:
##########
Review Comment:
Added the test data to the test as base64 encoded strings. We can remove this data and generate it once the FileWriter has been implemented?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] kou commented on a diff in pull request #34842: GH-34858: [Swift] Initial reader impl
Posted by "kou (via GitHub)" <gi...@apache.org>.
kou commented on code in PR #34842:
URL: https://github.com/apache/arrow/pull/34842#discussion_r1158069619
##########
swift/Arrow/Tests/ArrowTests/testdata.arrow:
##########
Review Comment:
> Added the test data to the test as base64 encoded strings.
It'll work for now but it will not work (too large data) when we have more supported patterns.
How about `generate_test_data.py` or something and generating test data in test time?
> We can remove this data and generate it once the FileWriter has been implemented?
Yes.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] github-actions[bot] commented on pull request #34842: GH-34858: [SWIFT] Initial arrow reader impl
Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on PR #34842:
URL: https://github.com/apache/arrow/pull/34842#issuecomment-1494272330
* Closes: #34858
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] abandy commented on a diff in pull request #34842: GH-34858: [Swift] Initial reader impl
Posted by "abandy (via GitHub)" <gi...@apache.org>.
abandy commented on code in PR #34842:
URL: https://github.com/apache/arrow/pull/34842#discussion_r1157937095
##########
swift/Arrow/Sources/Arrow/File_generated.swift:
##########
Review Comment:
Moved the contents of the file into the test as base64 encoded strings.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] abandy commented on a diff in pull request #34842: GH-34858: [Swift] Initial reader impl
Posted by "abandy (via GitHub)" <gi...@apache.org>.
abandy commented on code in PR #34842:
URL: https://github.com/apache/arrow/pull/34842#discussion_r1161106518
##########
swift/Arrow/Tests/ArrowTests/testdata.arrow:
##########
Review Comment:
Gotcha, will do.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] github-actions[bot] commented on pull request #34842: GH-34858: [SWIFT] Initial arrow reader impl
Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on PR #34842:
URL: https://github.com/apache/arrow/pull/34842#issuecomment-1494272408
:warning: GitHub issue #34858 **has been automatically assigned in GitHub** to PR creator.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] abandy commented on a diff in pull request #34842: GH-34858: [Swift] Initial reader impl
Posted by "abandy (via GitHub)" <gi...@apache.org>.
abandy commented on code in PR #34842:
URL: https://github.com/apache/arrow/pull/34842#discussion_r1159736248
##########
swift/Arrow/Tests/ArrowTests/testdata.arrow:
##########
Review Comment:
Will do.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] abandy commented on a diff in pull request #34842: GH-34858: [Swift] Initial reader impl
Posted by "abandy (via GitHub)" <gi...@apache.org>.
abandy commented on code in PR #34842:
URL: https://github.com/apache/arrow/pull/34842#discussion_r1157132310
##########
swift/Arrow/Sources/Arrow/ArrowReader.swift:
##########
@@ -0,0 +1,163 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements. See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership. The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License. You may obtain a copy of the License at
+//
+// http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied. See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+import FlatBuffers
+import Foundation
+
+public enum ArrowError: Error {
+ case runtimeError(String)
+}
+
+let FILEMARKER = "ARROW1"
+let CONTINUATIONMARKER = -1
+
+public class ArrowReader {
Review Comment:
This class contains a method for fromFile and fromStream. AFAIK the file format wraps the stream format with some extra data, so the fromFile method basically just unwraps the Stream formatted message and sends that to the fromStream method. Does that sound correct?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] ursabot commented on pull request #34842: GH-34858: [Swift] Initial reader impl
Posted by "ursabot (via GitHub)" <gi...@apache.org>.
ursabot commented on PR #34842:
URL: https://github.com/apache/arrow/pull/34842#issuecomment-1528932288
Benchmark runs are scheduled for baseline = 16dbd98e8f525884649e5a33096c74a455f64610 and contender = 0ea1a103dfc2fc4ab5a5d839369a805e2ae657a2. 0ea1a103dfc2fc4ab5a5d839369a805e2ae657a2 is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Finished :arrow_down:25.0% :arrow_up:0.0%] [ec2-t3-xlarge-us-east-2](https://conbench.ursa.dev/compare/runs/620273bd7942471f9e7acedb2af76abc...fe87885954c5458baf3d19f5f5c745e1/)
[Failed :arrow_down:1.31% :arrow_up:0.47%] [test-mac-arm](https://conbench.ursa.dev/compare/runs/14a7045811694c979985a5eb84833db6...10e4d724382a4301a13c7aa0ce29ecc7/)
[Failed :arrow_down:12.27% :arrow_up:0.0%] [ursa-i9-9960x](https://conbench.ursa.dev/compare/runs/e16d1045eab54f8ea631aaef6c113eaa...e977f7ce25ff471b8479da32a205df65/)
[Finished :arrow_down:0.99% :arrow_up:0.36%] [ursa-thinkcentre-m75q](https://conbench.ursa.dev/compare/runs/899737922d6f4ac783aaaa96bdc030a2...c7510f47499147138bc4d1b6fb890100/)
Buildkite builds:
[Finished] [`0ea1a103` ec2-t3-xlarge-us-east-2](https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ec2-t3-xlarge-us-east-2/builds/2798)
[Failed] [`0ea1a103` test-mac-arm](https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-test-mac-arm/builds/2832)
[Failed] [`0ea1a103` ursa-i9-9960x](https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ursa-i9-9960x/builds/2796)
[Finished] [`0ea1a103` ursa-thinkcentre-m75q](https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ursa-thinkcentre-m75q/builds/2823)
[Finished] [`16dbd98e` ec2-t3-xlarge-us-east-2](https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ec2-t3-xlarge-us-east-2/builds/2797)
[Failed] [`16dbd98e` test-mac-arm](https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-test-mac-arm/builds/2831)
[Failed] [`16dbd98e` ursa-i9-9960x](https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ursa-i9-9960x/builds/2795)
[Finished] [`16dbd98e` ursa-thinkcentre-m75q](https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ursa-thinkcentre-m75q/builds/2822)
Supported benchmarks:
ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
test-mac-arm: Supported benchmark langs: C++, Python, R
ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] abandy commented on a diff in pull request #34842: GH-34858: [Swift] Initial reader impl
Posted by "abandy (via GitHub)" <gi...@apache.org>.
abandy commented on code in PR #34842:
URL: https://github.com/apache/arrow/pull/34842#discussion_r1181068298
##########
swift/Arrow/swift-datagen:
##########
Review Comment:
Cool, I have updated the swift image to install golang and updated the script to build swift-datagen.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] abandy commented on pull request #34842: GH-34858: [Swift] Initial reader impl
Posted by "abandy (via GitHub)" <gi...@apache.org>.
abandy commented on PR #34842:
URL: https://github.com/apache/arrow/pull/34842#issuecomment-1513136678
@kou I think I made all the requested updates. Please review again and approve if all looks good?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] github-actions[bot] commented on pull request #34842: [Swift]Initial arrow reader impl
Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on PR #34842:
URL: https://github.com/apache/arrow/pull/34842#issuecomment-1493215809
<!--
Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied. See the License for the
specific language governing permissions and limitations
under the License.
-->
Thanks for opening a pull request!
If this is not a [minor PR](https://github.com/apache/arrow/blob/main/CONTRIBUTING.md#Minor-Fixes). Could you open an issue for this pull request on GitHub? https://github.com/apache/arrow/issues/new/choose
Opening GitHub issues ahead of time contributes to the [Openness](http://theapacheway.com/open/#:~:text=Openness%20allows%20new%20users%20the,must%20happen%20in%20the%20open.) of the Apache Arrow project.
Then could you also rename the pull request title in the following format?
GH-${GITHUB_ISSUE_ID}: [${COMPONENT}] ${SUMMARY}
or
MINOR: [${COMPONENT}] ${SUMMARY}
In the case of PARQUET issues on JIRA the title also supports:
PARQUET-${JIRA_ISSUE_ID}: [${COMPONENT}] ${SUMMARY}
See also:
* [Other pull requests](https://github.com/apache/arrow/pulls/)
* [Contribution Guidelines - How to contribute patches](https://arrow.apache.org/docs/developers/contributing.html#how-to-contribute-patches)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] abandy commented on a diff in pull request #34842: GH-34858: [Swift] Initial reader impl
Posted by "abandy (via GitHub)" <gi...@apache.org>.
abandy commented on code in PR #34842:
URL: https://github.com/apache/arrow/pull/34842#discussion_r1181074608
##########
swift/data-generator/swift-datagen/go.sum:
##########
Review Comment:
Will do.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] kou commented on a diff in pull request #34842: GH-34858: [Swift] Initial reader impl
Posted by "kou (via GitHub)" <gi...@apache.org>.
kou commented on code in PR #34842:
URL: https://github.com/apache/arrow/pull/34842#discussion_r1180933056
##########
swift/Arrow/swift-datagen:
##########
Review Comment:
Yes. I think so.
I think that we can use `go run swift-datagen` instead of `./swift-datagen` in `ci/scripts/swift_test.sh`.
See also: https://pkg.go.dev/cmd/go
If it's difficult for you, I can help it.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] abandy commented on pull request #34842: GH-34858: [Swift] Initial reader impl
Posted by "abandy (via GitHub)" <gi...@apache.org>.
abandy commented on PR #34842:
URL: https://github.com/apache/arrow/pull/34842#issuecomment-1526287573
Hi @kou, I hope all is well. I believe I have all the changes in. The failed test doesn't seem related to this change. Please review when you get a chance.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] abandy commented on pull request #34842: GH-34858: [Swift] Initial reader impl
Posted by "abandy (via GitHub)" <gi...@apache.org>.
abandy commented on PR #34842:
URL: https://github.com/apache/arrow/pull/34842#issuecomment-1529645523
Thank you @kou!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] abandy commented on a diff in pull request #34842: GH-34858: [Swift] Initial reader impl
Posted by "abandy (via GitHub)" <gi...@apache.org>.
abandy commented on code in PR #34842:
URL: https://github.com/apache/arrow/pull/34842#discussion_r1157126522
##########
swift/Arrow/Sources/Arrow/File_generated.swift:
##########
Review Comment:
I guess we could generate them during build/test time.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] abandy commented on a diff in pull request #34842: GH-34858: [Swift] Initial reader impl
Posted by "abandy (via GitHub)" <gi...@apache.org>.
abandy commented on code in PR #34842:
URL: https://github.com/apache/arrow/pull/34842#discussion_r1159734067
##########
swift/Arrow/Sources/Arrow/File_generated.swift:
##########
Review Comment:
Will do.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org