You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@tajo.apache.org by eminency <gi...@git.apache.org> on 2016/01/12 08:35:16 UTC
[GitHub] tajo pull request: TAJO-2052: Upgrading ORC reader version
GitHub user eminency opened a pull request:
https://github.com/apache/tajo/pull/937
TAJO-2052: Upgrading ORC reader version
Code is refined, and it is based on presto-orc-0.132.
* Base data structure is changed from Vector to Block.
* Some bugs are fixed.
* Compatibility with Hive is improved.
* Unluckily, there is no speed improvement.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/eminency/tajo orc_upgrade
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/tajo/pull/937.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #937
----
commit 4b9b992fcdbfba0901145a6df494a12a96d0a974
Author: Jongyoung Park <em...@gmail.com>
Date: 2016-01-12T03:26:46Z
ORC version upgraded
commit 3c562d3eea0b421604167695e8151c9f1adc7085
Author: Jongyoung Park <em...@gmail.com>
Date: 2016-01-12T05:32:12Z
Types added
commit 696a8162056314c7975c2961917f080ed9943cf9
Author: Jongyoung Park <em...@gmail.com>
Date: 2016-01-12T07:19:34Z
comment modified
commit 9a2b54b381f35317ad71b9f5102f48279878edc8
Author: Jongyoung Park <em...@gmail.com>
Date: 2016-01-12T07:28:13Z
The document is refined
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] tajo pull request: TAJO-2052: Upgrading ORC reader version
Posted by blrunner <gi...@git.apache.org>.
Github user blrunner commented on a diff in the pull request:
https://github.com/apache/tajo/pull/937#discussion_r51085103
--- Diff: tajo-storage/tajo-storage-hdfs/src/main/java/org/apache/tajo/storage/thirdparty/orc/HdfsOrcDataSource.java ---
@@ -114,9 +117,9 @@ public void readFully(long position, byte[] buffer, int bufferOffset, int buffer
buffers.put(mergedRange, buffer);
}
- ImmutableMap.Builder<K, Slice> slices = ImmutableMap.builder();
+ ImmutableMap.Builder<K, FixedLengthSliceInput> slices = ImmutableMap.builder();
for (Entry<K, DiskRange> entry : diskRanges.entrySet()) {
--- End diff --
If you use lambda expression, following codes would be simple.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] tajo pull request: TAJO-2052: Upgrading ORC reader version
Posted by eminency <gi...@git.apache.org>.
Github user eminency commented on the pull request:
https://github.com/apache/tajo/pull/937#issuecomment-175999530
@blrunner
It's already reported issue. Refer this:
https://issues.apache.org/jira/browse/TAJO-1929
I will fix it after this PR because it needs ORC upgrade.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] tajo pull request: TAJO-2052: Upgrading ORC reader version
Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:
https://github.com/apache/tajo/pull/937
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] tajo pull request: TAJO-2052: Upgrading ORC reader version
Posted by blrunner <gi...@git.apache.org>.
Github user blrunner commented on the pull request:
https://github.com/apache/tajo/pull/937#issuecomment-176010907
Ok, I understood your comments. Could you check my trivial comment?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] tajo pull request: TAJO-2052: Upgrading ORC reader version
Posted by blrunner <gi...@git.apache.org>.
Github user blrunner commented on the pull request:
https://github.com/apache/tajo/pull/937#issuecomment-175968378
@eminency
Thank you for your contribution.
When using MySQLStore, this PR runs successfully. But when using HiveCatalogStore, it throws NoClassDefFoundError as following:
```
2016-01-28 13:25:53,025 ERROR org.apache.tajo.master.GlobalEngine:
Stack Trace:
java.lang.NoClassDefFoundError: com/facebook/presto/hive/protobuf/CodedInputStream
at com.facebook.presto.orc.metadata.OrcMetadataReader.readPostScript(OrcMetadataReader.java:48)
at com.facebook.presto.orc.OrcReader.<init>(OrcReader.java:99)
at org.apache.tajo.storage.orc.ORCScanner.init(ORCScanner.java:136)
at org.apache.tajo.engine.planner.physical.SeqScanExec.initScanner(SeqScanExec.java:286)
at org.apache.tajo.engine.planner.physical.SeqScanExec.init(SeqScanExec.java:191)
at org.apache.tajo.engine.planner.physical.PartitionMergeScanExec.initScanExecutors(PartitionMergeScanExec.java:80)
at org.apache.tajo.engine.planner.physical.PartitionMergeScanExec.init(PartitionMergeScanExec.java:67)
```
For the reference, I added the installed directory of apache hive 1.2.1 to tajo-env.sh file.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] tajo pull request: TAJO-2052: Upgrading ORC reader version
Posted by eminency <gi...@git.apache.org>.
Github user eminency commented on the pull request:
https://github.com/apache/tajo/pull/937#issuecomment-176605183
@blrunner
I fixed it. Please verify if it fits in your intention.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] tajo pull request: TAJO-2052: Upgrading ORC reader version
Posted by blrunner <gi...@git.apache.org>.
Github user blrunner commented on the pull request:
https://github.com/apache/tajo/pull/937#issuecomment-177064844
+1
LGTM! I'll commit this PR soon. :)
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---