You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@parquet.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2018/06/04 15:36:00 UTC

[jira] [Commented] (PARQUET-1311) Update README.md

    [ https://issues.apache.org/jira/browse/PARQUET-1311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16500386#comment-16500386 ] 

ASF GitHub Bot commented on PARQUET-1311:
-----------------------------------------

zivanfi closed pull request #487: PARQUET-1311: Update README.md
URL: https://github.com/apache/parquet-mr/pull/487
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/README.md b/README.md
index f084f5075..4b6b96a87 100644
--- a/README.md
+++ b/README.md
@@ -20,9 +20,9 @@
 Parquet MR [![Build Status](https://travis-ci.org/apache/parquet-mr.svg?branch=master)](http://travis-ci.org/apache/parquet-mr)
 ======
 
-Parquet-MR contains the java implementation of the [Parquet format](https://github.com/apache/parquet-format). 
+Parquet-MR contains the java implementation of the [Parquet format](https://github.com/apache/parquet-format).
 Parquet is a columnar storage format for Hadoop; it provides efficient storage and encoding of data.
-Parquet uses the [record shredding and assembly algorithm](https://github.com/Parquet/parquet-mr/wiki/The-striping-and-assembly-algorithms-from-the-Dremel-paper) described in the Dremel paper to represent nested structures.
+Parquet uses the [record shredding and assembly algorithm](https://github.com/julienledem/redelm/wiki/The-striping-and-assembly-algorithms-from-the-Dremel-paper) described in the Dremel paper to represent nested structures.
 
 You can find some details about the format and intended use cases in our [Hadoop Summit 2013 presentation](http://www.slideshare.net/julienledem/parquet-hadoop-summit-2013)
 
@@ -49,11 +49,11 @@ sudo ldconfig
 To build and install the thrift compiler, run:
 
 ```
-wget -nv http://archive.apache.org/dist/thrift/0.7.0/thrift-0.7.0.tar.gz
-tar xzf thrift-0.7.0.tar.gz
-cd thrift-0.7.0
+wget -nv http://archive.apache.org/dist/thrift/0.9.3/thrift-0.9.3.tar.gz
+tar xzf thrift-0.9.3.tar.gz
+cd thrift-0.9.3
 chmod +x ./configure
-./configure --disable-gen-erl --disable-gen-hs --without-ruby --without-haskell --without-erlang
+./configure --disable-gen-erl --disable-gen-hs --without-ruby --without-haskell --without-erlang --without-php --without-nodejs
 sudo make install
 ```
 
@@ -67,31 +67,29 @@ LC_ALL=C mvn clean install
 
 ## Features
 
-Parquet is a very active project, and new features are being added quickly; below is the state as of June 2013.
-
-
-<table>
-  <tr><th>Feature</th><th>In trunk</th><th>In dev</th><th>Planned</th><th>Expected release</th></tr>
-  <tr><td>Type-specific encoding</td><td>YES</td><td></td></td><td></td><td>1.0</td></tr>
-  <tr><td>Hive integration</td><td>YES (<a href ="https://github.com/Parquet/parquet-mr/pull/28">28</a>)</td><td></td></td><td></td><td>1.0</td></tr>
-  <tr><td>Pig integration</td><td>YES</td><td></td></td><td></td><td>1.0</td></tr>
-  <tr><td>Cascading integration</td><td>YES</td><td></td></td><td></td><td>1.0</td></tr>
-  <tr><td>Crunch integration</td><td>YES (<a href ="https://issues.apache.org/jira/browse/CRUNCH-277">CRUNCH-277</a>)</td><td></td></td><td></td><td>1.0</td></tr>
-  <tr><td>Impala integration</td><td>YES (non-nested)</td><td></td></td><td></td><td>1.0</td></tr>
-  <tr><td>Java Map/Reduce API</td><td>YES</td><td></td></td><td></td><td>1.0</td></tr>
-  <tr><td>Native Avro support</td><td>YES</td><td></td></td><td></td><td>1.0</td></tr>
-  <tr><td>Native Thrift support</td><td>YES</td><td></td></td><td></td><td>1.0</td></tr>
-  <tr><td>Complex structure support</td><td>YES</td><td></td></td><td></td><td>1.0</td></tr>
-  <tr><td>Future-proofed versioning</td><td>YES</td><td></td></td><td></td><td>1.0</td></tr>
-  <tr><td>RLE</td><td>YES</td><td></td></td><td></td><td>1.0</td></tr>
-  <tr><td>Bit Packing</td><td>YES</td><td></td></td><td></td><td>1.0</td></tr>
-  <tr><td>Adaptive dictionary encoding</td><td>YES</td><td></td></td><td></td><td>1.0</td></tr>
-  <tr><td>Predicate pushdown</td><td>YES (<a href ="https://github.com/Parquet/parquet-mr/pull/68">68</a>)</td><td></td></td><td></td><td>1.0</td></tr>
-  <tr><td>Column stats</td><td>YES</td><td></td></td><td></td><td>2.0</td></tr>  
-  <tr><td>Delta encoding</td><td>YES</td><td></td></td><td></td><td>2.0</td></tr>
-  <tr><td>Native Protocol Buffers support</td><td>YES</td><td></td><td></td><td>1.0</td></tr>
-  <tr><td>Index pages</td><td></td><td></td></td><td>YES</td><td>2.0</td></tr>
-</table>
+Parquet is a very active project, and new features are being added quickly. Here are a few features:
+
+
+* Type-specific encoding
+* Hive integration
+* Pig integration
+* Cascading integration
+* Crunch integration
+* Apache Arrow integration
+* Apache Scrooge integration
+* Impala integration (non-nested)
+* Java Map/Reduce API
+* Native Avro support
+* Native Thrift support
+* Native Protocol Buffers support
+* Complex structure support
+* Run-length encoding (RLE)
+* Bit Packing
+* Adaptive dictionary encoding
+* Predicate pushdown
+* Column stats
+* Delta encoding
+* Index pages
 
 ## Map/Reduce integration
 
@@ -138,46 +136,44 @@ Hive integration is provided via the [parquet-hive](https://github.com/apache/pa
 
 ## Build
 
-to run the unit tests:
-mvn test
+To run the unit tests: `mvn test`
 
-to build the jars:
-mvn package
+To build the jars: `mvn package`
 
 The build runs in [Travis CI](http://travis-ci.org/apache/parquet-mr):
 [![Build Status](https://travis-ci.org/apache/parquet-mr.svg?branch=master)](http://travis-ci.org/apache/parquet-mr)
 
 ## Add Parquet as a dependency in Maven
-The current release is version `1.8.1`
+The current release is version `1.10.0`
 
 ```xml
   <dependencies>
     <dependency>
       <groupId>org.apache.parquet</groupId>
       <artifactId>parquet-common</artifactId>
-      <version>1.8.1</version>
+      <version>1.10.0</version>
     </dependency>
     <dependency>
       <groupId>org.apache.parquet</groupId>
       <artifactId>parquet-encoding</artifactId>
-      <version>1.8.1</version>
+      <version>1.10.0</version>
     </dependency>
     <dependency>
       <groupId>org.apache.parquet</groupId>
       <artifactId>parquet-column</artifactId>
-      <version>1.8.1</version>
+      <version>1.10.0</version>
     </dependency>
     <dependency>
       <groupId>org.apache.parquet</groupId>
       <artifactId>parquet-hadoop</artifactId>
-      <version>1.8.1</version>
+      <version>1.10.0</version>
     </dependency>
   </dependencies>
 ```
 
 ### How To Contribute
 
-We prefer to receive contributions in the form of GitHub pull requests. Please send pull requests against the [github.com/apache/parquet-mr](https://github.com/apache/parquet-mr) repository. If you've previously forked Parquet from its old location, you will need to add a remote or update your origin remote to https://github.com/apache/parquet-mr.git
+We prefer to receive contributions in the form of GitHub pull requests. Please send pull requests against the [parquet-mr](https://github.com/apache/parquet-mr) Git repository. If you've previously forked Parquet from its old location, you will need to add a remote or update your origin remote to https://github.com/apache/parquet-mr.git
 
 If you are looking for some ideas on what to contribute, check out jira issues for this project labeled ["pick-me-up"](https://issues.apache.org/jira/browse/PARQUET-5?jql=project%20%3D%20PARQUET%20and%20labels%20%3D%20pick-me-up%20and%20status%20%3D%20open).
 Comment on the issue and/or contact [dev@parquet.apache.org](http://mail-archives.apache.org/mod_mbox/parquet-dev/) with your questions and ideas.
@@ -189,8 +185,8 @@ To contribute a patch:
   1. Break your work into small, single-purpose patches if possible. It’s much harder to merge in a large change with a lot of disjoint features.
   2. Create a JIRA for your patch on the [Parquet Project JIRA](https://issues.apache.org/jira/browse/PARQUET).
   3. Submit the patch as a GitHub pull request against the master branch. For a tutorial, see the GitHub guides on forking a repo and sending a pull request. Prefix your pull request name with the JIRA name (ex: https://github.com/apache/parquet-mr/pull/240).
-  4. Make sure that your code passes the unit tests. You can run the tests with `mvn test` in the root directory. 
-  5. Add new unit tests for your code. 
+  4. Make sure that your code passes the unit tests. You can run the tests with `mvn test` in the root directory.
+  5. Add new unit tests for your code.
 
 We tend to do fairly close readings of pull requests, and you may get a lot of comments. Some common issues that are not code structure related, but still important:
   * Use 2 spaces for whitespace. Not tabs, not 4 spaces. The number of the spacing shall be 2.
@@ -212,11 +208,11 @@ We hold ourselves and the Parquet developer community to two codes of conduct:
   2. [The Twitter OSS Code of Conduct](https://github.com/twitter/code-of-conduct/blob/master/code-of-conduct.md)
 
 ## Discussions
-* Mailing list: [dev@parquet.apache.org](http://mail-archives.apache.org/mod_mbox/parquet-dev/) 
+* Mailing list: [dev@parquet.apache.org](http://mail-archives.apache.org/mod_mbox/parquet-dev/)
 * Bug trackter: [jira](https://issues.apache.org/jira/browse/PARQUET)
 * Discussions also take place in github pull requests
 
 ## License
 
 Licensed under the Apache License, Version 2.0: http://www.apache.org/licenses/LICENSE-2.0
-See also: 
+See also:
diff --git a/dev/README.md b/dev/README.md
index 8fe30e077..b984b117a 100644
--- a/dev/README.md
+++ b/dev/README.md
@@ -27,7 +27,7 @@ Merging a pull request requires being a committer on the project.
 have an apache and apache-github remote setup
 ```
 git remote add apache-github https://github.com/apache/parquet-mr.git
-git remote add apache https://git-wip-us.apache.org/repos/asf/parquet-mr.git
+git remote add apache https://gitbox.apache.org/repos/asf?p=parquet-mr.git
 ```
 run the following command
 ```
@@ -50,7 +50,7 @@ source	repo/branch
 target	master
 url	https://api.github.com/repos/apache/parquet-mr/pulls/X
 
-Proceed with merging pull request #3? (y/n): 
+Proceed with merging pull request #3? (y/n):
 ```
 If this looks good, type y and hit enter.
 ```


 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


> Update README.md
> ----------------
>
>                 Key: PARQUET-1311
>                 URL: https://issues.apache.org/jira/browse/PARQUET-1311
>             Project: Parquet
>          Issue Type: Bug
>          Components: parquet-mr
>            Reporter: Nandor Kollar
>            Assignee: Nandor Kollar
>            Priority: Minor
>
> parquet-mr documentation is not up to date:
>  * points to broken URLs
>  * tells to install old Thrift version (while it uses newer)
>  * current version is 1.8.1, but 1.10.0 is already released



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)