You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@parquet.apache.org by Sally Khudairi <sk...@apache.org> on 2015/04/24 22:17:25 UTC

FINAL CALL: Apache Parquet TLP announcement [was Re: Graduation blog post?]

Hello again, everyone --below is the latest draft.

Please review and forward any changes/additions no later than 5PM ET on Sunday in order for us to announce on Monday morning. I was aiming to go live by 7AM ET if that works for you. 

Kindly confirm. 

Thanks in advance,
Sally

= = =

DRAFT :: NOT FOR DISTRIBUTION

The Apache Software Foundation Announces Apache™ Parquet™ as a Top-Level Project 

Open Source storage format for the Apache™ Hadoop® ecosystem in use at Cloudera, NASA, Netflix, Stripe and Twitter, among other organizations 

Forest Hill, MD –27 April 2015– The Apache Software Foundation (ASF), the all-volunteer developers, stewards, and incubators of more than 350 Open Source projects and initiatives, announced today that Apache™ Parquet™ has graduated from the Apache Incubator to become a Top-Level Project (TLP), signifying that the project's community and products have been well-governed under the ASF's meritocratic process and principles. 

"The incubation process at Apache has been fantastic and really the last step of making Parquet a community driven standard fully integrated within the greater Hadoop ecosystem," said Julien Le Dem, Vice President of Apache Parquet. 

Apache Parquet is an Open Source columnar storage format for the Apache™ Hadoop® ecosystem, built to work across programming languages and much more: 


 - processing frameworks (MapReduce, Apache Spark, Scalding, Cascading, Crunch, Kite) 
 - data models (Apache Avro, Apache Thrift, Protocol Buffers, POJOs) 
 - query engines (Apache Hive, Impala, HAWQ, Apache Drill, Apache Tajo, Apache Pig, Presto, Apache Spark SQL) 

"At Twitter, Parquet has helped us scale our big data usage by in some cases reducing storage requirements by one third on large datasets as well as scan and deserialization time. This translated into hardware savings as well as reduced latency for accessing the data. Furthermore, Parquet being integrated with so many tools creates opportunities and flexibility regarding query engines," said Chris Aniszczyk, Head of Open Source at Twitter. "Finally, it's just fantastic to see it graduate to a top-level project and we look forward to further collaborating with the Apache Parquet community to continually improve performance." 

"Parquet’s integration with other object models, like Avro and Thrift, has been a key feature for our customers," said Ryan Blue, Software Engineer at Cloudera. "They can take advantage of columnar storage without changing the classes they already use in their production applications." 

"At Netflix, Parquet is the primary storage format for data warehousing. More than 7 petabytes of our 10+ Petabyte warehouse is Parquet formatted data that we query across a wide range of tools including Apache Hive, Apache Pig, Apache Spark, PigPen, Presto, and native MapReduce. The performance benefit of columnar projection and statistics is a game changer for our big data platform," said Daniel Weeks, Software Engineer at Netflix. "We look forward to working with the Apache community to advance the state of big data storage with Parquet and are excited to see the project graduate to full Apache status." 

"Stripe's data warehouse has been built on Parquet from the beginning," said Avi Bryant, Engineering Manager at Stripe. "Every aspect of our pipeline, from data import to machine learning to adhoc SQL analysis, uses Apache Parquet as the common interchange format." 

"I was extremely happy to see Parquet arrive as an Incubator project," said Chris Mattmann, Apache Parquet Incubator Mentor, and Chief Architect, Instrument and Science Data Systems Section at NASA Jet Propulsion Laboratory. "After talking with some in its community there was a real match with this columnar data format technology and its community with the way that we do things here at the ASF. Parquet has had an exemplar Incubation, and the project has big things ahead of it. I am encouraging my Data Science Team at NASA to evaluate it for data representation especially as it relates to our science holdings in Earth, planetary and space sciences, and astrophysics." 

The Apache Parquet project welcomes contributions and community participation through mailing lists, face-to-face MeetUps, and user events. For more information, visit http://parquet.apache.org/community/ 

Availability and Oversight 
Apache Parquet software is released under the Apache License v2.0 and is overseen by a self-selected team of active contributors to the project. A Project Management Committee (PMC) guides the Project's day-to-day operations, including community development and product releases. For downloads, documentation, and ways to become involved with Apache Parquet, visit http://parquet.apache.org/ and https://twitter.com/ApacheParquet 

About the Apache Incubator 
The Apache Incubator is the entry path for projects and codebases wishing to become part of the efforts at The Apache Software Foundation. All code donations from external organizations and existing external projects wishing to join the ASF enter through the Incubator to: 1) ensure all donations are in accordance with the ASF legal standards; and 2) develop new communities that adhere to our guiding principles. Incubation is required of all newly accepted projects until a further review indicates that the infrastructure, communications, and decision making process have stabilized in a manner consistent with other successful ASF projects. While incubation status is not necessarily a reflection of the completeness or stability of the code, it does indicate that the project has yet to be fully endorsed by the ASF. For more information, visit http://incubator.apache.org/. 

About The Apache Software Foundation (ASF) 
Established in 1999, the all-volunteer Foundation oversees more than 350 leading Open Source projects, including Apache HTTP Server --the world's most popular Web server software. Through the ASF's meritocratic process known as "The Apache Way," more than 500 individual Members and 4,500 Committers successfully collaborate to develop freely available enterprise-grade software, benefiting millions of users worldwide: thousands of software solutions are distributed under the Apache License; and the community actively participates in ASF mailing lists, mentoring initiatives, and ApacheCon, the Foundation's official user conference, trainings, and expo. The ASF is a US 501(c)(3) charitable organization, funded by individual donations and corporate sponsors including Bloomberg, Budget Direct, Cerner, Citrix, Cloudera, Comcast, Facebook, Google, Hortonworks, HP, IBM, InMotion Hosting, iSigma, Matt Mullenweg, Microsoft, Pivotal, Produban, WANdisco, and Yahoo. For more information, visit http://www.apache.org/ or follow @TheASF on Twitter. 

© The Apache Software Foundation. "Apache", "Avro", "Apache Avro", "Drill", "Apache Drill", "Hadoop", "Apache Hadoop", "Parquet", "Apache Parquet", "Pig", "Apache Pig", "Spark", "Apache Spark", "Tajo", "Apache Tajo", "Thrift", "Apache Thrift", and "ApacheCon" are registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries. All other brands and trademarks are the property of their respective owners. 

# # # 

[MEDIA CONTACT:SALLY]
________________________________
From: Sally Khudairi <sa...@yahoo.com>
To: Sally Khudairi <sa...@yahoo.com>; Daniel Weeks <dw...@netflix.com>; "dev@parquet.incubator.apache.org" <de...@parquet.incubator.apache.org> 
Cc: Chris Aniszczyk <ca...@gmail.com>; Ryan Blue <bl...@cloudera.com>; "jfarrell@apache.org" <jf...@apache.org>; "Mattmann, Chris A (3980)" <ch...@jpl.nasa.gov>; "press@apache.org" <pr...@apache.org> 
Sent: Friday, 24 April 2015, 13:56
Subject: Re: Graduation blog post?



Done.

ALL: can you please let me know if there are any events that Parquet will be at? Presenting? Hosting? etc. 

Thank you!
 
-Sally





________________________________
From: Sally Khudairi <sa...@yahoo.com>
To: Daniel Weeks <dw...@netflix.com>; "dev@parquet.incubator.apache.org" <de...@parquet.incubator.apache.org> 
Cc: Chris Aniszczyk <ca...@gmail.com>; Ryan Blue <bl...@cloudera.com>; "jfarrell@apache.org" <jf...@apache.org>; "Mattmann, Chris A (3980)" <ch...@jpl.nasa.gov>; "press@apache.org" <pr...@apache.org> 
Sent: Friday, 24 April 2015, 13:40
Subject: Re: Graduation blog post?



Of course --I'll fix that now!

Sorry about that, Daniel.

-Sally
 





________________________________
From: Daniel Weeks <dw...@netflix.com>
To: dev@parquet.incubator.apache.org; Sally Khudairi <sa...@yahoo.com> 
Cc: Chris Aniszczyk <ca...@gmail.com>; Ryan Blue <bl...@cloudera.com>; "jfarrell@apache.org" <jf...@apache.org>; "Mattmann, Chris A (3980)" <ch...@jpl.nasa.gov>; "press@apache.org" <pr...@apache.org> 
Sent: Friday, 24 April 2015, 13:38
Subject: Re: Graduation blog post?



Sally,

Just wanted to comment that my last name is misspelled in the Netflix testimonial.  Can someone fix that?  (it's Weeks, not Week)

Thanks,
Dan




On Fri, Apr 24, 2015 at 10:23 AM, Sally Khudairi <sa...@yahoo.com.invalid> wrote:

Hi everyone --there's been the addition of a quote from Stripe:
>
>"Stripe's data warehouse has been built on Parquet from the beginning," said Avi Bryant, Engineering Manager at Stripe. "Every aspect of our pipeline, from data import to machine learning to adhoc SQL analysis, uses Apache Parquet as the common interchange format."
>
>
>--please note that I added "Apache" to "Parquet" in the second sentence. Stripe has also been added to the sub-head.
>
>Are we waiting for quotes from anyone else? If not, I can add a closing sentence and forward the final copy later today.
>
>Thanks so much,
>Sally
>
>
>
>----- Original Message -----
>
>From: Sally Khudairi <sa...@yahoo.com>
>To: Chris Aniszczyk <ca...@gmail.com>; "dev@parquet.incubator.apache.org" <de...@parquet.incubator.apache.org>
>Cc: Ryan Blue <bl...@cloudera.com>; "jfarrell@apache.org" <jf...@apache.org>; "Mattmann, Chris A (3980)" <ch...@jpl.nasa.gov>; "press@apache.org" <pr...@apache.org>
>Sent: Thursday, 23 April 2015, 15:25
>Subject: Re: Graduation blog post?
>
>Hello everyone --below is the draft thus far.
>
>
>I was aiming to announce on Monday by 7AM ET, but noticed that we're waiting for additional quotes.
>
>Also, should we get a closing quote from Julien? Perhaps something that invites additional community participation?
>
>Please let me know your thoughts.
>
>Thanks so much,
>Sally
>
>= = =
>
>The Apache Software Foundation Announces Apache™ Parquet™ as a Top-Level Project
>
>Open Source storage format for the Apache™ Hadoop® ecosystem in use at Cloudera, NASA, Netflix, and Twitter, among other organizations
>
>Forest Hill, MD –27 April 2015– The Apache Software Foundation (ASF), the all-volunteer developers, stewards, and incubators of more than 350 Open Source projects and initiatives, announced today that Apache™ Parquet™ has graduated from the Apache Incubator to become a Top-Level Project (TLP), signifying that the project's community and products have been well-governed under the ASF's meritocratic process and principles.
>
>"The incubation process at Apache has been fantastic and really the last step of making Parquet a community driven standard fully integrated within the greater Hadoop ecosystem." said Julien Le Dem, Vice President of Apache Parquet.
>
>Apache Parquet is an Open Source columnar storage format for the Apache™ Hadoop® ecosystem, built to work across programming languages and much more:
>- processing frameworks (MapReduce, Apache Spark, Scalding, Cascading, Crunch, Kite)
>- data models (Apache Avro, Apache Thrift, Protocol Buffers, POJOs)
>- query engines (Apache Hive, Impala, HAWQ, Apache Drill, Apache Tajo, Apache Pig, Presto, Apache Spark SQL)
>
>"At Twitter, Parquet has helped us scale our big data usage by in some cases reducing storage requirements by one third on large datasets as well as scan and deserialization time. This translated into hardware savings as well as reduced latency for accessing the data. Furthermore, Parquet being integrated with so many tools creates opportunities and flexibility regarding query engines," said Chris Aniszczyk, Head of Open Source at Twitter. "Finally, it's just fantastic to see it graduate to a top-level project and we look forward to further collaborating with the Apache Parquet community to continually improve performance."
>
>"Parquet’s integration with other object models, like Avro and Thrift, has been a key feature for our customers," said Ryan Blue, Software Engineer at Cloudera. "They can take advantage of columnar storage without changing the classes they already use in their production applications."
>
>"At Netflix, Parquet is the primary storage format for data warehousing. More than 7 petabytes of our 10+ Petabyte warehouse is Parquet formatted data that we query across a wide range of tools including Apache Hive, Apache Pig, Apache Spark, PigPen, Presto, and native MapReduce.  The performance benefit of columnar projection and statistics is a game changer for our big data platform," said Daniel Week, Software Engineer at Netflix. "We look forward to working with the Apache community to advance the state of big data storage with Parquet and are excited to see the project graduate to full Apache status."
>
>"I was extremely happy to see Parquet arrive as an Incubator project," said Chris Mattmann, Apache Parquet Incubator Mentor, and Chief Architect, Instrument and Science Data Systems Section at NASA Jet Propulsion Laboratory. "After talking with some in its community there was a real match with
>this columnar data format technology and its community with the way that we do things here at the ASF. Parquet has had an exemplar Incubation, and the project has big things ahead of it. I am encouraging my Data Science Team at NASA to evaluate it for data representation especially
>as it relates to our science holdings in Earth, planetary and space sciences, and astrophysics."
>
>
>Stripe? @cra reached out to Avi, said he would get something by Monday
>Criteo?
>
>@@CLOSING QUOTE FROM JULIEN?
>
>Availability and Oversight
>Apache Parquet software is released under the Apache License v2.0 and is overseen by a self-selected team of active contributors to the project. A Project Management Committee (PMC) guides the Project's day-to-day operations, including community development and product releases. For downloads, documentation, and ways to become involved with Apache Parquet, visit http://parquet.apache.org/ and https://twitter.com/ApacheParquet
>
>About the Apache Incubator
>The Apache Incubator is the entry path for projects and codebases wishing to become part of the efforts at The Apache Software Foundation. All code donations from external organizations and existing external projects wishing to join the ASF enter through the Incubator to: 1) ensure all donations are in accordance with the ASF legal standards; and 2) develop new communities that adhere to our guiding principles. Incubation is required of all newly accepted projects until a further review indicates that the infrastructure, communications, and decision making process have stabilized in a manner consistent with other successful ASF projects. While incubation status is not necessarily a reflection of the completeness or stability of the code, it does indicate that the project has yet to be fully endorsed by the ASF. For more information, visit http://incubator.apache.org/.
>
>About The Apache Software Foundation (ASF)
>Established in 1999, the all-volunteer Foundation oversees more than 350 leading Open Source projects, including Apache HTTP Server --the world's most popular Web server software. Through the ASF's meritocratic process known as "The Apache Way," more than 500 individual Members and 4,500 Committers successfully collaborate to develop freely available enterprise-grade software, benefiting millions of users worldwide: thousands of software solutions are distributed under the Apache License; and the community actively participates in ASF mailing lists, mentoring initiatives, and ApacheCon, the Foundation's official user conference, trainings, and expo. The ASF is a US 501(c)(3) charitable organization, funded by individual donations and corporate sponsors including Bloomberg, Budget Direct, Cerner, Citrix, Cloudera, Comcast, Facebook, Google, Hortonworks, HP, IBM, InMotion Hosting, iSigma, Matt Mullenweg, Microsoft, Pivotal, Produban, WANdisco, and Yahoo. For more information, visit http://www.apache.org/ or follow @TheASF on Twitter.
>
>© The Apache Software Foundation. "Apache", "Avro", "Apache Avro", "Drill", "Apache Drill", "Hadoop", "Apache Hadoop", "Parquet", "Apache Parquet", "Pig", "Apache Pig", "Spark", "Apache Spark", "Tajo", "Apache Tajo", "Thrift", "Apache Thrift", and "ApacheCon" are registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries. All other brands and trademarks are the property of their respective owners.
>
># # #
>
>
>________________________________
>
>From: Chris Aniszczyk <ca...@gmail.com>
>To: "dev@parquet.incubator.apache.org" <de...@parquet.incubator.apache.org>
>Cc: Sally Khudairi <sa...@yahoo.com>; Ryan Blue <bl...@cloudera.com>; "jfarrell@apache.org" <jf...@apache.org>; "Mattmann, Chris A (3980)" <ch...@jpl.nasa.gov>; "press@apache.org" <pr...@apache.org>
>Sent: Wednesday, 22 April 2015, 14:51
>Subject: Re: Graduation blog post?
>
>
>
>Thanks Daniel, I added your quote.
>
>
>
>
>On Wed, Apr 22, 2015 at 12:14 PM, Daniel Weeks <dw...@netflix.com.invalid> wrote:
>
>Netflix Testimonial:
>>
>>At Netflix, Parquet is the primary storage format for data warehousing.
>>More than 7 petabytes of our 10+ Petabyte warehouse is Parquet formatted
>>data that we query across a wide range of tools including Apache Hive,
>>Apache Pig, Apache Spark, PigPen, Presto, and native MapReduce.  The
>>performance benefit of columnar projection and statistics is a game changer
>>for our big data platform.  We look forward to working with the Apache
>>community to advance the state of big data storage with Parquet and are
>>excited to see the project graduate to full Apache status.
>>
>>Daniel Weeks
>>Engineering Manager - Big Data Compute
>>Neflix
>>
>>
>>On Wed, Apr 22, 2015 at 9:36 AM, Sally Khudairi <
>>sallykhudairi@yahoo.com.invalid> wrote:
>>
>>> Thanks for the draft thus far, Ryan.
>>> Can we please include at least one more industry testimonial?
>>> Also, if you can please provide edit access to my account at
>>> khudairi@gmail.com, that would be great.
>>> Thanks in advance for this!
>>> -Sally
>>>
>>>
>>>       From: Ryan Blue <bl...@cloudera.com>
>>>  To: jfarrell@apache.org; Sally Khudairi <sa...@yahoo.com>
>>> Cc: "Mattmann, Chris A (3980)" <ch...@jpl.nasa.gov>; "
>>> press@apache.org" <pr...@apache.org>; "dev@parquet.incubator.apache.org" <
>>> dev@parquet.incubator.apache.org>
>>>  Sent: Monday, 20 April 2015, 15:48
>>>  Subject: Re: Graduation blog post?
>>>
>>> On 04/20/2015 12:36 PM, Jake Farrell wrote:
>>> > Hey Sally
>>> > i've got root@ karma and will take care of the infra side of things for
>>> > us once the board has successfully voted on our resolution
>>> >
>>> > -Jake
>>>
>>> Thanks, Jake! I've already sent an e-mail to Infra, but I'll follow up
>>> with this news so they don't worry about it.
>>>
>>> rb
>>>
>>>
>>> --
>>> Ryan Blue
>>> Software Engineer
>>> Cloudera, Inc.
>>>
>>>
>>>
>>>
>>
>
>
>--
>
>Cheers,
>
>Chris Aniszczyk
>http://aniszczyk.org
>+1 512 961 6719
>

Re: FINAL CALL: Apache Parquet TLP announcement [was Re: Graduation blog post?]

Posted by Sally Khudairi <sk...@apache.org>.
Hello, everyone --as promised, we are live:


 - NASDAQ Globenewswire http://globenewswire.com/news-release/2015/04/27/728529/10130773/en/The-Apache-Software-Foundation-Announces-Apache-tm-Parquet-tm-as-a-Top-Level-Project.html
 - ASF "Foundation" blog http://s.apache.org/L0H
 - @TheASF Twitter feed https://twitter.com/TheASF/status/592644433813884929

...plus announce@apache.org and our dedicated media/analyst list. This will appear on the apache.org homepage and the mail archives during the next auto-update, which should take place within the hour.

Thanks again for all your help, and congratulations on reaching this milestone!

Warmly,
Sally

________________________________
From: Sally Khudairi <sa...@yahoo.com>
To: Julien Le Dem <ju...@twitter.com> 
Cc: "dev@parquet.incubator.apache.org" <de...@parquet.incubator.apache.org>; Sally Khudairi <sk...@apache.org>; Daniel Weeks <dw...@netflix.com>; Chris Aniszczyk <ca...@gmail.com>; Ryan Blue <bl...@cloudera.com>; "jfarrell@apache.org" <jf...@apache.org>; "Mattmann, Chris A (3980)" <ch...@jpl.nasa.gov>; "press@apache.org" <pr...@apache.org> 
Sent: Sunday, 26 April 2015, 19:22
Subject: Re: FINAL CALL: Apache Parquet TLP announcement [was Re: Graduation blog post?]



Perfect. Thank you, Julien!

I'll confirm once we're live tomorrow morning.

Warmly,
Sally





________________________________
From: Julien Le Dem <ju...@twitter.com>
To: Sally Khudairi <sa...@yahoo.com> 
Cc: "dev@parquet.incubator.apache.org" <de...@parquet.incubator.apache.org>; Sally Khudairi <sk...@apache.org>; Daniel Weeks <dw...@netflix.com>; Chris Aniszczyk <ca...@gmail.com>; Ryan Blue <bl...@cloudera.com>; "jfarrell@apache.org" <jf...@apache.org>; "Mattmann, Chris A (3980)" <ch...@jpl.nasa.gov>; "press@apache.org" <pr...@apache.org> 
Sent: Sunday, 26 April 2015, 19:21
Subject: Re: FINAL CALL: Apache Parquet TLP announcement [was Re: Graduation blog post?]



Sounds good.
Thank you!




On Sunday, April 26, 2015, Sally Khudairi <sa...@yahoo.com> wrote:

Thanks, Julien --I can include that, yes.
>
>Does this work for you?
>
>
><snip>
>
>Catch Apache Parquet in action at the Hadoop Summit, 9-11 June 2015 in San Jose, California. The Apache Parquet project welcomes contributions and community participation through mailing lists, face-to-face MeetUps, and user events. For more information, visit http://parquet.apache.org/community/
>
></snip>
>
>
>Warmest regards,
>Sally
>
>
>
>Did you want to mention the parquet talks at the Hadoop summit in June?
>Otherwise this looks good to me.
>
>
>
>
>On Sunday, April 26, 2015, Sally Khudairi <sa...@yahoo.com.invalid> wrote:
>
>Hi everyone --I haven't received any other feedback, so I think we're all set to announce tomorrow.
>>I'd like to issue the press release at at 7AM ET. I'll confirm when we're live.
>>If there are any showstoppers, please let me know ASAP.
>>Thanks so much,Sally
>>
>>      From: Sally Khudairi <sk...@apache.org>
>> To: Sally Khudairi <sa...@yahoo.com>; Daniel Weeks <dw...@netflix.com>; "dev@parquet.incubator.apache.org" <de...@parquet.incubator.apache.org>
>>Cc: Chris Aniszczyk <ca...@gmail.com>; Ryan Blue <bl...@cloudera.com>; "jfarrell@apache.org" <jf...@apache.org>; "Mattmann, Chris A (3980)" <ch...@jpl.nasa.gov>; "press@apache.org" <pr...@apache.org>
>> Sent: Friday, 24 April 2015, 16:17
>> Subject: FINAL CALL: Apache Parquet TLP announcement [was Re: Graduation blog post?]
>>
>>Hello again, everyone --below is the latest draft.
>>
>>Please review and forward any changes/additions no later than 5PM ET on Sunday in order for us to announce on Monday morning. I was aiming to go live by 7AM ET if that works for you.
>>
>>Kindly confirm.
>>
>>Thanks in advance,
>>Sally
>>
>>= = =
>>
>>DRAFT :: NOT FOR DISTRIBUTION
>>
>>The Apache Software Foundation Announces Apache™ Parquet™ as a Top-Level Project
>>
>>Open Source storage format for the Apache™ Hadoop® ecosystem in use at Cloudera, NASA, Netflix, Stripe and Twitter, among other organizations
>>
>>Forest Hill, MD –27 April 2015– The Apache Software Foundation (ASF), the all-volunteer developers, stewards, and incubators of more than 350 Open Source projects and initiatives, announced today that Apache™ Parquet™ has graduated from the Apache Incubator to become a Top-Level Project (TLP), signifying that the project's community and products have been well-governed under the ASF's meritocratic process and principles.
>>
>>"The incubation process at Apache has been fantastic and really the last step of making Parquet a community driven standard fully integrated within the greater Hadoop ecosystem," said Julien Le Dem, Vice President of Apache Parquet.
>>
>>Apache Parquet is an Open Source columnar storage format for the Apache™ Hadoop® ecosystem, built to work across programming languages and much more:
>>
>>
>> - processing frameworks (MapReduce, Apache Spark, Scalding, Cascading, Crunch, Kite)
>> - data models (Apache Avro, Apache Thrift, Protocol Buffers, POJOs)
>> - query engines (Apache Hive, Impala, HAWQ, Apache Drill, Apache Tajo, Apache Pig, Presto, Apache Spark SQL)
>>
>>"At Twitter, Parquet has helped us scale our big data usage by in some cases reducing storage requirements by one third on large datasets as well as scan and deserialization time. This translated into hardware savings as well as reduced latency for accessing the data. Furthermore, Parquet being integrated with so many tools creates opportunities and flexibility regarding query engines," said Chris Aniszczyk, Head of Open Source at Twitter. "Finally, it's just fantastic to see it graduate to a top-level project and we look forward to further collaborating with the Apache Parquet community to continually improve performance."
>>
>>"Parquet’s integration with other object models, like Avro and Thrift, has been a key feature for our customers," said Ryan Blue, Software Engineer at Cloudera. "They can take advantage of columnar storage without changing the classes they already use in their production applications."
>>
>>"At Netflix, Parquet is the primary storage format for data warehousing. More than 7 petabytes of our 10+ Petabyte warehouse is Parquet formatted data that we query across a wide range of tools including Apache Hive, Apache Pig, Apache Spark, PigPen, Presto, and native MapReduce. The performance benefit of columnar projection and statistics is a game changer for our big data platform," said Daniel Weeks, Software Engineer at Netflix. "We look forward to working with the Apache community to advance the state of big data storage with Parquet and are excited to see the project graduate to full Apache status."
>>
>>"Stripe's data warehouse has been built on Parquet from the beginning," said Avi Bryant, Engineering Manager at Stripe. "Every aspect of our pipeline, from data import to machine learning to adhoc SQL analysis, uses Apache Parquet as the common interchange format."
>>
>>"I was extremely happy to see Parquet arrive as an Incubator project," said Chris Mattmann, Apache Parquet Incubator Mentor, and Chief Architect, Instrument and Science Data Systems Section at NASA Jet Propulsion Laboratory. "After talking with some in its community there was a real match with this columnar data format technology and its community with the way that we do things here at the ASF. Parquet has had an exemplar Incubation, and the project has big things ahead of it. I am encouraging my Data Science Team at NASA to evaluate it for data representation especially as it relates to our science holdings in Earth, planetary and space sciences, and astrophysics."
>>
>>The Apache Parquet project welcomes contributions and community participation through mailing lists, face-to-face MeetUps, and user events. For more information, visit http://parquet.apache.org/community/
>>
>>Availability and Oversight
>>Apache Parquet software is released under the Apache License v2.0 and is overseen by a self-selected team of active contributors to the project. A Project Management Committee (PMC) guides the Project's day-to-day operations, including community development and product releases. For downloads, documentation, and ways to become involved with Apache Parquet, visit http://parquet.apache.org/ and https://twitter.com/ApacheParquet
>>
>>About the Apache Incubator
>>The Apache Incubator is the entry path for projects and codebases wishing to become part of the efforts at The Apache Software Foundation. All code donations from external organizations and existing external projects wishing to join the ASF enter through the Incubator to: 1) ensure all donations are in accordance with the ASF legal standards; and 2) develop new communities that adhere to our guiding principles. Incubation is required of all newly accepted projects until a further review indicates that the infrastructure, communications, and decision making process have stabilized in a manner consistent with other successful ASF projects. While incubation status is not necessarily a reflection of the completeness or stability of the code, it does indicate that the project has yet to be fully endorsed by the ASF. For more information, visit http://incubator.apache.org/.
>>
>>About The Apache Software Foundation (ASF)
>>Established in 1999, the all-volunteer Foundation oversees more than 350 leading Open Source projects, including Apache HTTP Server --the world's most popular Web server software. Through the ASF's meritocratic process known as "The Apache Way," more than 500 individual Members and 4,500 Committers successfully collaborate to develop freely available enterprise-grade software, benefiting millions of users worldwide: thousands of software solutions are distributed under the Apache License; and the community actively participates in ASF mailing lists, mentoring initiatives, and ApacheCon, the Foundation's official user conference, trainings, and expo. The ASF is a US 501(c)(3) charitable organization, funded by individual donations and corporate sponsors including Bloomberg, Budget Direct, Cerner, Citrix, Cloudera, Comcast, Facebook, Google, Hortonworks, HP, IBM, InMotion Hosting, iSigma, Matt Mullenweg, Microsoft, Pivotal, Produban, WANdisco, and Yahoo. For more information, visit http://www.apache.org/ or follow @TheASF on Twitter.
>>
>>© The Apache Software Foundation. "Apache", "Avro", "Apache Avro", "Drill", "Apache Drill", "Hadoop", "Apache Hadoop", "Parquet", "Apache Parquet", "Pig", "Apache Pig", "Spark", "Apache Spark", "Tajo", "Apache Tajo", "Thrift", "Apache Thrift", and "ApacheCon" are registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries. All other brands and trademarks are the property of their respective owners.
>>
>># # #
>>
>>[MEDIA CONTACT:SALLY]
>>________________________________
>>
>>
>>From: Sally Khudairi <sa...@yahoo.com>
>>To: Sally Khudairi <sa...@yahoo.com>; Daniel Weeks <dw...@netflix.com>; "dev@parquet.incubator.apache.org" <de...@parquet.incubator.apache.org>
>>Cc: Chris Aniszczyk <ca...@gmail.com>; Ryan Blue <bl...@cloudera.com>; "jfarrell@apache.org" <jf...@apache.org>; "Mattmann, Chris A (3980)" <ch...@jpl.nasa.gov>; "press@apache.org" <pr...@apache.org>
>>Sent: Friday, 24 April 2015, 13:56
>>Subject: Re: Graduation blog post?
>>
>>
>>
>>Done.
>>
>>ALL: can you please let me know if there are any events that Parquet will be at? Presenting? Hosting? etc.
>>
>>Thank you!
>>
>>-Sally
>>
>>
>>
>>
>>
>>________________________________
>>From: Sally Khudairi <sa...@yahoo.com>
>>To: Daniel Weeks <dw...@netflix.com>; "dev@parquet.incubator.apache.org" <de...@parquet.incubator.apache.org>
>>Cc: Chris Aniszczyk <ca...@gmail.com>; Ryan Blue <bl...@cloudera.com>; "jfarrell@apache.org" <jf...@apache.org>; "Mattmann, Chris A (3980)" <ch...@jpl.nasa.gov>; "press@apache.org" <pr...@apache.org>
>>Sent: Friday, 24 April 2015, 13:40
>>Subject: Re: Graduation blog post?
>>
>>
>>
>>Of course --I'll fix that now!
>>
>>Sorry about that, Daniel.
>>
>>-Sally
>>
>>
>>
>>
>>
>>
>>________________________________
>>From: Daniel Weeks <dw...@netflix.com>
>>To: dev@parquet.incubator.apache.org; Sally Khudairi <sa...@yahoo.com>
>>Cc: Chris Aniszczyk <ca...@gmail.com>; Ryan Blue <bl...@cloudera.com>; "jfarrell@apache.org" <jf...@apache.org>; "Mattmann, Chris A (3980)" <ch...@jpl.nasa.gov>; "press@apache.org" <pr...@apache.org>
>>Sent: Friday, 24 April 2015, 13:38
>>Subject: Re: Graduation blog post?
>>
>>
>>
>>Sally,
>>
>>Just wanted to comment that my last name is misspelled in the Netflix testimonial.  Can someone fix that?  (it's Weeks, not Week)
>>
>>Thanks,
>>Dan
>>
>>
>>
>>
>>On Fri, Apr 24, 2015 at 10:23 AM, Sally Khudairi <sa...@yahoo.com.invalid> wrote:
>>
>>Hi everyone --there's been the addition of a quote from Stripe:
>>>
>>>"Stripe's data warehouse has been built on Parquet from the beginning," said Avi Bryant, Engineering Manager at Stripe. "Every aspect of our pipeline, from data import to machine learning to adhoc SQL analysis, uses Apache Parquet as the common interchange format."
>>>
>>>
>>>--please note that I added "Apache" to "Parquet" in the second sentence. Stripe has also been added to the sub-head.
>>>
>>>Are we waiting for quotes from anyone else? If not, I can add a closing sentence and forward the final copy later today.
>>>
>>>Thanks so much,
>>>Sally
>>>
>>>
>>>
>>>----- Original Message -----
>>>
>>>From: Sally Khudairi <sa...@yahoo.com>
>>>To: Chris Aniszczyk <ca...@gmail.com>; "dev@parquet.incubator.apache.org" <de...@parquet.incubator.apache.org>
>>>Cc: Ryan Blue <bl...@cloudera.com>; "jfarrell@apache.org" <jf...@apache.org>; "Mattmann, Chris A (3980)" <ch...@jpl.nasa.gov>; "press@apache.org" <pr...@apache.org>
>>>Sent: Thursday, 23 April 2015, 15:25
>>>Subject: Re: Graduation blog post?
>>>
>>>Hello everyone --below is the draft thus far.
>>>
>>>
>>>I was aiming to announce on Monday by 7AM ET, but noticed that we're waiting for additional quotes.
>>>
>>>Also, should we get a closing quote from Julien? Perhaps something that invites additional community participation?
>>>
>>>Please let me know your thoughts.
>>>
>>>Thanks so much,
>>>Sally
>>>
>>>= = =
>>>
>>>The Apache Software Foundation Announces Apache™ Parquet™ as a Top-Level Project
>>>
>>>Open Source storage format for the Apache™ Hadoop® ecosystem in use at Cloudera, NASA, Netflix, and Twitter, among other organizations
>>>
>>>Forest Hill, MD –27 April 2015– The Apache Software Foundation (ASF), the all-volunteer developers, stewards, and incubators of more than 350 Open Source projects and initiatives, announced today that Apache™ Parquet™ has graduated from the Apache Incubator to become a Top-Level Project (TLP), signifying that the project's community and products have been well-governed under the ASF's meritocratic process and principles.
>>>
>>>"The incubation process at Apache has been fantastic and really the last step of making Parquet a community driven standard fully integrated within the greater Hadoop ecosystem." said Julien Le Dem, Vice President of Apache Parquet.
>>>
>>>Apache Parquet is an Open Source columnar storage format for the Apache™ Hadoop® ecosystem, built to work across programming languages and much more:
>>>- processing frameworks (MapReduce, Apache Spark, Scalding, Cascading, Crunch, Kite)
>>>- data models (Apache Avro, Apache Thrift, Protocol Buffers, POJOs)
>>>- query engines (Apache Hive, Impala, HAWQ, Apache Drill, Apache Tajo, Apache Pig, Presto, Apache Spark SQL)
>>>
>>>"At Twitter, Parquet has helped us scale our big data usage by in some cases reducing storage requirements by one third on large datasets as well as scan and deserialization time. This translated into hardware savings as well as reduced latency for accessing the data. Furthermore, Parquet being integrated with so many tools creates opportunities and flexibility regarding query engines," said Chris Aniszczyk, Head of Open Source at Twitter. "Finally, it's just fantastic to see it graduate to a top-level project and we look forward to further collaborating with the Apache Parquet community to continually improve performance."
>>>
>>>"Parquet’s integration with other object models, like Avro and Thrift, has been a key feature for our customers," said Ryan Blue, Software Engineer at Cloudera. "They can take advantage of columnar storage without changing the classes they already use in their production applications."
>>>
>>>"At Netflix, Parquet is the primary storage format for data warehousing. More than 7 petabytes of our 10+ Petabyte warehouse is Parquet formatted data that we query across a wide range of tools including Apache Hive, Apache Pig, Apache Spark, PigPen, Presto, and native MapReduce.  The performance benefit of columnar projection and statistics is a game changer for our big data platform," said Daniel Week, Software Engineer at Netflix. "We look forward to working with the Apache community to advance the state of big data storage with Parquet and are excited to see the project graduate to full Apache status."
>>>
>>>"I was extremely happy to see Parquet arrive as an Incubator project," said Chris Mattmann, Apache Parquet Incubator Mentor, and Chief Architect, Instrument and Science Data Systems Section at NASA Jet Propulsion Laboratory. "After talking with some in its community there was a real match with
>>>this columnar data format technology and its community with the way that we do things here at the ASF. Parquet has had an exemplar Incubation, and the project has big things ahead of it. I am encouraging my Data Science Team at NASA to evaluate it for data representation especially
>>>as it relates to our science holdings in Earth, planetary and space sciences, and astrophysics."
>>>
>>>
>>>Stripe? @cra reached out to Avi, said he would get something by Monday
>>>Criteo?
>>>
>>>@@CLOSING QUOTE FROM JULIEN?
>>>
>>>Availability and Oversight
>>>Apache Parquet software is released under the Apache License v2.0 and is overseen by a self-selected team of active contributors to the project. A Project Management Committee (PMC) guides the Project's day-to-day operations, including community development and product releases. For downloads, documentation, and ways to become involved with Apache Parquet, visit http://parquet.apache.org/ and https://twitter.com/ApacheParquet
>>>
>>>About the Apache Incubator
>>>The Apache Incubator is the entry path for projects and codebases wishing to become part of the efforts at The Apache Software Foundation. All code donations from external organizations and existing external projects wishing to join the ASF enter through the Incubator to: 1) ensure all donations are in accordance with the ASF legal standards; and 2) develop new communities that adhere to our guiding principles. Incubation is required of all newly accepted projects until a further review indicates that the infrastructure, communications, and decision making process have stabilized in a manner consistent with other successful ASF projects. While incubation status is not necessarily a reflection of the completeness or stability of the code, it does indicate that the project has yet to be fully endorsed by the ASF. For more information, visit http://incubator.apache.org/.
>>>
>>>About The Apache Software Foundation (ASF)
>>>Established in 1999, the all-volunteer Foundation oversees more than 350 leading Open Source projects, including Apache HTTP Server --the world's most popular Web server software. Through the ASF's meritocratic process known as "The Apache Way," more than 500 individual Members and 4,500 Committers successfully collaborate to develop freely available enterprise-grade software, benefiting millions of users worldwide: thousands of software solutions are distributed under the Apache License; and the community actively participates in ASF mailing lists, mentoring initiatives, and ApacheCon, the Foundation's official user conference, trainings, and expo. The ASF is a US 501(c)(3) charitable organization, funded by individual donations and corporate sponsors including Bloomberg, Budget Direct, Cerner, Citrix, Cloudera, Comcast, Facebook, Google, Hortonworks, HP, IBM, InMotion Hosting, iSigma, Matt Mullenweg, Microsoft, Pivotal, Produban, WANdisco, and Yahoo. For more information, visit http://www.apache.org/ or follow @TheASF on Twitter.
>>>
>>>© The Apache Software Foundation. "Apache", "Avro", "Apache Avro", "Drill", "Apache Drill", "Hadoop", "Apache Hadoop", "Parquet", "Apache Parquet", "Pig", "Apache Pig", "Spark", "Apache Spark", "Tajo", "Apache Tajo", "Thrift", "Apache Thrift", and "ApacheCon" are registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries. All other brands and trademarks are the property of their respective owners.
>>>
>>># # #
>>>
>>>
>>>________________________________
>>>
>>>From: Chris Aniszczyk <ca...@gmail.com>
>>>To: "dev@parquet.incubator.apache.org" <de...@parquet.incubator.apache.org>
>>>Cc: Sally Khudairi <sa...@yahoo.com>; Ryan Blue <bl...@cloudera.com>; "jfarrell@apache.org" <jf...@apache.org>; "Mattmann, Chris A (3980)" <ch...@jpl.nasa.gov>; "press@apache.org" <pr...@apache.org>
>>>Sent: Wednesday, 22 April 2015, 14:51
>>>Subject: Re: Graduation blog post?
>>>
>>>
>>>
>>>Thanks Daniel, I added your quote.
>>>
>>>
>>>
>>>
>>>On Wed, Apr 22, 2015 at 12:14 PM, Daniel Weeks <dw...@netflix.com.invalid> wrote:
>>>
>>>Netflix Testimonial:
>>>>
>>>>At Netflix, Parquet is the primary storage format for data warehousing.
>>>>More than 7 petabytes of our 10+ Petabyte warehouse is Parquet formatted
>>>>data that we query across a wide range of tools including Apache Hive,
>>>>Apache Pig, Apache Spark, PigPen, Presto, and native MapReduce.  The
>>>>performance benefit of columnar projection and statistics is a game changer
>>>>for our big data platform.  We look forward to working with the Apache
>>>>community to advance the state of big data storage with Parquet and are
>>>>excited to see the project graduate to full Apache status.
>>>>
>>>>Daniel Weeks
>>>>Engineering Manager - Big Data Compute
>>>>Neflix
>>>>
>>>>
>>>>On Wed, Apr 22, 2015 at 9:36 AM, Sally Khudairi <
>>>>sallykhudairi@yahoo.com.invalid> wrote:
>>>>
>>>>> Thanks for the draft thus far, Ryan.
>>>>> Can we please include at least one more industry testimonial?
>>>>> Also, if you can please provide edit access to my account at
>>>>> khudairi@gmail.com, that would be great.
>>>>> Thanks in advance for this!
>>>>> -Sally
>>>>>
>>>>>
>>>>>      From: Ryan Blue <bl...@cloudera.com>
>>>>>  To: jfarrell@apache.org; Sally Khudairi <sa...@yahoo.com>
>>>>> Cc: "Mattmann, Chris A (3980)" <ch...@jpl.nasa.gov>; "
>>>>> press@apache.org" <pr...@apache.org>; "dev@parquet.incubator.apache.org" <
>>>>> dev@parquet.incubator.apache.org>
>>>>>  Sent: Monday, 20 April 2015, 15:48
>>>>>  Subject: Re: Graduation blog post?
>>>>>
>>>>> On 04/20/2015 12:36 PM, Jake Farrell wrote:
>>>>> > Hey Sally
>>>>> > i've got root@ karma and will take care of the infra side of things for
>>>>> > us once the board has successfully voted on our resolution
>>>>> >
>>>>> > -Jake
>>>>>
>>>>> Thanks, Jake! I've already sent an e-mail to Infra, but I'll follow up
>>>>> with this news so they don't worry about it.
>>>>>
>>>>> rb
>>>>>
>>>>>
>>>>> --
>>>>> Ryan Blue
>>>>> Software Engineer
>>>>> Cloudera, Inc.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>
>>>
>>>--
>>>
>>>Cheers,
>>>
>>>Chris Aniszczyk
>>>http://aniszczyk.org
>>>+1 512 961 6719
>>>
>>
>>
>

Re: FINAL CALL: Apache Parquet TLP announcement [was Re: Graduation blog post?]

Posted by Sally Khudairi <sa...@yahoo.com.INVALID>.
Perfect. Thank you, Julien!
I'll confirm once we're live tomorrow morning.
Warmly,Sally

      From: Julien Le Dem <ju...@twitter.com>
 To: Sally Khudairi <sa...@yahoo.com> 
Cc: "dev@parquet.incubator.apache.org" <de...@parquet.incubator.apache.org>; Sally Khudairi <sk...@apache.org>; Daniel Weeks <dw...@netflix.com>; Chris Aniszczyk <ca...@gmail.com>; Ryan Blue <bl...@cloudera.com>; "jfarrell@apache.org" <jf...@apache.org>; "Mattmann, Chris A (3980)" <ch...@jpl.nasa.gov>; "press@apache.org" <pr...@apache.org> 
 Sent: Sunday, 26 April 2015, 19:21
 Subject: Re: FINAL CALL: Apache Parquet TLP announcement [was Re: Graduation blog post?]
   
Sounds good.Thank you!



On Sunday, April 26, 2015, Sally Khudairi <sa...@yahoo.com> wrote:

Thanks, Julien --I can include that, yes.

Does this work for you?


<snip>

Catch Apache Parquet in action at the Hadoop Summit, 9-11 June 2015 in San Jose, California. The Apache Parquet project welcomes contributions and community participation through mailing lists, face-to-face MeetUps, and user events. For more information, visit http://parquet.apache.org/community/

</snip>


Warmest regards,
Sally



Did you want to mention the parquet talks at the Hadoop summit in June?
Otherwise this looks good to me.




On Sunday, April 26, 2015, Sally Khudairi <sa...@yahoo.com.invalid> wrote:

Hi everyone --I haven't received any other feedback, so I think we're all set to announce tomorrow.
>I'd like to issue the press release at at 7AM ET. I'll confirm when we're live.
>If there are any showstoppers, please let me know ASAP.
>Thanks so much,Sally
>
>      From: Sally Khudairi <sk...@apache.org>
> To: Sally Khudairi <sa...@yahoo.com>; Daniel Weeks <dw...@netflix.com>; "dev@parquet.incubator.apache.org" <de...@parquet.incubator.apache.org>
>Cc: Chris Aniszczyk <ca...@gmail.com>; Ryan Blue <bl...@cloudera.com>; "jfarrell@apache.org" <jf...@apache.org>; "Mattmann, Chris A (3980)" <ch...@jpl.nasa.gov>; "press@apache.org" <pr...@apache.org>
> Sent: Friday, 24 April 2015, 16:17
> Subject: FINAL CALL: Apache Parquet TLP announcement [was Re: Graduation blog post?]
>
>Hello again, everyone --below is the latest draft.
>
>Please review and forward any changes/additions no later than 5PM ET on Sunday in order for us to announce on Monday morning. I was aiming to go live by 7AM ET if that works for you.
>
>Kindly confirm.
>
>Thanks in advance,
>Sally
>
>= = =
>
>DRAFT :: NOT FOR DISTRIBUTION
>
>The Apache Software Foundation Announces Apache™ Parquet™ as a Top-Level Project
>
>Open Source storage format for the Apache™ Hadoop® ecosystem in use at Cloudera, NASA, Netflix, Stripe and Twitter, among other organizations
>
>Forest Hill, MD –27 April 2015– The Apache Software Foundation (ASF), the all-volunteer developers, stewards, and incubators of more than 350 Open Source projects and initiatives, announced today that Apache™ Parquet™ has graduated from the Apache Incubator to become a Top-Level Project (TLP), signifying that the project's community and products have been well-governed under the ASF's meritocratic process and principles.
>
>"The incubation process at Apache has been fantastic and really the last step of making Parquet a community driven standard fully integrated within the greater Hadoop ecosystem," said Julien Le Dem, Vice President of Apache Parquet.
>
>Apache Parquet is an Open Source columnar storage format for the Apache™ Hadoop® ecosystem, built to work across programming languages and much more:
>
>
> - processing frameworks (MapReduce, Apache Spark, Scalding, Cascading, Crunch, Kite)
> - data models (Apache Avro, Apache Thrift, Protocol Buffers, POJOs)
> - query engines (Apache Hive, Impala, HAWQ, Apache Drill, Apache Tajo, Apache Pig, Presto, Apache Spark SQL)
>
>"At Twitter, Parquet has helped us scale our big data usage by in some cases reducing storage requirements by one third on large datasets as well as scan and deserialization time. This translated into hardware savings as well as reduced latency for accessing the data. Furthermore, Parquet being integrated with so many tools creates opportunities and flexibility regarding query engines," said Chris Aniszczyk, Head of Open Source at Twitter. "Finally, it's just fantastic to see it graduate to a top-level project and we look forward to further collaborating with the Apache Parquet community to continually improve performance."
>
>"Parquet’s integration with other object models, like Avro and Thrift, has been a key feature for our customers," said Ryan Blue, Software Engineer at Cloudera. "They can take advantage of columnar storage without changing the classes they already use in their production applications."
>
>"At Netflix, Parquet is the primary storage format for data warehousing. More than 7 petabytes of our 10+ Petabyte warehouse is Parquet formatted data that we query across a wide range of tools including Apache Hive, Apache Pig, Apache Spark, PigPen, Presto, and native MapReduce. The performance benefit of columnar projection and statistics is a game changer for our big data platform," said Daniel Weeks, Software Engineer at Netflix. "We look forward to working with the Apache community to advance the state of big data storage with Parquet and are excited to see the project graduate to full Apache status."
>
>"Stripe's data warehouse has been built on Parquet from the beginning," said Avi Bryant, Engineering Manager at Stripe. "Every aspect of our pipeline, from data import to machine learning to adhoc SQL analysis, uses Apache Parquet as the common interchange format."
>
>"I was extremely happy to see Parquet arrive as an Incubator project," said Chris Mattmann, Apache Parquet Incubator Mentor, and Chief Architect, Instrument and Science Data Systems Section at NASA Jet Propulsion Laboratory. "After talking with some in its community there was a real match with this columnar data format technology and its community with the way that we do things here at the ASF. Parquet has had an exemplar Incubation, and the project has big things ahead of it. I am encouraging my Data Science Team at NASA to evaluate it for data representation especially as it relates to our science holdings in Earth, planetary and space sciences, and astrophysics."
>
>The Apache Parquet project welcomes contributions and community participation through mailing lists, face-to-face MeetUps, and user events. For more information, visit http://parquet.apache.org/community/
>
>Availability and Oversight
>Apache Parquet software is released under the Apache License v2.0 and is overseen by a self-selected team of active contributors to the project. A Project Management Committee (PMC) guides the Project's day-to-day operations, including community development and product releases. For downloads, documentation, and ways to become involved with Apache Parquet, visit http://parquet.apache.org/ and https://twitter.com/ApacheParquet
>
>About the Apache Incubator
>The Apache Incubator is the entry path for projects and codebases wishing to become part of the efforts at The Apache Software Foundation. All code donations from external organizations and existing external projects wishing to join the ASF enter through the Incubator to: 1) ensure all donations are in accordance with the ASF legal standards; and 2) develop new communities that adhere to our guiding principles. Incubation is required of all newly accepted projects until a further review indicates that the infrastructure, communications, and decision making process have stabilized in a manner consistent with other successful ASF projects. While incubation status is not necessarily a reflection of the completeness or stability of the code, it does indicate that the project has yet to be fully endorsed by the ASF. For more information, visit http://incubator.apache.org/.
>
>About The Apache Software Foundation (ASF)
>Established in 1999, the all-volunteer Foundation oversees more than 350 leading Open Source projects, including Apache HTTP Server --the world's most popular Web server software. Through the ASF's meritocratic process known as "The Apache Way," more than 500 individual Members and 4,500 Committers successfully collaborate to develop freely available enterprise-grade software, benefiting millions of users worldwide: thousands of software solutions are distributed under the Apache License; and the community actively participates in ASF mailing lists, mentoring initiatives, and ApacheCon, the Foundation's official user conference, trainings, and expo. The ASF is a US 501(c)(3) charitable organization, funded by individual donations and corporate sponsors including Bloomberg, Budget Direct, Cerner, Citrix, Cloudera, Comcast, Facebook, Google, Hortonworks, HP, IBM, InMotion Hosting, iSigma, Matt Mullenweg, Microsoft, Pivotal, Produban, WANdisco, and Yahoo. For more information, visit http://www.apache.org/ or follow @TheASF on Twitter.
>
>© The Apache Software Foundation. "Apache", "Avro", "Apache Avro", "Drill", "Apache Drill", "Hadoop", "Apache Hadoop", "Parquet", "Apache Parquet", "Pig", "Apache Pig", "Spark", "Apache Spark", "Tajo", "Apache Tajo", "Thrift", "Apache Thrift", and "ApacheCon" are registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries. All other brands and trademarks are the property of their respective owners.
>
># # #
>
>[MEDIA CONTACT:SALLY]
>________________________________
>
>
>From: Sally Khudairi <sa...@yahoo.com>
>To: Sally Khudairi <sa...@yahoo.com>; Daniel Weeks <dw...@netflix.com>; "dev@parquet.incubator.apache.org" <de...@parquet.incubator.apache.org>
>Cc: Chris Aniszczyk <ca...@gmail.com>; Ryan Blue <bl...@cloudera.com>; "jfarrell@apache.org" <jf...@apache.org>; "Mattmann, Chris A (3980)" <ch...@jpl.nasa.gov>; "press@apache.org" <pr...@apache.org>
>Sent: Friday, 24 April 2015, 13:56
>Subject: Re: Graduation blog post?
>
>
>
>Done.
>
>ALL: can you please let me know if there are any events that Parquet will be at? Presenting? Hosting? etc.
>
>Thank you!
>
>-Sally
>
>
>
>
>
>________________________________
>From: Sally Khudairi <sa...@yahoo.com>
>To: Daniel Weeks <dw...@netflix.com>; "dev@parquet.incubator.apache.org" <de...@parquet.incubator.apache.org>
>Cc: Chris Aniszczyk <ca...@gmail.com>; Ryan Blue <bl...@cloudera.com>; "jfarrell@apache.org" <jf...@apache.org>; "Mattmann, Chris A (3980)" <ch...@jpl.nasa.gov>; "press@apache.org" <pr...@apache.org>
>Sent: Friday, 24 April 2015, 13:40
>Subject: Re: Graduation blog post?
>
>
>
>Of course --I'll fix that now!
>
>Sorry about that, Daniel.
>
>-Sally
>
>
>
>
>
>
>________________________________
>From: Daniel Weeks <dw...@netflix.com>
>To: dev@parquet.incubator.apache.org; Sally Khudairi <sa...@yahoo.com>
>Cc: Chris Aniszczyk <ca...@gmail.com>; Ryan Blue <bl...@cloudera.com>; "jfarrell@apache.org" <jf...@apache.org>; "Mattmann, Chris A (3980)" <ch...@jpl.nasa.gov>; "press@apache.org" <pr...@apache.org>
>Sent: Friday, 24 April 2015, 13:38
>Subject: Re: Graduation blog post?
>
>
>
>Sally,
>
>Just wanted to comment that my last name is misspelled in the Netflix testimonial.  Can someone fix that?  (it's Weeks, not Week)
>
>Thanks,
>Dan
>
>
>
>
>On Fri, Apr 24, 2015 at 10:23 AM, Sally Khudairi <sa...@yahoo.com.invalid> wrote:
>
>Hi everyone --there's been the addition of a quote from Stripe:
>>
>>"Stripe's data warehouse has been built on Parquet from the beginning," said Avi Bryant, Engineering Manager at Stripe. "Every aspect of our pipeline, from data import to machine learning to adhoc SQL analysis, uses Apache Parquet as the common interchange format."
>>
>>
>>--please note that I added "Apache" to "Parquet" in the second sentence. Stripe has also been added to the sub-head.
>>
>>Are we waiting for quotes from anyone else? If not, I can add a closing sentence and forward the final copy later today.
>>
>>Thanks so much,
>>Sally
>>
>>
>>
>>----- Original Message -----
>>
>>From: Sally Khudairi <sa...@yahoo.com>
>>To: Chris Aniszczyk <ca...@gmail.com>; "dev@parquet.incubator.apache.org" <de...@parquet.incubator.apache.org>
>>Cc: Ryan Blue <bl...@cloudera.com>; "jfarrell@apache.org" <jf...@apache.org>; "Mattmann, Chris A (3980)" <ch...@jpl.nasa.gov>; "press@apache.org" <pr...@apache.org>
>>Sent: Thursday, 23 April 2015, 15:25
>>Subject: Re: Graduation blog post?
>>
>>Hello everyone --below is the draft thus far.
>>
>>
>>I was aiming to announce on Monday by 7AM ET, but noticed that we're waiting for additional quotes.
>>
>>Also, should we get a closing quote from Julien? Perhaps something that invites additional community participation?
>>
>>Please let me know your thoughts.
>>
>>Thanks so much,
>>Sally
>>
>>= = =
>>
>>The Apache Software Foundation Announces Apache™ Parquet™ as a Top-Level Project
>>
>>Open Source storage format for the Apache™ Hadoop® ecosystem in use at Cloudera, NASA, Netflix, and Twitter, among other organizations
>>
>>Forest Hill, MD –27 April 2015– The Apache Software Foundation (ASF), the all-volunteer developers, stewards, and incubators of more than 350 Open Source projects and initiatives, announced today that Apache™ Parquet™ has graduated from the Apache Incubator to become a Top-Level Project (TLP), signifying that the project's community and products have been well-governed under the ASF's meritocratic process and principles.
>>
>>"The incubation process at Apache has been fantastic and really the last step of making Parquet a community driven standard fully integrated within the greater Hadoop ecosystem." said Julien Le Dem, Vice President of Apache Parquet.
>>
>>Apache Parquet is an Open Source columnar storage format for the Apache™ Hadoop® ecosystem, built to work across programming languages and much more:
>>- processing frameworks (MapReduce, Apache Spark, Scalding, Cascading, Crunch, Kite)
>>- data models (Apache Avro, Apache Thrift, Protocol Buffers, POJOs)
>>- query engines (Apache Hive, Impala, HAWQ, Apache Drill, Apache Tajo, Apache Pig, Presto, Apache Spark SQL)
>>
>>"At Twitter, Parquet has helped us scale our big data usage by in some cases reducing storage requirements by one third on large datasets as well as scan and deserialization time. This translated into hardware savings as well as reduced latency for accessing the data. Furthermore, Parquet being integrated with so many tools creates opportunities and flexibility regarding query engines," said Chris Aniszczyk, Head of Open Source at Twitter. "Finally, it's just fantastic to see it graduate to a top-level project and we look forward to further collaborating with the Apache Parquet community to continually improve performance."
>>
>>"Parquet’s integration with other object models, like Avro and Thrift, has been a key feature for our customers," said Ryan Blue, Software Engineer at Cloudera. "They can take advantage of columnar storage without changing the classes they already use in their production applications."
>>
>>"At Netflix, Parquet is the primary storage format for data warehousing. More than 7 petabytes of our 10+ Petabyte warehouse is Parquet formatted data that we query across a wide range of tools including Apache Hive, Apache Pig, Apache Spark, PigPen, Presto, and native MapReduce.  The performance benefit of columnar projection and statistics is a game changer for our big data platform," said Daniel Week, Software Engineer at Netflix. "We look forward to working with the Apache community to advance the state of big data storage with Parquet and are excited to see the project graduate to full Apache status."
>>
>>"I was extremely happy to see Parquet arrive as an Incubator project," said Chris Mattmann, Apache Parquet Incubator Mentor, and Chief Architect, Instrument and Science Data Systems Section at NASA Jet Propulsion Laboratory. "After talking with some in its community there was a real match with
>>this columnar data format technology and its community with the way that we do things here at the ASF. Parquet has had an exemplar Incubation, and the project has big things ahead of it. I am encouraging my Data Science Team at NASA to evaluate it for data representation especially
>>as it relates to our science holdings in Earth, planetary and space sciences, and astrophysics."
>>
>>
>>Stripe? @cra reached out to Avi, said he would get something by Monday
>>Criteo?
>>
>>@@CLOSING QUOTE FROM JULIEN?
>>
>>Availability and Oversight
>>Apache Parquet software is released under the Apache License v2.0 and is overseen by a self-selected team of active contributors to the project. A Project Management Committee (PMC) guides the Project's day-to-day operations, including community development and product releases. For downloads, documentation, and ways to become involved with Apache Parquet, visit http://parquet.apache.org/ and https://twitter.com/ApacheParquet
>>
>>About the Apache Incubator
>>The Apache Incubator is the entry path for projects and codebases wishing to become part of the efforts at The Apache Software Foundation. All code donations from external organizations and existing external projects wishing to join the ASF enter through the Incubator to: 1) ensure all donations are in accordance with the ASF legal standards; and 2) develop new communities that adhere to our guiding principles. Incubation is required of all newly accepted projects until a further review indicates that the infrastructure, communications, and decision making process have stabilized in a manner consistent with other successful ASF projects. While incubation status is not necessarily a reflection of the completeness or stability of the code, it does indicate that the project has yet to be fully endorsed by the ASF. For more information, visit http://incubator.apache.org/.
>>
>>About The Apache Software Foundation (ASF)
>>Established in 1999, the all-volunteer Foundation oversees more than 350 leading Open Source projects, including Apache HTTP Server --the world's most popular Web server software. Through the ASF's meritocratic process known as "The Apache Way," more than 500 individual Members and 4,500 Committers successfully collaborate to develop freely available enterprise-grade software, benefiting millions of users worldwide: thousands of software solutions are distributed under the Apache License; and the community actively participates in ASF mailing lists, mentoring initiatives, and ApacheCon, the Foundation's official user conference, trainings, and expo. The ASF is a US 501(c)(3) charitable organization, funded by individual donations and corporate sponsors including Bloomberg, Budget Direct, Cerner, Citrix, Cloudera, Comcast, Facebook, Google, Hortonworks, HP, IBM, InMotion Hosting, iSigma, Matt Mullenweg, Microsoft, Pivotal, Produban, WANdisco, and Yahoo. For more information, visit http://www.apache.org/ or follow @TheASF on Twitter.
>>
>>© The Apache Software Foundation. "Apache", "Avro", "Apache Avro", "Drill", "Apache Drill", "Hadoop", "Apache Hadoop", "Parquet", "Apache Parquet", "Pig", "Apache Pig", "Spark", "Apache Spark", "Tajo", "Apache Tajo", "Thrift", "Apache Thrift", and "ApacheCon" are registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries. All other brands and trademarks are the property of their respective owners.
>>
>># # #
>>
>>
>>________________________________
>>
>>From: Chris Aniszczyk <ca...@gmail.com>
>>To: "dev@parquet.incubator.apache.org" <de...@parquet.incubator.apache.org>
>>Cc: Sally Khudairi <sa...@yahoo.com>; Ryan Blue <bl...@cloudera.com>; "jfarrell@apache.org" <jf...@apache.org>; "Mattmann, Chris A (3980)" <ch...@jpl.nasa.gov>; "press@apache.org" <pr...@apache.org>
>>Sent: Wednesday, 22 April 2015, 14:51
>>Subject: Re: Graduation blog post?
>>
>>
>>
>>Thanks Daniel, I added your quote.
>>
>>
>>
>>
>>On Wed, Apr 22, 2015 at 12:14 PM, Daniel Weeks <dw...@netflix.com.invalid> wrote:
>>
>>Netflix Testimonial:
>>>
>>>At Netflix, Parquet is the primary storage format for data warehousing.
>>>More than 7 petabytes of our 10+ Petabyte warehouse is Parquet formatted
>>>data that we query across a wide range of tools including Apache Hive,
>>>Apache Pig, Apache Spark, PigPen, Presto, and native MapReduce.  The
>>>performance benefit of columnar projection and statistics is a game changer
>>>for our big data platform.  We look forward to working with the Apache
>>>community to advance the state of big data storage with Parquet and are
>>>excited to see the project graduate to full Apache status.
>>>
>>>Daniel Weeks
>>>Engineering Manager - Big Data Compute
>>>Neflix
>>>
>>>
>>>On Wed, Apr 22, 2015 at 9:36 AM, Sally Khudairi <
>>>sallykhudairi@yahoo.com.invalid> wrote:
>>>
>>>> Thanks for the draft thus far, Ryan.
>>>> Can we please include at least one more industry testimonial?
>>>> Also, if you can please provide edit access to my account at
>>>> khudairi@gmail.com, that would be great.
>>>> Thanks in advance for this!
>>>> -Sally
>>>>
>>>>
>>>>      From: Ryan Blue <bl...@cloudera.com>
>>>>  To: jfarrell@apache.org; Sally Khudairi <sa...@yahoo.com>
>>>> Cc: "Mattmann, Chris A (3980)" <ch...@jpl.nasa.gov>; "
>>>> press@apache.org" <pr...@apache.org>; "dev@parquet.incubator.apache.org" <
>>>> dev@parquet.incubator.apache.org>
>>>>  Sent: Monday, 20 April 2015, 15:48
>>>>  Subject: Re: Graduation blog post?
>>>>
>>>> On 04/20/2015 12:36 PM, Jake Farrell wrote:
>>>> > Hey Sally
>>>> > i've got root@ karma and will take care of the infra side of things for
>>>> > us once the board has successfully voted on our resolution
>>>> >
>>>> > -Jake
>>>>
>>>> Thanks, Jake! I've already sent an e-mail to Infra, but I'll follow up
>>>> with this news so they don't worry about it.
>>>>
>>>> rb
>>>>
>>>>
>>>> --
>>>> Ryan Blue
>>>> Software Engineer
>>>> Cloudera, Inc.
>>>>
>>>>
>>>>
>>>>
>>>
>>
>>
>>--
>>
>>Cheers,
>>
>>Chris Aniszczyk
>>http://aniszczyk.org
>>+1 512 961 6719
>>
>
>



  

Re: FINAL CALL: Apache Parquet TLP announcement [was Re: Graduation blog post?]

Posted by Julien Le Dem <ju...@twitter.com.INVALID>.
Sounds good.
Thank you!

On Sunday, April 26, 2015, Sally Khudairi <sa...@yahoo.com> wrote:

> Thanks, Julien --I can include that, yes.
>
> Does this work for you?
>
>
> <snip>
>
> Catch Apache Parquet in action at the Hadoop Summit, 9-11 June 2015 in San
> Jose, California. The Apache Parquet project welcomes contributions and
> community participation through mailing lists, face-to-face MeetUps, and
> user events. For more information, visit
> http://parquet.apache.org/community/
>
> </snip>
>
>
> Warmest regards,
> Sally
>
>
>
> Did you want to mention the parquet talks at the Hadoop summit in June?
> Otherwise this looks good to me.
>
>
>
>
> On Sunday, April 26, 2015, Sally Khudairi <sa...@yahoo.com.invalid>
> wrote:
>
> Hi everyone --I haven't received any other feedback, so I think we're all
> set to announce tomorrow.
> >I'd like to issue the press release at at 7AM ET. I'll confirm when we're
> live.
> >If there are any showstoppers, please let me know ASAP.
> >Thanks so much,Sally
> >
> >      From: Sally Khudairi <sk@apache.org <javascript:;>>
> > To: Sally Khudairi <sallykhudairi@yahoo.com <javascript:;>>; Daniel
> Weeks <dweeks@netflix.com <javascript:;>>; "
> dev@parquet.incubator.apache.org <javascript:;>" <
> dev@parquet.incubator.apache.org <javascript:;>>
> >Cc: Chris Aniszczyk <caniszczyk@gmail.com <javascript:;>>; Ryan Blue <
> blue@cloudera.com <javascript:;>>; "jfarrell@apache.org <javascript:;>" <
> jfarrell@apache.org <javascript:;>>; "Mattmann, Chris A (3980)" <
> chris.a.mattmann@jpl.nasa.gov <javascript:;>>; "press@apache.org
> <javascript:;>" <press@apache.org <javascript:;>>
> > Sent: Friday, 24 April 2015, 16:17
> > Subject: FINAL CALL: Apache Parquet TLP announcement [was Re: Graduation
> blog post?]
> >
> >Hello again, everyone --below is the latest draft.
> >
> >Please review and forward any changes/additions no later than 5PM ET on
> Sunday in order for us to announce on Monday morning. I was aiming to go
> live by 7AM ET if that works for you.
> >
> >Kindly confirm.
> >
> >Thanks in advance,
> >Sally
> >
> >= = =
> >
> >DRAFT :: NOT FOR DISTRIBUTION
> >
> >The Apache Software Foundation Announces Apache™ Parquet™ as a Top-Level
> Project
> >
> >Open Source storage format for the Apache™ Hadoop® ecosystem in use at
> Cloudera, NASA, Netflix, Stripe and Twitter, among other organizations
> >
> >Forest Hill, MD –27 April 2015– The Apache Software Foundation (ASF), the
> all-volunteer developers, stewards, and incubators of more than 350 Open
> Source projects and initiatives, announced today that Apache™ Parquet™ has
> graduated from the Apache Incubator to become a Top-Level Project (TLP),
> signifying that the project's community and products have been
> well-governed under the ASF's meritocratic process and principles.
> >
> >"The incubation process at Apache has been fantastic and really the last
> step of making Parquet a community driven standard fully integrated within
> the greater Hadoop ecosystem," said Julien Le Dem, Vice President of Apache
> Parquet.
> >
> >Apache Parquet is an Open Source columnar storage format for the Apache™
> Hadoop® ecosystem, built to work across programming languages and much more:
> >
> >
> > - processing frameworks (MapReduce, Apache Spark, Scalding, Cascading,
> Crunch, Kite)
> > - data models (Apache Avro, Apache Thrift, Protocol Buffers, POJOs)
> > - query engines (Apache Hive, Impala, HAWQ, Apache Drill, Apache Tajo,
> Apache Pig, Presto, Apache Spark SQL)
> >
> >"At Twitter, Parquet has helped us scale our big data usage by in some
> cases reducing storage requirements by one third on large datasets as well
> as scan and deserialization time. This translated into hardware savings as
> well as reduced latency for accessing the data. Furthermore, Parquet being
> integrated with so many tools creates opportunities and flexibility
> regarding query engines," said Chris Aniszczyk, Head of Open Source at
> Twitter. "Finally, it's just fantastic to see it graduate to a top-level
> project and we look forward to further collaborating with the Apache
> Parquet community to continually improve performance."
> >
> >"Parquet’s integration with other object models, like Avro and Thrift,
> has been a key feature for our customers," said Ryan Blue, Software
> Engineer at Cloudera. "They can take advantage of columnar storage without
> changing the classes they already use in their production applications."
> >
> >"At Netflix, Parquet is the primary storage format for data warehousing.
> More than 7 petabytes of our 10+ Petabyte warehouse is Parquet formatted
> data that we query across a wide range of tools including Apache Hive,
> Apache Pig, Apache Spark, PigPen, Presto, and native MapReduce. The
> performance benefit of columnar projection and statistics is a game changer
> for our big data platform," said Daniel Weeks, Software Engineer at
> Netflix. "We look forward to working with the Apache community to advance
> the state of big data storage with Parquet and are excited to see the
> project graduate to full Apache status."
> >
> >"Stripe's data warehouse has been built on Parquet from the beginning,"
> said Avi Bryant, Engineering Manager at Stripe. "Every aspect of our
> pipeline, from data import to machine learning to adhoc SQL analysis, uses
> Apache Parquet as the common interchange format."
> >
> >"I was extremely happy to see Parquet arrive as an Incubator project,"
> said Chris Mattmann, Apache Parquet Incubator Mentor, and Chief Architect,
> Instrument and Science Data Systems Section at NASA Jet Propulsion
> Laboratory. "After talking with some in its community there was a real
> match with this columnar data format technology and its community with the
> way that we do things here at the ASF. Parquet has had an exemplar
> Incubation, and the project has big things ahead of it. I am encouraging my
> Data Science Team at NASA to evaluate it for data representation especially
> as it relates to our science holdings in Earth, planetary and space
> sciences, and astrophysics."
> >
> >The Apache Parquet project welcomes contributions and community
> participation through mailing lists, face-to-face MeetUps, and user events.
> For more information, visit http://parquet.apache.org/community/
> >
> >Availability and Oversight
> >Apache Parquet software is released under the Apache License v2.0 and is
> overseen by a self-selected team of active contributors to the project. A
> Project Management Committee (PMC) guides the Project's day-to-day
> operations, including community development and product releases. For
> downloads, documentation, and ways to become involved with Apache Parquet,
> visit http://parquet.apache.org/ and https://twitter.com/ApacheParquet
> >
> >About the Apache Incubator
> >The Apache Incubator is the entry path for projects and codebases wishing
> to become part of the efforts at The Apache Software Foundation. All code
> donations from external organizations and existing external projects
> wishing to join the ASF enter through the Incubator to: 1) ensure all
> donations are in accordance with the ASF legal standards; and 2) develop
> new communities that adhere to our guiding principles. Incubation is
> required of all newly accepted projects until a further review indicates
> that the infrastructure, communications, and decision making process have
> stabilized in a manner consistent with other successful ASF projects. While
> incubation status is not necessarily a reflection of the completeness or
> stability of the code, it does indicate that the project has yet to be
> fully endorsed by the ASF. For more information, visit
> http://incubator.apache.org/.
> >
> >About The Apache Software Foundation (ASF)
> >Established in 1999, the all-volunteer Foundation oversees more than 350
> leading Open Source projects, including Apache HTTP Server --the world's
> most popular Web server software. Through the ASF's meritocratic process
> known as "The Apache Way," more than 500 individual Members and 4,500
> Committers successfully collaborate to develop freely available
> enterprise-grade software, benefiting millions of users worldwide:
> thousands of software solutions are distributed under the Apache License;
> and the community actively participates in ASF mailing lists, mentoring
> initiatives, and ApacheCon, the Foundation's official user conference,
> trainings, and expo. The ASF is a US 501(c)(3) charitable organization,
> funded by individual donations and corporate sponsors including Bloomberg,
> Budget Direct, Cerner, Citrix, Cloudera, Comcast, Facebook, Google,
> Hortonworks, HP, IBM, InMotion Hosting, iSigma, Matt Mullenweg, Microsoft,
> Pivotal, Produban, WANdisco, and Yahoo. For more information, visit
> http://www.apache.org/ or follow @TheASF on Twitter.
> >
> >© The Apache Software Foundation. "Apache", "Avro", "Apache Avro",
> "Drill", "Apache Drill", "Hadoop", "Apache Hadoop", "Parquet", "Apache
> Parquet", "Pig", "Apache Pig", "Spark", "Apache Spark", "Tajo", "Apache
> Tajo", "Thrift", "Apache Thrift", and "ApacheCon" are registered trademarks
> or trademarks of the Apache Software Foundation in the United States and/or
> other countries. All other brands and trademarks are the property of their
> respective owners.
> >
> ># # #
> >
> >[MEDIA CONTACT:SALLY]
> >________________________________
> >
> >
> >From: Sally Khudairi <sallykhudairi@yahoo.com <javascript:;>>
> >To: Sally Khudairi <sallykhudairi@yahoo.com <javascript:;>>; Daniel
> Weeks <dweeks@netflix.com <javascript:;>>; "
> dev@parquet.incubator.apache.org <javascript:;>" <
> dev@parquet.incubator.apache.org <javascript:;>>
> >Cc: Chris Aniszczyk <caniszczyk@gmail.com <javascript:;>>; Ryan Blue <
> blue@cloudera.com <javascript:;>>; "jfarrell@apache.org <javascript:;>" <
> jfarrell@apache.org <javascript:;>>; "Mattmann, Chris A (3980)" <
> chris.a.mattmann@jpl.nasa.gov <javascript:;>>; "press@apache.org
> <javascript:;>" <press@apache.org <javascript:;>>
> >Sent: Friday, 24 April 2015, 13:56
> >Subject: Re: Graduation blog post?
> >
> >
> >
> >Done.
> >
> >ALL: can you please let me know if there are any events that Parquet will
> be at? Presenting? Hosting? etc.
> >
> >Thank you!
> >
> >-Sally
> >
> >
> >
> >
> >
> >________________________________
> >From: Sally Khudairi <sallykhudairi@yahoo.com <javascript:;>>
> >To: Daniel Weeks <dweeks@netflix.com <javascript:;>>; "
> dev@parquet.incubator.apache.org <javascript:;>" <
> dev@parquet.incubator.apache.org <javascript:;>>
> >Cc: Chris Aniszczyk <caniszczyk@gmail.com <javascript:;>>; Ryan Blue <
> blue@cloudera.com <javascript:;>>; "jfarrell@apache.org <javascript:;>" <
> jfarrell@apache.org <javascript:;>>; "Mattmann, Chris A (3980)" <
> chris.a.mattmann@jpl.nasa.gov <javascript:;>>; "press@apache.org
> <javascript:;>" <press@apache.org <javascript:;>>
> >Sent: Friday, 24 April 2015, 13:40
> >Subject: Re: Graduation blog post?
> >
> >
> >
> >Of course --I'll fix that now!
> >
> >Sorry about that, Daniel.
> >
> >-Sally
> >
> >
> >
> >
> >
> >
> >________________________________
> >From: Daniel Weeks <dweeks@netflix.com <javascript:;>>
> >To: dev@parquet.incubator.apache.org <javascript:;>; Sally Khudairi <
> sallykhudairi@yahoo.com <javascript:;>>
> >Cc: Chris Aniszczyk <caniszczyk@gmail.com <javascript:;>>; Ryan Blue <
> blue@cloudera.com <javascript:;>>; "jfarrell@apache.org <javascript:;>" <
> jfarrell@apache.org <javascript:;>>; "Mattmann, Chris A (3980)" <
> chris.a.mattmann@jpl.nasa.gov <javascript:;>>; "press@apache.org
> <javascript:;>" <press@apache.org <javascript:;>>
> >Sent: Friday, 24 April 2015, 13:38
> >Subject: Re: Graduation blog post?
> >
> >
> >
> >Sally,
> >
> >Just wanted to comment that my last name is misspelled in the Netflix
> testimonial.  Can someone fix that?  (it's Weeks, not Week)
> >
> >Thanks,
> >Dan
> >
> >
> >
> >
> >On Fri, Apr 24, 2015 at 10:23 AM, Sally Khudairi
> <sa...@yahoo.com.invalid> wrote:
> >
> >Hi everyone --there's been the addition of a quote from Stripe:
> >>
> >>"Stripe's data warehouse has been built on Parquet from the beginning,"
> said Avi Bryant, Engineering Manager at Stripe. "Every aspect of our
> pipeline, from data import to machine learning to adhoc SQL analysis, uses
> Apache Parquet as the common interchange format."
> >>
> >>
> >>--please note that I added "Apache" to "Parquet" in the second sentence.
> Stripe has also been added to the sub-head.
> >>
> >>Are we waiting for quotes from anyone else? If not, I can add a closing
> sentence and forward the final copy later today.
> >>
> >>Thanks so much,
> >>Sally
> >>
> >>
> >>
> >>----- Original Message -----
> >>
> >>From: Sally Khudairi <sallykhudairi@yahoo.com <javascript:;>>
> >>To: Chris Aniszczyk <caniszczyk@gmail.com <javascript:;>>; "
> dev@parquet.incubator.apache.org <javascript:;>" <
> dev@parquet.incubator.apache.org <javascript:;>>
> >>Cc: Ryan Blue <blue@cloudera.com <javascript:;>>; "jfarrell@apache.org
> <javascript:;>" <jfarrell@apache.org <javascript:;>>; "Mattmann, Chris A
> (3980)" <chris.a.mattmann@jpl.nasa.gov <javascript:;>>; "press@apache.org
> <javascript:;>" <press@apache.org <javascript:;>>
> >>Sent: Thursday, 23 April 2015, 15:25
> >>Subject: Re: Graduation blog post?
> >>
> >>Hello everyone --below is the draft thus far.
> >>
> >>
> >>I was aiming to announce on Monday by 7AM ET, but noticed that we're
> waiting for additional quotes.
> >>
> >>Also, should we get a closing quote from Julien? Perhaps something that
> invites additional community participation?
> >>
> >>Please let me know your thoughts.
> >>
> >>Thanks so much,
> >>Sally
> >>
> >>= = =
> >>
> >>The Apache Software Foundation Announces Apache™ Parquet™ as a Top-Level
> Project
> >>
> >>Open Source storage format for the Apache™ Hadoop® ecosystem in use at
> Cloudera, NASA, Netflix, and Twitter, among other organizations
> >>
> >>Forest Hill, MD –27 April 2015– The Apache Software Foundation (ASF),
> the all-volunteer developers, stewards, and incubators of more than 350
> Open Source projects and initiatives, announced today that Apache™ Parquet™
> has graduated from the Apache Incubator to become a Top-Level Project
> (TLP), signifying that the project's community and products have been
> well-governed under the ASF's meritocratic process and principles.
> >>
> >>"The incubation process at Apache has been fantastic and really the last
> step of making Parquet a community driven standard fully integrated within
> the greater Hadoop ecosystem." said Julien Le Dem, Vice President of Apache
> Parquet.
> >>
> >>Apache Parquet is an Open Source columnar storage format for the Apache™
> Hadoop® ecosystem, built to work across programming languages and much more:
> >>- processing frameworks (MapReduce, Apache Spark, Scalding, Cascading,
> Crunch, Kite)
> >>- data models (Apache Avro, Apache Thrift, Protocol Buffers, POJOs)
> >>- query engines (Apache Hive, Impala, HAWQ, Apache Drill, Apache Tajo,
> Apache Pig, Presto, Apache Spark SQL)
> >>
> >>"At Twitter, Parquet has helped us scale our big data usage by in some
> cases reducing storage requirements by one third on large datasets as well
> as scan and deserialization time. This translated into hardware savings as
> well as reduced latency for accessing the data. Furthermore, Parquet being
> integrated with so many tools creates opportunities and flexibility
> regarding query engines," said Chris Aniszczyk, Head of Open Source at
> Twitter. "Finally, it's just fantastic to see it graduate to a top-level
> project and we look forward to further collaborating with the Apache
> Parquet community to continually improve performance."
> >>
> >>"Parquet’s integration with other object models, like Avro and Thrift,
> has been a key feature for our customers," said Ryan Blue, Software
> Engineer at Cloudera. "They can take advantage of columnar storage without
> changing the classes they already use in their production applications."
> >>
> >>"At Netflix, Parquet is the primary storage format for data warehousing.
> More than 7 petabytes of our 10+ Petabyte warehouse is Parquet formatted
> data that we query across a wide range of tools including Apache Hive,
> Apache Pig, Apache Spark, PigPen, Presto, and native MapReduce.  The
> performance benefit of columnar projection and statistics is a game changer
> for our big data platform," said Daniel Week, Software Engineer at Netflix.
> "We look forward to working with the Apache community to advance the state
> of big data storage with Parquet and are excited to see the project
> graduate to full Apache status."
> >>
> >>"I was extremely happy to see Parquet arrive as an Incubator project,"
> said Chris Mattmann, Apache Parquet Incubator Mentor, and Chief Architect,
> Instrument and Science Data Systems Section at NASA Jet Propulsion
> Laboratory. "After talking with some in its community there was a real
> match with
> >>this columnar data format technology and its community with the way that
> we do things here at the ASF. Parquet has had an exemplar Incubation, and
> the project has big things ahead of it. I am encouraging my Data Science
> Team at NASA to evaluate it for data representation especially
> >>as it relates to our science holdings in Earth, planetary and space
> sciences, and astrophysics."
> >>
> >>
> >>Stripe? @cra reached out to Avi, said he would get something by Monday
> >>Criteo?
> >>
> >>@@CLOSING QUOTE FROM JULIEN?
> >>
> >>Availability and Oversight
> >>Apache Parquet software is released under the Apache License v2.0 and is
> overseen by a self-selected team of active contributors to the project. A
> Project Management Committee (PMC) guides the Project's day-to-day
> operations, including community development and product releases. For
> downloads, documentation, and ways to become involved with Apache Parquet,
> visit http://parquet.apache.org/ and https://twitter.com/ApacheParquet
> >>
> >>About the Apache Incubator
> >>The Apache Incubator is the entry path for projects and codebases
> wishing to become part of the efforts at The Apache Software Foundation.
> All code donations from external organizations and existing external
> projects wishing to join the ASF enter through the Incubator to: 1) ensure
> all donations are in accordance with the ASF legal standards; and 2)
> develop new communities that adhere to our guiding principles. Incubation
> is required of all newly accepted projects until a further review indicates
> that the infrastructure, communications, and decision making process have
> stabilized in a manner consistent with other successful ASF projects. While
> incubation status is not necessarily a reflection of the completeness or
> stability of the code, it does indicate that the project has yet to be
> fully endorsed by the ASF. For more information, visit
> http://incubator.apache.org/.
> >>
> >>About The Apache Software Foundation (ASF)
> >>Established in 1999, the all-volunteer Foundation oversees more than 350
> leading Open Source projects, including Apache HTTP Server --the world's
> most popular Web server software. Through the ASF's meritocratic process
> known as "The Apache Way," more than 500 individual Members and 4,500
> Committers successfully collaborate to develop freely available
> enterprise-grade software, benefiting millions of users worldwide:
> thousands of software solutions are distributed under the Apache License;
> and the community actively participates in ASF mailing lists, mentoring
> initiatives, and ApacheCon, the Foundation's official user conference,
> trainings, and expo. The ASF is a US 501(c)(3) charitable organization,
> funded by individual donations and corporate sponsors including Bloomberg,
> Budget Direct, Cerner, Citrix, Cloudera, Comcast, Facebook, Google,
> Hortonworks, HP, IBM, InMotion Hosting, iSigma, Matt Mullenweg, Microsoft,
> Pivotal, Produban, WANdisco, and Yahoo. For more information, visit
> http://www.apache.org/ or follow @TheASF on Twitter.
> >>
> >>© The Apache Software Foundation. "Apache", "Avro", "Apache Avro",
> "Drill", "Apache Drill", "Hadoop", "Apache Hadoop", "Parquet", "Apache
> Parquet", "Pig", "Apache Pig", "Spark", "Apache Spark", "Tajo", "Apache
> Tajo", "Thrift", "Apache Thrift", and "ApacheCon" are registered trademarks
> or trademarks of the Apache Software Foundation in the United States and/or
> other countries. All other brands and trademarks are the property of their
> respective owners.
> >>
> >># # #
> >>
> >>
> >>________________________________
> >>
> >>From: Chris Aniszczyk <caniszczyk@gmail.com <javascript:;>>
> >>To: "dev@parquet.incubator.apache.org <javascript:;>" <
> dev@parquet.incubator.apache.org <javascript:;>>
> >>Cc: Sally Khudairi <sallykhudairi@yahoo.com <javascript:;>>; Ryan Blue <
> blue@cloudera.com <javascript:;>>; "jfarrell@apache.org <javascript:;>" <
> jfarrell@apache.org <javascript:;>>; "Mattmann, Chris A (3980)" <
> chris.a.mattmann@jpl.nasa.gov <javascript:;>>; "press@apache.org
> <javascript:;>" <press@apache.org <javascript:;>>
> >>Sent: Wednesday, 22 April 2015, 14:51
> >>Subject: Re: Graduation blog post?
> >>
> >>
> >>
> >>Thanks Daniel, I added your quote.
> >>
> >>
> >>
> >>
> >>On Wed, Apr 22, 2015 at 12:14 PM, Daniel Weeks
> <dw...@netflix.com.invalid> wrote:
> >>
> >>Netflix Testimonial:
> >>>
> >>>At Netflix, Parquet is the primary storage format for data warehousing.
> >>>More than 7 petabytes of our 10+ Petabyte warehouse is Parquet formatted
> >>>data that we query across a wide range of tools including Apache Hive,
> >>>Apache Pig, Apache Spark, PigPen, Presto, and native MapReduce.  The
> >>>performance benefit of columnar projection and statistics is a game
> changer
> >>>for our big data platform.  We look forward to working with the Apache
> >>>community to advance the state of big data storage with Parquet and are
> >>>excited to see the project graduate to full Apache status.
> >>>
> >>>Daniel Weeks
> >>>Engineering Manager - Big Data Compute
> >>>Neflix
> >>>
> >>>
> >>>On Wed, Apr 22, 2015 at 9:36 AM, Sally Khudairi <
> >>>sallykhudairi@yahoo.com.invalid> wrote:
> >>>
> >>>> Thanks for the draft thus far, Ryan.
> >>>> Can we please include at least one more industry testimonial?
> >>>> Also, if you can please provide edit access to my account at
> >>>> khudairi@gmail.com <javascript:;>, that would be great.
> >>>> Thanks in advance for this!
> >>>> -Sally
> >>>>
> >>>>
> >>>>      From: Ryan Blue <blue@cloudera.com <javascript:;>>
> >>>>  To: jfarrell@apache.org <javascript:;>; Sally Khudairi <
> sallykhudairi@yahoo.com <javascript:;>>
> >>>> Cc: "Mattmann, Chris A (3980)" <chris.a.mattmann@jpl.nasa.gov
> <javascript:;>>; "
> >>>> press@apache.org <javascript:;>" <press@apache.org <javascript:;>>; "
> dev@parquet.incubator.apache.org <javascript:;>" <
> >>>> dev@parquet.incubator.apache.org <javascript:;>>
> >>>>  Sent: Monday, 20 April 2015, 15:48
> >>>>  Subject: Re: Graduation blog post?
> >>>>
> >>>> On 04/20/2015 12:36 PM, Jake Farrell wrote:
> >>>> > Hey Sally
> >>>> > i've got root@ karma and will take care of the infra side of
> things for
> >>>> > us once the board has successfully voted on our resolution
> >>>> >
> >>>> > -Jake
> >>>>
> >>>> Thanks, Jake! I've already sent an e-mail to Infra, but I'll follow up
> >>>> with this news so they don't worry about it.
> >>>>
> >>>> rb
> >>>>
> >>>>
> >>>> --
> >>>> Ryan Blue
> >>>> Software Engineer
> >>>> Cloudera, Inc.
> >>>>
> >>>>
> >>>>
> >>>>
> >>>
> >>
> >>
> >>--
> >>
> >>Cheers,
> >>
> >>Chris Aniszczyk
> >>http://aniszczyk.org
> >>+1 512 961 6719
> >>
> >
> >
>

Re: FINAL CALL: Apache Parquet TLP announcement [was Re: Graduation blog post?]

Posted by Sally Khudairi <sa...@yahoo.com.INVALID>.
Thanks, Julien --I can include that, yes.

Does this work for you?


<snip>

Catch Apache Parquet in action at the Hadoop Summit, 9-11 June 2015 in San Jose, California. The Apache Parquet project welcomes contributions and community participation through mailing lists, face-to-face MeetUps, and user events. For more information, visit http://parquet.apache.org/community/ 

</snip>


Warmest regards,
Sally
________________________________
From: Julien Le Dem <ju...@twitter.com>
To: "dev@parquet.incubator.apache.org" <de...@parquet.incubator.apache.org>; Sally Khudairi <sa...@yahoo.com> 
Cc: Sally Khudairi <sk...@apache.org>; Daniel Weeks <dw...@netflix.com>; Chris Aniszczyk <ca...@gmail.com>; Ryan Blue <bl...@cloudera.com>; "jfarrell@apache.org" <jf...@apache.org>; "Mattmann, Chris A (3980)" <ch...@jpl.nasa.gov>; "press@apache.org" <pr...@apache.org> 
Sent: Sunday, 26 April 2015, 19:14
Subject: Re: FINAL CALL: Apache Parquet TLP announcement [was Re: Graduation blog post?]



Did you want to mention the parquet talks at the Hadoop summit in June?
Otherwise this looks good to me.




On Sunday, April 26, 2015, Sally Khudairi <sa...@yahoo.com.invalid> wrote:

Hi everyone --I haven't received any other feedback, so I think we're all set to announce tomorrow.
>I'd like to issue the press release at at 7AM ET. I'll confirm when we're live.
>If there are any showstoppers, please let me know ASAP.
>Thanks so much,Sally
>
>      From: Sally Khudairi <sk...@apache.org>
> To: Sally Khudairi <sa...@yahoo.com>; Daniel Weeks <dw...@netflix.com>; "dev@parquet.incubator.apache.org" <de...@parquet.incubator.apache.org>
>Cc: Chris Aniszczyk <ca...@gmail.com>; Ryan Blue <bl...@cloudera.com>; "jfarrell@apache.org" <jf...@apache.org>; "Mattmann, Chris A (3980)" <ch...@jpl.nasa.gov>; "press@apache.org" <pr...@apache.org>
> Sent: Friday, 24 April 2015, 16:17
> Subject: FINAL CALL: Apache Parquet TLP announcement [was Re: Graduation blog post?]
>
>Hello again, everyone --below is the latest draft.
>
>Please review and forward any changes/additions no later than 5PM ET on Sunday in order for us to announce on Monday morning. I was aiming to go live by 7AM ET if that works for you.
>
>Kindly confirm.
>
>Thanks in advance,
>Sally
>
>= = =
>
>DRAFT :: NOT FOR DISTRIBUTION
>
>The Apache Software Foundation Announces Apache™ Parquet™ as a Top-Level Project
>
>Open Source storage format for the Apache™ Hadoop® ecosystem in use at Cloudera, NASA, Netflix, Stripe and Twitter, among other organizations
>
>Forest Hill, MD –27 April 2015– The Apache Software Foundation (ASF), the all-volunteer developers, stewards, and incubators of more than 350 Open Source projects and initiatives, announced today that Apache™ Parquet™ has graduated from the Apache Incubator to become a Top-Level Project (TLP), signifying that the project's community and products have been well-governed under the ASF's meritocratic process and principles.
>
>"The incubation process at Apache has been fantastic and really the last step of making Parquet a community driven standard fully integrated within the greater Hadoop ecosystem," said Julien Le Dem, Vice President of Apache Parquet.
>
>Apache Parquet is an Open Source columnar storage format for the Apache™ Hadoop® ecosystem, built to work across programming languages and much more:
>
>
> - processing frameworks (MapReduce, Apache Spark, Scalding, Cascading, Crunch, Kite)
> - data models (Apache Avro, Apache Thrift, Protocol Buffers, POJOs)
> - query engines (Apache Hive, Impala, HAWQ, Apache Drill, Apache Tajo, Apache Pig, Presto, Apache Spark SQL)
>
>"At Twitter, Parquet has helped us scale our big data usage by in some cases reducing storage requirements by one third on large datasets as well as scan and deserialization time. This translated into hardware savings as well as reduced latency for accessing the data. Furthermore, Parquet being integrated with so many tools creates opportunities and flexibility regarding query engines," said Chris Aniszczyk, Head of Open Source at Twitter. "Finally, it's just fantastic to see it graduate to a top-level project and we look forward to further collaborating with the Apache Parquet community to continually improve performance."
>
>"Parquet’s integration with other object models, like Avro and Thrift, has been a key feature for our customers," said Ryan Blue, Software Engineer at Cloudera. "They can take advantage of columnar storage without changing the classes they already use in their production applications."
>
>"At Netflix, Parquet is the primary storage format for data warehousing. More than 7 petabytes of our 10+ Petabyte warehouse is Parquet formatted data that we query across a wide range of tools including Apache Hive, Apache Pig, Apache Spark, PigPen, Presto, and native MapReduce. The performance benefit of columnar projection and statistics is a game changer for our big data platform," said Daniel Weeks, Software Engineer at Netflix. "We look forward to working with the Apache community to advance the state of big data storage with Parquet and are excited to see the project graduate to full Apache status."
>
>"Stripe's data warehouse has been built on Parquet from the beginning," said Avi Bryant, Engineering Manager at Stripe. "Every aspect of our pipeline, from data import to machine learning to adhoc SQL analysis, uses Apache Parquet as the common interchange format."
>
>"I was extremely happy to see Parquet arrive as an Incubator project," said Chris Mattmann, Apache Parquet Incubator Mentor, and Chief Architect, Instrument and Science Data Systems Section at NASA Jet Propulsion Laboratory. "After talking with some in its community there was a real match with this columnar data format technology and its community with the way that we do things here at the ASF. Parquet has had an exemplar Incubation, and the project has big things ahead of it. I am encouraging my Data Science Team at NASA to evaluate it for data representation especially as it relates to our science holdings in Earth, planetary and space sciences, and astrophysics."
>
>The Apache Parquet project welcomes contributions and community participation through mailing lists, face-to-face MeetUps, and user events. For more information, visit http://parquet.apache.org/community/
>
>Availability and Oversight
>Apache Parquet software is released under the Apache License v2.0 and is overseen by a self-selected team of active contributors to the project. A Project Management Committee (PMC) guides the Project's day-to-day operations, including community development and product releases. For downloads, documentation, and ways to become involved with Apache Parquet, visit http://parquet.apache.org/ and https://twitter.com/ApacheParquet
>
>About the Apache Incubator
>The Apache Incubator is the entry path for projects and codebases wishing to become part of the efforts at The Apache Software Foundation. All code donations from external organizations and existing external projects wishing to join the ASF enter through the Incubator to: 1) ensure all donations are in accordance with the ASF legal standards; and 2) develop new communities that adhere to our guiding principles. Incubation is required of all newly accepted projects until a further review indicates that the infrastructure, communications, and decision making process have stabilized in a manner consistent with other successful ASF projects. While incubation status is not necessarily a reflection of the completeness or stability of the code, it does indicate that the project has yet to be fully endorsed by the ASF. For more information, visit http://incubator.apache.org/.
>
>About The Apache Software Foundation (ASF)
>Established in 1999, the all-volunteer Foundation oversees more than 350 leading Open Source projects, including Apache HTTP Server --the world's most popular Web server software. Through the ASF's meritocratic process known as "The Apache Way," more than 500 individual Members and 4,500 Committers successfully collaborate to develop freely available enterprise-grade software, benefiting millions of users worldwide: thousands of software solutions are distributed under the Apache License; and the community actively participates in ASF mailing lists, mentoring initiatives, and ApacheCon, the Foundation's official user conference, trainings, and expo. The ASF is a US 501(c)(3) charitable organization, funded by individual donations and corporate sponsors including Bloomberg, Budget Direct, Cerner, Citrix, Cloudera, Comcast, Facebook, Google, Hortonworks, HP, IBM, InMotion Hosting, iSigma, Matt Mullenweg, Microsoft, Pivotal, Produban, WANdisco, and Yahoo. For more information, visit http://www.apache.org/ or follow @TheASF on Twitter.
>
>© The Apache Software Foundation. "Apache", "Avro", "Apache Avro", "Drill", "Apache Drill", "Hadoop", "Apache Hadoop", "Parquet", "Apache Parquet", "Pig", "Apache Pig", "Spark", "Apache Spark", "Tajo", "Apache Tajo", "Thrift", "Apache Thrift", and "ApacheCon" are registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries. All other brands and trademarks are the property of their respective owners.
>
># # #
>
>[MEDIA CONTACT:SALLY]
>________________________________
>
>
>From: Sally Khudairi <sa...@yahoo.com>
>To: Sally Khudairi <sa...@yahoo.com>; Daniel Weeks <dw...@netflix.com>; "dev@parquet.incubator.apache.org" <de...@parquet.incubator.apache.org>
>Cc: Chris Aniszczyk <ca...@gmail.com>; Ryan Blue <bl...@cloudera.com>; "jfarrell@apache.org" <jf...@apache.org>; "Mattmann, Chris A (3980)" <ch...@jpl.nasa.gov>; "press@apache.org" <pr...@apache.org>
>Sent: Friday, 24 April 2015, 13:56
>Subject: Re: Graduation blog post?
>
>
>
>Done.
>
>ALL: can you please let me know if there are any events that Parquet will be at? Presenting? Hosting? etc.
>
>Thank you!
>
>-Sally
>
>
>
>
>
>________________________________
>From: Sally Khudairi <sa...@yahoo.com>
>To: Daniel Weeks <dw...@netflix.com>; "dev@parquet.incubator.apache.org" <de...@parquet.incubator.apache.org>
>Cc: Chris Aniszczyk <ca...@gmail.com>; Ryan Blue <bl...@cloudera.com>; "jfarrell@apache.org" <jf...@apache.org>; "Mattmann, Chris A (3980)" <ch...@jpl.nasa.gov>; "press@apache.org" <pr...@apache.org>
>Sent: Friday, 24 April 2015, 13:40
>Subject: Re: Graduation blog post?
>
>
>
>Of course --I'll fix that now!
>
>Sorry about that, Daniel.
>
>-Sally
>
>
>
>
>
>
>________________________________
>From: Daniel Weeks <dw...@netflix.com>
>To: dev@parquet.incubator.apache.org; Sally Khudairi <sa...@yahoo.com>
>Cc: Chris Aniszczyk <ca...@gmail.com>; Ryan Blue <bl...@cloudera.com>; "jfarrell@apache.org" <jf...@apache.org>; "Mattmann, Chris A (3980)" <ch...@jpl.nasa.gov>; "press@apache.org" <pr...@apache.org>
>Sent: Friday, 24 April 2015, 13:38
>Subject: Re: Graduation blog post?
>
>
>
>Sally,
>
>Just wanted to comment that my last name is misspelled in the Netflix testimonial.  Can someone fix that?  (it's Weeks, not Week)
>
>Thanks,
>Dan
>
>
>
>
>On Fri, Apr 24, 2015 at 10:23 AM, Sally Khudairi <sa...@yahoo.com.invalid> wrote:
>
>Hi everyone --there's been the addition of a quote from Stripe:
>>
>>"Stripe's data warehouse has been built on Parquet from the beginning," said Avi Bryant, Engineering Manager at Stripe. "Every aspect of our pipeline, from data import to machine learning to adhoc SQL analysis, uses Apache Parquet as the common interchange format."
>>
>>
>>--please note that I added "Apache" to "Parquet" in the second sentence. Stripe has also been added to the sub-head.
>>
>>Are we waiting for quotes from anyone else? If not, I can add a closing sentence and forward the final copy later today.
>>
>>Thanks so much,
>>Sally
>>
>>
>>
>>----- Original Message -----
>>
>>From: Sally Khudairi <sa...@yahoo.com>
>>To: Chris Aniszczyk <ca...@gmail.com>; "dev@parquet.incubator.apache.org" <de...@parquet.incubator.apache.org>
>>Cc: Ryan Blue <bl...@cloudera.com>; "jfarrell@apache.org" <jf...@apache.org>; "Mattmann, Chris A (3980)" <ch...@jpl.nasa.gov>; "press@apache.org" <pr...@apache.org>
>>Sent: Thursday, 23 April 2015, 15:25
>>Subject: Re: Graduation blog post?
>>
>>Hello everyone --below is the draft thus far.
>>
>>
>>I was aiming to announce on Monday by 7AM ET, but noticed that we're waiting for additional quotes.
>>
>>Also, should we get a closing quote from Julien? Perhaps something that invites additional community participation?
>>
>>Please let me know your thoughts.
>>
>>Thanks so much,
>>Sally
>>
>>= = =
>>
>>The Apache Software Foundation Announces Apache™ Parquet™ as a Top-Level Project
>>
>>Open Source storage format for the Apache™ Hadoop® ecosystem in use at Cloudera, NASA, Netflix, and Twitter, among other organizations
>>
>>Forest Hill, MD –27 April 2015– The Apache Software Foundation (ASF), the all-volunteer developers, stewards, and incubators of more than 350 Open Source projects and initiatives, announced today that Apache™ Parquet™ has graduated from the Apache Incubator to become a Top-Level Project (TLP), signifying that the project's community and products have been well-governed under the ASF's meritocratic process and principles.
>>
>>"The incubation process at Apache has been fantastic and really the last step of making Parquet a community driven standard fully integrated within the greater Hadoop ecosystem." said Julien Le Dem, Vice President of Apache Parquet.
>>
>>Apache Parquet is an Open Source columnar storage format for the Apache™ Hadoop® ecosystem, built to work across programming languages and much more:
>>- processing frameworks (MapReduce, Apache Spark, Scalding, Cascading, Crunch, Kite)
>>- data models (Apache Avro, Apache Thrift, Protocol Buffers, POJOs)
>>- query engines (Apache Hive, Impala, HAWQ, Apache Drill, Apache Tajo, Apache Pig, Presto, Apache Spark SQL)
>>
>>"At Twitter, Parquet has helped us scale our big data usage by in some cases reducing storage requirements by one third on large datasets as well as scan and deserialization time. This translated into hardware savings as well as reduced latency for accessing the data. Furthermore, Parquet being integrated with so many tools creates opportunities and flexibility regarding query engines," said Chris Aniszczyk, Head of Open Source at Twitter. "Finally, it's just fantastic to see it graduate to a top-level project and we look forward to further collaborating with the Apache Parquet community to continually improve performance."
>>
>>"Parquet’s integration with other object models, like Avro and Thrift, has been a key feature for our customers," said Ryan Blue, Software Engineer at Cloudera. "They can take advantage of columnar storage without changing the classes they already use in their production applications."
>>
>>"At Netflix, Parquet is the primary storage format for data warehousing. More than 7 petabytes of our 10+ Petabyte warehouse is Parquet formatted data that we query across a wide range of tools including Apache Hive, Apache Pig, Apache Spark, PigPen, Presto, and native MapReduce.  The performance benefit of columnar projection and statistics is a game changer for our big data platform," said Daniel Week, Software Engineer at Netflix. "We look forward to working with the Apache community to advance the state of big data storage with Parquet and are excited to see the project graduate to full Apache status."
>>
>>"I was extremely happy to see Parquet arrive as an Incubator project," said Chris Mattmann, Apache Parquet Incubator Mentor, and Chief Architect, Instrument and Science Data Systems Section at NASA Jet Propulsion Laboratory. "After talking with some in its community there was a real match with
>>this columnar data format technology and its community with the way that we do things here at the ASF. Parquet has had an exemplar Incubation, and the project has big things ahead of it. I am encouraging my Data Science Team at NASA to evaluate it for data representation especially
>>as it relates to our science holdings in Earth, planetary and space sciences, and astrophysics."
>>
>>
>>Stripe? @cra reached out to Avi, said he would get something by Monday
>>Criteo?
>>
>>@@CLOSING QUOTE FROM JULIEN?
>>
>>Availability and Oversight
>>Apache Parquet software is released under the Apache License v2.0 and is overseen by a self-selected team of active contributors to the project. A Project Management Committee (PMC) guides the Project's day-to-day operations, including community development and product releases. For downloads, documentation, and ways to become involved with Apache Parquet, visit http://parquet.apache.org/ and https://twitter.com/ApacheParquet
>>
>>About the Apache Incubator
>>The Apache Incubator is the entry path for projects and codebases wishing to become part of the efforts at The Apache Software Foundation. All code donations from external organizations and existing external projects wishing to join the ASF enter through the Incubator to: 1) ensure all donations are in accordance with the ASF legal standards; and 2) develop new communities that adhere to our guiding principles. Incubation is required of all newly accepted projects until a further review indicates that the infrastructure, communications, and decision making process have stabilized in a manner consistent with other successful ASF projects. While incubation status is not necessarily a reflection of the completeness or stability of the code, it does indicate that the project has yet to be fully endorsed by the ASF. For more information, visit http://incubator.apache.org/.
>>
>>About The Apache Software Foundation (ASF)
>>Established in 1999, the all-volunteer Foundation oversees more than 350 leading Open Source projects, including Apache HTTP Server --the world's most popular Web server software. Through the ASF's meritocratic process known as "The Apache Way," more than 500 individual Members and 4,500 Committers successfully collaborate to develop freely available enterprise-grade software, benefiting millions of users worldwide: thousands of software solutions are distributed under the Apache License; and the community actively participates in ASF mailing lists, mentoring initiatives, and ApacheCon, the Foundation's official user conference, trainings, and expo. The ASF is a US 501(c)(3) charitable organization, funded by individual donations and corporate sponsors including Bloomberg, Budget Direct, Cerner, Citrix, Cloudera, Comcast, Facebook, Google, Hortonworks, HP, IBM, InMotion Hosting, iSigma, Matt Mullenweg, Microsoft, Pivotal, Produban, WANdisco, and Yahoo. For more information, visit http://www.apache.org/ or follow @TheASF on Twitter.
>>
>>© The Apache Software Foundation. "Apache", "Avro", "Apache Avro", "Drill", "Apache Drill", "Hadoop", "Apache Hadoop", "Parquet", "Apache Parquet", "Pig", "Apache Pig", "Spark", "Apache Spark", "Tajo", "Apache Tajo", "Thrift", "Apache Thrift", and "ApacheCon" are registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries. All other brands and trademarks are the property of their respective owners.
>>
>># # #
>>
>>
>>________________________________
>>
>>From: Chris Aniszczyk <ca...@gmail.com>
>>To: "dev@parquet.incubator.apache.org" <de...@parquet.incubator.apache.org>
>>Cc: Sally Khudairi <sa...@yahoo.com>; Ryan Blue <bl...@cloudera.com>; "jfarrell@apache.org" <jf...@apache.org>; "Mattmann, Chris A (3980)" <ch...@jpl.nasa.gov>; "press@apache.org" <pr...@apache.org>
>>Sent: Wednesday, 22 April 2015, 14:51
>>Subject: Re: Graduation blog post?
>>
>>
>>
>>Thanks Daniel, I added your quote.
>>
>>
>>
>>
>>On Wed, Apr 22, 2015 at 12:14 PM, Daniel Weeks <dw...@netflix.com.invalid> wrote:
>>
>>Netflix Testimonial:
>>>
>>>At Netflix, Parquet is the primary storage format for data warehousing.
>>>More than 7 petabytes of our 10+ Petabyte warehouse is Parquet formatted
>>>data that we query across a wide range of tools including Apache Hive,
>>>Apache Pig, Apache Spark, PigPen, Presto, and native MapReduce.  The
>>>performance benefit of columnar projection and statistics is a game changer
>>>for our big data platform.  We look forward to working with the Apache
>>>community to advance the state of big data storage with Parquet and are
>>>excited to see the project graduate to full Apache status.
>>>
>>>Daniel Weeks
>>>Engineering Manager - Big Data Compute
>>>Neflix
>>>
>>>
>>>On Wed, Apr 22, 2015 at 9:36 AM, Sally Khudairi <
>>>sallykhudairi@yahoo.com.invalid> wrote:
>>>
>>>> Thanks for the draft thus far, Ryan.
>>>> Can we please include at least one more industry testimonial?
>>>> Also, if you can please provide edit access to my account at
>>>> khudairi@gmail.com, that would be great.
>>>> Thanks in advance for this!
>>>> -Sally
>>>>
>>>>
>>>>      From: Ryan Blue <bl...@cloudera.com>
>>>>  To: jfarrell@apache.org; Sally Khudairi <sa...@yahoo.com>
>>>> Cc: "Mattmann, Chris A (3980)" <ch...@jpl.nasa.gov>; "
>>>> press@apache.org" <pr...@apache.org>; "dev@parquet.incubator.apache.org" <
>>>> dev@parquet.incubator.apache.org>
>>>>  Sent: Monday, 20 April 2015, 15:48
>>>>  Subject: Re: Graduation blog post?
>>>>
>>>> On 04/20/2015 12:36 PM, Jake Farrell wrote:
>>>> > Hey Sally
>>>> > i've got root@ karma and will take care of the infra side of things for
>>>> > us once the board has successfully voted on our resolution
>>>> >
>>>> > -Jake
>>>>
>>>> Thanks, Jake! I've already sent an e-mail to Infra, but I'll follow up
>>>> with this news so they don't worry about it.
>>>>
>>>> rb
>>>>
>>>>
>>>> --
>>>> Ryan Blue
>>>> Software Engineer
>>>> Cloudera, Inc.
>>>>
>>>>
>>>>
>>>>
>>>
>>
>>
>>--
>>
>>Cheers,
>>
>>Chris Aniszczyk
>>http://aniszczyk.org
>>+1 512 961 6719
>>
>
> 

Re: FINAL CALL: Apache Parquet TLP announcement [was Re: Graduation blog post?]

Posted by Julien Le Dem <ju...@twitter.com.INVALID>.
Did you want to mention the parquet talks at the Hadoop summit in June?
Otherwise this looks good to me.

On Sunday, April 26, 2015, Sally Khudairi <sa...@yahoo.com.invalid>
wrote:

> Hi everyone --I haven't received any other feedback, so I think we're all
> set to announce tomorrow.
> I'd like to issue the press release at at 7AM ET. I'll confirm when we're
> live.
> If there are any showstoppers, please let me know ASAP.
> Thanks so much,Sally
>
>       From: Sally Khudairi <sk@apache.org <javascript:;>>
>  To: Sally Khudairi <sallykhudairi@yahoo.com <javascript:;>>; Daniel
> Weeks <dweeks@netflix.com <javascript:;>>; "
> dev@parquet.incubator.apache.org <javascript:;>" <
> dev@parquet.incubator.apache.org <javascript:;>>
> Cc: Chris Aniszczyk <caniszczyk@gmail.com <javascript:;>>; Ryan Blue <
> blue@cloudera.com <javascript:;>>; "jfarrell@apache.org <javascript:;>" <
> jfarrell@apache.org <javascript:;>>; "Mattmann, Chris A (3980)" <
> chris.a.mattmann@jpl.nasa.gov <javascript:;>>; "press@apache.org
> <javascript:;>" <press@apache.org <javascript:;>>
>  Sent: Friday, 24 April 2015, 16:17
>  Subject: FINAL CALL: Apache Parquet TLP announcement [was Re: Graduation
> blog post?]
>
> Hello again, everyone --below is the latest draft.
>
> Please review and forward any changes/additions no later than 5PM ET on
> Sunday in order for us to announce on Monday morning. I was aiming to go
> live by 7AM ET if that works for you.
>
> Kindly confirm.
>
> Thanks in advance,
> Sally
>
> = = =
>
> DRAFT :: NOT FOR DISTRIBUTION
>
> The Apache Software Foundation Announces Apache™ Parquet™ as a Top-Level
> Project
>
> Open Source storage format for the Apache™ Hadoop® ecosystem in use at
> Cloudera, NASA, Netflix, Stripe and Twitter, among other organizations
>
> Forest Hill, MD –27 April 2015– The Apache Software Foundation (ASF), the
> all-volunteer developers, stewards, and incubators of more than 350 Open
> Source projects and initiatives, announced today that Apache™ Parquet™ has
> graduated from the Apache Incubator to become a Top-Level Project (TLP),
> signifying that the project's community and products have been
> well-governed under the ASF's meritocratic process and principles.
>
> "The incubation process at Apache has been fantastic and really the last
> step of making Parquet a community driven standard fully integrated within
> the greater Hadoop ecosystem," said Julien Le Dem, Vice President of Apache
> Parquet.
>
> Apache Parquet is an Open Source columnar storage format for the Apache™
> Hadoop® ecosystem, built to work across programming languages and much more:
>
>
>  - processing frameworks (MapReduce, Apache Spark, Scalding, Cascading,
> Crunch, Kite)
>  - data models (Apache Avro, Apache Thrift, Protocol Buffers, POJOs)
>  - query engines (Apache Hive, Impala, HAWQ, Apache Drill, Apache Tajo,
> Apache Pig, Presto, Apache Spark SQL)
>
> "At Twitter, Parquet has helped us scale our big data usage by in some
> cases reducing storage requirements by one third on large datasets as well
> as scan and deserialization time. This translated into hardware savings as
> well as reduced latency for accessing the data. Furthermore, Parquet being
> integrated with so many tools creates opportunities and flexibility
> regarding query engines," said Chris Aniszczyk, Head of Open Source at
> Twitter. "Finally, it's just fantastic to see it graduate to a top-level
> project and we look forward to further collaborating with the Apache
> Parquet community to continually improve performance."
>
> "Parquet’s integration with other object models, like Avro and Thrift, has
> been a key feature for our customers," said Ryan Blue, Software Engineer at
> Cloudera. "They can take advantage of columnar storage without changing the
> classes they already use in their production applications."
>
> "At Netflix, Parquet is the primary storage format for data warehousing.
> More than 7 petabytes of our 10+ Petabyte warehouse is Parquet formatted
> data that we query across a wide range of tools including Apache Hive,
> Apache Pig, Apache Spark, PigPen, Presto, and native MapReduce. The
> performance benefit of columnar projection and statistics is a game changer
> for our big data platform," said Daniel Weeks, Software Engineer at
> Netflix. "We look forward to working with the Apache community to advance
> the state of big data storage with Parquet and are excited to see the
> project graduate to full Apache status."
>
> "Stripe's data warehouse has been built on Parquet from the beginning,"
> said Avi Bryant, Engineering Manager at Stripe. "Every aspect of our
> pipeline, from data import to machine learning to adhoc SQL analysis, uses
> Apache Parquet as the common interchange format."
>
> "I was extremely happy to see Parquet arrive as an Incubator project,"
> said Chris Mattmann, Apache Parquet Incubator Mentor, and Chief Architect,
> Instrument and Science Data Systems Section at NASA Jet Propulsion
> Laboratory. "After talking with some in its community there was a real
> match with this columnar data format technology and its community with the
> way that we do things here at the ASF. Parquet has had an exemplar
> Incubation, and the project has big things ahead of it. I am encouraging my
> Data Science Team at NASA to evaluate it for data representation especially
> as it relates to our science holdings in Earth, planetary and space
> sciences, and astrophysics."
>
> The Apache Parquet project welcomes contributions and community
> participation through mailing lists, face-to-face MeetUps, and user events.
> For more information, visit http://parquet.apache.org/community/
>
> Availability and Oversight
> Apache Parquet software is released under the Apache License v2.0 and is
> overseen by a self-selected team of active contributors to the project. A
> Project Management Committee (PMC) guides the Project's day-to-day
> operations, including community development and product releases. For
> downloads, documentation, and ways to become involved with Apache Parquet,
> visit http://parquet.apache.org/ and https://twitter.com/ApacheParquet
>
> About the Apache Incubator
> The Apache Incubator is the entry path for projects and codebases wishing
> to become part of the efforts at The Apache Software Foundation. All code
> donations from external organizations and existing external projects
> wishing to join the ASF enter through the Incubator to: 1) ensure all
> donations are in accordance with the ASF legal standards; and 2) develop
> new communities that adhere to our guiding principles. Incubation is
> required of all newly accepted projects until a further review indicates
> that the infrastructure, communications, and decision making process have
> stabilized in a manner consistent with other successful ASF projects. While
> incubation status is not necessarily a reflection of the completeness or
> stability of the code, it does indicate that the project has yet to be
> fully endorsed by the ASF. For more information, visit
> http://incubator.apache.org/.
>
> About The Apache Software Foundation (ASF)
> Established in 1999, the all-volunteer Foundation oversees more than 350
> leading Open Source projects, including Apache HTTP Server --the world's
> most popular Web server software. Through the ASF's meritocratic process
> known as "The Apache Way," more than 500 individual Members and 4,500
> Committers successfully collaborate to develop freely available
> enterprise-grade software, benefiting millions of users worldwide:
> thousands of software solutions are distributed under the Apache License;
> and the community actively participates in ASF mailing lists, mentoring
> initiatives, and ApacheCon, the Foundation's official user conference,
> trainings, and expo. The ASF is a US 501(c)(3) charitable organization,
> funded by individual donations and corporate sponsors including Bloomberg,
> Budget Direct, Cerner, Citrix, Cloudera, Comcast, Facebook, Google,
> Hortonworks, HP, IBM, InMotion Hosting, iSigma, Matt Mullenweg, Microsoft,
> Pivotal, Produban, WANdisco, and Yahoo. For more information, visit
> http://www.apache.org/ or follow @TheASF on Twitter.
>
> © The Apache Software Foundation. "Apache", "Avro", "Apache Avro",
> "Drill", "Apache Drill", "Hadoop", "Apache Hadoop", "Parquet", "Apache
> Parquet", "Pig", "Apache Pig", "Spark", "Apache Spark", "Tajo", "Apache
> Tajo", "Thrift", "Apache Thrift", and "ApacheCon" are registered trademarks
> or trademarks of the Apache Software Foundation in the United States and/or
> other countries. All other brands and trademarks are the property of their
> respective owners.
>
> # # #
>
> [MEDIA CONTACT:SALLY]
> ________________________________
>
>
> From: Sally Khudairi <sallykhudairi@yahoo.com <javascript:;>>
> To: Sally Khudairi <sallykhudairi@yahoo.com <javascript:;>>; Daniel Weeks
> <dweeks@netflix.com <javascript:;>>; "dev@parquet.incubator.apache.org
> <javascript:;>" <dev@parquet.incubator.apache.org <javascript:;>>
> Cc: Chris Aniszczyk <caniszczyk@gmail.com <javascript:;>>; Ryan Blue <
> blue@cloudera.com <javascript:;>>; "jfarrell@apache.org <javascript:;>" <
> jfarrell@apache.org <javascript:;>>; "Mattmann, Chris A (3980)" <
> chris.a.mattmann@jpl.nasa.gov <javascript:;>>; "press@apache.org
> <javascript:;>" <press@apache.org <javascript:;>>
> Sent: Friday, 24 April 2015, 13:56
> Subject: Re: Graduation blog post?
>
>
>
> Done.
>
> ALL: can you please let me know if there are any events that Parquet will
> be at? Presenting? Hosting? etc.
>
> Thank you!
>
> -Sally
>
>
>
>
>
> ________________________________
> From: Sally Khudairi <sallykhudairi@yahoo.com <javascript:;>>
> To: Daniel Weeks <dweeks@netflix.com <javascript:;>>; "
> dev@parquet.incubator.apache.org <javascript:;>" <
> dev@parquet.incubator.apache.org <javascript:;>>
> Cc: Chris Aniszczyk <caniszczyk@gmail.com <javascript:;>>; Ryan Blue <
> blue@cloudera.com <javascript:;>>; "jfarrell@apache.org <javascript:;>" <
> jfarrell@apache.org <javascript:;>>; "Mattmann, Chris A (3980)" <
> chris.a.mattmann@jpl.nasa.gov <javascript:;>>; "press@apache.org
> <javascript:;>" <press@apache.org <javascript:;>>
> Sent: Friday, 24 April 2015, 13:40
> Subject: Re: Graduation blog post?
>
>
>
> Of course --I'll fix that now!
>
> Sorry about that, Daniel.
>
> -Sally
>
>
>
>
>
>
> ________________________________
> From: Daniel Weeks <dweeks@netflix.com <javascript:;>>
> To: dev@parquet.incubator.apache.org <javascript:;>; Sally Khudairi <
> sallykhudairi@yahoo.com <javascript:;>>
> Cc: Chris Aniszczyk <caniszczyk@gmail.com <javascript:;>>; Ryan Blue <
> blue@cloudera.com <javascript:;>>; "jfarrell@apache.org <javascript:;>" <
> jfarrell@apache.org <javascript:;>>; "Mattmann, Chris A (3980)" <
> chris.a.mattmann@jpl.nasa.gov <javascript:;>>; "press@apache.org
> <javascript:;>" <press@apache.org <javascript:;>>
> Sent: Friday, 24 April 2015, 13:38
> Subject: Re: Graduation blog post?
>
>
>
> Sally,
>
> Just wanted to comment that my last name is misspelled in the Netflix
> testimonial.  Can someone fix that?  (it's Weeks, not Week)
>
> Thanks,
> Dan
>
>
>
>
> On Fri, Apr 24, 2015 at 10:23 AM, Sally Khudairi
> <sa...@yahoo.com.invalid> wrote:
>
> Hi everyone --there's been the addition of a quote from Stripe:
> >
> >"Stripe's data warehouse has been built on Parquet from the beginning,"
> said Avi Bryant, Engineering Manager at Stripe. "Every aspect of our
> pipeline, from data import to machine learning to adhoc SQL analysis, uses
> Apache Parquet as the common interchange format."
> >
> >
> >--please note that I added "Apache" to "Parquet" in the second sentence.
> Stripe has also been added to the sub-head.
> >
> >Are we waiting for quotes from anyone else? If not, I can add a closing
> sentence and forward the final copy later today.
> >
> >Thanks so much,
> >Sally
> >
> >
> >
> >----- Original Message -----
> >
> >From: Sally Khudairi <sallykhudairi@yahoo.com <javascript:;>>
> >To: Chris Aniszczyk <caniszczyk@gmail.com <javascript:;>>; "
> dev@parquet.incubator.apache.org <javascript:;>" <
> dev@parquet.incubator.apache.org <javascript:;>>
> >Cc: Ryan Blue <blue@cloudera.com <javascript:;>>; "jfarrell@apache.org
> <javascript:;>" <jfarrell@apache.org <javascript:;>>; "Mattmann, Chris A
> (3980)" <chris.a.mattmann@jpl.nasa.gov <javascript:;>>; "press@apache.org
> <javascript:;>" <press@apache.org <javascript:;>>
> >Sent: Thursday, 23 April 2015, 15:25
> >Subject: Re: Graduation blog post?
> >
> >Hello everyone --below is the draft thus far.
> >
> >
> >I was aiming to announce on Monday by 7AM ET, but noticed that we're
> waiting for additional quotes.
> >
> >Also, should we get a closing quote from Julien? Perhaps something that
> invites additional community participation?
> >
> >Please let me know your thoughts.
> >
> >Thanks so much,
> >Sally
> >
> >= = =
> >
> >The Apache Software Foundation Announces Apache™ Parquet™ as a Top-Level
> Project
> >
> >Open Source storage format for the Apache™ Hadoop® ecosystem in use at
> Cloudera, NASA, Netflix, and Twitter, among other organizations
> >
> >Forest Hill, MD –27 April 2015– The Apache Software Foundation (ASF), the
> all-volunteer developers, stewards, and incubators of more than 350 Open
> Source projects and initiatives, announced today that Apache™ Parquet™ has
> graduated from the Apache Incubator to become a Top-Level Project (TLP),
> signifying that the project's community and products have been
> well-governed under the ASF's meritocratic process and principles.
> >
> >"The incubation process at Apache has been fantastic and really the last
> step of making Parquet a community driven standard fully integrated within
> the greater Hadoop ecosystem." said Julien Le Dem, Vice President of Apache
> Parquet.
> >
> >Apache Parquet is an Open Source columnar storage format for the Apache™
> Hadoop® ecosystem, built to work across programming languages and much more:
> >- processing frameworks (MapReduce, Apache Spark, Scalding, Cascading,
> Crunch, Kite)
> >- data models (Apache Avro, Apache Thrift, Protocol Buffers, POJOs)
> >- query engines (Apache Hive, Impala, HAWQ, Apache Drill, Apache Tajo,
> Apache Pig, Presto, Apache Spark SQL)
> >
> >"At Twitter, Parquet has helped us scale our big data usage by in some
> cases reducing storage requirements by one third on large datasets as well
> as scan and deserialization time. This translated into hardware savings as
> well as reduced latency for accessing the data. Furthermore, Parquet being
> integrated with so many tools creates opportunities and flexibility
> regarding query engines," said Chris Aniszczyk, Head of Open Source at
> Twitter. "Finally, it's just fantastic to see it graduate to a top-level
> project and we look forward to further collaborating with the Apache
> Parquet community to continually improve performance."
> >
> >"Parquet’s integration with other object models, like Avro and Thrift,
> has been a key feature for our customers," said Ryan Blue, Software
> Engineer at Cloudera. "They can take advantage of columnar storage without
> changing the classes they already use in their production applications."
> >
> >"At Netflix, Parquet is the primary storage format for data warehousing.
> More than 7 petabytes of our 10+ Petabyte warehouse is Parquet formatted
> data that we query across a wide range of tools including Apache Hive,
> Apache Pig, Apache Spark, PigPen, Presto, and native MapReduce.  The
> performance benefit of columnar projection and statistics is a game changer
> for our big data platform," said Daniel Week, Software Engineer at Netflix.
> "We look forward to working with the Apache community to advance the state
> of big data storage with Parquet and are excited to see the project
> graduate to full Apache status."
> >
> >"I was extremely happy to see Parquet arrive as an Incubator project,"
> said Chris Mattmann, Apache Parquet Incubator Mentor, and Chief Architect,
> Instrument and Science Data Systems Section at NASA Jet Propulsion
> Laboratory. "After talking with some in its community there was a real
> match with
> >this columnar data format technology and its community with the way that
> we do things here at the ASF. Parquet has had an exemplar Incubation, and
> the project has big things ahead of it. I am encouraging my Data Science
> Team at NASA to evaluate it for data representation especially
> >as it relates to our science holdings in Earth, planetary and space
> sciences, and astrophysics."
> >
> >
> >Stripe? @cra reached out to Avi, said he would get something by Monday
> >Criteo?
> >
> >@@CLOSING QUOTE FROM JULIEN?
> >
> >Availability and Oversight
> >Apache Parquet software is released under the Apache License v2.0 and is
> overseen by a self-selected team of active contributors to the project. A
> Project Management Committee (PMC) guides the Project's day-to-day
> operations, including community development and product releases. For
> downloads, documentation, and ways to become involved with Apache Parquet,
> visit http://parquet.apache.org/ and https://twitter.com/ApacheParquet
> >
> >About the Apache Incubator
> >The Apache Incubator is the entry path for projects and codebases wishing
> to become part of the efforts at The Apache Software Foundation. All code
> donations from external organizations and existing external projects
> wishing to join the ASF enter through the Incubator to: 1) ensure all
> donations are in accordance with the ASF legal standards; and 2) develop
> new communities that adhere to our guiding principles. Incubation is
> required of all newly accepted projects until a further review indicates
> that the infrastructure, communications, and decision making process have
> stabilized in a manner consistent with other successful ASF projects. While
> incubation status is not necessarily a reflection of the completeness or
> stability of the code, it does indicate that the project has yet to be
> fully endorsed by the ASF. For more information, visit
> http://incubator.apache.org/.
> >
> >About The Apache Software Foundation (ASF)
> >Established in 1999, the all-volunteer Foundation oversees more than 350
> leading Open Source projects, including Apache HTTP Server --the world's
> most popular Web server software. Through the ASF's meritocratic process
> known as "The Apache Way," more than 500 individual Members and 4,500
> Committers successfully collaborate to develop freely available
> enterprise-grade software, benefiting millions of users worldwide:
> thousands of software solutions are distributed under the Apache License;
> and the community actively participates in ASF mailing lists, mentoring
> initiatives, and ApacheCon, the Foundation's official user conference,
> trainings, and expo. The ASF is a US 501(c)(3) charitable organization,
> funded by individual donations and corporate sponsors including Bloomberg,
> Budget Direct, Cerner, Citrix, Cloudera, Comcast, Facebook, Google,
> Hortonworks, HP, IBM, InMotion Hosting, iSigma, Matt Mullenweg, Microsoft,
> Pivotal, Produban, WANdisco, and Yahoo. For more information, visit
> http://www.apache.org/ or follow @TheASF on Twitter.
> >
> >© The Apache Software Foundation. "Apache", "Avro", "Apache Avro",
> "Drill", "Apache Drill", "Hadoop", "Apache Hadoop", "Parquet", "Apache
> Parquet", "Pig", "Apache Pig", "Spark", "Apache Spark", "Tajo", "Apache
> Tajo", "Thrift", "Apache Thrift", and "ApacheCon" are registered trademarks
> or trademarks of the Apache Software Foundation in the United States and/or
> other countries. All other brands and trademarks are the property of their
> respective owners.
> >
> ># # #
> >
> >
> >________________________________
> >
> >From: Chris Aniszczyk <caniszczyk@gmail.com <javascript:;>>
> >To: "dev@parquet.incubator.apache.org <javascript:;>" <
> dev@parquet.incubator.apache.org <javascript:;>>
> >Cc: Sally Khudairi <sallykhudairi@yahoo.com <javascript:;>>; Ryan Blue <
> blue@cloudera.com <javascript:;>>; "jfarrell@apache.org <javascript:;>" <
> jfarrell@apache.org <javascript:;>>; "Mattmann, Chris A (3980)" <
> chris.a.mattmann@jpl.nasa.gov <javascript:;>>; "press@apache.org
> <javascript:;>" <press@apache.org <javascript:;>>
> >Sent: Wednesday, 22 April 2015, 14:51
> >Subject: Re: Graduation blog post?
> >
> >
> >
> >Thanks Daniel, I added your quote.
> >
> >
> >
> >
> >On Wed, Apr 22, 2015 at 12:14 PM, Daniel Weeks <dw...@netflix.com.invalid>
> wrote:
> >
> >Netflix Testimonial:
> >>
> >>At Netflix, Parquet is the primary storage format for data warehousing.
> >>More than 7 petabytes of our 10+ Petabyte warehouse is Parquet formatted
> >>data that we query across a wide range of tools including Apache Hive,
> >>Apache Pig, Apache Spark, PigPen, Presto, and native MapReduce.  The
> >>performance benefit of columnar projection and statistics is a game
> changer
> >>for our big data platform.  We look forward to working with the Apache
> >>community to advance the state of big data storage with Parquet and are
> >>excited to see the project graduate to full Apache status.
> >>
> >>Daniel Weeks
> >>Engineering Manager - Big Data Compute
> >>Neflix
> >>
> >>
> >>On Wed, Apr 22, 2015 at 9:36 AM, Sally Khudairi <
> >>sallykhudairi@yahoo.com.invalid> wrote:
> >>
> >>> Thanks for the draft thus far, Ryan.
> >>> Can we please include at least one more industry testimonial?
> >>> Also, if you can please provide edit access to my account at
> >>> khudairi@gmail.com <javascript:;>, that would be great.
> >>> Thanks in advance for this!
> >>> -Sally
> >>>
> >>>
> >>>      From: Ryan Blue <blue@cloudera.com <javascript:;>>
> >>>  To: jfarrell@apache.org <javascript:;>; Sally Khudairi <
> sallykhudairi@yahoo.com <javascript:;>>
> >>> Cc: "Mattmann, Chris A (3980)" <chris.a.mattmann@jpl.nasa.gov
> <javascript:;>>; "
> >>> press@apache.org <javascript:;>" <press@apache.org <javascript:;>>; "
> dev@parquet.incubator.apache.org <javascript:;>" <
> >>> dev@parquet.incubator.apache.org <javascript:;>>
> >>>  Sent: Monday, 20 April 2015, 15:48
> >>>  Subject: Re: Graduation blog post?
> >>>
> >>> On 04/20/2015 12:36 PM, Jake Farrell wrote:
> >>> > Hey Sally
> >>> > i've got root@ karma and will take care of the infra side of things
> for
> >>> > us once the board has successfully voted on our resolution
> >>> >
> >>> > -Jake
> >>>
> >>> Thanks, Jake! I've already sent an e-mail to Infra, but I'll follow up
> >>> with this news so they don't worry about it.
> >>>
> >>> rb
> >>>
> >>>
> >>> --
> >>> Ryan Blue
> >>> Software Engineer
> >>> Cloudera, Inc.
> >>>
> >>>
> >>>
> >>>
> >>
> >
> >
> >--
> >
> >Cheers,
> >
> >Chris Aniszczyk
> >http://aniszczyk.org
> >+1 512 961 6719
> >
>
>

Re: FINAL CALL: Apache Parquet TLP announcement [was Re: Graduation blog post?]

Posted by Sally Khudairi <sa...@yahoo.com.INVALID>.
Hi everyone --I haven't received any other feedback, so I think we're all set to announce tomorrow.
I'd like to issue the press release at at 7AM ET. I'll confirm when we're live.
If there are any showstoppers, please let me know ASAP.
Thanks so much,Sally

      From: Sally Khudairi <sk...@apache.org>
 To: Sally Khudairi <sa...@yahoo.com>; Daniel Weeks <dw...@netflix.com>; "dev@parquet.incubator.apache.org" <de...@parquet.incubator.apache.org> 
Cc: Chris Aniszczyk <ca...@gmail.com>; Ryan Blue <bl...@cloudera.com>; "jfarrell@apache.org" <jf...@apache.org>; "Mattmann, Chris A (3980)" <ch...@jpl.nasa.gov>; "press@apache.org" <pr...@apache.org> 
 Sent: Friday, 24 April 2015, 16:17
 Subject: FINAL CALL: Apache Parquet TLP announcement [was Re: Graduation blog post?]
   
Hello again, everyone --below is the latest draft.

Please review and forward any changes/additions no later than 5PM ET on Sunday in order for us to announce on Monday morning. I was aiming to go live by 7AM ET if that works for you. 

Kindly confirm. 

Thanks in advance,
Sally

= = =

DRAFT :: NOT FOR DISTRIBUTION

The Apache Software Foundation Announces Apache™ Parquet™ as a Top-Level Project 

Open Source storage format for the Apache™ Hadoop® ecosystem in use at Cloudera, NASA, Netflix, Stripe and Twitter, among other organizations 

Forest Hill, MD –27 April 2015– The Apache Software Foundation (ASF), the all-volunteer developers, stewards, and incubators of more than 350 Open Source projects and initiatives, announced today that Apache™ Parquet™ has graduated from the Apache Incubator to become a Top-Level Project (TLP), signifying that the project's community and products have been well-governed under the ASF's meritocratic process and principles. 

"The incubation process at Apache has been fantastic and really the last step of making Parquet a community driven standard fully integrated within the greater Hadoop ecosystem," said Julien Le Dem, Vice President of Apache Parquet. 

Apache Parquet is an Open Source columnar storage format for the Apache™ Hadoop® ecosystem, built to work across programming languages and much more: 


 - processing frameworks (MapReduce, Apache Spark, Scalding, Cascading, Crunch, Kite) 
 - data models (Apache Avro, Apache Thrift, Protocol Buffers, POJOs) 
 - query engines (Apache Hive, Impala, HAWQ, Apache Drill, Apache Tajo, Apache Pig, Presto, Apache Spark SQL) 

"At Twitter, Parquet has helped us scale our big data usage by in some cases reducing storage requirements by one third on large datasets as well as scan and deserialization time. This translated into hardware savings as well as reduced latency for accessing the data. Furthermore, Parquet being integrated with so many tools creates opportunities and flexibility regarding query engines," said Chris Aniszczyk, Head of Open Source at Twitter. "Finally, it's just fantastic to see it graduate to a top-level project and we look forward to further collaborating with the Apache Parquet community to continually improve performance." 

"Parquet’s integration with other object models, like Avro and Thrift, has been a key feature for our customers," said Ryan Blue, Software Engineer at Cloudera. "They can take advantage of columnar storage without changing the classes they already use in their production applications." 

"At Netflix, Parquet is the primary storage format for data warehousing. More than 7 petabytes of our 10+ Petabyte warehouse is Parquet formatted data that we query across a wide range of tools including Apache Hive, Apache Pig, Apache Spark, PigPen, Presto, and native MapReduce. The performance benefit of columnar projection and statistics is a game changer for our big data platform," said Daniel Weeks, Software Engineer at Netflix. "We look forward to working with the Apache community to advance the state of big data storage with Parquet and are excited to see the project graduate to full Apache status." 

"Stripe's data warehouse has been built on Parquet from the beginning," said Avi Bryant, Engineering Manager at Stripe. "Every aspect of our pipeline, from data import to machine learning to adhoc SQL analysis, uses Apache Parquet as the common interchange format." 

"I was extremely happy to see Parquet arrive as an Incubator project," said Chris Mattmann, Apache Parquet Incubator Mentor, and Chief Architect, Instrument and Science Data Systems Section at NASA Jet Propulsion Laboratory. "After talking with some in its community there was a real match with this columnar data format technology and its community with the way that we do things here at the ASF. Parquet has had an exemplar Incubation, and the project has big things ahead of it. I am encouraging my Data Science Team at NASA to evaluate it for data representation especially as it relates to our science holdings in Earth, planetary and space sciences, and astrophysics." 

The Apache Parquet project welcomes contributions and community participation through mailing lists, face-to-face MeetUps, and user events. For more information, visit http://parquet.apache.org/community/ 

Availability and Oversight 
Apache Parquet software is released under the Apache License v2.0 and is overseen by a self-selected team of active contributors to the project. A Project Management Committee (PMC) guides the Project's day-to-day operations, including community development and product releases. For downloads, documentation, and ways to become involved with Apache Parquet, visit http://parquet.apache.org/ and https://twitter.com/ApacheParquet 

About the Apache Incubator 
The Apache Incubator is the entry path for projects and codebases wishing to become part of the efforts at The Apache Software Foundation. All code donations from external organizations and existing external projects wishing to join the ASF enter through the Incubator to: 1) ensure all donations are in accordance with the ASF legal standards; and 2) develop new communities that adhere to our guiding principles. Incubation is required of all newly accepted projects until a further review indicates that the infrastructure, communications, and decision making process have stabilized in a manner consistent with other successful ASF projects. While incubation status is not necessarily a reflection of the completeness or stability of the code, it does indicate that the project has yet to be fully endorsed by the ASF. For more information, visit http://incubator.apache.org/. 

About The Apache Software Foundation (ASF) 
Established in 1999, the all-volunteer Foundation oversees more than 350 leading Open Source projects, including Apache HTTP Server --the world's most popular Web server software. Through the ASF's meritocratic process known as "The Apache Way," more than 500 individual Members and 4,500 Committers successfully collaborate to develop freely available enterprise-grade software, benefiting millions of users worldwide: thousands of software solutions are distributed under the Apache License; and the community actively participates in ASF mailing lists, mentoring initiatives, and ApacheCon, the Foundation's official user conference, trainings, and expo. The ASF is a US 501(c)(3) charitable organization, funded by individual donations and corporate sponsors including Bloomberg, Budget Direct, Cerner, Citrix, Cloudera, Comcast, Facebook, Google, Hortonworks, HP, IBM, InMotion Hosting, iSigma, Matt Mullenweg, Microsoft, Pivotal, Produban, WANdisco, and Yahoo. For more information, visit http://www.apache.org/ or follow @TheASF on Twitter. 

© The Apache Software Foundation. "Apache", "Avro", "Apache Avro", "Drill", "Apache Drill", "Hadoop", "Apache Hadoop", "Parquet", "Apache Parquet", "Pig", "Apache Pig", "Spark", "Apache Spark", "Tajo", "Apache Tajo", "Thrift", "Apache Thrift", and "ApacheCon" are registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries. All other brands and trademarks are the property of their respective owners. 

# # # 

[MEDIA CONTACT:SALLY]
________________________________


From: Sally Khudairi <sa...@yahoo.com>
To: Sally Khudairi <sa...@yahoo.com>; Daniel Weeks <dw...@netflix.com>; "dev@parquet.incubator.apache.org" <de...@parquet.incubator.apache.org> 
Cc: Chris Aniszczyk <ca...@gmail.com>; Ryan Blue <bl...@cloudera.com>; "jfarrell@apache.org" <jf...@apache.org>; "Mattmann, Chris A (3980)" <ch...@jpl.nasa.gov>; "press@apache.org" <pr...@apache.org> 
Sent: Friday, 24 April 2015, 13:56
Subject: Re: Graduation blog post?



Done.

ALL: can you please let me know if there are any events that Parquet will be at? Presenting? Hosting? etc. 

Thank you!
 
-Sally





________________________________
From: Sally Khudairi <sa...@yahoo.com>
To: Daniel Weeks <dw...@netflix.com>; "dev@parquet.incubator.apache.org" <de...@parquet.incubator.apache.org> 
Cc: Chris Aniszczyk <ca...@gmail.com>; Ryan Blue <bl...@cloudera.com>; "jfarrell@apache.org" <jf...@apache.org>; "Mattmann, Chris A (3980)" <ch...@jpl.nasa.gov>; "press@apache.org" <pr...@apache.org> 
Sent: Friday, 24 April 2015, 13:40
Subject: Re: Graduation blog post?



Of course --I'll fix that now!

Sorry about that, Daniel.

-Sally
 





________________________________
From: Daniel Weeks <dw...@netflix.com>
To: dev@parquet.incubator.apache.org; Sally Khudairi <sa...@yahoo.com> 
Cc: Chris Aniszczyk <ca...@gmail.com>; Ryan Blue <bl...@cloudera.com>; "jfarrell@apache.org" <jf...@apache.org>; "Mattmann, Chris A (3980)" <ch...@jpl.nasa.gov>; "press@apache.org" <pr...@apache.org> 
Sent: Friday, 24 April 2015, 13:38
Subject: Re: Graduation blog post?



Sally,

Just wanted to comment that my last name is misspelled in the Netflix testimonial.  Can someone fix that?  (it's Weeks, not Week)

Thanks,
Dan




On Fri, Apr 24, 2015 at 10:23 AM, Sally Khudairi <sa...@yahoo.com.invalid> wrote:

Hi everyone --there's been the addition of a quote from Stripe:
>
>"Stripe's data warehouse has been built on Parquet from the beginning," said Avi Bryant, Engineering Manager at Stripe. "Every aspect of our pipeline, from data import to machine learning to adhoc SQL analysis, uses Apache Parquet as the common interchange format."
>
>
>--please note that I added "Apache" to "Parquet" in the second sentence. Stripe has also been added to the sub-head.
>
>Are we waiting for quotes from anyone else? If not, I can add a closing sentence and forward the final copy later today.
>
>Thanks so much,
>Sally
>
>
>
>----- Original Message -----
>
>From: Sally Khudairi <sa...@yahoo.com>
>To: Chris Aniszczyk <ca...@gmail.com>; "dev@parquet.incubator.apache.org" <de...@parquet.incubator.apache.org>
>Cc: Ryan Blue <bl...@cloudera.com>; "jfarrell@apache.org" <jf...@apache.org>; "Mattmann, Chris A (3980)" <ch...@jpl.nasa.gov>; "press@apache.org" <pr...@apache.org>
>Sent: Thursday, 23 April 2015, 15:25
>Subject: Re: Graduation blog post?
>
>Hello everyone --below is the draft thus far.
>
>
>I was aiming to announce on Monday by 7AM ET, but noticed that we're waiting for additional quotes.
>
>Also, should we get a closing quote from Julien? Perhaps something that invites additional community participation?
>
>Please let me know your thoughts.
>
>Thanks so much,
>Sally
>
>= = =
>
>The Apache Software Foundation Announces Apache™ Parquet™ as a Top-Level Project
>
>Open Source storage format for the Apache™ Hadoop® ecosystem in use at Cloudera, NASA, Netflix, and Twitter, among other organizations
>
>Forest Hill, MD –27 April 2015– The Apache Software Foundation (ASF), the all-volunteer developers, stewards, and incubators of more than 350 Open Source projects and initiatives, announced today that Apache™ Parquet™ has graduated from the Apache Incubator to become a Top-Level Project (TLP), signifying that the project's community and products have been well-governed under the ASF's meritocratic process and principles.
>
>"The incubation process at Apache has been fantastic and really the last step of making Parquet a community driven standard fully integrated within the greater Hadoop ecosystem." said Julien Le Dem, Vice President of Apache Parquet.
>
>Apache Parquet is an Open Source columnar storage format for the Apache™ Hadoop® ecosystem, built to work across programming languages and much more:
>- processing frameworks (MapReduce, Apache Spark, Scalding, Cascading, Crunch, Kite)
>- data models (Apache Avro, Apache Thrift, Protocol Buffers, POJOs)
>- query engines (Apache Hive, Impala, HAWQ, Apache Drill, Apache Tajo, Apache Pig, Presto, Apache Spark SQL)
>
>"At Twitter, Parquet has helped us scale our big data usage by in some cases reducing storage requirements by one third on large datasets as well as scan and deserialization time. This translated into hardware savings as well as reduced latency for accessing the data. Furthermore, Parquet being integrated with so many tools creates opportunities and flexibility regarding query engines," said Chris Aniszczyk, Head of Open Source at Twitter. "Finally, it's just fantastic to see it graduate to a top-level project and we look forward to further collaborating with the Apache Parquet community to continually improve performance."
>
>"Parquet’s integration with other object models, like Avro and Thrift, has been a key feature for our customers," said Ryan Blue, Software Engineer at Cloudera. "They can take advantage of columnar storage without changing the classes they already use in their production applications."
>
>"At Netflix, Parquet is the primary storage format for data warehousing. More than 7 petabytes of our 10+ Petabyte warehouse is Parquet formatted data that we query across a wide range of tools including Apache Hive, Apache Pig, Apache Spark, PigPen, Presto, and native MapReduce.  The performance benefit of columnar projection and statistics is a game changer for our big data platform," said Daniel Week, Software Engineer at Netflix. "We look forward to working with the Apache community to advance the state of big data storage with Parquet and are excited to see the project graduate to full Apache status."
>
>"I was extremely happy to see Parquet arrive as an Incubator project," said Chris Mattmann, Apache Parquet Incubator Mentor, and Chief Architect, Instrument and Science Data Systems Section at NASA Jet Propulsion Laboratory. "After talking with some in its community there was a real match with
>this columnar data format technology and its community with the way that we do things here at the ASF. Parquet has had an exemplar Incubation, and the project has big things ahead of it. I am encouraging my Data Science Team at NASA to evaluate it for data representation especially
>as it relates to our science holdings in Earth, planetary and space sciences, and astrophysics."
>
>
>Stripe? @cra reached out to Avi, said he would get something by Monday
>Criteo?
>
>@@CLOSING QUOTE FROM JULIEN?
>
>Availability and Oversight
>Apache Parquet software is released under the Apache License v2.0 and is overseen by a self-selected team of active contributors to the project. A Project Management Committee (PMC) guides the Project's day-to-day operations, including community development and product releases. For downloads, documentation, and ways to become involved with Apache Parquet, visit http://parquet.apache.org/ and https://twitter.com/ApacheParquet
>
>About the Apache Incubator
>The Apache Incubator is the entry path for projects and codebases wishing to become part of the efforts at The Apache Software Foundation. All code donations from external organizations and existing external projects wishing to join the ASF enter through the Incubator to: 1) ensure all donations are in accordance with the ASF legal standards; and 2) develop new communities that adhere to our guiding principles. Incubation is required of all newly accepted projects until a further review indicates that the infrastructure, communications, and decision making process have stabilized in a manner consistent with other successful ASF projects. While incubation status is not necessarily a reflection of the completeness or stability of the code, it does indicate that the project has yet to be fully endorsed by the ASF. For more information, visit http://incubator.apache.org/.
>
>About The Apache Software Foundation (ASF)
>Established in 1999, the all-volunteer Foundation oversees more than 350 leading Open Source projects, including Apache HTTP Server --the world's most popular Web server software. Through the ASF's meritocratic process known as "The Apache Way," more than 500 individual Members and 4,500 Committers successfully collaborate to develop freely available enterprise-grade software, benefiting millions of users worldwide: thousands of software solutions are distributed under the Apache License; and the community actively participates in ASF mailing lists, mentoring initiatives, and ApacheCon, the Foundation's official user conference, trainings, and expo. The ASF is a US 501(c)(3) charitable organization, funded by individual donations and corporate sponsors including Bloomberg, Budget Direct, Cerner, Citrix, Cloudera, Comcast, Facebook, Google, Hortonworks, HP, IBM, InMotion Hosting, iSigma, Matt Mullenweg, Microsoft, Pivotal, Produban, WANdisco, and Yahoo. For more information, visit http://www.apache.org/ or follow @TheASF on Twitter.
>
>© The Apache Software Foundation. "Apache", "Avro", "Apache Avro", "Drill", "Apache Drill", "Hadoop", "Apache Hadoop", "Parquet", "Apache Parquet", "Pig", "Apache Pig", "Spark", "Apache Spark", "Tajo", "Apache Tajo", "Thrift", "Apache Thrift", and "ApacheCon" are registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries. All other brands and trademarks are the property of their respective owners.
>
># # #
>
>
>________________________________
>
>From: Chris Aniszczyk <ca...@gmail.com>
>To: "dev@parquet.incubator.apache.org" <de...@parquet.incubator.apache.org>
>Cc: Sally Khudairi <sa...@yahoo.com>; Ryan Blue <bl...@cloudera.com>; "jfarrell@apache.org" <jf...@apache.org>; "Mattmann, Chris A (3980)" <ch...@jpl.nasa.gov>; "press@apache.org" <pr...@apache.org>
>Sent: Wednesday, 22 April 2015, 14:51
>Subject: Re: Graduation blog post?
>
>
>
>Thanks Daniel, I added your quote.
>
>
>
>
>On Wed, Apr 22, 2015 at 12:14 PM, Daniel Weeks <dw...@netflix.com.invalid> wrote:
>
>Netflix Testimonial:
>>
>>At Netflix, Parquet is the primary storage format for data warehousing.
>>More than 7 petabytes of our 10+ Petabyte warehouse is Parquet formatted
>>data that we query across a wide range of tools including Apache Hive,
>>Apache Pig, Apache Spark, PigPen, Presto, and native MapReduce.  The
>>performance benefit of columnar projection and statistics is a game changer
>>for our big data platform.  We look forward to working with the Apache
>>community to advance the state of big data storage with Parquet and are
>>excited to see the project graduate to full Apache status.
>>
>>Daniel Weeks
>>Engineering Manager - Big Data Compute
>>Neflix
>>
>>
>>On Wed, Apr 22, 2015 at 9:36 AM, Sally Khudairi <
>>sallykhudairi@yahoo.com.invalid> wrote:
>>
>>> Thanks for the draft thus far, Ryan.
>>> Can we please include at least one more industry testimonial?
>>> Also, if you can please provide edit access to my account at
>>> khudairi@gmail.com, that would be great.
>>> Thanks in advance for this!
>>> -Sally
>>>
>>>
>>>      From: Ryan Blue <bl...@cloudera.com>
>>>  To: jfarrell@apache.org; Sally Khudairi <sa...@yahoo.com>
>>> Cc: "Mattmann, Chris A (3980)" <ch...@jpl.nasa.gov>; "
>>> press@apache.org" <pr...@apache.org>; "dev@parquet.incubator.apache.org" <
>>> dev@parquet.incubator.apache.org>
>>>  Sent: Monday, 20 April 2015, 15:48
>>>  Subject: Re: Graduation blog post?
>>>
>>> On 04/20/2015 12:36 PM, Jake Farrell wrote:
>>> > Hey Sally
>>> > i've got root@ karma and will take care of the infra side of things for
>>> > us once the board has successfully voted on our resolution
>>> >
>>> > -Jake
>>>
>>> Thanks, Jake! I've already sent an e-mail to Infra, but I'll follow up
>>> with this news so they don't worry about it.
>>>
>>> rb
>>>
>>>
>>> --
>>> Ryan Blue
>>> Software Engineer
>>> Cloudera, Inc.
>>>
>>>
>>>
>>>
>>
>
>
>--
>
>Cheers,
>
>Chris Aniszczyk
>http://aniszczyk.org
>+1 512 961 6719
>