You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hawq.apache.org by Bernard Fraenkel <bf...@gmail.com> on 2015/11/13 16:35:21 UTC
Unsusbscribe

Unsubscribe


On Fri, Nov 13, 2015 at 6:59 AM, Dan Baskette <db...@gmail.com> wrote:

> Hive doesn't have the level of SQL support that HAWQ provides especially
> around sub-selects.   SparkSQL only support a subset of HiveQL, so the
> difference there is even bigger.
>
> Sent from my iPhone
>
> On Nov 13, 2015, at 9:39 AM, Biswas, Supriya <Su...@nielsen.com>
> wrote:
>
> Hello All –
>
>
>
> Hive 0.14 supports ACID and also supports transactions. Spark supports
> Hive queries (HQL).
>
>
>
> Did anyone compare HAWQ with spark SQL or Hive HQL on Spark?
>
>
>
> Thanks.
>
>
>
>
> *Supriyo Biswas *Architect – CPS Service Delivery
> The Nielsen Company
> Office (516) 682-6021/NETS 249-6021
>
> Cell     (516) 353-6795
> www.nielsen.com
>
>
>
> *From:* Atri Sharma [mailto:atri@apache.org <at...@apache.org>]
> *Sent:* Friday, November 13, 2015 3:53 AM
> *To:* user@hawq.incubator.apache.org
> *Subject:* Re: what is Hawq?
>
>
>
> Greenplum is open sourced.
>
> The main difference is between the two engines is that HAWQ is more for
> Hadoop based systems whereas Greenplum is more towards regular FS. This is
> a very high level difference between the two, the differences are more
> detailed. But a single line difference between the two is the one I wrote.
>
> On 13 Nov 2015 14:20, "Adaryl "Bob" Wakefield, MBA" <
> adaryl.wakefield@hotmail.com> wrote:
>
> Is Greenplum free? I heard they open sourced it but I haven’t found
> anything but a community edition.
>
>
>
> Adaryl "Bob" Wakefield, MBA
> Principal
> Mass Street Analytics, LLC
> 913.938.6685
> www.linkedin.com/in/bobwakefieldmba
> Twitter: @BobLovesData
>
>
>
> *From:* dortmont <do...@gmail.com>
>
> *Sent:* Friday, November 13, 2015 2:42 AM
>
> *To:* user@hawq.incubator.apache.org
>
> *Subject:* Re: what is Hawq?
>
>
>
> I see the advantage of HAWQ compared to other Hadoop SQL engines. It looks
> like the most mature solution on Hadoop thanks to the postgresql based
> engine.
>
>
>
> But why wouldn't I use Greenplum instead of HAWQ? It has even better
> performance and it supports updates.
>
>
> Cheers
>
>
>
> 2015-11-13 7:45 GMT+01:00 Atri Sharma <at...@apache.org>:
>
> +1 for transactions.
>
> I think a major plus point is that HAWQ supports transactions,  and this
> enables a lot of critical workloads to be done on HAWQ.
>
> On 13 Nov 2015 12:13, "Lei Chang" <ch...@gmail.com> wrote:
>
>
>
> Like what Bob said, HAWQ is a complete database and Drill is just a query
> engine.
>
>
>
> And HAWQ has also a lot of other benefits over Drill, for example:
>
>
>
> 1. SQL completeness: HAWQ is the best for the sql-on-hadoop engines, can
> run all TPCDS queries without any changes. And support almost all third
> party tools, such as Tableau et al.
>
> 2. Performance: proved the best in the hadoop world
>
> 3. Scalability: high scalable via high speed UDP based interconnect.
>
> 4. Transactions: as I know, drill does not support transactions. it is a
> nightmare for end users to keep consistency.
>
> 5. Advanced resource management: HAWQ has the most advanced resource
> management. It natively supports YARN and easy to use hierarchical resource
> queues. Resources can be managed and enforced on query and operator level.
>
>
>
> Cheers
>
> Lei
>
>
>
>
>
> On Fri, Nov 13, 2015 at 9:34 AM, Adaryl "Bob" Wakefield, MBA <
> adaryl.wakefield@hotmail.com> wrote:
>
> There are a lot of tools that do a lot of things. Believe me it’s a full
> time job keeping track of what is going on in the apache world. As I
> understand it, Drill is just a query engine while Hawq is an actual
> database...some what anyway.
>
>
>
> Adaryl "Bob" Wakefield, MBA
> Principal
> Mass Street Analytics, LLC
> 913.938.6685
> www.linkedin.com/in/bobwakefieldmba
> Twitter: @BobLovesData
>
>
>
> *From:* Will Wagner <wo...@gmail.com>
>
> *Sent:* Thursday, November 12, 2015 7:42 AM
>
> *To:* user@hawq.incubator.apache.org
>
> *Subject:* Re: what is Hawq?
>
>
>
> Hi Lie,
>
> Great answer.
>
> I have a follow up question.
> Everything HAWQ is capable of doing is already covered by Apache Drill.
> Why do we need another tool?
>
> Thank you,
> Will W
>
> On Nov 12, 2015 12:25 AM, "Lei Chang" <ch...@gmail.com> wrote:
>
>
>
> Hi Bob,
>
>
>
> Apache HAWQ is a Hadoop native SQL query engine that combines the key
> technological advantages of MPP database with the scalability and
> convenience of Hadoop. HAWQ reads data from and writes data to HDFS
> natively. HAWQ delivers industry-leading performance and linear
> scalability. It provides users the tools to confidently and successfully
> interact with petabyte range data sets. HAWQ provides users with a
> complete, standards compliant SQL interface. More specifically, HAWQ has
> the following features:
>
> ·         On-premise or cloud deployment
>
> ·         Robust ANSI SQL compliance: SQL-92, SQL-99, SQL-2003, OLAP
> extension
>
> ·         Extremely high performance. many times faster than other Hadoop
> SQL engine.
>
> ·         World-class parallel optimizer
>
> ·         Full transaction capability and consistency guarantee: ACID
>
> ·         Dynamic data flow engine through high speed UDP based
> interconnect
>
> ·         Elastic execution engine based on virtual segment & data
> locality
>
> ·         Support multiple level partitioning and List/Range based
> partitioned tables.
>
> ·         Multiple compression method support: snappy, gzip, quicklz, RLE
>
> ·         Multi-language user defined function support: python, perl,
> java, c/c++, R
>
> ·         Advanced machine learning and data mining functionalities
> through MADLib
>
> ·         Dynamic node expansion: in seconds
>
> ·         Most advanced three level resource management: Integrate with
> YARN and hierarchical resource queues.
>
> ·         Easy access of all HDFS data and external system data (for
> example, HBase)
>
> ·         Hadoop Native: from storage (HDFS), resource management (YARN)
> to deployment (Ambari).
>
> ·         Authentication & Granular authorization: Kerberos, SSL and role
> based access
>
> ·         Advanced C/C++ access library to HDFS and YARN: libhdfs3 &
> libYARN
>
> ·         Support most third party tools: Tableau, SAS et al.
>
> ·         Standard connectivity: JDBC/ODBC
>
>
>
> And the link here can give you more information around hawq:
> https://cwiki.apache.org/confluence/display/HAWQ/About+HAWQ
>
>
>
>
>
> And please also see the answers inline to your specific questions:
>
>
>
> On Thu, Nov 12, 2015 at 4:09 PM, Adaryl "Bob" Wakefield, MBA <
> adaryl.wakefield@hotmail.com> wrote:
>
> Silly question right? Thing is I’ve read a bit and watched some YouTube
> videos and I’m still not quite sure what I can and can’t do with Hawq. Is
> it a true database or is it like Hive where I need to use HCatalog?
>
>
>
> It is a true database, you can think it is like a parallel postgres but
> with much more functionalities and it works natively in hadoop world.
> HCatalog is not necessary. But you can read data registered in HCatalog
> with the new feature "hcatalog integration".
>
>
>
> Can I write data intensive applications against it using ODBC? Does it
> enforce referential integrity? Does it have stored procedures?
>
>
>
> ODBC: yes, both JDBC/ODBC are supported
>
> referential integrity: currently not supported.
>
> Stored procedures: yes.
>
>
>
> B.
>
>
>
>
>
> Please let us know if you have any other questions.
>
>
>
> Cheers
>
> Lei
>
>
>
>
>
>
>
>
>
>