You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@impala.apache.org by "Wes McKinney (Code Review)" <ge...@cloudera.org> on 2016/06/03 06:41:44 UTC

[hs2client-CR] Rough draft of Cython interface to libhs2client and start of DB API 2.0 compliance layer (for compatibility with other systems that expect the standard Python database driver interface).

Wes McKinney has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/3296

Change subject: Rough draft of Cython interface to libhs2client and start of DB API 2.0 compliance layer (for compatibility with other systems that expect the standard Python database driver interface).
......................................................................

Rough draft of Cython interface to libhs2client and start of DB API 2.0
compliance layer (for compatibility with other systems that expect the standard
Python database driver interface).

Draft fetchall code path to a pandas-compatible NumPy representation

- Integers (with and without nulls)
- Floating point
- String (UTF-8 only for now)
- Boolean
- Timestamps (using pandas.to_datetime for ISO-8601 parsing)
- Decimal (to Python Decimal objects)

There is some refactoring of the C++ ColumnarRowSet to enable template-based
column access.

Change-Id: I8c26be5932548d096088aa62bde0fe03e85a9a1e
---
M .gitignore
M CMakeLists.txt
A python/.gitignore
A python/README.md
A python/hs2client/__init__.pxd
A python/hs2client/__init__.py
A python/hs2client/compat.py
A python/hs2client/converters.h
A python/hs2client/dbapi.py
A python/hs2client/error.py
A python/hs2client/ext.pyx
A python/hs2client/libhs2client.pxd
A python/hs2client/tests/__init__.py
A python/hs2client/tests/common.py
A python/hs2client/tests/test_impala.py
A python/setup.py
M src/hs2client/api.h
M src/hs2client/columnar-row-set.cc
M src/hs2client/columnar-row-set.h
M src/hs2client/service.h
M src/hs2client/session.h
M src/hs2client/types.h
M src/hs2client/util.h
23 files changed, 2,175 insertions(+), 74 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/hs2client refs/changes/96/3296/1
-- 
To view, visit http://gerrit.cloudera.org:8080/3296
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I8c26be5932548d096088aa62bde0fe03e85a9a1e
Gerrit-PatchSet: 1
Gerrit-Project: hs2client
Gerrit-Branch: master
Gerrit-Owner: Wes McKinney <we...@apache.org>

[hs2client-CR] Rough draft of Cython interface to libhs2client and start of DB API 2.0 compliance layer (for compatibility with other systems that expect the standard Python database driver interface).

Posted by "Matthew Jacobs (Code Review)" <ge...@cloudera.org>.
Matthew Jacobs has posted comments on this change.

Change subject: Rough draft of Cython interface to libhs2client and start of DB API 2.0 compliance layer (for compatibility with other systems that expect the standard Python database driver interface).
......................................................................


Patch Set 3:

You can close by marking as Verified and then a "Submit" box shows up, which just closes it since we don't have replication to github set up.

-- 
To view, visit http://gerrit.cloudera.org:8080/3296
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8c26be5932548d096088aa62bde0fe03e85a9a1e
Gerrit-PatchSet: 3
Gerrit-Project: hs2client
Gerrit-Branch: master
Gerrit-Owner: Wes McKinney <we...@apache.org>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Wes McKinney <we...@apache.org>
Gerrit-HasComments: No

[hs2client-CR] Rough draft of Cython interface to libhs2client and start of DB API 2.0 compliance layer (for compatibility with other systems that expect the standard Python database driver interface).

Posted by "Matthew Jacobs (Code Review)" <ge...@cloudera.org>.
Matthew Jacobs has posted comments on this change.

Change subject: Rough draft of Cython interface to libhs2client and start of DB API 2.0 compliance layer (for compatibility with other systems that expect the standard Python database driver interface).
......................................................................


Patch Set 3: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/3296
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8c26be5932548d096088aa62bde0fe03e85a9a1e
Gerrit-PatchSet: 3
Gerrit-Project: hs2client
Gerrit-Branch: master
Gerrit-Owner: Wes McKinney <we...@apache.org>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Wes McKinney <we...@apache.org>
Gerrit-HasComments: No

[hs2client-CR] Rough draft of Cython interface to libhs2client and start of DB API 2.0 compliance layer (for compatibility with other systems that expect the standard Python database driver interface).

Posted by "Wes McKinney (Code Review)" <ge...@cloudera.org>.
Wes McKinney has uploaded a new patch set (#3).

Change subject: Rough draft of Cython interface to libhs2client and start of DB API 2.0 compliance layer (for compatibility with other systems that expect the standard Python database driver interface).
......................................................................

Rough draft of Cython interface to libhs2client and start of DB API 2.0
compliance layer (for compatibility with other systems that expect the standard
Python database driver interface).

Draft fetchall code path to a pandas-compatible NumPy representation

- Integers (with and without nulls)
- Floating point
- String (UTF-8 only for now)
- Boolean
- Timestamps (using pandas.to_datetime for ISO-8601 parsing)
- Decimal (to Python Decimal objects)

There is some refactoring of the C++ ColumnarRowSet to enable template-based
column access.

Change-Id: I8c26be5932548d096088aa62bde0fe03e85a9a1e
---
M .gitignore
M CMakeLists.txt
A python/.gitignore
A python/README.md
A python/hs2client/__init__.pxd
A python/hs2client/__init__.py
A python/hs2client/compat.py
A python/hs2client/converters.h
A python/hs2client/dbapi.py
A python/hs2client/error.py
A python/hs2client/ext.pyx
A python/hs2client/libhs2client.pxd
A python/hs2client/tests/__init__.py
A python/hs2client/tests/common.py
A python/hs2client/tests/test_impala.py
A python/setup.py
M src/hs2client/api.h
M src/hs2client/columnar-row-set.cc
M src/hs2client/columnar-row-set.h
M src/hs2client/service.h
M src/hs2client/session.h
M src/hs2client/types.h
M src/hs2client/util.h
23 files changed, 2,186 insertions(+), 73 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/hs2client refs/changes/96/3296/3
-- 
To view, visit http://gerrit.cloudera.org:8080/3296
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I8c26be5932548d096088aa62bde0fe03e85a9a1e
Gerrit-PatchSet: 3
Gerrit-Project: hs2client
Gerrit-Branch: master
Gerrit-Owner: Wes McKinney <we...@apache.org>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Wes McKinney <we...@apache.org>

[hs2client-CR] Rough draft of Cython interface to libhs2client and start of DB API 2.0 compliance layer (for compatibility with other systems that expect the standard Python database driver interface).

Posted by "Wes McKinney (Code Review)" <ge...@cloudera.org>.
Wes McKinney has uploaded a new patch set (#2).

Change subject: Rough draft of Cython interface to libhs2client and start of DB API 2.0 compliance layer (for compatibility with other systems that expect the standard Python database driver interface).
......................................................................

Rough draft of Cython interface to libhs2client and start of DB API 2.0
compliance layer (for compatibility with other systems that expect the standard
Python database driver interface).

Draft fetchall code path to a pandas-compatible NumPy representation

- Integers (with and without nulls)
- Floating point
- String (UTF-8 only for now)
- Boolean
- Timestamps (using pandas.to_datetime for ISO-8601 parsing)
- Decimal (to Python Decimal objects)

There is some refactoring of the C++ ColumnarRowSet to enable template-based
column access.

Change-Id: I8c26be5932548d096088aa62bde0fe03e85a9a1e
---
M .gitignore
M CMakeLists.txt
A python/.gitignore
A python/README.md
A python/hs2client/__init__.pxd
A python/hs2client/__init__.py
A python/hs2client/compat.py
A python/hs2client/converters.h
A python/hs2client/dbapi.py
A python/hs2client/error.py
A python/hs2client/ext.pyx
A python/hs2client/libhs2client.pxd
A python/hs2client/tests/__init__.py
A python/hs2client/tests/common.py
A python/hs2client/tests/test_impala.py
A python/setup.py
M src/hs2client/api.h
M src/hs2client/columnar-row-set.cc
M src/hs2client/columnar-row-set.h
M src/hs2client/service.h
M src/hs2client/session.h
M src/hs2client/types.h
M src/hs2client/util.h
23 files changed, 2,174 insertions(+), 74 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/hs2client refs/changes/96/3296/2
-- 
To view, visit http://gerrit.cloudera.org:8080/3296
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I8c26be5932548d096088aa62bde0fe03e85a9a1e
Gerrit-PatchSet: 2
Gerrit-Project: hs2client
Gerrit-Branch: master
Gerrit-Owner: Wes McKinney <we...@apache.org>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Wes McKinney <we...@apache.org>

[hs2client-CR] Rough draft of Cython interface to libhs2client and start of DB API 2.0 compliance layer (for compatibility with other systems that expect the standard Python database driver interface).

Posted by "Matthew Jacobs (Code Review)" <ge...@cloudera.org>.
Matthew Jacobs has posted comments on this change.

Change subject: Rough draft of Cython interface to libhs2client and start of DB API 2.0 compliance layer (for compatibility with other systems that expect the standard Python database driver interface).
......................................................................


Patch Set 2:

(6 comments)

(Just a review of the C++ code)
Looks good. I think the refactoring makes sense, just a few small nits/questions. Thanks for fixing the header guards.

http://gerrit.cloudera.org:8080/#/c/3296/2/src/hs2client/columnar-row-set.cc
File src/hs2client/columnar-row-set.cc:

PS2, Line 33: ATTR
nit: ATTR_NAME ?


Line 52: 
nit: #undef VALUE_GETTER


PS2, Line 56: columns[i]
It might be worth adding a DCHECK to make sure i < columns.size()


Line 73: 
nit: #undef TYPED_GETTER


http://gerrit.cloudera.org:8080/#/c/3296/2/src/hs2client/columnar-row-set.h
File src/hs2client/columnar-row-set.h:

PS2, Line 56: std::string*
I'm fine with this, but just curious why switch from uint8_t*?


PS2, Line 71: nulls_->c_str()
Are we sure this won't incur a virtual function call? May be unexpectedly expensive if IsNull() is called in a loop.


-- 
To view, visit http://gerrit.cloudera.org:8080/3296
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8c26be5932548d096088aa62bde0fe03e85a9a1e
Gerrit-PatchSet: 2
Gerrit-Project: hs2client
Gerrit-Branch: master
Gerrit-Owner: Wes McKinney <we...@apache.org>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Wes McKinney <we...@apache.org>
Gerrit-HasComments: Yes

[hs2client-CR] Rough draft of Cython interface to libhs2client and start of DB API 2.0 compliance layer (for compatibility with other systems that expect the standard Python database driver interface).

Posted by "Wes McKinney (Code Review)" <ge...@cloudera.org>.
Wes McKinney has posted comments on this change.

Change subject: Rough draft of Cython interface to libhs2client and start of DB API 2.0 compliance layer (for compatibility with other systems that expect the standard Python database driver interface).
......................................................................


Patch Set 2:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/3296/2/src/hs2client/columnar-row-set.cc
File src/hs2client/columnar-row-set.cc:

Line 52: 
> nit: #undef VALUE_GETTER
Done (is this generally preferred even when defines have no visibility outside a particular .cc file?)


PS2, Line 56: columns[i]
> It might be worth adding a DCHECK to make sure i < columns.size()
Done


Line 73: 
> nit: #undef TYPED_GETTER
Done


http://gerrit.cloudera.org:8080/#/c/3296/2/src/hs2client/columnar-row-set.h
File src/hs2client/columnar-row-set.h:

PS2, Line 56: std::string*
> I'm fine with this, but just curious why switch from uint8_t*?
See https://github.com/cloudera/hs2client/issues/16. Some versions of Hive truncate the bitmap, so there may be fewer bits available than there are values. So we will need the bitmap size to be able to work around this problem.


PS2, Line 71: nulls_->c_str()
> Are we sure this won't incur a virtual function call? May be unexpectedly e
Reverting to uint8_t* for nulls (otherwise you have to do the reinterpret_cast everywhere you want the bitmap as const uint8_t*) and also adding a member for the length to make things more explicit (+ a comment explaining).


-- 
To view, visit http://gerrit.cloudera.org:8080/3296
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8c26be5932548d096088aa62bde0fe03e85a9a1e
Gerrit-PatchSet: 2
Gerrit-Project: hs2client
Gerrit-Branch: master
Gerrit-Owner: Wes McKinney <we...@apache.org>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Wes McKinney <we...@apache.org>
Gerrit-HasComments: Yes

[hs2client-CR] Rough draft of Cython interface to libhs2client and start of DB API 2.0 compliance layer (for compatibility with other systems that expect the standard Python database driver interface).

Posted by "Matthew Jacobs (Code Review)" <ge...@cloudera.org>.
Matthew Jacobs has posted comments on this change.

Change subject: Rough draft of Cython interface to libhs2client and start of DB API 2.0 compliance layer (for compatibility with other systems that expect the standard Python database driver interface).
......................................................................


Patch Set 3: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/3296
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8c26be5932548d096088aa62bde0fe03e85a9a1e
Gerrit-PatchSet: 3
Gerrit-Project: hs2client
Gerrit-Branch: master
Gerrit-Owner: Wes McKinney <we...@apache.org>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Wes McKinney <we...@apache.org>
Gerrit-HasComments: No

[hs2client-CR] Rough draft of Cython interface to libhs2client and start of DB API 2.0 compliance layer (for compatibility with other systems that expect the standard Python database driver interface).

Posted by "Matthew Jacobs (Code Review)" <ge...@cloudera.org>.
Matthew Jacobs has posted comments on this change.

Change subject: Rough draft of Cython interface to libhs2client and start of DB API 2.0 compliance layer (for compatibility with other systems that expect the standard Python database driver interface).
......................................................................


Patch Set 3:

Hm, I think maybe you needed to be in the Impala group on gerrit. I just added you.

-- 
To view, visit http://gerrit.cloudera.org:8080/3296
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8c26be5932548d096088aa62bde0fe03e85a9a1e
Gerrit-PatchSet: 3
Gerrit-Project: hs2client
Gerrit-Branch: master
Gerrit-Owner: Wes McKinney <we...@apache.org>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Wes McKinney <we...@apache.org>
Gerrit-HasComments: No

[hs2client-CR] Rough draft of Cython interface to libhs2client and start of DB API 2.0 compliance layer (for compatibility with other systems that expect the standard Python database driver interface).

Posted by "Wes McKinney (Code Review)" <ge...@cloudera.org>.
Wes McKinney has posted comments on this change.

Change subject: Rough draft of Cython interface to libhs2client and start of DB API 2.0 compliance layer (for compatibility with other systems that expect the standard Python database driver interface).
......................................................................


Patch Set 3:

Merged (not sure how to close this on Gerrit)

-- 
To view, visit http://gerrit.cloudera.org:8080/3296
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8c26be5932548d096088aa62bde0fe03e85a9a1e
Gerrit-PatchSet: 3
Gerrit-Project: hs2client
Gerrit-Branch: master
Gerrit-Owner: Wes McKinney <we...@apache.org>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Wes McKinney <we...@apache.org>
Gerrit-HasComments: No

[hs2client-CR] Rough draft of Cython interface to libhs2client and start of DB API 2.0 compliance layer (for compatibility with other systems that expect the standard Python database driver interface).

Posted by "Matthew Jacobs (Code Review)" <ge...@cloudera.org>.
Matthew Jacobs has submitted this change and it was merged.

Change subject: Rough draft of Cython interface to libhs2client and start of DB API 2.0 compliance layer (for compatibility with other systems that expect the standard Python database driver interface).
......................................................................


Rough draft of Cython interface to libhs2client and start of DB API 2.0
compliance layer (for compatibility with other systems that expect the standard
Python database driver interface).

Draft fetchall code path to a pandas-compatible NumPy representation

- Integers (with and without nulls)
- Floating point
- String (UTF-8 only for now)
- Boolean
- Timestamps (using pandas.to_datetime for ISO-8601 parsing)
- Decimal (to Python Decimal objects)

There is some refactoring of the C++ ColumnarRowSet to enable template-based
column access.

Change-Id: I8c26be5932548d096088aa62bde0fe03e85a9a1e
---
M .gitignore
M CMakeLists.txt
A python/.gitignore
A python/README.md
A python/hs2client/__init__.pxd
A python/hs2client/__init__.py
A python/hs2client/compat.py
A python/hs2client/converters.h
A python/hs2client/dbapi.py
A python/hs2client/error.py
A python/hs2client/ext.pyx
A python/hs2client/libhs2client.pxd
A python/hs2client/tests/__init__.py
A python/hs2client/tests/common.py
A python/hs2client/tests/test_impala.py
A python/setup.py
M src/hs2client/api.h
M src/hs2client/columnar-row-set.cc
M src/hs2client/columnar-row-set.h
M src/hs2client/service.h
M src/hs2client/session.h
M src/hs2client/types.h
M src/hs2client/util.h
23 files changed, 2,186 insertions(+), 73 deletions(-)

Approvals:
  Matthew Jacobs: Looks good to me, approved; Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/3296
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I8c26be5932548d096088aa62bde0fe03e85a9a1e
Gerrit-PatchSet: 3
Gerrit-Project: hs2client
Gerrit-Branch: master
Gerrit-Owner: Wes McKinney <we...@apache.org>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Wes McKinney <we...@apache.org>

[hs2client-CR] Rough draft of Cython interface to libhs2client and start of DB API 2.0 compliance layer (for compatibility with other systems that expect the standard Python database driver interface).

Posted by "Wes McKinney (Code Review)" <ge...@cloudera.org>.
Wes McKinney has posted comments on this change.

Change subject: Rough draft of Cython interface to libhs2client and start of DB API 2.0 compliance layer (for compatibility with other systems that expect the standard Python database driver interface).
......................................................................


Patch Set 3:

I appear to not have the karma to perform those actions on this project.

-- 
To view, visit http://gerrit.cloudera.org:8080/3296
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8c26be5932548d096088aa62bde0fe03e85a9a1e
Gerrit-PatchSet: 3
Gerrit-Project: hs2client
Gerrit-Branch: master
Gerrit-Owner: Wes McKinney <we...@apache.org>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Wes McKinney <we...@apache.org>
Gerrit-HasComments: No

[hs2client-CR] Rough draft of Cython interface to libhs2client and start of DB API 2.0 compliance layer (for compatibility with other systems that expect the standard Python database driver interface).

Posted by "Wes McKinney (Code Review)" <ge...@cloudera.org>.
Wes McKinney has posted comments on this change.

Change subject: Rough draft of Cython interface to libhs2client and start of DB API 2.0 compliance layer (for compatibility with other systems that expect the standard Python database driver interface).
......................................................................


Patch Set 1:

@Matt, feel free to gloss over the Python/Cython stuff, but let me know if you have any more thoughts on the refactoring I did in columnar-row-set.cc. The other changes to the core library are to fix linking / installation issues associated with using libhs2client as a dynamic library

-- 
To view, visit http://gerrit.cloudera.org:8080/3296
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8c26be5932548d096088aa62bde0fe03e85a9a1e
Gerrit-PatchSet: 1
Gerrit-Project: hs2client
Gerrit-Branch: master
Gerrit-Owner: Wes McKinney <we...@apache.org>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Wes McKinney <we...@apache.org>
Gerrit-HasComments: No