You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2020/10/08 13:39:03 UTC

[GitHub] [arrow] naman1996 opened a new pull request #8201: ARROW-9956: [C++] [Gandiva] Implementation of binary_string function in gandiva

naman1996 opened a new pull request #8201:
URL: https://github.com/apache/arrow/pull/8201


   Function takes in a normal string or a hexadecimal encoded string( Eg: \xca\xfe\xba\xbe) and converts it to VARBINARY (byte array).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] naman1996 commented on a change in pull request #8201: ARROW-9956: [C++] [Gandiva] Implementation of binary_string function in gandiva

Posted by GitBox <gi...@apache.org>.
naman1996 commented on a change in pull request #8201:
URL: https://github.com/apache/arrow/pull/8201#discussion_r501024248



##########
File path: cpp/src/gandiva/precompiled/string_ops.cc
##########
@@ -84,6 +83,15 @@ VAR_LEN_OP_TYPES(BINARY_RELATIONAL, greater_than_or_equal_to, >=)
   INNER(NAME, utf8)                \
   INNER(NAME, binary)
 
+int to_binary_from_hex(char ch) {

Review comment:
       will handle only valid hexadecimal characters
   a-f,  A-F,  0 - 9
   
   There is a check too before invoking this method _isxdigit(hd1) && isxdigit(hd2)_




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] naman1996 commented on a change in pull request #8201: ARROW-9956: [C++] [Gandiva] Implementation of binary_string function in gandiva

Posted by GitBox <gi...@apache.org>.
naman1996 commented on a change in pull request #8201:
URL: https://github.com/apache/arrow/pull/8201#discussion_r498089255



##########
File path: cpp/src/gandiva/precompiled/string_ops.cc
##########
@@ -835,6 +835,51 @@ const char* replace_utf8_utf8_utf8(gdv_int64 context, const char* text,
                                              out_len);
 }
 
+FORCE_INLINE
+const char* binary_string(gdv_int64 context, const char* text, gdv_int32 text_len,
+                          gdv_int32* out_len) {
+  gdv_binary ret =
+      reinterpret_cast<gdv_binary>(gdv_fn_context_arena_malloc(context, text_len));
+
+  if (ret == nullptr) {
+    gdv_fn_context_set_error_msg(context, "Could not allocate memory for output string");
+    *out_len = 0;
+    return "";
+  }
+
+  if (text_len == 0) {
+    *out_len = 0;
+    return "";
+  }
+
+  // converting hex encoded string to normal string
+  int j = 0;
+  for (int i = 0; i < text_len; i++, j++) {
+    if (text[i] == '\\' && i + 3 < text_len &&
+        (text[i + 1] == 'x' || text[i + 1] == 'X')) {
+      char hex_string[3];
+      hex_string[0] = toupper(text[i + 2]);
+      hex_string[1] = toupper(text[i + 3]);
+      hex_string[2] = '\0';
+      uint8_t out;
+      arrow::Status st;
+      st = arrow::ParseHexValue(hex_string, &out);
+      st = arrow::ParseHexValue(hex_string, &out);

Review comment:
       just a typo during refactor. Changed it




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] praveenbingo closed pull request #8201: ARROW-9956: [C++] [Gandiva] Implementation of binary_string function in gandiva

Posted by GitBox <gi...@apache.org>.
praveenbingo closed pull request #8201:
URL: https://github.com/apache/arrow/pull/8201


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] praveenbingo commented on a change in pull request #8201: ARROW-9956: [C++] [Gandiva] Implementation of binary_string function in gandiva

Posted by GitBox <gi...@apache.org>.
praveenbingo commented on a change in pull request #8201:
URL: https://github.com/apache/arrow/pull/8201#discussion_r497996318



##########
File path: cpp/src/gandiva/precompiled/string_ops.cc
##########
@@ -835,6 +835,48 @@ const char* replace_utf8_utf8_utf8(gdv_int64 context, const char* text,
                                              out_len);
 }
 
+FORCE_INLINE
+const char* binary_string(gdv_int64 context, const char* text, gdv_int32 text_len,
+                          gdv_int32* out_len) {
+  gdv_binary ret =
+      reinterpret_cast<gdv_binary>(gdv_fn_context_arena_malloc(context, text_len));
+
+  if (ret == nullptr) {
+    gdv_fn_context_set_error_msg(context, "Could not allocate memory for output string");

Review comment:
       i still see the same above?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] naman1996 edited a comment on pull request #8201: ARROW-9956: [C++] [Gandiva] Implementation of binary_string function in gandiva

Posted by GitBox <gi...@apache.org>.
naman1996 edited a comment on pull request #8201:
URL: https://github.com/apache/arrow/pull/8201#issuecomment-694112182


   > We have hex encoding / decoding functions in arrow/util/string.h, is it possible for you to use them?
   
   I think I can use the arrow::ParseHexValue from string arrow/string/util. This would require a lot to type conversions though as that function takes in a string and returns the hex value an unsigned int. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] naman1996 closed pull request #8201: ARROW-9956: [C++] [Gandiva] Implementation of binary_string function in gandiva

Posted by GitBox <gi...@apache.org>.
naman1996 closed pull request #8201:
URL: https://github.com/apache/arrow/pull/8201


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] naman1996 commented on a change in pull request #8201: ARROW-9956: [C++] [Gandiva] Implementation of binary_string function in gandiva

Posted by GitBox <gi...@apache.org>.
naman1996 commented on a change in pull request #8201:
URL: https://github.com/apache/arrow/pull/8201#discussion_r498016721



##########
File path: cpp/src/gandiva/precompiled/string_ops.cc
##########
@@ -835,6 +835,48 @@ const char* replace_utf8_utf8_utf8(gdv_int64 context, const char* text,
                                              out_len);
 }
 
+FORCE_INLINE
+const char* binary_string(gdv_int64 context, const char* text, gdv_int32 text_len,
+                          gdv_int32* out_len) {
+  gdv_binary ret =
+      reinterpret_cast<gdv_binary>(gdv_fn_context_arena_malloc(context, text_len));
+
+  if (ret == nullptr) {
+    gdv_fn_context_set_error_msg(context, "Could not allocate memory for output string");

Review comment:
       Resolving based on offline discussion.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] github-actions[bot] commented on pull request #8201: ARROW-9956: [C++] Implementation of binary_string function in gandiva

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #8201:
URL: https://github.com/apache/arrow/pull/8201#issuecomment-693180918


   https://issues.apache.org/jira/browse/ARROW-9956


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] praveenbingo commented on a change in pull request #8201: ARROW-9956: [C++] [Gandiva] Implementation of binary_string function in gandiva

Posted by GitBox <gi...@apache.org>.
praveenbingo commented on a change in pull request #8201:
URL: https://github.com/apache/arrow/pull/8201#discussion_r500985340



##########
File path: cpp/src/gandiva/precompiled/string_ops.cc
##########
@@ -84,6 +83,15 @@ VAR_LEN_OP_TYPES(BINARY_RELATIONAL, greater_than_or_equal_to, >=)
   INNER(NAME, utf8)                \
   INNER(NAME, binary)
 
+int to_binary_from_hex(char ch) {

Review comment:
       inline..is this only to handle ascii characters?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] naman1996 commented on pull request #8201: ARROW-9956: [C++] [Gandiva] Implementation of binary_string function in gandiva

Posted by GitBox <gi...@apache.org>.
naman1996 commented on pull request #8201:
URL: https://github.com/apache/arrow/pull/8201#issuecomment-694185948


   Sorry was closed by mistake 😓 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] wesm commented on pull request #8201: ARROW-9956: [C++] [Gandiva] Implementation of binary_string function in gandiva

Posted by GitBox <gi...@apache.org>.
wesm commented on pull request #8201:
URL: https://github.com/apache/arrow/pull/8201#issuecomment-693453104


   We have hex encoding / decoding functions in arrow/util/string.h, is it possible for you to use them?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] praveenbingo commented on a change in pull request #8201: ARROW-9956: [C++] [Gandiva] Implementation of binary_string function in gandiva

Posted by GitBox <gi...@apache.org>.
praveenbingo commented on a change in pull request #8201:
URL: https://github.com/apache/arrow/pull/8201#discussion_r495826677



##########
File path: cpp/src/gandiva/precompiled/string_ops.cc
##########
@@ -835,6 +835,48 @@ const char* replace_utf8_utf8_utf8(gdv_int64 context, const char* text,
                                              out_len);
 }
 
+FORCE_INLINE
+const char* binary_string(gdv_int64 context, const char* text, gdv_int32 text_len,
+                          gdv_int32* out_len) {
+  gdv_binary ret =
+      reinterpret_cast<gdv_binary>(gdv_fn_context_arena_malloc(context, text_len));
+
+  if (ret == nullptr) {
+    gdv_fn_context_set_error_msg(context, "Could not allocate memory for output string");

Review comment:
       function that can return errors needs to set that in the function definition : NativeFunction::kNeedsContext | NativeFunction::kCanReturnErrors

##########
File path: cpp/src/gandiva/precompiled/string_ops.cc
##########
@@ -835,6 +835,48 @@ const char* replace_utf8_utf8_utf8(gdv_int64 context, const char* text,
                                              out_len);
 }
 
+FORCE_INLINE
+const char* binary_string(gdv_int64 context, const char* text, gdv_int32 text_len,
+                          gdv_int32* out_len) {
+  gdv_binary ret =
+      reinterpret_cast<gdv_binary>(gdv_fn_context_arena_malloc(context, text_len));
+
+  if (ret == nullptr) {
+    gdv_fn_context_set_error_msg(context, "Could not allocate memory for output string");
+    *out_len = 0;
+    return "";
+  }
+
+  if (text_len == 0) {
+    *out_len = 0;
+    return "";
+  }
+
+  // converting hex encoded string to normal string
+  int j = 0;
+  for (int i = 0; i < text_len; i++, j++) {
+    if (text[i] == '\\' && i + 3 < text_len &&
+        (text[i + 1] == 'x' || text[i + 1] == 'X')) {
+      std::string hex_string;

Review comment:
       same as before - please avoid use of std::string




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] praveenbingo commented on a change in pull request #8201: ARROW-9956: [C++] [Gandiva] Implementation of binary_string function in gandiva

Posted by GitBox <gi...@apache.org>.
praveenbingo commented on a change in pull request #8201:
URL: https://github.com/apache/arrow/pull/8201#discussion_r497996878



##########
File path: cpp/src/gandiva/precompiled/string_ops.cc
##########
@@ -835,6 +835,51 @@ const char* replace_utf8_utf8_utf8(gdv_int64 context, const char* text,
                                              out_len);
 }
 
+FORCE_INLINE
+const char* binary_string(gdv_int64 context, const char* text, gdv_int32 text_len,
+                          gdv_int32* out_len) {
+  gdv_binary ret =
+      reinterpret_cast<gdv_binary>(gdv_fn_context_arena_malloc(context, text_len));
+
+  if (ret == nullptr) {
+    gdv_fn_context_set_error_msg(context, "Could not allocate memory for output string");
+    *out_len = 0;
+    return "";
+  }
+
+  if (text_len == 0) {
+    *out_len = 0;
+    return "";
+  }
+
+  // converting hex encoded string to normal string
+  int j = 0;
+  for (int i = 0; i < text_len; i++, j++) {
+    if (text[i] == '\\' && i + 3 < text_len &&
+        (text[i + 1] == 'x' || text[i + 1] == 'X')) {
+      char hex_string[3];
+      hex_string[0] = toupper(text[i + 2]);
+      hex_string[1] = toupper(text[i + 3]);
+      hex_string[2] = '\0';
+      uint8_t out;
+      arrow::Status st;

Review comment:
       have we used this in other places..are we sure it does not link in any streams which might cause issues..

##########
File path: cpp/src/gandiva/precompiled/string_ops.cc
##########
@@ -835,6 +835,51 @@ const char* replace_utf8_utf8_utf8(gdv_int64 context, const char* text,
                                              out_len);
 }
 
+FORCE_INLINE
+const char* binary_string(gdv_int64 context, const char* text, gdv_int32 text_len,
+                          gdv_int32* out_len) {
+  gdv_binary ret =
+      reinterpret_cast<gdv_binary>(gdv_fn_context_arena_malloc(context, text_len));
+
+  if (ret == nullptr) {
+    gdv_fn_context_set_error_msg(context, "Could not allocate memory for output string");
+    *out_len = 0;
+    return "";
+  }
+
+  if (text_len == 0) {
+    *out_len = 0;
+    return "";
+  }
+
+  // converting hex encoded string to normal string
+  int j = 0;
+  for (int i = 0; i < text_len; i++, j++) {
+    if (text[i] == '\\' && i + 3 < text_len &&
+        (text[i + 1] == 'x' || text[i + 1] == 'X')) {
+      char hex_string[3];
+      hex_string[0] = toupper(text[i + 2]);
+      hex_string[1] = toupper(text[i + 3]);
+      hex_string[2] = '\0';
+      uint8_t out;
+      arrow::Status st;
+      st = arrow::ParseHexValue(hex_string, &out);
+      st = arrow::ParseHexValue(hex_string, &out);

Review comment:
       why are we doing this twice..




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] naman1996 commented on a change in pull request #8201: ARROW-9956: [C++] [Gandiva] Implementation of binary_string function in gandiva

Posted by GitBox <gi...@apache.org>.
naman1996 commented on a change in pull request #8201:
URL: https://github.com/apache/arrow/pull/8201#discussion_r498158806



##########
File path: cpp/src/gandiva/precompiled/string_ops.cc
##########
@@ -835,6 +835,51 @@ const char* replace_utf8_utf8_utf8(gdv_int64 context, const char* text,
                                              out_len);
 }
 
+FORCE_INLINE
+const char* binary_string(gdv_int64 context, const char* text, gdv_int32 text_len,
+                          gdv_int32* out_len) {
+  gdv_binary ret =
+      reinterpret_cast<gdv_binary>(gdv_fn_context_arena_malloc(context, text_len));
+
+  if (ret == nullptr) {
+    gdv_fn_context_set_error_msg(context, "Could not allocate memory for output string");
+    *out_len = 0;
+    return "";
+  }
+
+  if (text_len == 0) {
+    *out_len = 0;
+    return "";
+  }
+
+  // converting hex encoded string to normal string
+  int j = 0;
+  for (int i = 0; i < text_len; i++, j++) {
+    if (text[i] == '\\' && i + 3 < text_len &&
+        (text[i + 1] == 'x' || text[i + 1] == 'X')) {
+      char hex_string[3];
+      hex_string[0] = toupper(text[i + 2]);
+      hex_string[1] = toupper(text[i + 3]);
+      hex_string[2] = '\0';
+      uint8_t out;
+      arrow::Status st;

Review comment:
       Checked the function. They are including string. So I have removed it's usages.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] naman1996 commented on a change in pull request #8201: ARROW-9956: [C++] [Gandiva] Implementation of binary_string function in gandiva

Posted by GitBox <gi...@apache.org>.
naman1996 commented on a change in pull request #8201:
URL: https://github.com/apache/arrow/pull/8201#discussion_r496434956



##########
File path: cpp/src/gandiva/precompiled/string_ops.cc
##########
@@ -835,6 +835,48 @@ const char* replace_utf8_utf8_utf8(gdv_int64 context, const char* text,
                                              out_len);
 }
 
+FORCE_INLINE
+const char* binary_string(gdv_int64 context, const char* text, gdv_int32 text_len,
+                          gdv_int32* out_len) {
+  gdv_binary ret =
+      reinterpret_cast<gdv_binary>(gdv_fn_context_arena_malloc(context, text_len));
+
+  if (ret == nullptr) {
+    gdv_fn_context_set_error_msg(context, "Could not allocate memory for output string");

Review comment:
       added




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] naman1996 commented on a change in pull request #8201: ARROW-9956: [C++] [Gandiva] Implementation of binary_string function in gandiva

Posted by GitBox <gi...@apache.org>.
naman1996 commented on a change in pull request #8201:
URL: https://github.com/apache/arrow/pull/8201#discussion_r498037287



##########
File path: cpp/src/gandiva/precompiled/string_ops.cc
##########
@@ -835,6 +835,51 @@ const char* replace_utf8_utf8_utf8(gdv_int64 context, const char* text,
                                              out_len);
 }
 
+FORCE_INLINE
+const char* binary_string(gdv_int64 context, const char* text, gdv_int32 text_len,
+                          gdv_int32* out_len) {
+  gdv_binary ret =
+      reinterpret_cast<gdv_binary>(gdv_fn_context_arena_malloc(context, text_len));
+
+  if (ret == nullptr) {
+    gdv_fn_context_set_error_msg(context, "Could not allocate memory for output string");
+    *out_len = 0;
+    return "";
+  }
+
+  if (text_len == 0) {
+    *out_len = 0;
+    return "";
+  }
+
+  // converting hex encoded string to normal string
+  int j = 0;
+  for (int i = 0; i < text_len; i++, j++) {
+    if (text[i] == '\\' && i + 3 < text_len &&
+        (text[i + 1] == 'x' || text[i + 1] == 'X')) {
+      char hex_string[3];
+      hex_string[0] = toupper(text[i + 2]);
+      hex_string[1] = toupper(text[i + 3]);
+      hex_string[2] = '\0';
+      uint8_t out;
+      arrow::Status st;

Review comment:
       Not used in other places in gandiva.
    I have run Dremio unit tests which were using this function on macOS. For linux a Gerrit build is currently running which should run unit tests which must us this function. So should be good IMO.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] naman1996 commented on a change in pull request #8201: ARROW-9956: [C++] [Gandiva] Implementation of binary_string function in gandiva

Posted by GitBox <gi...@apache.org>.
naman1996 commented on a change in pull request #8201:
URL: https://github.com/apache/arrow/pull/8201#discussion_r498037287



##########
File path: cpp/src/gandiva/precompiled/string_ops.cc
##########
@@ -835,6 +835,51 @@ const char* replace_utf8_utf8_utf8(gdv_int64 context, const char* text,
                                              out_len);
 }
 
+FORCE_INLINE
+const char* binary_string(gdv_int64 context, const char* text, gdv_int32 text_len,
+                          gdv_int32* out_len) {
+  gdv_binary ret =
+      reinterpret_cast<gdv_binary>(gdv_fn_context_arena_malloc(context, text_len));
+
+  if (ret == nullptr) {
+    gdv_fn_context_set_error_msg(context, "Could not allocate memory for output string");
+    *out_len = 0;
+    return "";
+  }
+
+  if (text_len == 0) {
+    *out_len = 0;
+    return "";
+  }
+
+  // converting hex encoded string to normal string
+  int j = 0;
+  for (int i = 0; i < text_len; i++, j++) {
+    if (text[i] == '\\' && i + 3 < text_len &&
+        (text[i + 1] == 'x' || text[i + 1] == 'X')) {
+      char hex_string[3];
+      hex_string[0] = toupper(text[i + 2]);
+      hex_string[1] = toupper(text[i + 3]);
+      hex_string[2] = '\0';
+      uint8_t out;
+      arrow::Status st;

Review comment:
       Not used in other places.
    I have run Dremio unit tests which were using this function on macOS. For linux a Gerrit build is currently running which should run unit tests which must us this function. So should be good IMO.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] naman1996 closed pull request #8201: ARROW-9956: [C++] [Gandiva] Implementation of binary_string function in gandiva

Posted by GitBox <gi...@apache.org>.
naman1996 closed pull request #8201:
URL: https://github.com/apache/arrow/pull/8201


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] naman1996 closed pull request #8201: ARROW-9956: [C++] [Gandiva] Implementation of binary_string function in gandiva

Posted by GitBox <gi...@apache.org>.
naman1996 closed pull request #8201:
URL: https://github.com/apache/arrow/pull/8201


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] naman1996 commented on a change in pull request #8201: ARROW-9956: [C++] [Gandiva] Implementation of binary_string function in gandiva

Posted by GitBox <gi...@apache.org>.
naman1996 commented on a change in pull request #8201:
URL: https://github.com/apache/arrow/pull/8201#discussion_r498158806



##########
File path: cpp/src/gandiva/precompiled/string_ops.cc
##########
@@ -835,6 +835,51 @@ const char* replace_utf8_utf8_utf8(gdv_int64 context, const char* text,
                                              out_len);
 }
 
+FORCE_INLINE
+const char* binary_string(gdv_int64 context, const char* text, gdv_int32 text_len,
+                          gdv_int32* out_len) {
+  gdv_binary ret =
+      reinterpret_cast<gdv_binary>(gdv_fn_context_arena_malloc(context, text_len));
+
+  if (ret == nullptr) {
+    gdv_fn_context_set_error_msg(context, "Could not allocate memory for output string");
+    *out_len = 0;
+    return "";
+  }
+
+  if (text_len == 0) {
+    *out_len = 0;
+    return "";
+  }
+
+  // converting hex encoded string to normal string
+  int j = 0;
+  for (int i = 0; i < text_len; i++, j++) {
+    if (text[i] == '\\' && i + 3 < text_len &&
+        (text[i + 1] == 'x' || text[i + 1] == 'X')) {
+      char hex_string[3];
+      hex_string[0] = toupper(text[i + 2]);
+      hex_string[1] = toupper(text[i + 3]);
+      hex_string[2] = '\0';
+      uint8_t out;
+      arrow::Status st;

Review comment:
       Checked the function. They are not including streams but are including string. So I have removed it's usages.

##########
File path: cpp/src/gandiva/precompiled/string_ops.cc
##########
@@ -835,6 +835,51 @@ const char* replace_utf8_utf8_utf8(gdv_int64 context, const char* text,
                                              out_len);
 }
 
+FORCE_INLINE
+const char* binary_string(gdv_int64 context, const char* text, gdv_int32 text_len,
+                          gdv_int32* out_len) {
+  gdv_binary ret =
+      reinterpret_cast<gdv_binary>(gdv_fn_context_arena_malloc(context, text_len));
+
+  if (ret == nullptr) {
+    gdv_fn_context_set_error_msg(context, "Could not allocate memory for output string");
+    *out_len = 0;
+    return "";
+  }
+
+  if (text_len == 0) {
+    *out_len = 0;
+    return "";
+  }
+
+  // converting hex encoded string to normal string
+  int j = 0;
+  for (int i = 0; i < text_len; i++, j++) {
+    if (text[i] == '\\' && i + 3 < text_len &&
+        (text[i + 1] == 'x' || text[i + 1] == 'X')) {
+      char hex_string[3];
+      hex_string[0] = toupper(text[i + 2]);
+      hex_string[1] = toupper(text[i + 3]);
+      hex_string[2] = '\0';
+      uint8_t out;
+      arrow::Status st;

Review comment:
       Checked the includes. They are not including streams but are including string. So I have removed it's usages.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] naman1996 commented on pull request #8201: ARROW-9956: [C++] [Gandiva] Implementation of binary_string function in gandiva

Posted by GitBox <gi...@apache.org>.
naman1996 commented on pull request #8201:
URL: https://github.com/apache/arrow/pull/8201#issuecomment-694112182


   > We have hex encoding / decoding functions in arrow/util/string.h, is it possible for you to use them?
   
   I think I can use the arrow::ParseHexValue from string util. This would require a lot to type conversions though as that function takes in a string and returns the hex value an unsigned int. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] naman1996 commented on pull request #8201: ARROW-9956: [C++] [Gandiva] Implementation of binary_string function in gandiva

Posted by GitBox <gi...@apache.org>.
naman1996 commented on pull request #8201:
URL: https://github.com/apache/arrow/pull/8201#issuecomment-694185255


   > We have hex encoding / decoding functions in arrow/util/string.h, is it possible for you to use them?
   
   @wesm I have refactored the code to use arrow::ParseHexValue. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] praveenbingo closed pull request #8201: ARROW-9956: [C++] [Gandiva] Implementation of binary_string function in gandiva

Posted by GitBox <gi...@apache.org>.
praveenbingo closed pull request #8201:
URL: https://github.com/apache/arrow/pull/8201


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] naman1996 commented on a change in pull request #8201: ARROW-9956: [C++] [Gandiva] Implementation of binary_string function in gandiva

Posted by GitBox <gi...@apache.org>.
naman1996 commented on a change in pull request #8201:
URL: https://github.com/apache/arrow/pull/8201#discussion_r498016721



##########
File path: cpp/src/gandiva/precompiled/string_ops.cc
##########
@@ -835,6 +835,48 @@ const char* replace_utf8_utf8_utf8(gdv_int64 context, const char* text,
                                              out_len);
 }
 
+FORCE_INLINE
+const char* binary_string(gdv_int64 context, const char* text, gdv_int32 text_len,
+                          gdv_int32* out_len) {
+  gdv_binary ret =
+      reinterpret_cast<gdv_binary>(gdv_fn_context_arena_malloc(context, text_len));
+
+  if (ret == nullptr) {
+    gdv_fn_context_set_error_msg(context, "Could not allocate memory for output string");

Review comment:
       Resolving based on offline discussion. Needed only for null internal functions




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org