You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/03/03 14:04:53 UTC

[GitHub] [arrow-datafusion] tustvold opened a new issue #1915: LIKE operator incorrectly handles newlines

tustvold opened a new issue #1915:
URL: https://github.com/apache/arrow-datafusion/issues/1915


   **Describe the bug**
   
   The LIKE operator doesn't appear to behave correctly with strings containing newlines
   
   **To Reproduce**
   
   ```
   #[tokio::test]
   async fn like_on_multiline_string() -> Result<()> {
       let input = vec![Some("foo\nbar\nbaz")]
           .into_iter()
           .collect::<StringArray>();
   
       let batch = RecordBatch::try_from_iter(vec![("c1", Arc::new(input) as _)]).unwrap();
   
       let table = MemTable::try_new(batch.schema(), vec![vec![batch]])?;
       let mut ctx = ExecutionContext::new();
       ctx.register_table("test", Arc::new(table))?;
   
       let sql = "SELECT * FROM test WHERE c1 ~ 'bar'";
       let regex = execute_to_batches(&mut ctx, sql).await;
   
       let sql = "SELECT * FROM test WHERE c1 LIKE '%bar%'";
       let like = execute_to_batches(&mut ctx, sql).await;
   
       let expected = vec![
           "+-----+",
           "| c1  |",
           "+-----+",
           "| foo |",
           "| bar |",
           "| baz |",
           "+-----+",
       ];
   
       assert_batches_eq!(expected, &regex);
       assert_batches_eq!(expected, &like);
       Ok(())
   }
   ```
   
   The regex passes, but the LIKE expression which should be equivalent does not
   
   **Expected behavior**
   
   The test should pass
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-datafusion] alamb commented on issue #1915: LIKE operator incorrectly handles newlines

Posted by GitBox <gi...@apache.org>.
alamb commented on issue #1915:
URL: https://github.com/apache/arrow-datafusion/issues/1915#issuecomment-1058088692


   This will likely require a fix in the arrow kernel. Datafusion just calls functions `like_utf8` etc here:
   https://github.com/apache/arrow-datafusion/blob/master/datafusion-physical-expr/src/expressions/binary.rs#L47-L51
   
   https://docs.rs/arrow/9.1.0/arrow/compute/kernels/comparison/fn.like_utf8.html


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org