You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/01/26 15:46:45 UTC

[GitHub] [arrow-cookbook] davisusanibar opened a new pull request #135: [Java]: Java cookbook for create arrow data manipulation

davisusanibar opened a new pull request #135:
URL: https://github.com/apache/arrow-cookbook/pull/135


   Java cookbook for create arrow data manipulation


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-cookbook] amol- commented on a change in pull request #135: [Java]: Java cookbook for create arrow data manipulation

Posted by GitBox <gi...@apache.org>.
amol- commented on a change in pull request #135:
URL: https://github.com/apache/arrow-cookbook/pull/135#discussion_r797532351



##########
File path: java/source/data.rst
##########
@@ -0,0 +1,360 @@
+=================
+Data manipulation
+=================
+
+Recipes related to compare, filtering or transforming data.
+
+.. contents::
+
+Compare Vectors for Field Equality
+==================================
+
+.. testcode::
+
+    import org.apache.arrow.vector.IntVector;
+    import org.apache.arrow.vector.compare.TypeEqualsVisitor;
+    import org.apache.arrow.memory.RootAllocator;
+
+    RootAllocator rootAllocator = new RootAllocator(Long.MAX_VALUE);
+    IntVector right = new IntVector("int", rootAllocator);
+    right.allocateNew(3);
+    right.set(0, 10);
+    right.set(1, 20);
+    right.set(2, 30);
+    right.setValueCount(3);
+    IntVector left1 = new IntVector("int", rootAllocator);
+    IntVector left2 = new IntVector("int2", rootAllocator);
+    TypeEqualsVisitor visitor = new TypeEqualsVisitor(right);
+
+    System.out.println(visitor.equals(left1));
+    System.out.println(visitor.equals(left2));
+
+.. testoutput::
+
+    true
+    false
+
+Compare Values on the Array
+===========================
+
+Comparing two values at the given indices in the vectors:
+
+.. testcode::
+
+    import org.apache.arrow.algorithm.sort.StableVectorComparator;
+    import org.apache.arrow.algorithm.sort.VectorValueComparator;
+    import org.apache.arrow.vector.VarCharVector;
+    import org.apache.arrow.memory.RootAllocator;
+
+    class TestVectorValueComparator extends VectorValueComparator<VarCharVector> {
+        @Override
+        public int compareNotNull(int index1, int index2) {
+            byte b1 = vector1.get(index1)[0];
+            byte b2 = vector2.get(index2)[0];
+            return b1 - b2;
+        }
+
+        @Override
+        public VectorValueComparator<VarCharVector> createNew() {
+            return new TestVectorValueComparator();
+        }
+    }
+
+    RootAllocator rootAllocator = new RootAllocator(Long.MAX_VALUE);
+
+    VarCharVector vec = new VarCharVector("valueindexcomparator", rootAllocator);
+    vec.allocateNew(3);
+    vec.setValueCount(3);
+    vec.set(0, "ba".getBytes());
+    vec.set(1, "abc".getBytes());
+    vec.set(2, "aa".getBytes());
+
+    VectorValueComparator<VarCharVector> comparatorValues = new TestVectorValueComparator();
+    VectorValueComparator<VarCharVector> stableComparator = new StableVectorComparator<>(comparatorValues);
+    stableComparator.attachVector(vec);
+
+    System.out.println(stableComparator.compare(0, 1) > 0);
+    System.out.println(stableComparator.compare(1, 2) < 0);
+
+.. testoutput::
+
+    true
+    true
+
+Search Values on the Array
+==========================
+
+Linear Search - O(n)
+********************
+
+Algorithm: org.apache.arrow.algorithm.search.VectorSearcher#linearSearch - O(n)
+
+.. testcode::
+
+    import org.apache.arrow.algorithm.search.VectorSearcher;
+    import org.apache.arrow.algorithm.sort.DefaultVectorComparators;
+    import org.apache.arrow.algorithm.sort.VectorValueComparator;
+    import org.apache.arrow.vector.IntVector;
+    import org.apache.arrow.memory.RootAllocator;
+
+    RootAllocator rootAllocator = new RootAllocator(Long.MAX_VALUE);
+    IntVector linearSearchVector = new IntVector("linearSearchVector", rootAllocator);
+    linearSearchVector.allocateNew(10);
+    linearSearchVector.setValueCount(10);
+    for (int i = 0; i < 10; i++) {
+        linearSearchVector.set(i, i);
+    }
+    VectorValueComparator<IntVector> comparatorInt = DefaultVectorComparators.createDefaultComparator(linearSearchVector);
+    List<Integer> listResultLinearSearch = new ArrayList<Integer>();
+    for (int i = 0; i < 10; i++) {
+       int result = VectorSearcher.linearSearch(linearSearchVector, comparatorInt, linearSearchVector, i);
+       listResultLinearSearch.add(result);
+    }

Review comment:
       I think that the for loop to search for all values makes things more complicate to understand. Let's stick to how you search 1 value, it will make the recipe easier to understand.

##########
File path: java/source/data.rst
##########
@@ -0,0 +1,360 @@
+=================
+Data manipulation
+=================
+
+Recipes related to compare, filtering or transforming data.
+
+.. contents::
+
+Compare Vectors for Field Equality
+==================================
+
+.. testcode::
+
+    import org.apache.arrow.vector.IntVector;
+    import org.apache.arrow.vector.compare.TypeEqualsVisitor;
+    import org.apache.arrow.memory.RootAllocator;
+
+    RootAllocator rootAllocator = new RootAllocator(Long.MAX_VALUE);
+    IntVector right = new IntVector("int", rootAllocator);
+    right.allocateNew(3);
+    right.set(0, 10);
+    right.set(1, 20);
+    right.set(2, 30);
+    right.setValueCount(3);
+    IntVector left1 = new IntVector("int", rootAllocator);
+    IntVector left2 = new IntVector("int2", rootAllocator);
+    TypeEqualsVisitor visitor = new TypeEqualsVisitor(right);
+
+    System.out.println(visitor.equals(left1));
+    System.out.println(visitor.equals(left2));
+
+.. testoutput::
+
+    true
+    false
+
+Compare Values on the Array
+===========================
+
+Comparing two values at the given indices in the vectors:
+
+.. testcode::
+
+    import org.apache.arrow.algorithm.sort.StableVectorComparator;
+    import org.apache.arrow.algorithm.sort.VectorValueComparator;
+    import org.apache.arrow.vector.VarCharVector;
+    import org.apache.arrow.memory.RootAllocator;
+
+    class TestVectorValueComparator extends VectorValueComparator<VarCharVector> {
+        @Override
+        public int compareNotNull(int index1, int index2) {
+            byte b1 = vector1.get(index1)[0];
+            byte b2 = vector2.get(index2)[0];
+            return b1 - b2;
+        }
+
+        @Override
+        public VectorValueComparator<VarCharVector> createNew() {
+            return new TestVectorValueComparator();
+        }
+    }
+
+    RootAllocator rootAllocator = new RootAllocator(Long.MAX_VALUE);
+
+    VarCharVector vec = new VarCharVector("valueindexcomparator", rootAllocator);
+    vec.allocateNew(3);
+    vec.setValueCount(3);
+    vec.set(0, "ba".getBytes());
+    vec.set(1, "abc".getBytes());
+    vec.set(2, "aa".getBytes());
+
+    VectorValueComparator<VarCharVector> comparatorValues = new TestVectorValueComparator();
+    VectorValueComparator<VarCharVector> stableComparator = new StableVectorComparator<>(comparatorValues);
+    stableComparator.attachVector(vec);
+
+    System.out.println(stableComparator.compare(0, 1) > 0);
+    System.out.println(stableComparator.compare(1, 2) < 0);
+
+.. testoutput::
+
+    true
+    true
+
+Search Values on the Array
+==========================
+
+Linear Search - O(n)
+********************
+
+Algorithm: org.apache.arrow.algorithm.search.VectorSearcher#linearSearch - O(n)
+
+.. testcode::
+
+    import org.apache.arrow.algorithm.search.VectorSearcher;
+    import org.apache.arrow.algorithm.sort.DefaultVectorComparators;
+    import org.apache.arrow.algorithm.sort.VectorValueComparator;
+    import org.apache.arrow.vector.IntVector;
+    import org.apache.arrow.memory.RootAllocator;
+
+    RootAllocator rootAllocator = new RootAllocator(Long.MAX_VALUE);
+    IntVector linearSearchVector = new IntVector("linearSearchVector", rootAllocator);
+    linearSearchVector.allocateNew(10);
+    linearSearchVector.setValueCount(10);
+    for (int i = 0; i < 10; i++) {
+        linearSearchVector.set(i, i);
+    }
+    VectorValueComparator<IntVector> comparatorInt = DefaultVectorComparators.createDefaultComparator(linearSearchVector);
+    List<Integer> listResultLinearSearch = new ArrayList<Integer>();
+    for (int i = 0; i < 10; i++) {
+       int result = VectorSearcher.linearSearch(linearSearchVector, comparatorInt, linearSearchVector, i);
+       listResultLinearSearch.add(result);
+    }
+
+    System.out.println(listResultLinearSearch);
+
+.. testoutput::
+
+    [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
+
+Binary Search - O(log(n))
+*************************
+
+Algorithm: org.apache.arrow.algorithm.search.VectorSearcher#binarySearch - O(log(n))
+
+.. testcode::
+
+    import org.apache.arrow.algorithm.search.VectorSearcher;
+    import org.apache.arrow.algorithm.sort.DefaultVectorComparators;
+    import org.apache.arrow.algorithm.sort.VectorValueComparator;
+    import org.apache.arrow.vector.IntVector;
+    import org.apache.arrow.memory.RootAllocator;
+
+    RootAllocator rootAllocator = new RootAllocator(Long.MAX_VALUE);
+    IntVector binarySearchVector = new IntVector("", rootAllocator);
+    binarySearchVector.allocateNew(10);
+    binarySearchVector.setValueCount(10);
+    for (int i = 0; i < 10; i++) {
+        binarySearchVector.set(i, i);
+    }
+    VectorValueComparator<IntVector> comparatorInt = DefaultVectorComparators.createDefaultComparator(binarySearchVector);
+    List<Integer> listResultBinarySearch = new ArrayList<Integer>();
+    for (int i = 0; i < 10; i++) {
+       int result = VectorSearcher.binarySearch(binarySearchVector, comparatorInt, binarySearchVector, i);
+       listResultBinarySearch.add(result);
+    }

Review comment:
       I think that the for loop to search for all values makes things more complicate to understand. Let's stick to how you search 1 value, it will make the recipe easier to understand.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-cookbook] amol- merged pull request #135: [Java]: Java cookbook for create arrow data manipulation

Posted by GitBox <gi...@apache.org>.
amol- merged pull request #135:
URL: https://github.com/apache/arrow-cookbook/pull/135


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-cookbook] amol- commented on a change in pull request #135: [Java]: Java cookbook for create arrow data manipulation

Posted by GitBox <gi...@apache.org>.
amol- commented on a change in pull request #135:
URL: https://github.com/apache/arrow-cookbook/pull/135#discussion_r797521878



##########
File path: java/source/data.rst
##########
@@ -0,0 +1,360 @@
+=================
+Data manipulation
+=================
+
+Recipes related to compare, filtering or transforming data.
+
+.. contents::
+
+Compare Vectors for Field Equality
+==================================
+
+.. testcode::
+
+    import org.apache.arrow.vector.IntVector;
+    import org.apache.arrow.vector.compare.TypeEqualsVisitor;
+    import org.apache.arrow.memory.RootAllocator;
+
+    RootAllocator rootAllocator = new RootAllocator(Long.MAX_VALUE);
+    IntVector right = new IntVector("int", rootAllocator);
+    right.allocateNew(3);
+    right.set(0, 10);
+    right.set(1, 20);
+    right.set(2, 30);
+    right.setValueCount(3);
+    IntVector left1 = new IntVector("int", rootAllocator);
+    IntVector left2 = new IntVector("int2", rootAllocator);
+    TypeEqualsVisitor visitor = new TypeEqualsVisitor(right);
+
+    System.out.println(visitor.equals(left1));
+    System.out.println(visitor.equals(left2));
+
+.. testoutput::
+
+    true
+    false
+
+Compare Values on the Array

Review comment:
       I think we might also want a recipe that show usage of `VectorEqualsVisitor` to show readers how to compare arrays themselves. I guess it should probably be the first recipe in this chapter.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-cookbook] davisusanibar commented on a change in pull request #135: [Java]: Java cookbook for create arrow data manipulation

Posted by GitBox <gi...@apache.org>.
davisusanibar commented on a change in pull request #135:
URL: https://github.com/apache/arrow-cookbook/pull/135#discussion_r797980225



##########
File path: java/source/data.rst
##########
@@ -0,0 +1,360 @@
+=================
+Data manipulation
+=================
+
+Recipes related to compare, filtering or transforming data.
+
+.. contents::
+
+Compare Vectors for Field Equality
+==================================
+
+.. testcode::
+
+    import org.apache.arrow.vector.IntVector;
+    import org.apache.arrow.vector.compare.TypeEqualsVisitor;
+    import org.apache.arrow.memory.RootAllocator;
+
+    RootAllocator rootAllocator = new RootAllocator(Long.MAX_VALUE);
+    IntVector right = new IntVector("int", rootAllocator);
+    right.allocateNew(3);
+    right.set(0, 10);
+    right.set(1, 20);
+    right.set(2, 30);
+    right.setValueCount(3);
+    IntVector left1 = new IntVector("int", rootAllocator);
+    IntVector left2 = new IntVector("int2", rootAllocator);
+    TypeEqualsVisitor visitor = new TypeEqualsVisitor(right);
+
+    System.out.println(visitor.equals(left1));
+    System.out.println(visitor.equals(left2));
+
+.. testoutput::
+
+    true
+    false
+
+Compare Values on the Array
+===========================
+
+Comparing two values at the given indices in the vectors:
+
+.. testcode::
+
+    import org.apache.arrow.algorithm.sort.StableVectorComparator;
+    import org.apache.arrow.algorithm.sort.VectorValueComparator;
+    import org.apache.arrow.vector.VarCharVector;
+    import org.apache.arrow.memory.RootAllocator;
+
+    class TestVectorValueComparator extends VectorValueComparator<VarCharVector> {
+        @Override
+        public int compareNotNull(int index1, int index2) {
+            byte b1 = vector1.get(index1)[0];
+            byte b2 = vector2.get(index2)[0];
+            return b1 - b2;
+        }
+
+        @Override
+        public VectorValueComparator<VarCharVector> createNew() {
+            return new TestVectorValueComparator();
+        }
+    }

Review comment:
       Deleted

##########
File path: java/source/data.rst
##########
@@ -0,0 +1,360 @@
+=================
+Data manipulation
+=================
+
+Recipes related to compare, filtering or transforming data.
+
+.. contents::
+
+Compare Vectors for Field Equality
+==================================
+
+.. testcode::
+
+    import org.apache.arrow.vector.IntVector;
+    import org.apache.arrow.vector.compare.TypeEqualsVisitor;
+    import org.apache.arrow.memory.RootAllocator;
+
+    RootAllocator rootAllocator = new RootAllocator(Long.MAX_VALUE);
+    IntVector right = new IntVector("int", rootAllocator);
+    right.allocateNew(3);
+    right.set(0, 10);
+    right.set(1, 20);
+    right.set(2, 30);
+    right.setValueCount(3);
+    IntVector left1 = new IntVector("int", rootAllocator);
+    IntVector left2 = new IntVector("int2", rootAllocator);
+    TypeEqualsVisitor visitor = new TypeEqualsVisitor(right);
+
+    System.out.println(visitor.equals(left1));
+    System.out.println(visitor.equals(left2));
+
+.. testoutput::
+
+    true
+    false
+
+Compare Values on the Array

Review comment:
       Added

##########
File path: java/source/data.rst
##########
@@ -0,0 +1,360 @@
+=================
+Data manipulation
+=================
+
+Recipes related to compare, filtering or transforming data.
+
+.. contents::
+
+Compare Vectors for Field Equality
+==================================
+
+.. testcode::
+
+    import org.apache.arrow.vector.IntVector;
+    import org.apache.arrow.vector.compare.TypeEqualsVisitor;
+    import org.apache.arrow.memory.RootAllocator;
+
+    RootAllocator rootAllocator = new RootAllocator(Long.MAX_VALUE);
+    IntVector right = new IntVector("int", rootAllocator);
+    right.allocateNew(3);
+    right.set(0, 10);
+    right.set(1, 20);
+    right.set(2, 30);
+    right.setValueCount(3);
+    IntVector left1 = new IntVector("int", rootAllocator);
+    IntVector left2 = new IntVector("int2", rootAllocator);
+    TypeEqualsVisitor visitor = new TypeEqualsVisitor(right);
+
+    System.out.println(visitor.equals(left1));
+    System.out.println(visitor.equals(left2));
+
+.. testoutput::
+
+    true
+    false
+
+Compare Values on the Array
+===========================
+
+Comparing two values at the given indices in the vectors:
+
+.. testcode::
+
+    import org.apache.arrow.algorithm.sort.StableVectorComparator;
+    import org.apache.arrow.algorithm.sort.VectorValueComparator;
+    import org.apache.arrow.vector.VarCharVector;
+    import org.apache.arrow.memory.RootAllocator;
+
+    class TestVectorValueComparator extends VectorValueComparator<VarCharVector> {
+        @Override
+        public int compareNotNull(int index1, int index2) {
+            byte b1 = vector1.get(index1)[0];
+            byte b2 = vector2.get(index2)[0];
+            return b1 - b2;
+        }
+
+        @Override
+        public VectorValueComparator<VarCharVector> createNew() {
+            return new TestVectorValueComparator();
+        }
+    }
+
+    RootAllocator rootAllocator = new RootAllocator(Long.MAX_VALUE);
+
+    VarCharVector vec = new VarCharVector("valueindexcomparator", rootAllocator);
+    vec.allocateNew(3);
+    vec.setValueCount(3);
+    vec.set(0, "ba".getBytes());
+    vec.set(1, "abc".getBytes());
+    vec.set(2, "aa".getBytes());
+
+    VectorValueComparator<VarCharVector> comparatorValues = new TestVectorValueComparator();
+    VectorValueComparator<VarCharVector> stableComparator = new StableVectorComparator<>(comparatorValues);
+    stableComparator.attachVector(vec);
+
+    System.out.println(stableComparator.compare(0, 1) > 0);
+    System.out.println(stableComparator.compare(1, 2) < 0);
+
+.. testoutput::
+
+    true
+    true
+
+Search Values on the Array
+==========================
+
+Linear Search - O(n)
+********************
+
+Algorithm: org.apache.arrow.algorithm.search.VectorSearcher#linearSearch - O(n)
+
+.. testcode::
+
+    import org.apache.arrow.algorithm.search.VectorSearcher;
+    import org.apache.arrow.algorithm.sort.DefaultVectorComparators;
+    import org.apache.arrow.algorithm.sort.VectorValueComparator;
+    import org.apache.arrow.vector.IntVector;
+    import org.apache.arrow.memory.RootAllocator;
+
+    RootAllocator rootAllocator = new RootAllocator(Long.MAX_VALUE);
+    IntVector linearSearchVector = new IntVector("linearSearchVector", rootAllocator);
+    linearSearchVector.allocateNew(10);
+    linearSearchVector.setValueCount(10);
+    for (int i = 0; i < 10; i++) {
+        linearSearchVector.set(i, i);
+    }
+    VectorValueComparator<IntVector> comparatorInt = DefaultVectorComparators.createDefaultComparator(linearSearchVector);
+    List<Integer> listResultLinearSearch = new ArrayList<Integer>();
+    for (int i = 0; i < 10; i++) {
+       int result = VectorSearcher.linearSearch(linearSearchVector, comparatorInt, linearSearchVector, i);
+       listResultLinearSearch.add(result);
+    }

Review comment:
       Changed

##########
File path: java/source/data.rst
##########
@@ -0,0 +1,360 @@
+=================
+Data manipulation
+=================
+
+Recipes related to compare, filtering or transforming data.
+
+.. contents::
+
+Compare Vectors for Field Equality
+==================================
+
+.. testcode::
+
+    import org.apache.arrow.vector.IntVector;
+    import org.apache.arrow.vector.compare.TypeEqualsVisitor;
+    import org.apache.arrow.memory.RootAllocator;
+
+    RootAllocator rootAllocator = new RootAllocator(Long.MAX_VALUE);
+    IntVector right = new IntVector("int", rootAllocator);
+    right.allocateNew(3);
+    right.set(0, 10);
+    right.set(1, 20);
+    right.set(2, 30);
+    right.setValueCount(3);
+    IntVector left1 = new IntVector("int", rootAllocator);
+    IntVector left2 = new IntVector("int2", rootAllocator);
+    TypeEqualsVisitor visitor = new TypeEqualsVisitor(right);
+
+    System.out.println(visitor.equals(left1));
+    System.out.println(visitor.equals(left2));
+
+.. testoutput::
+
+    true
+    false
+
+Compare Values on the Array
+===========================
+
+Comparing two values at the given indices in the vectors:
+
+.. testcode::
+
+    import org.apache.arrow.algorithm.sort.StableVectorComparator;
+    import org.apache.arrow.algorithm.sort.VectorValueComparator;
+    import org.apache.arrow.vector.VarCharVector;
+    import org.apache.arrow.memory.RootAllocator;
+
+    class TestVectorValueComparator extends VectorValueComparator<VarCharVector> {
+        @Override
+        public int compareNotNull(int index1, int index2) {
+            byte b1 = vector1.get(index1)[0];
+            byte b2 = vector2.get(index2)[0];
+            return b1 - b2;
+        }
+
+        @Override
+        public VectorValueComparator<VarCharVector> createNew() {
+            return new TestVectorValueComparator();
+        }
+    }
+
+    RootAllocator rootAllocator = new RootAllocator(Long.MAX_VALUE);
+
+    VarCharVector vec = new VarCharVector("valueindexcomparator", rootAllocator);
+    vec.allocateNew(3);
+    vec.setValueCount(3);
+    vec.set(0, "ba".getBytes());
+    vec.set(1, "abc".getBytes());
+    vec.set(2, "aa".getBytes());
+
+    VectorValueComparator<VarCharVector> comparatorValues = new TestVectorValueComparator();
+    VectorValueComparator<VarCharVector> stableComparator = new StableVectorComparator<>(comparatorValues);
+    stableComparator.attachVector(vec);
+
+    System.out.println(stableComparator.compare(0, 1) > 0);
+    System.out.println(stableComparator.compare(1, 2) < 0);
+
+.. testoutput::
+
+    true
+    true
+
+Search Values on the Array
+==========================
+
+Linear Search - O(n)
+********************
+
+Algorithm: org.apache.arrow.algorithm.search.VectorSearcher#linearSearch - O(n)
+
+.. testcode::
+
+    import org.apache.arrow.algorithm.search.VectorSearcher;
+    import org.apache.arrow.algorithm.sort.DefaultVectorComparators;
+    import org.apache.arrow.algorithm.sort.VectorValueComparator;
+    import org.apache.arrow.vector.IntVector;
+    import org.apache.arrow.memory.RootAllocator;
+
+    RootAllocator rootAllocator = new RootAllocator(Long.MAX_VALUE);
+    IntVector linearSearchVector = new IntVector("linearSearchVector", rootAllocator);
+    linearSearchVector.allocateNew(10);
+    linearSearchVector.setValueCount(10);
+    for (int i = 0; i < 10; i++) {
+        linearSearchVector.set(i, i);
+    }
+    VectorValueComparator<IntVector> comparatorInt = DefaultVectorComparators.createDefaultComparator(linearSearchVector);
+    List<Integer> listResultLinearSearch = new ArrayList<Integer>();
+    for (int i = 0; i < 10; i++) {
+       int result = VectorSearcher.linearSearch(linearSearchVector, comparatorInt, linearSearchVector, i);
+       listResultLinearSearch.add(result);
+    }
+
+    System.out.println(listResultLinearSearch);
+
+.. testoutput::
+
+    [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
+
+Binary Search - O(log(n))
+*************************
+
+Algorithm: org.apache.arrow.algorithm.search.VectorSearcher#binarySearch - O(log(n))
+
+.. testcode::
+
+    import org.apache.arrow.algorithm.search.VectorSearcher;
+    import org.apache.arrow.algorithm.sort.DefaultVectorComparators;
+    import org.apache.arrow.algorithm.sort.VectorValueComparator;
+    import org.apache.arrow.vector.IntVector;
+    import org.apache.arrow.memory.RootAllocator;
+
+    RootAllocator rootAllocator = new RootAllocator(Long.MAX_VALUE);
+    IntVector binarySearchVector = new IntVector("", rootAllocator);
+    binarySearchVector.allocateNew(10);
+    binarySearchVector.setValueCount(10);
+    for (int i = 0; i < 10; i++) {
+        binarySearchVector.set(i, i);
+    }
+    VectorValueComparator<IntVector> comparatorInt = DefaultVectorComparators.createDefaultComparator(binarySearchVector);
+    List<Integer> listResultBinarySearch = new ArrayList<Integer>();
+    for (int i = 0; i < 10; i++) {
+       int result = VectorSearcher.binarySearch(binarySearchVector, comparatorInt, binarySearchVector, i);
+       listResultBinarySearch.add(result);
+    }

Review comment:
       Changed

##########
File path: java/source/data.rst
##########
@@ -0,0 +1,360 @@
+=================
+Data manipulation
+=================
+
+Recipes related to compare, filtering or transforming data.
+
+.. contents::
+
+Compare Vectors for Field Equality
+==================================
+
+.. testcode::
+
+    import org.apache.arrow.vector.IntVector;
+    import org.apache.arrow.vector.compare.TypeEqualsVisitor;
+    import org.apache.arrow.memory.RootAllocator;
+
+    RootAllocator rootAllocator = new RootAllocator(Long.MAX_VALUE);
+    IntVector right = new IntVector("int", rootAllocator);
+    right.allocateNew(3);
+    right.set(0, 10);
+    right.set(1, 20);
+    right.set(2, 30);
+    right.setValueCount(3);
+    IntVector left1 = new IntVector("int", rootAllocator);
+    IntVector left2 = new IntVector("int2", rootAllocator);
+    TypeEqualsVisitor visitor = new TypeEqualsVisitor(right);
+
+    System.out.println(visitor.equals(left1));
+    System.out.println(visitor.equals(left2));
+
+.. testoutput::
+
+    true
+    false
+
+Compare Values on the Array
+===========================
+
+Comparing two values at the given indices in the vectors:
+
+.. testcode::
+
+    import org.apache.arrow.algorithm.sort.StableVectorComparator;
+    import org.apache.arrow.algorithm.sort.VectorValueComparator;
+    import org.apache.arrow.vector.VarCharVector;
+    import org.apache.arrow.memory.RootAllocator;
+
+    class TestVectorValueComparator extends VectorValueComparator<VarCharVector> {
+        @Override
+        public int compareNotNull(int index1, int index2) {
+            byte b1 = vector1.get(index1)[0];
+            byte b2 = vector2.get(index2)[0];
+            return b1 - b2;
+        }
+
+        @Override
+        public VectorValueComparator<VarCharVector> createNew() {
+            return new TestVectorValueComparator();
+        }
+    }
+
+    RootAllocator rootAllocator = new RootAllocator(Long.MAX_VALUE);
+
+    VarCharVector vec = new VarCharVector("valueindexcomparator", rootAllocator);
+    vec.allocateNew(3);
+    vec.setValueCount(3);
+    vec.set(0, "ba".getBytes());
+    vec.set(1, "abc".getBytes());
+    vec.set(2, "aa".getBytes());
+
+    VectorValueComparator<VarCharVector> comparatorValues = new TestVectorValueComparator();
+    VectorValueComparator<VarCharVector> stableComparator = new StableVectorComparator<>(comparatorValues);
+    stableComparator.attachVector(vec);
+
+    System.out.println(stableComparator.compare(0, 1) > 0);
+    System.out.println(stableComparator.compare(1, 2) < 0);
+
+.. testoutput::
+
+    true
+    true
+
+Search Values on the Array
+==========================
+
+Linear Search - O(n)
+********************
+
+Algorithm: org.apache.arrow.algorithm.search.VectorSearcher#linearSearch - O(n)
+
+.. testcode::
+
+    import org.apache.arrow.algorithm.search.VectorSearcher;
+    import org.apache.arrow.algorithm.sort.DefaultVectorComparators;
+    import org.apache.arrow.algorithm.sort.VectorValueComparator;
+    import org.apache.arrow.vector.IntVector;
+    import org.apache.arrow.memory.RootAllocator;
+
+    RootAllocator rootAllocator = new RootAllocator(Long.MAX_VALUE);
+    IntVector linearSearchVector = new IntVector("linearSearchVector", rootAllocator);
+    linearSearchVector.allocateNew(10);
+    linearSearchVector.setValueCount(10);
+    for (int i = 0; i < 10; i++) {
+        linearSearchVector.set(i, i);
+    }
+    VectorValueComparator<IntVector> comparatorInt = DefaultVectorComparators.createDefaultComparator(linearSearchVector);
+    List<Integer> listResultLinearSearch = new ArrayList<Integer>();
+    for (int i = 0; i < 10; i++) {
+       int result = VectorSearcher.linearSearch(linearSearchVector, comparatorInt, linearSearchVector, i);
+       listResultLinearSearch.add(result);
+    }
+
+    System.out.println(listResultLinearSearch);
+
+.. testoutput::
+
+    [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
+
+Binary Search - O(log(n))
+*************************
+
+Algorithm: org.apache.arrow.algorithm.search.VectorSearcher#binarySearch - O(log(n))
+
+.. testcode::
+
+    import org.apache.arrow.algorithm.search.VectorSearcher;
+    import org.apache.arrow.algorithm.sort.DefaultVectorComparators;
+    import org.apache.arrow.algorithm.sort.VectorValueComparator;
+    import org.apache.arrow.vector.IntVector;
+    import org.apache.arrow.memory.RootAllocator;
+
+    RootAllocator rootAllocator = new RootAllocator(Long.MAX_VALUE);
+    IntVector binarySearchVector = new IntVector("", rootAllocator);
+    binarySearchVector.allocateNew(10);
+    binarySearchVector.setValueCount(10);
+    for (int i = 0; i < 10; i++) {
+        binarySearchVector.set(i, i);
+    }
+    VectorValueComparator<IntVector> comparatorInt = DefaultVectorComparators.createDefaultComparator(binarySearchVector);
+    List<Integer> listResultBinarySearch = new ArrayList<Integer>();
+    for (int i = 0; i < 10; i++) {
+       int result = VectorSearcher.binarySearch(binarySearchVector, comparatorInt, binarySearchVector, i);
+       listResultBinarySearch.add(result);
+    }
+
+    System.out.println(listResultBinarySearch);
+
+.. testoutput::
+
+    [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
+
+Sort Values on the Array
+========================
+
+In-place Sorter - O(nlog(n))
+****************************
+
+Sorting by manipulating the original vector.
+Algorithm: org.apache.arrow.algorithm.sort.FixedWidthInPlaceVectorSorter - O(nlog(n))
+
+.. testcode::
+
+    import org.apache.arrow.algorithm.sort.DefaultVectorComparators;
+    import org.apache.arrow.algorithm.sort.FixedWidthInPlaceVectorSorter;
+    import org.apache.arrow.algorithm.sort.VectorValueComparator;
+    import org.apache.arrow.vector.IntVector;
+    import org.apache.arrow.memory.RootAllocator;
+
+    RootAllocator rootAllocator = new RootAllocator(Long.MAX_VALUE);
+    IntVector intVectorNotSorted = new IntVector("intvectornotsorted", rootAllocator);
+    intVectorNotSorted.allocateNew(3);
+    intVectorNotSorted.setValueCount(3);
+    intVectorNotSorted.set(0, 10);
+    intVectorNotSorted.set(1, 8);
+    intVectorNotSorted.setNull(2);
+    FixedWidthInPlaceVectorSorter<IntVector> sorter = new FixedWidthInPlaceVectorSorter<IntVector>();
+    VectorValueComparator<IntVector> comparator = DefaultVectorComparators.createDefaultComparator(intVectorNotSorted);
+    sorter.sortInPlace(intVectorNotSorted, comparator);
+
+    System.out.println(intVectorNotSorted);
+
+.. testoutput::
+
+    [null, 8, 10]
+
+Out-place Sorter - O(nlog(n))
+*****************************
+
+Sorting by copies vector elements to a new vector in sorted order - O(nlog(n))
+Algorithm: : org.apache.arrow.algorithm.sort.FixedWidthInPlaceVectorSorter.
+FixedWidthOutOfPlaceVectorSorter & VariableWidthOutOfPlaceVectorSor
+
+.. testcode::
+
+    import org.apache.arrow.algorithm.sort.DefaultVectorComparators;
+    import org.apache.arrow.algorithm.sort.FixedWidthOutOfPlaceVectorSorter;
+    import org.apache.arrow.algorithm.sort.OutOfPlaceVectorSorter;
+    import org.apache.arrow.algorithm.sort.VectorValueComparator;
+    import org.apache.arrow.vector.IntVector;
+    import org.apache.arrow.memory.RootAllocator;
+
+    RootAllocator rootAllocator = new RootAllocator(Long.MAX_VALUE);
+    IntVector intVectorNotSorted = new IntVector("intvectornotsorted", rootAllocator);
+    intVectorNotSorted.allocateNew(3);
+    intVectorNotSorted.setValueCount(3);
+    intVectorNotSorted.set(0, 10);
+    intVectorNotSorted.set(1, 8);
+    intVectorNotSorted.setNull(2);
+    OutOfPlaceVectorSorter<IntVector> sorterOutOfPlaceSorter = new FixedWidthOutOfPlaceVectorSorter<>();
+    VectorValueComparator<IntVector> comparatorOutOfPlaceSorter = DefaultVectorComparators.createDefaultComparator(intVectorNotSorted);
+    IntVector intVectorSorted = (IntVector) intVectorNotSorted.getField().getFieldType().createNewSingleVector("new-out-of-place-sorter", rootAllocator, null);
+    intVectorSorted.allocateNew(intVectorNotSorted.getValueCount());
+    intVectorSorted.setValueCount(intVectorNotSorted.getValueCount());
+    sorterOutOfPlaceSorter.sortOutOfPlace(intVectorNotSorted, intVectorSorted, comparatorOutOfPlaceSorter);
+
+    System.out.println(intVectorSorted);
+
+.. testoutput::
+
+    [null, 8, 10]
+
+Data Filter & Aggregation

Review comment:
       Deleted




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-cookbook] amol- commented on a change in pull request #135: [Java]: Java cookbook for create arrow data manipulation

Posted by GitBox <gi...@apache.org>.
amol- commented on a change in pull request #135:
URL: https://github.com/apache/arrow-cookbook/pull/135#discussion_r797535468



##########
File path: java/source/data.rst
##########
@@ -0,0 +1,360 @@
+=================
+Data manipulation
+=================
+
+Recipes related to compare, filtering or transforming data.
+
+.. contents::
+
+Compare Vectors for Field Equality
+==================================
+
+.. testcode::
+
+    import org.apache.arrow.vector.IntVector;
+    import org.apache.arrow.vector.compare.TypeEqualsVisitor;
+    import org.apache.arrow.memory.RootAllocator;
+
+    RootAllocator rootAllocator = new RootAllocator(Long.MAX_VALUE);
+    IntVector right = new IntVector("int", rootAllocator);
+    right.allocateNew(3);
+    right.set(0, 10);
+    right.set(1, 20);
+    right.set(2, 30);
+    right.setValueCount(3);
+    IntVector left1 = new IntVector("int", rootAllocator);
+    IntVector left2 = new IntVector("int2", rootAllocator);
+    TypeEqualsVisitor visitor = new TypeEqualsVisitor(right);
+
+    System.out.println(visitor.equals(left1));
+    System.out.println(visitor.equals(left2));
+
+.. testoutput::
+
+    true
+    false
+
+Compare Values on the Array
+===========================
+
+Comparing two values at the given indices in the vectors:
+
+.. testcode::
+
+    import org.apache.arrow.algorithm.sort.StableVectorComparator;
+    import org.apache.arrow.algorithm.sort.VectorValueComparator;
+    import org.apache.arrow.vector.VarCharVector;
+    import org.apache.arrow.memory.RootAllocator;
+
+    class TestVectorValueComparator extends VectorValueComparator<VarCharVector> {
+        @Override
+        public int compareNotNull(int index1, int index2) {
+            byte b1 = vector1.get(index1)[0];
+            byte b2 = vector2.get(index2)[0];
+            return b1 - b2;
+        }
+
+        @Override
+        public VectorValueComparator<VarCharVector> createNew() {
+            return new TestVectorValueComparator();
+        }
+    }
+
+    RootAllocator rootAllocator = new RootAllocator(Long.MAX_VALUE);
+
+    VarCharVector vec = new VarCharVector("valueindexcomparator", rootAllocator);
+    vec.allocateNew(3);
+    vec.setValueCount(3);
+    vec.set(0, "ba".getBytes());
+    vec.set(1, "abc".getBytes());
+    vec.set(2, "aa".getBytes());
+
+    VectorValueComparator<VarCharVector> comparatorValues = new TestVectorValueComparator();
+    VectorValueComparator<VarCharVector> stableComparator = new StableVectorComparator<>(comparatorValues);
+    stableComparator.attachVector(vec);
+
+    System.out.println(stableComparator.compare(0, 1) > 0);
+    System.out.println(stableComparator.compare(1, 2) < 0);
+
+.. testoutput::
+
+    true
+    true
+
+Search Values on the Array
+==========================
+
+Linear Search - O(n)
+********************
+
+Algorithm: org.apache.arrow.algorithm.search.VectorSearcher#linearSearch - O(n)
+
+.. testcode::
+
+    import org.apache.arrow.algorithm.search.VectorSearcher;
+    import org.apache.arrow.algorithm.sort.DefaultVectorComparators;
+    import org.apache.arrow.algorithm.sort.VectorValueComparator;
+    import org.apache.arrow.vector.IntVector;
+    import org.apache.arrow.memory.RootAllocator;
+
+    RootAllocator rootAllocator = new RootAllocator(Long.MAX_VALUE);
+    IntVector linearSearchVector = new IntVector("linearSearchVector", rootAllocator);
+    linearSearchVector.allocateNew(10);
+    linearSearchVector.setValueCount(10);
+    for (int i = 0; i < 10; i++) {
+        linearSearchVector.set(i, i);
+    }
+    VectorValueComparator<IntVector> comparatorInt = DefaultVectorComparators.createDefaultComparator(linearSearchVector);
+    List<Integer> listResultLinearSearch = new ArrayList<Integer>();
+    for (int i = 0; i < 10; i++) {
+       int result = VectorSearcher.linearSearch(linearSearchVector, comparatorInt, linearSearchVector, i);
+       listResultLinearSearch.add(result);
+    }
+
+    System.out.println(listResultLinearSearch);
+
+.. testoutput::
+
+    [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
+
+Binary Search - O(log(n))
+*************************
+
+Algorithm: org.apache.arrow.algorithm.search.VectorSearcher#binarySearch - O(log(n))
+
+.. testcode::
+
+    import org.apache.arrow.algorithm.search.VectorSearcher;
+    import org.apache.arrow.algorithm.sort.DefaultVectorComparators;
+    import org.apache.arrow.algorithm.sort.VectorValueComparator;
+    import org.apache.arrow.vector.IntVector;
+    import org.apache.arrow.memory.RootAllocator;
+
+    RootAllocator rootAllocator = new RootAllocator(Long.MAX_VALUE);
+    IntVector binarySearchVector = new IntVector("", rootAllocator);
+    binarySearchVector.allocateNew(10);
+    binarySearchVector.setValueCount(10);
+    for (int i = 0; i < 10; i++) {
+        binarySearchVector.set(i, i);
+    }
+    VectorValueComparator<IntVector> comparatorInt = DefaultVectorComparators.createDefaultComparator(binarySearchVector);
+    List<Integer> listResultBinarySearch = new ArrayList<Integer>();
+    for (int i = 0; i < 10; i++) {
+       int result = VectorSearcher.binarySearch(binarySearchVector, comparatorInt, binarySearchVector, i);
+       listResultBinarySearch.add(result);
+    }
+
+    System.out.println(listResultBinarySearch);
+
+.. testoutput::
+
+    [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
+
+Sort Values on the Array
+========================
+
+In-place Sorter - O(nlog(n))
+****************************
+
+Sorting by manipulating the original vector.
+Algorithm: org.apache.arrow.algorithm.sort.FixedWidthInPlaceVectorSorter - O(nlog(n))
+
+.. testcode::
+
+    import org.apache.arrow.algorithm.sort.DefaultVectorComparators;
+    import org.apache.arrow.algorithm.sort.FixedWidthInPlaceVectorSorter;
+    import org.apache.arrow.algorithm.sort.VectorValueComparator;
+    import org.apache.arrow.vector.IntVector;
+    import org.apache.arrow.memory.RootAllocator;
+
+    RootAllocator rootAllocator = new RootAllocator(Long.MAX_VALUE);
+    IntVector intVectorNotSorted = new IntVector("intvectornotsorted", rootAllocator);
+    intVectorNotSorted.allocateNew(3);
+    intVectorNotSorted.setValueCount(3);
+    intVectorNotSorted.set(0, 10);
+    intVectorNotSorted.set(1, 8);
+    intVectorNotSorted.setNull(2);
+    FixedWidthInPlaceVectorSorter<IntVector> sorter = new FixedWidthInPlaceVectorSorter<IntVector>();
+    VectorValueComparator<IntVector> comparator = DefaultVectorComparators.createDefaultComparator(intVectorNotSorted);
+    sorter.sortInPlace(intVectorNotSorted, comparator);
+
+    System.out.println(intVectorNotSorted);
+
+.. testoutput::
+
+    [null, 8, 10]
+
+Out-place Sorter - O(nlog(n))
+*****************************
+
+Sorting by copies vector elements to a new vector in sorted order - O(nlog(n))
+Algorithm: : org.apache.arrow.algorithm.sort.FixedWidthInPlaceVectorSorter.
+FixedWidthOutOfPlaceVectorSorter & VariableWidthOutOfPlaceVectorSor
+
+.. testcode::
+
+    import org.apache.arrow.algorithm.sort.DefaultVectorComparators;
+    import org.apache.arrow.algorithm.sort.FixedWidthOutOfPlaceVectorSorter;
+    import org.apache.arrow.algorithm.sort.OutOfPlaceVectorSorter;
+    import org.apache.arrow.algorithm.sort.VectorValueComparator;
+    import org.apache.arrow.vector.IntVector;
+    import org.apache.arrow.memory.RootAllocator;
+
+    RootAllocator rootAllocator = new RootAllocator(Long.MAX_VALUE);
+    IntVector intVectorNotSorted = new IntVector("intvectornotsorted", rootAllocator);
+    intVectorNotSorted.allocateNew(3);
+    intVectorNotSorted.setValueCount(3);
+    intVectorNotSorted.set(0, 10);
+    intVectorNotSorted.set(1, 8);
+    intVectorNotSorted.setNull(2);
+    OutOfPlaceVectorSorter<IntVector> sorterOutOfPlaceSorter = new FixedWidthOutOfPlaceVectorSorter<>();
+    VectorValueComparator<IntVector> comparatorOutOfPlaceSorter = DefaultVectorComparators.createDefaultComparator(intVectorNotSorted);
+    IntVector intVectorSorted = (IntVector) intVectorNotSorted.getField().getFieldType().createNewSingleVector("new-out-of-place-sorter", rootAllocator, null);
+    intVectorSorted.allocateNew(intVectorNotSorted.getValueCount());
+    intVectorSorted.setValueCount(intVectorNotSorted.getValueCount());
+    sorterOutOfPlaceSorter.sortOutOfPlace(intVectorNotSorted, intVectorSorted, comparatorOutOfPlaceSorter);
+
+    System.out.println(intVectorSorted);
+
+.. testoutput::
+
+    [null, 8, 10]
+
+Data Filter & Aggregation

Review comment:
       I question the purpose of this recipe as it doesn't show any Arrow capability, I mean it ends up implementing custom filtering and aggregation. We should probably postpone this recipe until Arrow Java gets support for pieces of the compute engine like filtering.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-cookbook] amol- commented on a change in pull request #135: [Java]: Java cookbook for create arrow data manipulation

Posted by GitBox <gi...@apache.org>.
amol- commented on a change in pull request #135:
URL: https://github.com/apache/arrow-cookbook/pull/135#discussion_r797521070



##########
File path: java/source/data.rst
##########
@@ -0,0 +1,360 @@
+=================
+Data manipulation
+=================
+
+Recipes related to compare, filtering or transforming data.
+
+.. contents::
+
+Compare Vectors for Field Equality
+==================================
+
+.. testcode::
+
+    import org.apache.arrow.vector.IntVector;
+    import org.apache.arrow.vector.compare.TypeEqualsVisitor;
+    import org.apache.arrow.memory.RootAllocator;
+
+    RootAllocator rootAllocator = new RootAllocator(Long.MAX_VALUE);
+    IntVector right = new IntVector("int", rootAllocator);
+    right.allocateNew(3);
+    right.set(0, 10);
+    right.set(1, 20);
+    right.set(2, 30);
+    right.setValueCount(3);
+    IntVector left1 = new IntVector("int", rootAllocator);
+    IntVector left2 = new IntVector("int2", rootAllocator);
+    TypeEqualsVisitor visitor = new TypeEqualsVisitor(right);
+
+    System.out.println(visitor.equals(left1));
+    System.out.println(visitor.equals(left2));
+
+.. testoutput::
+
+    true
+    false
+
+Compare Values on the Array
+===========================
+
+Comparing two values at the given indices in the vectors:
+
+.. testcode::
+
+    import org.apache.arrow.algorithm.sort.StableVectorComparator;
+    import org.apache.arrow.algorithm.sort.VectorValueComparator;
+    import org.apache.arrow.vector.VarCharVector;
+    import org.apache.arrow.memory.RootAllocator;
+
+    class TestVectorValueComparator extends VectorValueComparator<VarCharVector> {
+        @Override
+        public int compareNotNull(int index1, int index2) {
+            byte b1 = vector1.get(index1)[0];
+            byte b2 = vector2.get(index2)[0];
+            return b1 - b2;
+        }
+
+        @Override
+        public VectorValueComparator<VarCharVector> createNew() {
+            return new TestVectorValueComparator();
+        }
+    }

Review comment:
       I'm not sure why we are making our own implementation of `VectorValueComparator`.
   The user coming here looking for information on how to compare the values in two arrays might be confused by this and think that it's required instead of just using one of the provided subclasses. It's important that recipes don't include anymore than the minimum code required to answer the question.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org