You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2021/04/24 22:00:16 UTC

[GitHub] [arrow] domoritz opened a new pull request #10151: ARROW-12528: [JS] Support typed arrays in Vector.from and Table.new

domoritz opened a new pull request #10151:
URL: https://github.com/apache/arrow/pull/10151


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] domoritz commented on a change in pull request #10151: ARROW-12528: [JS] Support typed arrays in Vector.from and Table.new

Posted by GitBox <gi...@apache.org>.
domoritz commented on a change in pull request #10151:
URL: https://github.com/apache/arrow/pull/10151#discussion_r619719176



##########
File path: js/src/util/args.ts
##########
@@ -21,11 +21,26 @@ import { Column } from '../column';
 import { Vector } from '../vector';
 import { DataType } from '../type';
 import { Chunked } from '../vector/chunked';
+import { BigIntArray, TypedArray } from '../interfaces';
+import { FloatVector, IntVector } from '../vector/index';
 
 type RecordBatchCtor = typeof import('../recordbatch').RecordBatch;
 
 const isArray = Array.isArray;
 
+/** @ignore */
+export function isTypedArray(arr: any): arr is TypedArray | BigIntArray {
+    return ArrayBuffer.isView(arr) && 'BYTES_PER_ELEMENT' in arr;

Review comment:
       I think this should be enough. If not, we can use something like https://underscorejs.org/docs/modules/isTypedArray.html. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] domoritz commented on a change in pull request #10151: ARROW-12528: [JS] Support typed arrays in Vector.from and Table.new

Posted by GitBox <gi...@apache.org>.
domoritz commented on a change in pull request #10151:
URL: https://github.com/apache/arrow/pull/10151#discussion_r619719176



##########
File path: js/src/util/args.ts
##########
@@ -21,11 +21,26 @@ import { Column } from '../column';
 import { Vector } from '../vector';
 import { DataType } from '../type';
 import { Chunked } from '../vector/chunked';
+import { BigIntArray, TypedArray } from '../interfaces';
+import { FloatVector, IntVector } from '../vector/index';
 
 type RecordBatchCtor = typeof import('../recordbatch').RecordBatch;
 
 const isArray = Array.isArray;
 
+/** @ignore */
+export function isTypedArray(arr: any): arr is TypedArray | BigIntArray {
+    return ArrayBuffer.isView(arr) && 'BYTES_PER_ELEMENT' in arr;

Review comment:
       I think this should be enough. Otherwise, we can use something like https://underscorejs.org/docs/modules/isTypedArray.html. 

##########
File path: js/src/util/args.ts
##########
@@ -21,11 +21,26 @@ import { Column } from '../column';
 import { Vector } from '../vector';
 import { DataType } from '../type';
 import { Chunked } from '../vector/chunked';
+import { BigIntArray, TypedArray } from '../interfaces';
+import { FloatVector, IntVector } from '../vector/index';
 
 type RecordBatchCtor = typeof import('../recordbatch').RecordBatch;
 
 const isArray = Array.isArray;
 
+/** @ignore */
+export function isTypedArray(arr: any): arr is TypedArray | BigIntArray {
+    return ArrayBuffer.isView(arr) && 'BYTES_PER_ELEMENT' in arr;

Review comment:
       I think this should be enough. If not, we can use something like https://underscorejs.org/docs/modules/isTypedArray.html. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] kou closed pull request #10151: ARROW-12528: [JS] Support typed arrays in Table.new

Posted by GitBox <gi...@apache.org>.
kou closed pull request #10151:
URL: https://github.com/apache/arrow/pull/10151


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] domoritz commented on a change in pull request #10151: ARROW-12528: [JS] Support typed arrays in Table.new

Posted by GitBox <gi...@apache.org>.
domoritz commented on a change in pull request #10151:
URL: https://github.com/apache/arrow/pull/10151#discussion_r620743520



##########
File path: js/src/util/args.ts
##########
@@ -21,11 +21,26 @@ import { Column } from '../column';
 import { Vector } from '../vector';
 import { DataType } from '../type';
 import { Chunked } from '../vector/chunked';
+import { BigIntArray, TypedArray } from '../interfaces';
+import { FloatVector, IntVector } from '../vector/index';
 
 type RecordBatchCtor = typeof import('../recordbatch').RecordBatch;
 
 const isArray = Array.isArray;
 
+/** @ignore */
+export function isTypedArray(arr: any): arr is TypedArray | BigIntArray {
+    return ArrayBuffer.isView(arr) && 'BYTES_PER_ELEMENT' in arr;
+}
+
+/** @ignore */
+function vectorFromTypedArray(array: TypedArray): Vector {
+    if (array instanceof Float32Array || array instanceof Float64Array) {
+        return FloatVector.from(array);
+    }
+    return IntVector.from(array);

Review comment:
       @trxcllnt I took your feedback but this feels overly verbose and also now I expose a previously internal function. 
   
   <img width="855" alt="Screen Shot 2021-04-26 at 17 33 23" src="https://user-images.githubusercontent.com/589034/116167669-86746b00-a6b5-11eb-812f-103f0daaf67f.png">
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] github-actions[bot] commented on pull request #10151: ARROW-12528: [JS] Support typed arrays in Vector.from and Table.new

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #10151:
URL: https://github.com/apache/arrow/pull/10151#issuecomment-826158934


   https://issues.apache.org/jira/browse/ARROW-12528


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] domoritz commented on a change in pull request #10151: ARROW-12528: [JS] Support typed arrays in Table.new

Posted by GitBox <gi...@apache.org>.
domoritz commented on a change in pull request #10151:
URL: https://github.com/apache/arrow/pull/10151#discussion_r620530082



##########
File path: js/src/util/args.ts
##########
@@ -21,11 +21,26 @@ import { Column } from '../column';
 import { Vector } from '../vector';
 import { DataType } from '../type';
 import { Chunked } from '../vector/chunked';
+import { BigIntArray, TypedArray } from '../interfaces';
+import { FloatVector, IntVector } from '../vector/index';
 
 type RecordBatchCtor = typeof import('../recordbatch').RecordBatch;
 
 const isArray = Array.isArray;
 
+/** @ignore */
+export function isTypedArray(arr: any): arr is TypedArray | BigIntArray {
+    return ArrayBuffer.isView(arr) && 'BYTES_PER_ELEMENT' in arr;
+}
+
+/** @ignore */
+function vectorFromTypedArray(array: TypedArray): Vector {
+    if (array instanceof Float32Array || array instanceof Float64Array) {
+        return FloatVector.from(array);
+    }
+    return IntVector.from(array);

Review comment:
       Use Vector.new with Data.new here. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] domoritz commented on a change in pull request #10151: ARROW-12528: [JS] Support typed arrays in Vector.from and Table.new

Posted by GitBox <gi...@apache.org>.
domoritz commented on a change in pull request #10151:
URL: https://github.com/apache/arrow/pull/10151#discussion_r619719176



##########
File path: js/src/util/args.ts
##########
@@ -21,11 +21,26 @@ import { Column } from '../column';
 import { Vector } from '../vector';
 import { DataType } from '../type';
 import { Chunked } from '../vector/chunked';
+import { BigIntArray, TypedArray } from '../interfaces';
+import { FloatVector, IntVector } from '../vector/index';
 
 type RecordBatchCtor = typeof import('../recordbatch').RecordBatch;
 
 const isArray = Array.isArray;
 
+/** @ignore */
+export function isTypedArray(arr: any): arr is TypedArray | BigIntArray {
+    return ArrayBuffer.isView(arr) && 'BYTES_PER_ELEMENT' in arr;

Review comment:
       I think this should be enough. Otherwise, we can use something like https://underscorejs.org/docs/modules/isTypedArray.html. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] github-actions[bot] commented on pull request #10151: ARROW-12528: [JS] Support typed arrays in Vector.from and Table.new

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #10151:
URL: https://github.com/apache/arrow/pull/10151#issuecomment-826158934


   https://issues.apache.org/jira/browse/ARROW-12528


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] domoritz commented on a change in pull request #10151: ARROW-12528: [JS] Support typed arrays in Table.new

Posted by GitBox <gi...@apache.org>.
domoritz commented on a change in pull request #10151:
URL: https://github.com/apache/arrow/pull/10151#discussion_r621665892



##########
File path: js/src/util/args.ts
##########
@@ -21,11 +21,26 @@ import { Column } from '../column';
 import { Vector } from '../vector';
 import { DataType } from '../type';
 import { Chunked } from '../vector/chunked';
+import { BigIntArray, TypedArray } from '../interfaces';
+import { FloatVector, IntVector } from '../vector/index';
 
 type RecordBatchCtor = typeof import('../recordbatch').RecordBatch;
 
 const isArray = Array.isArray;
 
+/** @ignore */
+export function isTypedArray(arr: any): arr is TypedArray | BigIntArray {
+    return ArrayBuffer.isView(arr) && 'BYTES_PER_ELEMENT' in arr;
+}
+
+/** @ignore */
+function vectorFromTypedArray(array: TypedArray): Vector {
+    if (array instanceof Float32Array || array instanceof Float64Array) {
+        return FloatVector.from(array);
+    }
+    return IntVector.from(array);

Review comment:
       Thanks for the help on Slack. I updated the pull request. 

##########
File path: js/src/util/args.ts
##########
@@ -19,13 +19,55 @@ import { Data } from '../data';
 import { Field } from '../schema';
 import { Column } from '../column';
 import { Vector } from '../vector';
-import { DataType } from '../type';
+import { DataType, Float32, Float64, FloatArray, IntArray, Int16, Int32, Int64, Int8, Uint16, Uint32, Uint64, Uint8 } from '../type';
 import { Chunked } from '../vector/chunked';
+import { BigIntArray, TypedArray as TypedArray_ } from '../interfaces';
+import { FloatArrayCtor } from '../vector/float';
+import { IntArrayCtor } from '../vector/int';
 
 type RecordBatchCtor = typeof import('../recordbatch').RecordBatch;
 
 const isArray = Array.isArray;
 
+type TypedArray = Exclude<TypedArray_ | BigIntArray, Uint8ClampedArray>;
+
+/** @ignore */
+export function isTypedArray(arr: any): arr is TypedArray {
+    return ArrayBuffer.isView(arr) && 'BYTES_PER_ELEMENT' in arr;
+}
+
+
+/** @ignore */
+type ArrayCtor = FloatArrayCtor | IntArrayCtor;
+
+/** @ignore */
+export function arrayTypeToDataType(ctor: ArrayCtor) {
+    switch (ctor) {
+        case Int8Array:         return Int8;
+        case Int16Array:        return Int16;
+        case Int32Array:        return Int32;
+        case BigInt64Array:     return Int64;
+        case Uint8Array:        return Uint8;
+        case Uint16Array:       return Uint16;
+        case Uint32Array:       return Uint32;
+        case BigUint64Array:    return Uint64;
+        case Float32Array:      return Float32;
+        case Float64Array:      return Float64;
+        default: return null;
+    }
+}
+
+/** @ignore */
+function vectorFromTypedArray(array: TypedArray): Vector {

Review comment:
       We could make this function have the right types but since it's internal, it doesn't matter as much. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] domoritz commented on a change in pull request #10151: ARROW-12528: [JS] Support typed arrays in Table.new

Posted by GitBox <gi...@apache.org>.
domoritz commented on a change in pull request #10151:
URL: https://github.com/apache/arrow/pull/10151#discussion_r621797335



##########
File path: js/src/util/args.ts
##########
@@ -19,13 +19,55 @@ import { Data } from '../data';
 import { Field } from '../schema';
 import { Column } from '../column';
 import { Vector } from '../vector';
-import { DataType } from '../type';
+import { DataType, Float32, Float64, FloatArray, IntArray, Int16, Int32, Int64, Int8, Uint16, Uint32, Uint64, Uint8 } from '../type';
 import { Chunked } from '../vector/chunked';
+import { BigIntArray, TypedArray as TypedArray_ } from '../interfaces';
+import { FloatArrayCtor } from '../vector/float';
+import { IntArrayCtor } from '../vector/int';
 
 type RecordBatchCtor = typeof import('../recordbatch').RecordBatch;
 
 const isArray = Array.isArray;
 
+type TypedArray = Exclude<TypedArray_ | BigIntArray, Uint8ClampedArray>;
+
+/** @ignore */
+export function isTypedArray(arr: any): arr is TypedArray {
+    return ArrayBuffer.isView(arr) && 'BYTES_PER_ELEMENT' in arr;
+}
+
+
+/** @ignore */
+type ArrayCtor = FloatArrayCtor | IntArrayCtor;
+
+/** @ignore */
+export function arrayTypeToDataType(ctor: ArrayCtor) {
+    switch (ctor) {

Review comment:
       Arquero uses an object lookup for this. Perf wise it doesn't seem to matter too much: https://jsben.ch/JYZLQ. 
   
   https://github.com/uwdata/arquero/blob/af843a4377254bab970d40559879f9191fe81079/src/arrow/encode/data-from-table.js#L50




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] domoritz commented on a change in pull request #10151: ARROW-12528: [JS] Support typed arrays in Table.new

Posted by GitBox <gi...@apache.org>.
domoritz commented on a change in pull request #10151:
URL: https://github.com/apache/arrow/pull/10151#discussion_r621796532



##########
File path: js/src/util/args.ts
##########
@@ -21,11 +21,26 @@ import { Column } from '../column';
 import { Vector } from '../vector';
 import { DataType } from '../type';
 import { Chunked } from '../vector/chunked';
+import { BigIntArray, TypedArray } from '../interfaces';
+import { FloatVector, IntVector } from '../vector/index';
 
 type RecordBatchCtor = typeof import('../recordbatch').RecordBatch;
 
 const isArray = Array.isArray;
 
+/** @ignore */
+export function isTypedArray(arr: any): arr is TypedArray | BigIntArray {
+    return ArrayBuffer.isView(arr) && 'BYTES_PER_ELEMENT' in arr;

Review comment:
       Arquero just uses `instanceof TypedArray`. 
   
   https://github.com/uwdata/arquero/blob/af843a4377254bab970d40559879f9191fe81079/src/util/is-typed-array.js#L4




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org