You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "jeremyosterhoudt (via GitHub)" <gi...@apache.org> on 2023/06/16 21:57:53 UTC

[GitHub] [arrow] jeremyosterhoudt opened a new pull request, #36134: GH-36133: [C#] Support Set on the `BinaryArray` Builder

jeremyosterhoudt opened a new pull request, #36134:
URL: https://github.com/apache/arrow/pull/36134

   ### What changes are included in this PR?
   
   Implemented the Set method on both the `BinaryArray` and `StringArray` builders to match functionality of other Builders (`PrimitiveArray.Builder` for example).
   
   Closes #36133


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] GH-36133: [C#] Support Set on the `BinaryArray` Builder [arrow]

Posted by "CurtHagenlocher (via GitHub)" <gi...@apache.org>.
CurtHagenlocher commented on code in PR #36134:
URL: https://github.com/apache/arrow/pull/36134#discussion_r1396675290


##########
csharp/src/Apache.Arrow/Arrays/BinaryArray.cs:
##########
@@ -258,8 +259,77 @@ public TBuilder Swap(int i, int j)
 
             public TBuilder Set(int index, byte value)
             {
-                // TODO: Implement
+                ValueBuffer.Span[index] = value;
+                return Instance;
+            }
+
+            public TBuilder SetNull(int offset)

Review Comment:
   So this is nulling out the original bytes of the value but not freeing the space it uses?



##########
csharp/src/Apache.Arrow/Arrays/BinaryArray.cs:
##########
@@ -258,8 +259,77 @@ public TBuilder Swap(int i, int j)
 
             public TBuilder Set(int index, byte value)
             {
-                // TODO: Implement
+                ValueBuffer.Span[index] = value;
+                return Instance;
+            }
+
+            public TBuilder SetNull(int offset)
+            {
+                int index = ValueOffsets.Span[offset];
+                int length = GetOffsetValueLength(offset);
+
+                for (int i = 0; i < length; i++)
+                {
+                    ValueBuffer.Span[index + i] = 0;
+                }
+
+                ValidityBuffer.Set(offset, false);
+
+                return Instance;
+            }
+
+            public TBuilder Set(int offset, ReadOnlySpan<byte> values)
+            {
+#if NETSTANDARD2_1_OR_GREATER || NET6_0_OR_GREATER
+                byte[] newValues = values.ToArray();

Review Comment:
   It shouldn't be necessary to allocate temporary memory here, should it?



##########
csharp/src/Apache.Arrow/Arrays/BinaryArray.cs:
##########
@@ -258,8 +259,77 @@ public TBuilder Swap(int i, int j)
 
             public TBuilder Set(int index, byte value)
             {
-                // TODO: Implement
+                ValueBuffer.Span[index] = value;

Review Comment:
   I know this is a pre-existing function, but I don't understand what the original author expected it to accomplish. Unless you can elucidate a use case for it, I would consider keeping the original implementation and marking it as obsolete.



##########
csharp/src/Apache.Arrow/Arrays/BinaryArray.cs:
##########
@@ -258,8 +259,77 @@ public TBuilder Swap(int i, int j)
 
             public TBuilder Set(int index, byte value)
             {
-                // TODO: Implement
+                ValueBuffer.Span[index] = value;
+                return Instance;
+            }
+
+            public TBuilder SetNull(int offset)
+            {
+                int index = ValueOffsets.Span[offset];
+                int length = GetOffsetValueLength(offset);
+
+                for (int i = 0; i < length; i++)
+                {
+                    ValueBuffer.Span[index + i] = 0;
+                }
+
+                ValidityBuffer.Set(offset, false);
+
+                return Instance;
+            }
+
+            public TBuilder Set(int offset, ReadOnlySpan<byte> values)
+            {
+#if NETSTANDARD2_1_OR_GREATER || NET6_0_OR_GREATER

Review Comment:
   Nit: these are redundant in that NETSTANDARD2_1_OR_GREATER is always true if NET6_0_OR_GREATER is also true.
   
   Rather than throwing an exception when the functionality can't be supported, I think it's better to simply not expose the function on downlevel .NET. Otherwise, the experience is that someone discovers a method via Intellisense, uses it and it always fails.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] GH-36133: [C#] Support Set on the `BinaryArray` Builder [arrow]

Posted by "danmoseley (via GitHub)" <gi...@apache.org>.
danmoseley commented on code in PR #36134:
URL: https://github.com/apache/arrow/pull/36134#discussion_r1393599083


##########
csharp/src/Apache.Arrow/Arrays/BinaryArray.cs:
##########
@@ -258,8 +259,77 @@ public TBuilder Swap(int i, int j)
 
             public TBuilder Set(int index, byte value)
             {
-                // TODO: Implement
+                ValueBuffer.Span[index] = value;
+                return Instance;
+            }
+
+            public TBuilder SetNull(int offset)
+            {
+                int index = ValueOffsets.Span[offset];
+                int length = GetOffsetValueLength(offset);
+
+                for (int i = 0; i < length; i++)
+                {
+                    ValueBuffer.Span[index + i] = 0;
+                }
+
+                ValidityBuffer.Set(offset, false);
+
+                return Instance;
+            }
+
+            public TBuilder Set(int offset, ReadOnlySpan<byte> values)
+            {
+#if NETSTANDARD2_1_OR_GREATER || NET6_0_OR_GREATER
+                byte[] newValues = values.ToArray();
+                int index = ValueOffsets.Span[offset];
+                int existingValueLength = GetOffsetValueLength(offset);
+                int newValueLength = newValues.Length;
+
+                // Resize and shift the value and offset buffers
+                if (existingValueLength != newValueLength)
+                {
+                    int indexShift = newValueLength - existingValueLength;
+
+                    if (offset != ValueOffsets.Length)
+                    {
+                        //Shift the existing values after our change in size
+                        var afterValues = ValueBuffer.Span[(index + existingValueLength)..];
+                        ValueBuffer.Resize(ValueBuffer.Length + indexShift + 1);
+                        afterValues.CopyTo(ValueBuffer.Span[(index + newValueLength)..]);
+                    }
+                    else
+                    {
+                        ValueBuffer.Resize(ValueBuffer.Length + indexShift + 1);
+                    }
+
+                    //Update out offset indexes
+                    for (int i = offset + 1; i < ValueOffsets.Length; i++)
+                    {
+                        ValueOffsets.Span[i] += indexShift;
+                    }
+                }
+
+                for (int i = 0; i < newValues.Length; i++)
+                {
+                    Set(i + index, newValues[i]);
+                }
+
+                ValidityBuffer.Set(offset, true);
+
+                return Instance;
+#else
+                //There is not a clean way to shift the bytes with Span.CopyTo which was added in .NetStandard 2.1/.NET5
+                //Given .NetStandard past EOL, keep existing functionality the same for those users.

Review Comment:
   nit, .NET Standard 2.0/2.1 do not have an EOL, they define a standard, not a runtime or product. All future .NET's including .NET 6, 7, and 8 are expected to support .NET Standard 2.1. It's true that .NET Standard is getting increasingly less useful as .NET evolves while .NET Framework is static.
   
   you probably know this -- just suggesting this comment ought to say ".NET 5 and lower are EOL"



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] GH-36133: [C#] Support Set on the `BinaryArray` Builder [arrow]

Posted by "jeremyosterhoudt (via GitHub)" <gi...@apache.org>.
jeremyosterhoudt commented on code in PR #36134:
URL: https://github.com/apache/arrow/pull/36134#discussion_r1394868597


##########
csharp/src/Apache.Arrow/Arrays/BinaryArray.cs:
##########
@@ -258,8 +259,77 @@ public TBuilder Swap(int i, int j)
 
             public TBuilder Set(int index, byte value)
             {
-                // TODO: Implement
+                ValueBuffer.Span[index] = value;
+                return Instance;
+            }
+
+            public TBuilder SetNull(int offset)
+            {
+                int index = ValueOffsets.Span[offset];
+                int length = GetOffsetValueLength(offset);
+
+                for (int i = 0; i < length; i++)
+                {
+                    ValueBuffer.Span[index + i] = 0;
+                }
+
+                ValidityBuffer.Set(offset, false);
+
+                return Instance;
+            }
+
+            public TBuilder Set(int offset, ReadOnlySpan<byte> values)
+            {
+#if NETSTANDARD2_1_OR_GREATER || NET6_0_OR_GREATER
+                byte[] newValues = values.ToArray();
+                int index = ValueOffsets.Span[offset];
+                int existingValueLength = GetOffsetValueLength(offset);
+                int newValueLength = newValues.Length;
+
+                // Resize and shift the value and offset buffers
+                if (existingValueLength != newValueLength)
+                {
+                    int indexShift = newValueLength - existingValueLength;
+
+                    if (offset != ValueOffsets.Length)
+                    {
+                        //Shift the existing values after our change in size
+                        var afterValues = ValueBuffer.Span[(index + existingValueLength)..];
+                        ValueBuffer.Resize(ValueBuffer.Length + indexShift + 1);
+                        afterValues.CopyTo(ValueBuffer.Span[(index + newValueLength)..]);
+                    }
+                    else
+                    {
+                        ValueBuffer.Resize(ValueBuffer.Length + indexShift + 1);
+                    }
+
+                    //Update out offset indexes
+                    for (int i = offset + 1; i < ValueOffsets.Length; i++)
+                    {
+                        ValueOffsets.Span[i] += indexShift;
+                    }
+                }
+
+                for (int i = 0; i < newValues.Length; i++)
+                {
+                    Set(i + index, newValues[i]);
+                }
+
+                ValidityBuffer.Set(offset, true);
+
+                return Instance;
+#else
+                //There is not a clean way to shift the bytes with Span.CopyTo which was added in .NetStandard 2.1/.NET5
+                //Given .NetStandard past EOL, keep existing functionality the same for those users.

Review Comment:
   Great point!  I've updated the comment to reflect that the feature is only available to users running NetStandard 2.1/NET 5 and higher.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] GH-36133: [C#] Support Set on the `BinaryArray` Builder [arrow]

Posted by "danmoseley (via GitHub)" <gi...@apache.org>.
danmoseley commented on code in PR #36134:
URL: https://github.com/apache/arrow/pull/36134#discussion_r1394878345


##########
csharp/src/Apache.Arrow/Arrays/BinaryArray.cs:
##########
@@ -258,8 +259,77 @@ public TBuilder Swap(int i, int j)
 
             public TBuilder Set(int index, byte value)
             {
-                // TODO: Implement
+                ValueBuffer.Span[index] = value;
+                return Instance;
+            }
+
+            public TBuilder SetNull(int offset)
+            {
+                int index = ValueOffsets.Span[offset];
+                int length = GetOffsetValueLength(offset);
+
+                for (int i = 0; i < length; i++)
+                {
+                    ValueBuffer.Span[index + i] = 0;
+                }
+
+                ValidityBuffer.Set(offset, false);
+
+                return Instance;
+            }
+
+            public TBuilder Set(int offset, ReadOnlySpan<byte> values)
+            {
+#if NETSTANDARD2_1_OR_GREATER || NET6_0_OR_GREATER
+                byte[] newValues = values.ToArray();
+                int index = ValueOffsets.Span[offset];
+                int existingValueLength = GetOffsetValueLength(offset);
+                int newValueLength = newValues.Length;
+
+                // Resize and shift the value and offset buffers
+                if (existingValueLength != newValueLength)
+                {
+                    int indexShift = newValueLength - existingValueLength;
+
+                    if (offset != ValueOffsets.Length)
+                    {
+                        //Shift the existing values after our change in size
+                        var afterValues = ValueBuffer.Span[(index + existingValueLength)..];
+                        ValueBuffer.Resize(ValueBuffer.Length + indexShift + 1);
+                        afterValues.CopyTo(ValueBuffer.Span[(index + newValueLength)..]);
+                    }
+                    else
+                    {
+                        ValueBuffer.Resize(ValueBuffer.Length + indexShift + 1);
+                    }
+
+                    //Update out offset indexes
+                    for (int i = offset + 1; i < ValueOffsets.Length; i++)
+                    {
+                        ValueOffsets.Span[i] += indexShift;
+                    }
+                }
+
+                for (int i = 0; i < newValues.Length; i++)
+                {
+                    Set(i + index, newValues[i]);
+                }
+
+                ValidityBuffer.Set(offset, true);
+
+                return Instance;
+#else
+                //There is not a clean way to shift the bytes with Span.CopyTo which was added in .NetStandard 2.1/.NET5
+                //Given .NetStandard past EOL, keep existing functionality the same for those users.

Review Comment:
   Looks good to me, but I am not an Arrow person I just happened to come here so I'll let someone else review...



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] github-actions[bot] commented on pull request #36134: GH-36133: [C#] Support Set on the `BinaryArray` Builder

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on PR #36134:
URL: https://github.com/apache/arrow/pull/36134#issuecomment-1595363306

   :warning: GitHub issue #36133 **has been automatically assigned in GitHub** to PR creator.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org