You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@arrow.apache.org by Kirill Lykov <ly...@gmail.com> on 2020/06/22 13:50:58 UTC

Generate random arrow table

Hi,

I wonder if there is existing C++ code which allows to generate a
random arrow table by given metadata. Maybe, it is not part of the
arrow library but someone wrote such a builder. I want to use it for
benchmarking purposes (using google benchmark).

-- 
Best regards,
Kirill Lykov,
personal page: http://kirilllykov.github.com/blog/about/
tel.: +41 765 27 6229

Re: Generate random arrow table

Posted by Francois Saint-Jacques <fs...@gmail.com>.
If you configured CMake to build tests (-DARROW_BUILD_TESTS=ON) and
install locally, there should be a `libarrow_testing.so` that you need
to link against. What I meant is that this library is _not_ part of
pip/conda/dpkg/rpm.

François

Re: Generate random arrow table

Posted by Kirill Lykov <ly...@gmail.com>.
Yes, I figured it out already by seeing "undefined reference".
Do you know if there is any way to build Arrow in a way that testing
is part of the library?
For some reason, in the installation directory I see random.h but when
I inspect libarrow.a with nm I don't see definitions of the functions.

On Tue, Jun 23, 2020 at 5:44 PM Francois Saint-Jacques
<fs...@gmail.com> wrote:
>
> It doesn't support it yet, but you can probably trivially add a method
> for it. Time (timestamp, interval) types might require more changes
> since the type factories are not always parameter-less. A small note,
> this functionality (random generation) is part of the testing shared
> library. AFAIK this is _not_ exported in the binary packages.
>
> Regards,
> François
>
> On Tue, Jun 23, 2020 at 11:33 AM Kirill Lykov <ly...@gmail.com> wrote:
> >
> > Yeah, I wrote some code which does the thing (cannot use the latest
> > arrow version so far). The only missing feature is Date32/Date64.
> > Are they supported as part of ArrayOf or not implemented because it
> > might be possible (guess) to convert uint32 to Date32 for example?
> >
> > On Mon, Jun 22, 2020 at 7:31 PM Wes McKinney <we...@gmail.com> wrote:
> > >
> > > I think you can pretty easily use the new
> > > `RandomArrayGenerator::ArrayOf` function to generate a random
> > > RecordBatch given a schema and length.
> > >
> > > On Mon, Jun 22, 2020 at 10:52 AM Kirill Lykov <ly...@gmail.com> wrote:
> > > >
> > > > Thanks for the reply.
> > > > I saw an Array generator and decided to ask if there is already
> > > > something like RandomTableGenerator before implementing myself one
> > > > using RandomArrayGenerator.
> > > >
> > > > On Mon, Jun 22, 2020 at 4:49 PM Francois Saint-Jacques
> > > > <fs...@gmail.com> wrote:
> > > > >
> > > > > Hello,
> > > > >
> > > > > We use this extensively in unit tests, see [1]
> > > > >
> > > > > François
> > > > > [1] https://github.com/apache/arrow/blob/master/cpp/src/arrow/testing/random.h
> > > > >
> > > > > On Mon, Jun 22, 2020 at 9:51 AM Kirill Lykov <ly...@gmail.com> wrote:
> > > > > >
> > > > > > Hi,
> > > > > >
> > > > > > I wonder if there is existing C++ code which allows to generate a
> > > > > > random arrow table by given metadata. Maybe, it is not part of the
> > > > > > arrow library but someone wrote such a builder. I want to use it for
> > > > > > benchmarking purposes (using google benchmark).
> > > > > >
> > > > > > --
> > > > > > Best regards,
> > > > > > Kirill Lykov,
> > > > > > personal page: http://kirilllykov.github.com/blog/about/
> > > > > > tel.: +41 765 27 6229
> > > >
> > > >
> > > >
> > > > --
> > > > Best regards,
> > > > Kirill Lykov
> >
> >
> >
> > --
> > Best regards,
> > Kirill Lykov



-- 
Best regards,
Kirill Lykov

Re: Generate random arrow table

Posted by Francois Saint-Jacques <fs...@gmail.com>.
It doesn't support it yet, but you can probably trivially add a method
for it. Time (timestamp, interval) types might require more changes
since the type factories are not always parameter-less. A small note,
this functionality (random generation) is part of the testing shared
library. AFAIK this is _not_ exported in the binary packages.

Regards,
François

On Tue, Jun 23, 2020 at 11:33 AM Kirill Lykov <ly...@gmail.com> wrote:
>
> Yeah, I wrote some code which does the thing (cannot use the latest
> arrow version so far). The only missing feature is Date32/Date64.
> Are they supported as part of ArrayOf or not implemented because it
> might be possible (guess) to convert uint32 to Date32 for example?
>
> On Mon, Jun 22, 2020 at 7:31 PM Wes McKinney <we...@gmail.com> wrote:
> >
> > I think you can pretty easily use the new
> > `RandomArrayGenerator::ArrayOf` function to generate a random
> > RecordBatch given a schema and length.
> >
> > On Mon, Jun 22, 2020 at 10:52 AM Kirill Lykov <ly...@gmail.com> wrote:
> > >
> > > Thanks for the reply.
> > > I saw an Array generator and decided to ask if there is already
> > > something like RandomTableGenerator before implementing myself one
> > > using RandomArrayGenerator.
> > >
> > > On Mon, Jun 22, 2020 at 4:49 PM Francois Saint-Jacques
> > > <fs...@gmail.com> wrote:
> > > >
> > > > Hello,
> > > >
> > > > We use this extensively in unit tests, see [1]
> > > >
> > > > François
> > > > [1] https://github.com/apache/arrow/blob/master/cpp/src/arrow/testing/random.h
> > > >
> > > > On Mon, Jun 22, 2020 at 9:51 AM Kirill Lykov <ly...@gmail.com> wrote:
> > > > >
> > > > > Hi,
> > > > >
> > > > > I wonder if there is existing C++ code which allows to generate a
> > > > > random arrow table by given metadata. Maybe, it is not part of the
> > > > > arrow library but someone wrote such a builder. I want to use it for
> > > > > benchmarking purposes (using google benchmark).
> > > > >
> > > > > --
> > > > > Best regards,
> > > > > Kirill Lykov,
> > > > > personal page: http://kirilllykov.github.com/blog/about/
> > > > > tel.: +41 765 27 6229
> > >
> > >
> > >
> > > --
> > > Best regards,
> > > Kirill Lykov
>
>
>
> --
> Best regards,
> Kirill Lykov

Re: Generate random arrow table

Posted by Kirill Lykov <ly...@gmail.com>.
Yeah, I wrote some code which does the thing (cannot use the latest
arrow version so far). The only missing feature is Date32/Date64.
Are they supported as part of ArrayOf or not implemented because it
might be possible (guess) to convert uint32 to Date32 for example?

On Mon, Jun 22, 2020 at 7:31 PM Wes McKinney <we...@gmail.com> wrote:
>
> I think you can pretty easily use the new
> `RandomArrayGenerator::ArrayOf` function to generate a random
> RecordBatch given a schema and length.
>
> On Mon, Jun 22, 2020 at 10:52 AM Kirill Lykov <ly...@gmail.com> wrote:
> >
> > Thanks for the reply.
> > I saw an Array generator and decided to ask if there is already
> > something like RandomTableGenerator before implementing myself one
> > using RandomArrayGenerator.
> >
> > On Mon, Jun 22, 2020 at 4:49 PM Francois Saint-Jacques
> > <fs...@gmail.com> wrote:
> > >
> > > Hello,
> > >
> > > We use this extensively in unit tests, see [1]
> > >
> > > François
> > > [1] https://github.com/apache/arrow/blob/master/cpp/src/arrow/testing/random.h
> > >
> > > On Mon, Jun 22, 2020 at 9:51 AM Kirill Lykov <ly...@gmail.com> wrote:
> > > >
> > > > Hi,
> > > >
> > > > I wonder if there is existing C++ code which allows to generate a
> > > > random arrow table by given metadata. Maybe, it is not part of the
> > > > arrow library but someone wrote such a builder. I want to use it for
> > > > benchmarking purposes (using google benchmark).
> > > >
> > > > --
> > > > Best regards,
> > > > Kirill Lykov,
> > > > personal page: http://kirilllykov.github.com/blog/about/
> > > > tel.: +41 765 27 6229
> >
> >
> >
> > --
> > Best regards,
> > Kirill Lykov



-- 
Best regards,
Kirill Lykov

Re: Generate random arrow table

Posted by Wes McKinney <we...@gmail.com>.
I think you can pretty easily use the new
`RandomArrayGenerator::ArrayOf` function to generate a random
RecordBatch given a schema and length.

On Mon, Jun 22, 2020 at 10:52 AM Kirill Lykov <ly...@gmail.com> wrote:
>
> Thanks for the reply.
> I saw an Array generator and decided to ask if there is already
> something like RandomTableGenerator before implementing myself one
> using RandomArrayGenerator.
>
> On Mon, Jun 22, 2020 at 4:49 PM Francois Saint-Jacques
> <fs...@gmail.com> wrote:
> >
> > Hello,
> >
> > We use this extensively in unit tests, see [1]
> >
> > François
> > [1] https://github.com/apache/arrow/blob/master/cpp/src/arrow/testing/random.h
> >
> > On Mon, Jun 22, 2020 at 9:51 AM Kirill Lykov <ly...@gmail.com> wrote:
> > >
> > > Hi,
> > >
> > > I wonder if there is existing C++ code which allows to generate a
> > > random arrow table by given metadata. Maybe, it is not part of the
> > > arrow library but someone wrote such a builder. I want to use it for
> > > benchmarking purposes (using google benchmark).
> > >
> > > --
> > > Best regards,
> > > Kirill Lykov,
> > > personal page: http://kirilllykov.github.com/blog/about/
> > > tel.: +41 765 27 6229
>
>
>
> --
> Best regards,
> Kirill Lykov

Re: Generate random arrow table

Posted by Kirill Lykov <ly...@gmail.com>.
Thanks for the reply.
I saw an Array generator and decided to ask if there is already
something like RandomTableGenerator before implementing myself one
using RandomArrayGenerator.

On Mon, Jun 22, 2020 at 4:49 PM Francois Saint-Jacques
<fs...@gmail.com> wrote:
>
> Hello,
>
> We use this extensively in unit tests, see [1]
>
> François
> [1] https://github.com/apache/arrow/blob/master/cpp/src/arrow/testing/random.h
>
> On Mon, Jun 22, 2020 at 9:51 AM Kirill Lykov <ly...@gmail.com> wrote:
> >
> > Hi,
> >
> > I wonder if there is existing C++ code which allows to generate a
> > random arrow table by given metadata. Maybe, it is not part of the
> > arrow library but someone wrote such a builder. I want to use it for
> > benchmarking purposes (using google benchmark).
> >
> > --
> > Best regards,
> > Kirill Lykov,
> > personal page: http://kirilllykov.github.com/blog/about/
> > tel.: +41 765 27 6229



-- 
Best regards,
Kirill Lykov

Re: Generate random arrow table

Posted by Francois Saint-Jacques <fs...@gmail.com>.
Hello,

We use this extensively in unit tests, see [1]

François
[1] https://github.com/apache/arrow/blob/master/cpp/src/arrow/testing/random.h

On Mon, Jun 22, 2020 at 9:51 AM Kirill Lykov <ly...@gmail.com> wrote:
>
> Hi,
>
> I wonder if there is existing C++ code which allows to generate a
> random arrow table by given metadata. Maybe, it is not part of the
> arrow library but someone wrote such a builder. I want to use it for
> benchmarking purposes (using google benchmark).
>
> --
> Best regards,
> Kirill Lykov,
> personal page: http://kirilllykov.github.com/blog/about/
> tel.: +41 765 27 6229