You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@datasketches.apache.org by Jon Malkin <jm...@apache.org> on 2020/01/19 01:31:48 UTC

Feature tracking spreadsheet

Since we're supporting multiple languages, is there any known, useful way
to have some sort of spreadsheet tracking what exists where? In this case
not just the base sketches but features within them.

I'm working on porting var opt sampling, which re-uses some classes from
other sketches for estimation. But I found that the parts that I need don't
yet exist in C++, and it seems to be in part because we currently only have
Jaccard Similarity for theta sketches. But it'd be nice to have a reference
rather than digging through another repo's code directly.

  jon

Re: Feature tracking spreadsheet

Posted by Kenneth Knowles <ke...@apache.org>.
It is created by hand. We periodically have discussions about creating it
based on a test suite, but it changes infrequently enough it isn't really
worth it. (also empirically no  one has found it worth their time to do it)

Kenn

On Wed, Jan 22, 2020 at 12:09 PM leerho <le...@gmail.com> wrote:

> Thanks,   I was looking for the data file and couldn't find it :)  Is that
> yml file created by hand? Or do you have some script tool that generates it?
>
> Lee.
>
>
>
>
> On Wed, Jan 22, 2020 at 11:18 AM Kenneth Knowles <ke...@apache.org> wrote:
>
>> It is generated by Jekyll (along with the rest of the site).
>>
>> Here's the data file:
>> https://github.com/apache/beam/blob/master/website/src/_data/capability-matrix.yml
>> Here's the thing that rolls it out:
>> https://github.com/apache/beam/blob/master/website/src/documentation/runners/capability-matrix.md
>>
>> Kenn
>>
>> On Mon, Jan 20, 2020 at 10:45 AM leerho <le...@gmail.com> wrote:
>>
>>> Kenn,
>>>
>>> What tool do you use to generate those compatibility matrices?
>>>
>>> On Sun, Jan 19, 2020 at 9:46 PM leerho <le...@gmail.com> wrote:
>>>
>>>> I am working on something similar, but it won’t be as pretty 😊
>>>>
>>>> Lee
>>>>
>>>> On Sun, Jan 19, 2020 at 4:00 PM Kenneth Knowles <ke...@apache.org>
>>>> wrote:
>>>>
>>>>> Is it relevant to users? If so, maybe a table on the site is better
>>>>> than a spreadsheet? Here's what Beam does:
>>>>> https://beam.apache.org/documentation/runners/capability-matrix/
>>>>>
>>>>> Kenn
>>>>>
>>>>
>>>>
>>>>> On Sat, Jan 18, 2020 at 5:32 PM Jon Malkin <jm...@apache.org> wrote:
>>>>>
>>>>>> Since we're supporting multiple languages, is there any known, useful
>>>>>> way to have some sort of spreadsheet tracking what exists where? In this
>>>>>> case not just the base sketches but features within them.
>>>>>>
>>>>>> I'm working on porting var opt sampling, which re-uses some classes
>>>>>> from other sketches for estimation. But I found that the parts that I need
>>>>>> don't yet exist in C++, and it seems to be in part because we currently
>>>>>> only have Jaccard Similarity for theta sketches. But it'd be nice to have a
>>>>>> reference rather than digging through another repo's code directly.
>>>>>>
>>>>>>   jon
>>>>>>
>>>>> --
>>>> From my cell phone.
>>>>
>>>

Re: Feature tracking spreadsheet

Posted by leerho <le...@gmail.com>.
Thanks,   I was looking for the data file and couldn't find it :)  Is that
yml file created by hand? Or do you have some script tool that generates it?

Lee.




On Wed, Jan 22, 2020 at 11:18 AM Kenneth Knowles <ke...@apache.org> wrote:

> It is generated by Jekyll (along with the rest of the site).
>
> Here's the data file:
> https://github.com/apache/beam/blob/master/website/src/_data/capability-matrix.yml
> Here's the thing that rolls it out:
> https://github.com/apache/beam/blob/master/website/src/documentation/runners/capability-matrix.md
>
> Kenn
>
> On Mon, Jan 20, 2020 at 10:45 AM leerho <le...@gmail.com> wrote:
>
>> Kenn,
>>
>> What tool do you use to generate those compatibility matrices?
>>
>> On Sun, Jan 19, 2020 at 9:46 PM leerho <le...@gmail.com> wrote:
>>
>>> I am working on something similar, but it won’t be as pretty 😊
>>>
>>> Lee
>>>
>>> On Sun, Jan 19, 2020 at 4:00 PM Kenneth Knowles <ke...@apache.org> wrote:
>>>
>>>> Is it relevant to users? If so, maybe a table on the site is better
>>>> than a spreadsheet? Here's what Beam does:
>>>> https://beam.apache.org/documentation/runners/capability-matrix/
>>>>
>>>> Kenn
>>>>
>>>
>>>
>>>> On Sat, Jan 18, 2020 at 5:32 PM Jon Malkin <jm...@apache.org> wrote:
>>>>
>>>>> Since we're supporting multiple languages, is there any known, useful
>>>>> way to have some sort of spreadsheet tracking what exists where? In this
>>>>> case not just the base sketches but features within them.
>>>>>
>>>>> I'm working on porting var opt sampling, which re-uses some classes
>>>>> from other sketches for estimation. But I found that the parts that I need
>>>>> don't yet exist in C++, and it seems to be in part because we currently
>>>>> only have Jaccard Similarity for theta sketches. But it'd be nice to have a
>>>>> reference rather than digging through another repo's code directly.
>>>>>
>>>>>   jon
>>>>>
>>>> --
>>> From my cell phone.
>>>
>>

Re: Feature tracking spreadsheet

Posted by Kenneth Knowles <ke...@apache.org>.
It is generated by Jekyll (along with the rest of the site).

Here's the data file:
https://github.com/apache/beam/blob/master/website/src/_data/capability-matrix.yml
Here's the thing that rolls it out:
https://github.com/apache/beam/blob/master/website/src/documentation/runners/capability-matrix.md

Kenn

On Mon, Jan 20, 2020 at 10:45 AM leerho <le...@gmail.com> wrote:

> Kenn,
>
> What tool do you use to generate those compatibility matrices?
>
> On Sun, Jan 19, 2020 at 9:46 PM leerho <le...@gmail.com> wrote:
>
>> I am working on something similar, but it won’t be as pretty 😊
>>
>> Lee
>>
>> On Sun, Jan 19, 2020 at 4:00 PM Kenneth Knowles <ke...@apache.org> wrote:
>>
>>> Is it relevant to users? If so, maybe a table on the site is better than
>>> a spreadsheet? Here's what Beam does:
>>> https://beam.apache.org/documentation/runners/capability-matrix/
>>>
>>> Kenn
>>>
>>
>>
>>> On Sat, Jan 18, 2020 at 5:32 PM Jon Malkin <jm...@apache.org> wrote:
>>>
>>>> Since we're supporting multiple languages, is there any known, useful
>>>> way to have some sort of spreadsheet tracking what exists where? In this
>>>> case not just the base sketches but features within them.
>>>>
>>>> I'm working on porting var opt sampling, which re-uses some classes
>>>> from other sketches for estimation. But I found that the parts that I need
>>>> don't yet exist in C++, and it seems to be in part because we currently
>>>> only have Jaccard Similarity for theta sketches. But it'd be nice to have a
>>>> reference rather than digging through another repo's code directly.
>>>>
>>>>   jon
>>>>
>>> --
>> From my cell phone.
>>
>

Re: Feature tracking spreadsheet

Posted by leerho <le...@gmail.com>.
Kenn,

What tool do you use to generate those compatibility matrices?

On Sun, Jan 19, 2020 at 9:46 PM leerho <le...@gmail.com> wrote:

> I am working on something similar, but it won’t be as pretty 😊
>
> Lee
>
> On Sun, Jan 19, 2020 at 4:00 PM Kenneth Knowles <ke...@apache.org> wrote:
>
>> Is it relevant to users? If so, maybe a table on the site is better than
>> a spreadsheet? Here's what Beam does:
>> https://beam.apache.org/documentation/runners/capability-matrix/
>>
>> Kenn
>>
>
>
>> On Sat, Jan 18, 2020 at 5:32 PM Jon Malkin <jm...@apache.org> wrote:
>>
>>> Since we're supporting multiple languages, is there any known, useful
>>> way to have some sort of spreadsheet tracking what exists where? In this
>>> case not just the base sketches but features within them.
>>>
>>> I'm working on porting var opt sampling, which re-uses some classes from
>>> other sketches for estimation. But I found that the parts that I need don't
>>> yet exist in C++, and it seems to be in part because we currently only have
>>> Jaccard Similarity for theta sketches. But it'd be nice to have a reference
>>> rather than digging through another repo's code directly.
>>>
>>>   jon
>>>
>> --
> From my cell phone.
>

Re: Feature tracking spreadsheet

Posted by leerho <le...@gmail.com>.
I am working on something similar, but it won’t be as pretty 😊

Lee

On Sun, Jan 19, 2020 at 4:00 PM Kenneth Knowles <ke...@apache.org> wrote:

> Is it relevant to users? If so, maybe a table on the site is better than a
> spreadsheet? Here's what Beam does:
> https://beam.apache.org/documentation/runners/capability-matrix/
>
> Kenn
>


> On Sat, Jan 18, 2020 at 5:32 PM Jon Malkin <jm...@apache.org> wrote:
>
>> Since we're supporting multiple languages, is there any known, useful way
>> to have some sort of spreadsheet tracking what exists where? In this case
>> not just the base sketches but features within them.
>>
>> I'm working on porting var opt sampling, which re-uses some classes from
>> other sketches for estimation. But I found that the parts that I need don't
>> yet exist in C++, and it seems to be in part because we currently only have
>> Jaccard Similarity for theta sketches. But it'd be nice to have a reference
>> rather than digging through another repo's code directly.
>>
>>   jon
>>
> --
From my cell phone.

Re: Feature tracking spreadsheet

Posted by Kenneth Knowles <ke...@apache.org>.
Is it relevant to users? If so, maybe a table on the site is better than a
spreadsheet? Here's what Beam does:
https://beam.apache.org/documentation/runners/capability-matrix/

Kenn

On Sat, Jan 18, 2020 at 5:32 PM Jon Malkin <jm...@apache.org> wrote:

> Since we're supporting multiple languages, is there any known, useful way
> to have some sort of spreadsheet tracking what exists where? In this case
> not just the base sketches but features within them.
>
> I'm working on porting var opt sampling, which re-uses some classes from
> other sketches for estimation. But I found that the parts that I need don't
> yet exist in C++, and it seems to be in part because we currently only have
> Jaccard Similarity for theta sketches. But it'd be nice to have a reference
> rather than digging through another repo's code directly.
>
>   jon
>