You are viewing a plain text version of this content. The canonical link for it is here.
Posted to marketing@couchdb.apache.org by Noah Slater <ns...@apache.org> on 2014/05/01 20:50:32 UTC

Sharing data with third-parties

Marketing team,

I'd like to discuss the privacy of data the project collects, so we
know what can be shared on the lists, and with interested
third-parties.

I'd like to cover these types of data:

1) Download stats
2) Web analytics for the website and the blog
3) Size of the mailing lists
4) Composition of the mailing lists

Here's what that data can be used for:

(1) and (2) can be used to measure popularity of CouchDB and how our
marketing efforts and releases have an impact on the project.

(3) can be used to get a sense of how big the community is, and how
that is changing over time.

(4) can be used to get demographic information. There are plenty of
tools that will mine information from a list of email addresses,
linking people up with employers, LinkedIn profiles, and other social
media.

Now, as far as I am concerned:

(1) is fine and I am happy sharing that on this list.

(2) should be for committers by default only, but relevant info can be
posted if it helps us in our marketing efforts. Popular page views,
specific page views, paths/clicktrails, browser demographics, and so
on.

But what about information like location, age, gender, interests and
so on (which you can get from some analytics tools). I think we'd have
to make sure that whatever was being shared publicly was 1) actually
useful to us and needed to be shared, and 2) heavily aggregated so as
not to cause any privacy concerns.

(I am more interested in doing the right thing here than I am getting
around the legalities of the situation.)

(3) is fine, and we share this every three months already in our board reports.

(4) seems like something that ought to be confidential and not shared
with anyone outside of the PMC for any purpose.

If we shared (4) with anyone, we'd have to share it with everyone, per
our strict vendor neutral position. And there is already a strong
consensus on the internal ASF press@ list that this is out of the
question.

What do other people think?

Thank you,

-- 
Noah Slater
https://twitter.com/nslater

Re: Sharing data with third-parties

Posted by Noah Slater <ns...@apache.org>.
The goal of vendor neutrality is to not give any specific vendor of
CouchDB preference of favour over any other. It does not preclude us
from using the services of unrelated third-party organisations.

On 1 May 2014 23:55, matt j. sorenson <ma...@sorensonbros.net> wrote:
> Okay, how does employing google analytics pass the vendor-neutrality test?!
> Which, yes, I've lamented in the past on these lists. But frankly, I think
> I'd rather you tell everybody on the planet my email address without my
> express permission, than position google analytics as vendor-neutral.
>
> How about piwik?
>
> --
> *Matt*
>
>
> On Thu, May 1, 2014 at 4:43 PM, Noah Slater <ns...@apache.org> wrote:
>
>> Right now we are using Google Analytics, but the configuration is
>> messed up. I'm open to suggestions of alternatives. But I think
>> assuming the feature set of GA is safe.
>>
>> As for what we want to share there, I don't know yet. Immediately, I
>> can think of:
>>
>> - Overall traffic (so we can spot trends)
>> - Traffic graphs for blog posts (page views over time)
>> - Top referers to our various properties (helps us to target our promotion)
>> - Search terms bringing people to the blog, docs, etc (so we know
>> what's resonating)
>> - Top countries, languages (so we know who our audience for events, etc)
>> - Browser stats (so we have an idea of who to target with our product and
>> site)
>>
>> Can you think of any other stats we'd want? We don't have to list them
>> all here right away. We can discuss specific stats as we need or want
>> to share them. But good to get a general impression of what we think
>> is acceptable now.
>>
>> On 1 May 2014 23:10, Joan Touzet <wo...@apache.org> wrote:
>> > Agree to all and will restate what I said before: "Web Analytics" is too
>> > broad, we need to define that term more narrowly, at which point I think
>> > we can share only aggregate technical info and avoid anything that
>> > attempts to get into individual demographics as pure conjecture only.
>> >
>> > -Joan
>> >
>> > ----- Original Message -----
>> > From: "Andy Wenk" <an...@nms.de>
>> > To: marketing@couchdb.apache.org
>> > Sent: Thursday, May 1, 2014 3:32:33 PM
>> > Subject: Re: Sharing data with third-parties
>> >
>> > Hi Noah,
>> >
>> > +1 for your suggestions where and with whom to share the for 1), 2), 3)
>> and
>> > 4).
>> >
>> > Cheers
>> >
>> > Andy
>> >
>> >
>> > On 1 May 2014 20:50, Noah Slater <ns...@apache.org> wrote:
>> >
>> >> Marketing team,
>> >>
>> >> I'd like to discuss the privacy of data the project collects, so we
>> >> know what can be shared on the lists, and with interested
>> >> third-parties.
>> >>
>> >> I'd like to cover these types of data:
>> >>
>> >> 1) Download stats
>> >> 2) Web analytics for the website and the blog
>> >> 3) Size of the mailing lists
>> >> 4) Composition of the mailing lists
>> >>
>> >> Here's what that data can be used for:
>> >>
>> >> (1) and (2) can be used to measure popularity of CouchDB and how our
>> >> marketing efforts and releases have an impact on the project.
>> >>
>> >> (3) can be used to get a sense of how big the community is, and how
>> >> that is changing over time.
>> >>
>> >> (4) can be used to get demographic information. There are plenty of
>> >> tools that will mine information from a list of email addresses,
>> >> linking people up with employers, LinkedIn profiles, and other social
>> >> media.
>> >>
>> >> Now, as far as I am concerned:
>> >>
>> >> (1) is fine and I am happy sharing that on this list.
>> >>
>> >> (2) should be for committers by default only, but relevant info can be
>> >> posted if it helps us in our marketing efforts. Popular page views,
>> >> specific page views, paths/clicktrails, browser demographics, and so
>> >> on.
>> >>
>> >> But what about information like location, age, gender, interests and
>> >> so on (which you can get from some analytics tools). I think we'd have
>> >> to make sure that whatever was being shared publicly was 1) actually
>> >> useful to us and needed to be shared, and 2) heavily aggregated so as
>> >> not to cause any privacy concerns.
>> >>
>> >> (I am more interested in doing the right thing here than I am getting
>> >> around the legalities of the situation.)
>> >>
>> >> (3) is fine, and we share this every three months already in our board
>> >> reports.
>> >>
>> >> (4) seems like something that ought to be confidential and not shared
>> >> with anyone outside of the PMC for any purpose.
>> >>
>> >> If we shared (4) with anyone, we'd have to share it with everyone, per
>> >> our strict vendor neutral position. And there is already a strong
>> >> consensus on the internal ASF press@ list that this is out of the
>> >> question.
>> >>
>> >> What do other people think?
>> >>
>> >> Thank you,
>> >>
>> >> --
>> >> Noah Slater
>> >> https://twitter.com/nslater
>> >>
>> >
>> >
>> >
>> > --
>> > Andy Wenk
>> > Hamburg - Germany
>> > RockIt!
>> >
>> > http://www.couchdb-buch.de
>> > http://www.pg-praxisbuch.de
>> >
>> > GPG fingerprint: C044 8322 9E12 1483 4FEC 9452 B65D 6BE3 9ED3 9588
>> >
>> > https://people.apache.org/keys/committer/andywenk.asc
>>
>>
>>
>> --
>> Noah Slater
>> https://twitter.com/nslater
>>



-- 
Noah Slater
https://twitter.com/nslater

Re: Sharing data with third-parties

Posted by "matt j. sorenson" <ma...@sorensonbros.net>.
Okay, how does employing google analytics pass the vendor-neutrality test?!
Which, yes, I've lamented in the past on these lists. But frankly, I think
I'd rather you tell everybody on the planet my email address without my
express permission, than position google analytics as vendor-neutral.

How about piwik?

--
*​Matt*


On Thu, May 1, 2014 at 4:43 PM, Noah Slater <ns...@apache.org> wrote:

> Right now we are using Google Analytics, but the configuration is
> messed up. I'm open to suggestions of alternatives. But I think
> assuming the feature set of GA is safe.
>
> As for what we want to share there, I don't know yet. Immediately, I
> can think of:
>
> - Overall traffic (so we can spot trends)
> - Traffic graphs for blog posts (page views over time)
> - Top referers to our various properties (helps us to target our promotion)
> - Search terms bringing people to the blog, docs, etc (so we know
> what's resonating)
> - Top countries, languages (so we know who our audience for events, etc)
> - Browser stats (so we have an idea of who to target with our product and
> site)
>
> Can you think of any other stats we'd want? We don't have to list them
> all here right away. We can discuss specific stats as we need or want
> to share them. But good to get a general impression of what we think
> is acceptable now.
>
> On 1 May 2014 23:10, Joan Touzet <wo...@apache.org> wrote:
> > Agree to all and will restate what I said before: "Web Analytics" is too
> > broad, we need to define that term more narrowly, at which point I think
> > we can share only aggregate technical info and avoid anything that
> > attempts to get into individual demographics as pure conjecture only.
> >
> > -Joan
> >
> > ----- Original Message -----
> > From: "Andy Wenk" <an...@nms.de>
> > To: marketing@couchdb.apache.org
> > Sent: Thursday, May 1, 2014 3:32:33 PM
> > Subject: Re: Sharing data with third-parties
> >
> > Hi Noah,
> >
> > +1 for your suggestions where and with whom to share the for 1), 2), 3)
> and
> > 4).
> >
> > Cheers
> >
> > Andy
> >
> >
> > On 1 May 2014 20:50, Noah Slater <ns...@apache.org> wrote:
> >
> >> Marketing team,
> >>
> >> I'd like to discuss the privacy of data the project collects, so we
> >> know what can be shared on the lists, and with interested
> >> third-parties.
> >>
> >> I'd like to cover these types of data:
> >>
> >> 1) Download stats
> >> 2) Web analytics for the website and the blog
> >> 3) Size of the mailing lists
> >> 4) Composition of the mailing lists
> >>
> >> Here's what that data can be used for:
> >>
> >> (1) and (2) can be used to measure popularity of CouchDB and how our
> >> marketing efforts and releases have an impact on the project.
> >>
> >> (3) can be used to get a sense of how big the community is, and how
> >> that is changing over time.
> >>
> >> (4) can be used to get demographic information. There are plenty of
> >> tools that will mine information from a list of email addresses,
> >> linking people up with employers, LinkedIn profiles, and other social
> >> media.
> >>
> >> Now, as far as I am concerned:
> >>
> >> (1) is fine and I am happy sharing that on this list.
> >>
> >> (2) should be for committers by default only, but relevant info can be
> >> posted if it helps us in our marketing efforts. Popular page views,
> >> specific page views, paths/clicktrails, browser demographics, and so
> >> on.
> >>
> >> But what about information like location, age, gender, interests and
> >> so on (which you can get from some analytics tools). I think we'd have
> >> to make sure that whatever was being shared publicly was 1) actually
> >> useful to us and needed to be shared, and 2) heavily aggregated so as
> >> not to cause any privacy concerns.
> >>
> >> (I am more interested in doing the right thing here than I am getting
> >> around the legalities of the situation.)
> >>
> >> (3) is fine, and we share this every three months already in our board
> >> reports.
> >>
> >> (4) seems like something that ought to be confidential and not shared
> >> with anyone outside of the PMC for any purpose.
> >>
> >> If we shared (4) with anyone, we'd have to share it with everyone, per
> >> our strict vendor neutral position. And there is already a strong
> >> consensus on the internal ASF press@ list that this is out of the
> >> question.
> >>
> >> What do other people think?
> >>
> >> Thank you,
> >>
> >> --
> >> Noah Slater
> >> https://twitter.com/nslater
> >>
> >
> >
> >
> > --
> > Andy Wenk
> > Hamburg - Germany
> > RockIt!
> >
> > http://www.couchdb-buch.de
> > http://www.pg-praxisbuch.de
> >
> > GPG fingerprint: C044 8322 9E12 1483 4FEC 9452 B65D 6BE3 9ED3 9588
> >
> > https://people.apache.org/keys/committer/andywenk.asc
>
>
>
> --
> Noah Slater
> https://twitter.com/nslater
>

Re: Sharing data with third-parties

Posted by Noah Slater <ns...@apache.org>.
Right now we are using Google Analytics, but the configuration is
messed up. I'm open to suggestions of alternatives. But I think
assuming the feature set of GA is safe.

As for what we want to share there, I don't know yet. Immediately, I
can think of:

- Overall traffic (so we can spot trends)
- Traffic graphs for blog posts (page views over time)
- Top referers to our various properties (helps us to target our promotion)
- Search terms bringing people to the blog, docs, etc (so we know
what's resonating)
- Top countries, languages (so we know who our audience for events, etc)
- Browser stats (so we have an idea of who to target with our product and site)

Can you think of any other stats we'd want? We don't have to list them
all here right away. We can discuss specific stats as we need or want
to share them. But good to get a general impression of what we think
is acceptable now.

On 1 May 2014 23:10, Joan Touzet <wo...@apache.org> wrote:
> Agree to all and will restate what I said before: "Web Analytics" is too
> broad, we need to define that term more narrowly, at which point I think
> we can share only aggregate technical info and avoid anything that
> attempts to get into individual demographics as pure conjecture only.
>
> -Joan
>
> ----- Original Message -----
> From: "Andy Wenk" <an...@nms.de>
> To: marketing@couchdb.apache.org
> Sent: Thursday, May 1, 2014 3:32:33 PM
> Subject: Re: Sharing data with third-parties
>
> Hi Noah,
>
> +1 for your suggestions where and with whom to share the for 1), 2), 3) and
> 4).
>
> Cheers
>
> Andy
>
>
> On 1 May 2014 20:50, Noah Slater <ns...@apache.org> wrote:
>
>> Marketing team,
>>
>> I'd like to discuss the privacy of data the project collects, so we
>> know what can be shared on the lists, and with interested
>> third-parties.
>>
>> I'd like to cover these types of data:
>>
>> 1) Download stats
>> 2) Web analytics for the website and the blog
>> 3) Size of the mailing lists
>> 4) Composition of the mailing lists
>>
>> Here's what that data can be used for:
>>
>> (1) and (2) can be used to measure popularity of CouchDB and how our
>> marketing efforts and releases have an impact on the project.
>>
>> (3) can be used to get a sense of how big the community is, and how
>> that is changing over time.
>>
>> (4) can be used to get demographic information. There are plenty of
>> tools that will mine information from a list of email addresses,
>> linking people up with employers, LinkedIn profiles, and other social
>> media.
>>
>> Now, as far as I am concerned:
>>
>> (1) is fine and I am happy sharing that on this list.
>>
>> (2) should be for committers by default only, but relevant info can be
>> posted if it helps us in our marketing efforts. Popular page views,
>> specific page views, paths/clicktrails, browser demographics, and so
>> on.
>>
>> But what about information like location, age, gender, interests and
>> so on (which you can get from some analytics tools). I think we'd have
>> to make sure that whatever was being shared publicly was 1) actually
>> useful to us and needed to be shared, and 2) heavily aggregated so as
>> not to cause any privacy concerns.
>>
>> (I am more interested in doing the right thing here than I am getting
>> around the legalities of the situation.)
>>
>> (3) is fine, and we share this every three months already in our board
>> reports.
>>
>> (4) seems like something that ought to be confidential and not shared
>> with anyone outside of the PMC for any purpose.
>>
>> If we shared (4) with anyone, we'd have to share it with everyone, per
>> our strict vendor neutral position. And there is already a strong
>> consensus on the internal ASF press@ list that this is out of the
>> question.
>>
>> What do other people think?
>>
>> Thank you,
>>
>> --
>> Noah Slater
>> https://twitter.com/nslater
>>
>
>
>
> --
> Andy Wenk
> Hamburg - Germany
> RockIt!
>
> http://www.couchdb-buch.de
> http://www.pg-praxisbuch.de
>
> GPG fingerprint: C044 8322 9E12 1483 4FEC 9452 B65D 6BE3 9ED3 9588
>
> https://people.apache.org/keys/committer/andywenk.asc



-- 
Noah Slater
https://twitter.com/nslater

Re: Sharing data with third-parties

Posted by Joan Touzet <wo...@apache.org>.
Agree to all and will restate what I said before: "Web Analytics" is too
broad, we need to define that term more narrowly, at which point I think
we can share only aggregate technical info and avoid anything that
attempts to get into individual demographics as pure conjecture only.

-Joan

----- Original Message -----
From: "Andy Wenk" <an...@nms.de>
To: marketing@couchdb.apache.org
Sent: Thursday, May 1, 2014 3:32:33 PM
Subject: Re: Sharing data with third-parties

Hi Noah,

+1 for your suggestions where and with whom to share the for 1), 2), 3) and
4).

Cheers

Andy


On 1 May 2014 20:50, Noah Slater <ns...@apache.org> wrote:

> Marketing team,
>
> I'd like to discuss the privacy of data the project collects, so we
> know what can be shared on the lists, and with interested
> third-parties.
>
> I'd like to cover these types of data:
>
> 1) Download stats
> 2) Web analytics for the website and the blog
> 3) Size of the mailing lists
> 4) Composition of the mailing lists
>
> Here's what that data can be used for:
>
> (1) and (2) can be used to measure popularity of CouchDB and how our
> marketing efforts and releases have an impact on the project.
>
> (3) can be used to get a sense of how big the community is, and how
> that is changing over time.
>
> (4) can be used to get demographic information. There are plenty of
> tools that will mine information from a list of email addresses,
> linking people up with employers, LinkedIn profiles, and other social
> media.
>
> Now, as far as I am concerned:
>
> (1) is fine and I am happy sharing that on this list.
>
> (2) should be for committers by default only, but relevant info can be
> posted if it helps us in our marketing efforts. Popular page views,
> specific page views, paths/clicktrails, browser demographics, and so
> on.
>
> But what about information like location, age, gender, interests and
> so on (which you can get from some analytics tools). I think we'd have
> to make sure that whatever was being shared publicly was 1) actually
> useful to us and needed to be shared, and 2) heavily aggregated so as
> not to cause any privacy concerns.
>
> (I am more interested in doing the right thing here than I am getting
> around the legalities of the situation.)
>
> (3) is fine, and we share this every three months already in our board
> reports.
>
> (4) seems like something that ought to be confidential and not shared
> with anyone outside of the PMC for any purpose.
>
> If we shared (4) with anyone, we'd have to share it with everyone, per
> our strict vendor neutral position. And there is already a strong
> consensus on the internal ASF press@ list that this is out of the
> question.
>
> What do other people think?
>
> Thank you,
>
> --
> Noah Slater
> https://twitter.com/nslater
>



-- 
Andy Wenk
Hamburg - Germany
RockIt!

http://www.couchdb-buch.de
http://www.pg-praxisbuch.de

GPG fingerprint: C044 8322 9E12 1483 4FEC 9452 B65D 6BE3 9ED3 9588

https://people.apache.org/keys/committer/andywenk.asc

Re: Sharing data with third-parties

Posted by Andy Wenk <an...@nms.de>.
Hi Noah,

+1 for your suggestions where and with whom to share the for 1), 2), 3) and
4).

Cheers

Andy


On 1 May 2014 20:50, Noah Slater <ns...@apache.org> wrote:

> Marketing team,
>
> I'd like to discuss the privacy of data the project collects, so we
> know what can be shared on the lists, and with interested
> third-parties.
>
> I'd like to cover these types of data:
>
> 1) Download stats
> 2) Web analytics for the website and the blog
> 3) Size of the mailing lists
> 4) Composition of the mailing lists
>
> Here's what that data can be used for:
>
> (1) and (2) can be used to measure popularity of CouchDB and how our
> marketing efforts and releases have an impact on the project.
>
> (3) can be used to get a sense of how big the community is, and how
> that is changing over time.
>
> (4) can be used to get demographic information. There are plenty of
> tools that will mine information from a list of email addresses,
> linking people up with employers, LinkedIn profiles, and other social
> media.
>
> Now, as far as I am concerned:
>
> (1) is fine and I am happy sharing that on this list.
>
> (2) should be for committers by default only, but relevant info can be
> posted if it helps us in our marketing efforts. Popular page views,
> specific page views, paths/clicktrails, browser demographics, and so
> on.
>
> But what about information like location, age, gender, interests and
> so on (which you can get from some analytics tools). I think we'd have
> to make sure that whatever was being shared publicly was 1) actually
> useful to us and needed to be shared, and 2) heavily aggregated so as
> not to cause any privacy concerns.
>
> (I am more interested in doing the right thing here than I am getting
> around the legalities of the situation.)
>
> (3) is fine, and we share this every three months already in our board
> reports.
>
> (4) seems like something that ought to be confidential and not shared
> with anyone outside of the PMC for any purpose.
>
> If we shared (4) with anyone, we'd have to share it with everyone, per
> our strict vendor neutral position. And there is already a strong
> consensus on the internal ASF press@ list that this is out of the
> question.
>
> What do other people think?
>
> Thank you,
>
> --
> Noah Slater
> https://twitter.com/nslater
>



-- 
Andy Wenk
Hamburg - Germany
RockIt!

http://www.couchdb-buch.de
http://www.pg-praxisbuch.de

GPG fingerprint: C044 8322 9E12 1483 4FEC 9452 B65D 6BE3 9ED3 9588

https://people.apache.org/keys/committer/andywenk.asc