You are viewing a plain text version of this content. The canonical link for it is here.
Posted to diversity@apache.org by Daniel Gruno <hu...@apache.org> on 2021/05/20 13:22:36 UTC

Inclusive terminology in Apache Software - a practical approach

Hi folks,
following the discussion on inclusive naming, I got to thinking and 
discussing a bit more with Rich and others about practical approaches we 
could explore under the EDI banner.

And thus, I started work on a scanner service for repositories that 
could identify places where we can improve wording. The resulting 
service I have dubbed CLC (Conscious Language Checker), and a demo is up 
and running at https://clcdemo.net/

I picked a few Apache projects to test it on, and you can see individual 
analyses by clicking on the charts, for instance 
https://clcdemo.net/analysis.html?project=httpd.git

Anyone can currently edit settings and see the updated result once a new 
scan runs (twice daily currently). This will eventually be committers only.

I am proposing we put this under the EDI banner with a public repository 
for the service and a .apache.org hostname, for instance 
clc.diversity.apache.org and then invite projects to participate once 
the service is production ready. I believe this could be a good 
practical goal to achieve at EDI, and that it could help projects more 
easily adjust their terminologies.

WDYT?

With regards,
Daniel.

---------------------------------------------------------------------
To unsubscribe, e-mail: diversity-unsubscribe@apache.org
For additional commands, e-mail: diversity-help@apache.org


Fwd: Inclusive terminology in Apache Software - a practical approach

Posted by Kenneth Knowles <ke...@apache.org>.
Check out this tool. I tried it, adjusted Beam's settings a bit.

https://clcdemo.net/analysis.html?project=beam.git

Kenn

---------- Forwarded message ---------
From: Daniel Gruno <hu...@apache.org>
Date: Thu, May 20, 2021 at 6:22 AM
Subject: Inclusive terminology in Apache Software - a practical approach
To: <di...@apache.org>


Hi folks,
following the discussion on inclusive naming, I got to thinking and
discussing a bit more with Rich and others about practical approaches we
could explore under the EDI banner.

And thus, I started work on a scanner service for repositories that
could identify places where we can improve wording. The resulting
service I have dubbed CLC (Conscious Language Checker), and a demo is up
and running at https://clcdemo.net/

I picked a few Apache projects to test it on, and you can see individual
analyses by clicking on the charts, for instance
https://clcdemo.net/analysis.html?project=httpd.git

Anyone can currently edit settings and see the updated result once a new
scan runs (twice daily currently). This will eventually be committers only.

I am proposing we put this under the EDI banner with a public repository
for the service and a .apache.org hostname, for instance
clc.diversity.apache.org and then invite projects to participate once
the service is production ready. I believe this could be a good
practical goal to achieve at EDI, and that it could help projects more
easily adjust their terminologies.

WDYT?

With regards,
Daniel.

---------------------------------------------------------------------
To unsubscribe, e-mail: diversity-unsubscribe@apache.org
For additional commands, e-mail: diversity-help@apache.org

Re: Inclusive terminology in Apache Software - a practical approach

Posted by Daniel Gruno <hu...@apache.org>.
On 20/05/2021 15.27, Andrew Wetmore wrote:
> This is a very good idea,

Thanks!

> 
> On the left and right Y axes of the charts, what are the units of
> measurement?

issues found on the left, files processed (and time spent) on the right.
I hope I have added that to the charts now :)

> 
> <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail>
> Virus-free.
> www.avast.com
> <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail>
> <#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
> 
> On Thu, May 20, 2021 at 10:22 AM Daniel Gruno <hu...@apache.org> wrote:
> 
>> Hi folks,
>> following the discussion on inclusive naming, I got to thinking and
>> discussing a bit more with Rich and others about practical approaches we
>> could explore under the EDI banner.
>>
>> And thus, I started work on a scanner service for repositories that
>> could identify places where we can improve wording. The resulting
>> service I have dubbed CLC (Conscious Language Checker), and a demo is up
>> and running at https://clcdemo.net/
>>
>> I picked a few Apache projects to test it on, and you can see individual
>> analyses by clicking on the charts, for instance
>> https://clcdemo.net/analysis.html?project=httpd.git
>>
>> Anyone can currently edit settings and see the updated result once a new
>> scan runs (twice daily currently). This will eventually be committers only.
>>
>> I am proposing we put this under the EDI banner with a public repository
>> for the service and a .apache.org hostname, for instance
>> clc.diversity.apache.org and then invite projects to participate once
>> the service is production ready. I believe this could be a good
>> practical goal to achieve at EDI, and that it could help projects more
>> easily adjust their terminologies.
>>
>> WDYT?
>>
>> With regards,
>> Daniel.
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: diversity-unsubscribe@apache.org
>> For additional commands, e-mail: diversity-help@apache.org
>>
>>
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: diversity-unsubscribe@apache.org
For additional commands, e-mail: diversity-help@apache.org


Re: Inclusive terminology in Apache Software - a practical approach

Posted by Andrew Wetmore <an...@apache.org>.
This is a very good idea,

On the left and right Y axes of the charts, what are the units of
measurement?

<https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail>
Virus-free.
www.avast.com
<https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail>
<#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>

On Thu, May 20, 2021 at 10:22 AM Daniel Gruno <hu...@apache.org> wrote:

> Hi folks,
> following the discussion on inclusive naming, I got to thinking and
> discussing a bit more with Rich and others about practical approaches we
> could explore under the EDI banner.
>
> And thus, I started work on a scanner service for repositories that
> could identify places where we can improve wording. The resulting
> service I have dubbed CLC (Conscious Language Checker), and a demo is up
> and running at https://clcdemo.net/
>
> I picked a few Apache projects to test it on, and you can see individual
> analyses by clicking on the charts, for instance
> https://clcdemo.net/analysis.html?project=httpd.git
>
> Anyone can currently edit settings and see the updated result once a new
> scan runs (twice daily currently). This will eventually be committers only.
>
> I am proposing we put this under the EDI banner with a public repository
> for the service and a .apache.org hostname, for instance
> clc.diversity.apache.org and then invite projects to participate once
> the service is production ready. I believe this could be a good
> practical goal to achieve at EDI, and that it could help projects more
> easily adjust their terminologies.
>
> WDYT?
>
> With regards,
> Daniel.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: diversity-unsubscribe@apache.org
> For additional commands, e-mail: diversity-help@apache.org
>
>

-- 
Andrew Wetmore
Technical Writer-Editor
Infra
*Apache Software Foundation*
andreww@apache.org

Re: Inclusive terminology in Apache Software - a practical approach

Posted by Rich Bowen <rb...@rcbowen.com>.

On 5/21/21 7:50 AM, Daniel Gruno wrote:
> The repository for CLC is now live at https://github.com/Humbedooh/clc
> Anyoen can clone the repo, run the install and have a service up and 
> running in minutes. Requires Python 3.7 and some free disk space, but 
> that's about it - the rest is all self-contained.

Confirmed: 2 minutes to get everything set up, and start scanning my 
projects. Thanks!


> 
> Feedback much appreciated!
> 
> I'm going to ask for lazy consensus on asking infra for a VM where we 
> can have this run for committers, so people can perhaps assess their 
> projects.
> 
> With regards,
> Daniel.
> 
> On 20/05/2021 15.22, Daniel Gruno wrote:
>> Hi folks,
>> following the discussion on inclusive naming, I got to thinking and 
>> discussing a bit more with Rich and others about practical approaches 
>> we could explore under the EDI banner.
>>
>> And thus, I started work on a scanner service for repositories that 
>> could identify places where we can improve wording. The resulting 
>> service I have dubbed CLC (Conscious Language Checker), and a demo is 
>> up and running at https://clcdemo.net/
>>
>> I picked a few Apache projects to test it on, and you can see 
>> individual analyses by clicking on the charts, for instance 
>> https://clcdemo.net/analysis.html?project=httpd.git
>>
>> Anyone can currently edit settings and see the updated result once a 
>> new scan runs (twice daily currently). This will eventually be 
>> committers only.
>>
>> I am proposing we put this under the EDI banner with a public 
>> repository for the service and a .apache.org hostname, for instance 
>> clc.diversity.apache.org and then invite projects to participate once 
>> the service is production ready. I believe this could be a good 
>> practical goal to achieve at EDI, and that it could help projects more 
>> easily adjust their terminologies.
>>
>> WDYT?
>>
>> With regards,
>> Daniel.
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: diversity-unsubscribe@apache.org
>> For additional commands, e-mail: diversity-help@apache.org
>>
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: diversity-unsubscribe@apache.org
> For additional commands, e-mail: diversity-help@apache.org
> 

-- 
Rich Bowen - rbowen@rcbowen.com
@rbowen

---------------------------------------------------------------------
To unsubscribe, e-mail: diversity-unsubscribe@apache.org
For additional commands, e-mail: diversity-help@apache.org


Re: Inclusive terminology in Apache Software - a practical approach

Posted by Daniel Gruno <hu...@apache.org>.
The repository for CLC is now live at https://github.com/Humbedooh/clc
Anyoen can clone the repo, run the install and have a service up and 
running in minutes. Requires Python 3.7 and some free disk space, but 
that's about it - the rest is all self-contained.

Feedback much appreciated!

I'm going to ask for lazy consensus on asking infra for a VM where we 
can have this run for committers, so people can perhaps assess their 
projects.

With regards,
Daniel.

On 20/05/2021 15.22, Daniel Gruno wrote:
> Hi folks,
> following the discussion on inclusive naming, I got to thinking and 
> discussing a bit more with Rich and others about practical approaches we 
> could explore under the EDI banner.
> 
> And thus, I started work on a scanner service for repositories that 
> could identify places where we can improve wording. The resulting 
> service I have dubbed CLC (Conscious Language Checker), and a demo is up 
> and running at https://clcdemo.net/
> 
> I picked a few Apache projects to test it on, and you can see individual 
> analyses by clicking on the charts, for instance 
> https://clcdemo.net/analysis.html?project=httpd.git
> 
> Anyone can currently edit settings and see the updated result once a new 
> scan runs (twice daily currently). This will eventually be committers only.
> 
> I am proposing we put this under the EDI banner with a public repository 
> for the service and a .apache.org hostname, for instance 
> clc.diversity.apache.org and then invite projects to participate once 
> the service is production ready. I believe this could be a good 
> practical goal to achieve at EDI, and that it could help projects more 
> easily adjust their terminologies.
> 
> WDYT?
> 
> With regards,
> Daniel.
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: diversity-unsubscribe@apache.org
> For additional commands, e-mail: diversity-help@apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: diversity-unsubscribe@apache.org
For additional commands, e-mail: diversity-help@apache.org


Re: Inclusive terminology in Apache Software - a practical approach

Posted by Daniel Gruno <hu...@apache.org>.
On 20/05/2021 19.43, Kenneth Knowles wrote:
> I tried it on Beam and noticed that the default exclusion glob of
> *shakespeare* did not match /sdks/go/data/shakespeare/hamlet.txt

It _does_ match, but Rich put that exclusion in *after* the scan had run.

On next scan, it should exclude the shakespeare stuff :)


> 
> Is this the sort of glob where I would use ** to traverse directories, or
> some such?
> 
> Kenn
> 
> On Thu, May 20, 2021 at 10:38 AM Andrew Wetmore <co...@gmail.com> wrote:
> 
>> The "about" link doesn't seem to lead anywhere.
>>
>> <
>> https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail
>>>
>> Virus-free.
>> www.avast.com
>> <
>> https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail
>>>
>> <#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
>>
>> On Thu, May 20, 2021 at 11:25 AM Daniel Gruno <hu...@apache.org>
>> wrote:
>>
>>> On 20/05/2021 16.15, Mark Thomas wrote:
>>>>
>>>> I like it.
>>>>
>>>> Looks like Tomcat has managed to trigger a false positive in
>>>> CharsetCache.java.
>>>
>>> You can exclude files and contexts if you click on 'scan settings...'
>>> and edit it. There will later on be a per-occurrence action to ignore or
>>> mark as intended/false-positive that the scanner will remember.
>>>
>>> thouogh, remember the scanner only rescans every 12 hours, so it'll be a
>>> while before your changes show up. I might change that to something
>>> shorter.
>>>
>>>>
>>>> Might want to consider a way to handle such cases depending on how
>>>> frequent they are.
>>>>
>>>> Mark
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: diversity-unsubscribe@apache.org
>>>> For additional commands, e-mail: diversity-help@apache.org
>>>>
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: diversity-unsubscribe@apache.org
>>> For additional commands, e-mail: diversity-help@apache.org
>>>
>>>
>>
>> --
>> Andrew Wetmore
>>
>> http://cottage14.blogspot.com/
>>
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: diversity-unsubscribe@apache.org
For additional commands, e-mail: diversity-help@apache.org


Re: Inclusive terminology in Apache Software - a practical approach

Posted by Kenneth Knowles <ke...@apache.org>.
I tried it on Beam and noticed that the default exclusion glob of
*shakespeare* did not match /sdks/go/data/shakespeare/hamlet.txt

Is this the sort of glob where I would use ** to traverse directories, or
some such?

Kenn

On Thu, May 20, 2021 at 10:38 AM Andrew Wetmore <co...@gmail.com> wrote:

> The "about" link doesn't seem to lead anywhere.
>
> <
> https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail
> >
> Virus-free.
> www.avast.com
> <
> https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail
> >
> <#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
>
> On Thu, May 20, 2021 at 11:25 AM Daniel Gruno <hu...@apache.org>
> wrote:
>
> > On 20/05/2021 16.15, Mark Thomas wrote:
> > >
> > > I like it.
> > >
> > > Looks like Tomcat has managed to trigger a false positive in
> > > CharsetCache.java.
> >
> > You can exclude files and contexts if you click on 'scan settings...'
> > and edit it. There will later on be a per-occurrence action to ignore or
> > mark as intended/false-positive that the scanner will remember.
> >
> > thouogh, remember the scanner only rescans every 12 hours, so it'll be a
> > while before your changes show up. I might change that to something
> > shorter.
> >
> > >
> > > Might want to consider a way to handle such cases depending on how
> > > frequent they are.
> > >
> > > Mark
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: diversity-unsubscribe@apache.org
> > > For additional commands, e-mail: diversity-help@apache.org
> > >
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: diversity-unsubscribe@apache.org
> > For additional commands, e-mail: diversity-help@apache.org
> >
> >
>
> --
> Andrew Wetmore
>
> http://cottage14.blogspot.com/
>

Re: Inclusive terminology in Apache Software - a practical approach

Posted by Daniel Gruno <hu...@apache.org>.
On 20/05/2021 19.37, Andrew Wetmore wrote:
> The "about" link doesn't seem to lead anywhere.

Work in progress :D
I've yet to figure out what to put there...

> 
> <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail>
> Virus-free.
> www.avast.com
> <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail>
> <#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
> 
> On Thu, May 20, 2021 at 11:25 AM Daniel Gruno <hu...@apache.org> wrote:
> 
>> On 20/05/2021 16.15, Mark Thomas wrote:
>>>
>>> I like it.
>>>
>>> Looks like Tomcat has managed to trigger a false positive in
>>> CharsetCache.java.
>>
>> You can exclude files and contexts if you click on 'scan settings...'
>> and edit it. There will later on be a per-occurrence action to ignore or
>> mark as intended/false-positive that the scanner will remember.
>>
>> thouogh, remember the scanner only rescans every 12 hours, so it'll be a
>> while before your changes show up. I might change that to something
>> shorter.
>>
>>>
>>> Might want to consider a way to handle such cases depending on how
>>> frequent they are.
>>>
>>> Mark
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: diversity-unsubscribe@apache.org
>>> For additional commands, e-mail: diversity-help@apache.org
>>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: diversity-unsubscribe@apache.org
>> For additional commands, e-mail: diversity-help@apache.org
>>
>>
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: diversity-unsubscribe@apache.org
For additional commands, e-mail: diversity-help@apache.org


Re: Inclusive terminology in Apache Software - a practical approach

Posted by Andrew Wetmore <co...@gmail.com>.
The "about" link doesn't seem to lead anywhere.

<https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail>
Virus-free.
www.avast.com
<https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail>
<#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>

On Thu, May 20, 2021 at 11:25 AM Daniel Gruno <hu...@apache.org> wrote:

> On 20/05/2021 16.15, Mark Thomas wrote:
> >
> > I like it.
> >
> > Looks like Tomcat has managed to trigger a false positive in
> > CharsetCache.java.
>
> You can exclude files and contexts if you click on 'scan settings...'
> and edit it. There will later on be a per-occurrence action to ignore or
> mark as intended/false-positive that the scanner will remember.
>
> thouogh, remember the scanner only rescans every 12 hours, so it'll be a
> while before your changes show up. I might change that to something
> shorter.
>
> >
> > Might want to consider a way to handle such cases depending on how
> > frequent they are.
> >
> > Mark
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: diversity-unsubscribe@apache.org
> > For additional commands, e-mail: diversity-help@apache.org
> >
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: diversity-unsubscribe@apache.org
> For additional commands, e-mail: diversity-help@apache.org
>
>

-- 
Andrew Wetmore

http://cottage14.blogspot.com/

Re: Inclusive terminology in Apache Software - a practical approach

Posted by Daniel Gruno <hu...@apache.org>.
On 20/05/2021 16.15, Mark Thomas wrote:
> 
> I like it.
> 
> Looks like Tomcat has managed to trigger a false positive in 
> CharsetCache.java.

You can exclude files and contexts if you click on 'scan settings...' 
and edit it. There will later on be a per-occurrence action to ignore or 
mark as intended/false-positive that the scanner will remember.

thouogh, remember the scanner only rescans every 12 hours, so it'll be a 
while before your changes show up. I might change that to something shorter.

> 
> Might want to consider a way to handle such cases depending on how 
> frequent they are.
> 
> Mark
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: diversity-unsubscribe@apache.org
> For additional commands, e-mail: diversity-help@apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: diversity-unsubscribe@apache.org
For additional commands, e-mail: diversity-help@apache.org


Re: Inclusive terminology in Apache Software - a practical approach

Posted by Mark Thomas <ma...@apache.org>.
On 20/05/2021 14:22, Daniel Gruno wrote:
> Hi folks,
> following the discussion on inclusive naming, I got to thinking and 
> discussing a bit more with Rich and others about practical approaches we 
> could explore under the EDI banner.
> 
> And thus, I started work on a scanner service for repositories that 
> could identify places where we can improve wording. The resulting 
> service I have dubbed CLC (Conscious Language Checker), and a demo is up 
> and running at https://clcdemo.net/
> 
> I picked a few Apache projects to test it on, and you can see individual 
> analyses by clicking on the charts, for instance 
> https://clcdemo.net/analysis.html?project=httpd.git
> 
> Anyone can currently edit settings and see the updated result once a new 
> scan runs (twice daily currently). This will eventually be committers only.
> 
> I am proposing we put this under the EDI banner with a public repository 
> for the service and a .apache.org hostname, for instance 
> clc.diversity.apache.org and then invite projects to participate once 
> the service is production ready. I believe this could be a good 
> practical goal to achieve at EDI, and that it could help projects more 
> easily adjust their terminologies.
> 
> WDYT?

I like it.

Looks like Tomcat has managed to trigger a false positive in 
CharsetCache.java.

Might want to consider a way to handle such cases depending on how 
frequent they are.

Mark

---------------------------------------------------------------------
To unsubscribe, e-mail: diversity-unsubscribe@apache.org
For additional commands, e-mail: diversity-help@apache.org