You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@beam.apache.org by Valentyn Tymofieiev <va...@google.com> on 2021/09/23 02:12:58 UTC
Re: Please help triage issues!
Thanks, Kyle!
Since adding labels is a part of possible triaging action, it would be good
to increase their usefulness.
Possible ideas:
- Add a triaging recommendation page that shows commonly used labels that
still make sense to use, add few word recommendation on when to use them if
not obvious. It would be nice if this page was easily discoverable when
users are looking for how to report issues.
- Given that labels are arbitrary and we don't control them, we should
create filters[1] and Kanban boards[2], which we control, and use them to
look up triaged issues in the future. Filters can pull information from
multiple labels. For example the flaky test filter[3] should pull issues
with labels 'flake', 'flaky', 'flaky-test', 'flakey'... We may need to
update filters periodically when synonymous labels start to appear.
- Some filters/boards I'd like to see: issues suitable for a first-time
PR, ease-of-use issues, filters for medium-size self-contained projects
suitable for new contributors who are ramping up on a certain area of Beam.
- Use components instead of labels when a identical component is available.
- Maybe cleanup some of the labels if no longer useful. perhaps there is an
automated way to do that easily.
[1]
https://issues.apache.org/jira/secure/ManageFilters.jspa?filterView=search
[2] https://issues.apache.org/jira/secure/ManageRapidViews.jspa
[3]
https://issues.apache.org/jira/secure/RapidView.jspa?rapidView=464&tab=filter
Here are some more labels I collected among all open and closed issues.
1. :jira$ cat ./get_all_labels.sh
2. for file_column in $(
3. for file in *.csv; do
4. cat $file | head -n1 | awk -F ',' -v fname="$file"
'{for(c=1;c<=NF;c++) if($c=="Labels") print fname "," c }' ;
5. done
6. );
7. do
8. file=$(echo $file_column | cut -d',' -f1);
9. column=$(echo $file_column | cut -d',' -f2);
10. # cat $file | tail -n +2 | awk -F "\"*,\"*" '{print $'$colname'}'
11. csvtool format '%('$column')\n' $file | tail -n +2
12. done | grep -v "^\s*$" | sort | uniq -c | sort -h -r
13.
14.
15. :jira$ ./get_all_labels.sh
16. 1166 Done
17. 530 Clarified
18. 382 starter
19. 332 portability
20. 252 flake
21. 222 stale-assigned
22. 194 newbie
23. 185 currently-failing
24. 154 stale-P2
25. 128 portability-spark
26. 122 website-revamp-2020
27. 115 beam-fixit
28. 113 portability-flink
29. 111 backward-incompatible
30. 71 dataframe-api
31. 69 dsl_sql_merge
32. 60 easyfix
33. 49 triaged
34. 48 errorprone
35. 47 beginner
36. 39 documentation
37. 35 zetasql-compliance
38. 31 python
39. 31 flaky-test
40. 30 sickbay
41. 30 portability-samza
42. 28 gradle
43. 25 structured-streaming
44. 24 BeamSummitEU2019
45. 24 beamsummit
46. 22 zetasql-java-udf
47. 22 findbugs
48. 22 easy
49. 22 BeamSummitWebsite
50. 21 io
51. 21 dataflow
52. 20 performance
53. 20 nexmark
54. 20 features
55. 20 beamevents
56. 19 GCP
57. 19 dsl_sql_review
58. 16 types
59. 16 pipeline-patterns
60. 15 pull-request-available
61. 15 ccoss2019
62. 15 bigquery
63. 14 test
64. 14 simple
65. 14 beam-site-automation-reliability
66. 13 mentor
67. 13 gcp
68. 13 flaky
69. 12 sdk-consistency
70. 12 gsoc
71. 12 beam
72. 11 website-revamp-sprint-8
73. 10 java
74. 9 website-revamp-sprint-5
75. 9 schema-io
76. 8 website-revamp-sprint-4
77. 8 usability
78. 7 test-failure
79. 7 patch
80. 7 JdbcIO
81. 6 Windowing
82. 6 security
83. 6 mongodb
84. 6 KafkaIO
85. 6 infra
86. 6 gsoc2020
87. 6 gsoc2019
88. 6 flink
89. 6 build
90. 6 backwards-incompatible
91. 5 website-revamp-sprint-9
92. 5 website-revamp-sprint-6
93. 5 website-revamp-sprint-10
94. 5 Triggers
95. 5 State
96. 5 outreachy19dec
97. 5 jenkins
98. 5 google-cloud-spanner
99. 5 easy-fix
100. 5 datastore
101. 4 windowing
102. 4 website-revamp-sprint-3
103. 4 thrift
104. 4 SQL
105. 4 Python
106. 4 MongoDB
107. 4 java11
108. 4 infrastructure
109. 4 FlinkRunner
110. 4 community-metrics
111. 4 beam-website-sprint-2
112. 4 beamsummitsponsor
113. 4 apache-beam
114. 3 website-revamp-sprint-12
115. 3 website-revamp-sprint-11
116. 3 testing
117. 3 sql
118. 3 spark
119. 3 python-wheel
120. 3 PubSubIO
121. 3 pubsubio
122. 3 pubsub
123. 3 portable-metrics-bugs
124. 3 P2
125. 3 noob
126. 3 metrics
127. 3 maven
128. 3 kafka
129. 3 jdbc
130. 3 intellij
131. 3 GSoC2019
132. 3 google-dataflow
133. 3 document
134. 3 community-onboarding
135. 3 cloud
136. 3 CI
137. 3 bundle
138. 3 bug
139. 3 azureblob
140. 2 website-revamp-sprint-7
141. 2 watermark
142. 2 test-failures
143. 2 test-fail
144. 2 Starter
145. 2 split
146. 2 spark-runner
147. 2 schema
148. 2 release
149. 2 regression
150. 2 reference
151. 2 python-packages
152. 2 PubsubLiteIO
153. 2 perfomance
154. 2 parallel-deployment
155. 2 MySQL
156. 2 MQTT
157. 2 mitigated
158. 2 join
159. 2 Jenkins
160. 2 javadoc
161. 2 Java8
162. 2 Java11
163. 2 IO
164. 2 has-pr
165. 2 gsod2019
166. 2 gsod
167. 2 gsoc2018
168. 2 gsoc2017
169. 2 google-cloud-bigquery
170. 2 golang
171. 2 gcs
172. 2 documentaion
173. 2 docker
174. 2 dataflow-runner-v2
175. 2 containers
176. 2 cassandra
177. 2 buid
178. 2 blocking-postcommit
179. 2 bigdata
180. 2 azure
181. 2 AWS
182. 1 www
183. 1 windows
184. 1 web
185. 1 Watermark
186. 1 vulnerabilities
187. 1 Update
188. 1 typo
189. 1 Triaged
190. 1 TFX+Beam
191. 1 text
192. 1 tests
193. 1 test-patch
194. 1 testlabel
195. 1 test-infra
196. 1 test-framework
197. 1 tensorflow-datasets
198. 1 tensorflow
199. 1 T5
200. 1 streaming
201. 1 storage
202. 1 starer
203. 1 SSLException
204. 1 SQS
205. 1 sql-engine
206. 1 spring-boot
207. 1 spotbugs
208. 1 spark-streaming
209. 1 sparkrunner
210. 1 spam
211. 1 Snappy
212. 1 SLF4J
213. 1 sideinput
214. 1 shade
215. 1 SESSION
216. 1 session
217. 1 serialization
218. 1 serializable
219. 1 sdk-py-core
220. 1 sdk
221. 1 savepoints
222. 1 S3
223. 1 runner
224. 1 restful
225. 1 requirements
226. 1 rabbitmq
227. 1 quickstart
228. 1 python-sqltransform
229. 1 python-conversion
230. 1 python3
231. 1 precommit
232. 1 pom.xml
233. 1 Periodic
234. 1 Parquet
235. 1 parquet
236. 1 parameter
237. 1 P3
238. 1 p2
239. 1 p1
240. 1 oracle
241. 1 OOM
242. 1 on-hold
243. 1 offset
244. 1 Novice
245. 1 node.js
246. 1 newbie,
247. 1 n00b
248. 1 multi-threading
249. 1 mongo
250. 1 low-hanging-fruit
251. 1 logging,
252. 1 log-aggregation
253. 1 log4j
254. 1 log
255. 1 Learning
256. 1 label123
257. 1 kubernetes
258. 1 kotlin
259. 1 kafkaio
260. 1 jdbc_connector
261. 1 JavaDoc
262. 1 java9
263. 1 Java
264. 1 I/O
265. 1 hash
266. 1 Guava
267. 1 gsoc2021
268. 1 Grouping
269. 1 gradle-wrapper
270. 1 google-cloud-dataflow
271. 1 google
272. 1 github
273. 1 Flink
274. 1 flakey
275. 1 file-component
276. 1 fieldtype
277. 1 feature-request
278. 1 failed-test
279. 1 experimental
280. 1 examples
281. 1 eos
282. 1 elasticsearch
283. 1 Eclipse
284. 1 EaseOfUse
285. 1 duplicate
286. 1 Documentation
287. 1 docuentation
288. 1 doc_cleanup
289. 1 Doc
290. 1 doc
291. 1 dependencies
292. 1 cross-platform
293. 1 Couchbase
294. 1 contribution-guide
295. 1 compile-error
296. 1 codehealth
297. 1 ClassNotFoundException
298. 1 ClassCastException
299. 1 CI/CD
300. 1 ci-builds
301. 1 ci
302. 1 calcite
303. 1 C4
304. 1 c
305. 1 blog
306. 1 blocking
307. 1 Bigtable
308. 1 aws-s3
309. 1 aws
310. 1 auth
311. 1 apex-runner
312. 1 apache
313. 1 annotation
314. 1 2.2.0
On Thu, May 13, 2021 at 12:08 PM Kyle Weaver <kc...@google.com> wrote:
> It's a little cumbersome, but you can query JIRA and export a CSV with the
> labels, and run a script to count them. Also, it won't let you export
> results from a query with more than 1000 results.
>
> Here's the list from query "project = beam and created > startOfYear()"
>
> dataframe-api 45
> stale-P2 36
> currently-failing 31
> stale-assigned 31
> website-revamp-2020 28
> flake 27
> zetasql-java-udf 17
> portability-spark 6
> portability-flink 4
> test-failure 4
> starter 4
> MongoDB 3
> Python 3
> PubSubIO 3
> GCP 3
> pipeline-patterns 3
> newbie 2
> python 2
> PubsubLiteIO 2
> beam-fixit 2
> vulnerabilities 1
> documentation 1
> containers 1
> types 1
> mongo 1
> mongodb 1
> elasticsearch 1
> dataflow 1
> java 1
> Grouping 1
> Windowing 1
> Doc 1
> Learning 1
> ClassNotFoundException 1
> jdbc 1
> gcp 1
> pubsub 1
> pubsubio 1
> apache-beam 1
> ClassCastException 1
> JdbcIO 1
> MySQL 1
> easyfix 1
> pull-request-available 1
> gsoc 1
> gsoc2021 1
> mentor 1
> python-sqltransform 1
> OOM 1
> AWS 1
> multi-threading 1
> S3 1
> log4j 1
> log-aggregation 1
> "logging 1
> " 1
> SLF4J 1
> google-cloud-spanner 1
> kafka 1
> savepoints 1
> flaky-test 1
> website-revamp-sprint-12 1
> structured-streaming 1
> nexmark 1
>
>
> On Wed, May 12, 2021 at 3:10 PM Valentyn Tymofieiev <va...@google.com>
> wrote:
>
>> Is there a way to see the list of labels used in Beam ? I found a
>> discussion on using labels gadget and some SQL queries to pull the
>> labels[1], but did not find a way to use them - does anyone have hands-on
>> experience with any of these approaches? Does adding a gadget require PMC
>> privileges?
>>
>> Thanks!
>>
>> [1]
>> https://community.atlassian.com/t5/Jira-questions/Is-there-a-way-to-get-a-list-of-all-labels-being-used-in-a/qaq-p/344778
>>
>> On Mon, Mar 29, 2021 at 10:59 AM Kenneth Knowles <ke...@apache.org> wrote:
>>
>>> We are down to about 550.
>>>
>>> I randomly selected some long-time contributors who I am sure know about
>>> components and priorities well enough. There are 10-15 issues across a
>>> number of people. If these are already good, then it would close out a lot
>>> of them and help focus on the ones that need attention.
>>>
>>> This Jira search searches by "current user" so you should see the bugs
>>> that you have reported that are still marked as "Triage Needed". Take a
>>> quick look and if you are confident you got the components, priority,
>>> labels (especially "currently-failing" and "flake") then you could bulk
>>> edit them to "Open" status:
>>>
>>>
>>> https://issues.apache.org/jira/issues/?jql=project%20%3D%20BEAM%20AND%20status%20%3D%20%22Triage%20Needed%22%20AND%20reporter%20in%20(currentUser())
>>>
>>> Kenn
>>>
>>> On Mon, Mar 15, 2021 at 10:28 AM Tyson Hamilton <ty...@google.com>
>>> wrote:
>>>
>>>> There is a 'Triaged' button that I click:
>>>> https://photos.app.goo.gl/Ub5Qwnpp6aFrmaDZ9
>>>>
>>>> On Mon, Mar 15, 2021 at 9:48 AM Alex Amato <aj...@google.com> wrote:
>>>>
>>>>> (Do I need certain permissions to be able to do this?)
>>>>>
>>>>> On Mon, Mar 15, 2021 at 9:47 AM Alex Amato <aj...@google.com> wrote:
>>>>>
>>>>>> Would you mind posting a screenshot of exactly where you are supposed
>>>>>> to click to move a jira issue to "Open" status? I honestly can't find where
>>>>>> to click. I don't see the option in the edit dialog box
>>>>>>
>>>>>> On Sun, Mar 14, 2021 at 8:03 PM Kenneth Knowles <ke...@apache.org>
>>>>>> wrote:
>>>>>>
>>>>>>> No need for feeling any guilt :-)
>>>>>>>
>>>>>>> I'm just hoping that by everyone randomly doing a very small amount
>>>>>>> of work, this could be in good shape very quickly. I've done a number of
>>>>>>> bulk edits like automated dependency upgrade requests which brings the
>>>>>>> number down to just over 600.
>>>>>>>
>>>>>>> Your message does highlight some easy cases: issues filed to track
>>>>>>> your own feature work. I did built automation for this: "On Issue Created"
>>>>>>> -> "If Assignee == Issue Creator" -> "Transition to 'Open'". If the
>>>>>>> automation isn't working, that can probably be fixed. Some of the issues
>>>>>>> might just predate the automation.
>>>>>>>
>>>>>>> To be super clear: I don't mean to ask anyone to waste time looking
>>>>>>> at things that don't need attention, but to be able to notice things that
>>>>>>> do need attention. I did a few manually too, and the components, issue
>>>>>>> type, and priority very often need fixing up. I especially want to get
>>>>>>> untriaged P0s and P1s to zero.
>>>>>>>
>>>>>>> Kenn
>>>>>>>
>>>>>>> On Fri, Mar 12, 2021 at 5:07 PM Tyson Hamilton <ty...@google.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> I'm guilty of creating issues and not moving them to 'open'. I'll
>>>>>>>> do better to move them to open in the future. To recompense I will spend
>>>>>>>> some additional time triaging =)
>>>>>>>>
>>>>>>>> Thanks for the review of the flow.
>>>>>>>>
>>>>>>>> On Thu, Mar 11, 2021 at 12:39 PM Kenneth Knowles <ke...@apache.org>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Hi all,
>>>>>>>>>
>>>>>>>>> You may or may not think about this very often, but our Jira
>>>>>>>>> workflow goes like this:
>>>>>>>>>
>>>>>>>>> Needs Triage --> Open --> In Progress --> Resolved
>>>>>>>>>
>>>>>>>>> "Needs Triage" means someone needs to look at it briefly:
>>>>>>>>>
>>>>>>>>> - component(s)
>>>>>>>>> - label(s)
>>>>>>>>> - issue type
>>>>>>>>> - priority (see
>>>>>>>>> https://beam.apache.org/contribute/jira-priorities/)
>>>>>>>>> - if appropriate, ping someone or write to dev@ especially for
>>>>>>>>> P1 and P0
>>>>>>>>>
>>>>>>>>> Then transition the issue to "Open".
>>>>>>>>>
>>>>>>>>> Currently there is a big backlog but I don't think it is actually
>>>>>>>>> accurate. I also think we have enough people to keep up with this and even
>>>>>>>>> to eliminate the backlog pretty quick.
>>>>>>>>>
>>>>>>>>> Here are some things you can do when you are waiting for Jenkins
>>>>>>>>> tests to complete:
>>>>>>>>>
>>>>>>>>> - check your assigned issues
>>>>>>>>> - open up this filter and triage a couple issues at random:
>>>>>>>>> https://issues.apache.org/jira/issues/?filter=12345682
>>>>>>>>>
>>>>>>>>> 800+ may seem like a lot, but dev@ had 65 participants in the
>>>>>>>>> last 28 days (126 participants in the last 3 months). I would guess it
>>>>>>>>> averages less than a minute per issue so this could be done in less than a
>>>>>>>>> day, especially considering our CI times :-)
>>>>>>>>>
>>>>>>>>> Kenn
>>>>>>>>>
>>>>>>>>>
Re: Please help triage issues!
Posted by Valentyn Tymofieiev <va...@google.com>.
Re: using filters to query several labels:
I created a filter [1] for various labels that refer to 'flake' and are
still open. The query should be editable by committers. The filter is
accessible by a short link http://s.apache.org/beam-flakes
There is already a filter for starter tasks[2], accessible by
http://s.apache.org/beam-starter-tasks. Filter is currently not editable
but captures current open starter issues. There were other labels (simple,
noob, n00b, novice, Learning) not included in this filter, but
issues with these labels included one of the labels included in the query
or are fixed now.
Finding all relevant filters/hotlists is tricky: there appears to be at
least 3 different ways, each having some unique results:
- Filters with the word 'Beam' [3]
- Filters shared with 'Project' 'Beam' [4]
- Filters shared with 'Group' beam[5]
The links in 3-5 are not readily copyable from the jira URLs (I had to fish
the form IDs.).
As a best practice, perhaps we should add a word 'Beam' in the filter name,
given that the filter names are in global namespace, and make queries
editable by Beam committers. Then we could use a shortlink
http://s.apache.org/beam-filters (new link pointing to [5]) to look up
relevant filters, and modify queries as necessary.
To edit filter settings one can use a link like [6].
[1] https://issues.apache.org/jira/issues/?filter=12350929
[2] https://issues.apache.org/jira/issues/?filter=12343676
[3]
https://issues.apache.org/jira/secure/ManageFilters.jspa?search=search&searchName=Beam&searchShareType=any&roleShare=&returnUrl=ManageFilters.jspa&Search=Search
[4]
https://issues.apache.org/jira/secure/ManageFilters.jspa?search=search&searchName=&searchShareType=project&projectShare=12319527&roleShare=&returnUrl=ManageFilters.jspa&Search=Search&filterView=search
[5]
https://issues.apache.org/jira/secure/ManageFilters.jspa?search=search&searchName=&searchShareType=group&groupShare=beam&roleShare=&returnUrl=ManageFilters.jspa&Search=Search&filterView=search
[6]
https://issues.apache.org/jira/secure/EditFilter!default.jspa?filterId=12349495
On Wed, Sep 22, 2021 at 7:12 PM Valentyn Tymofieiev <va...@google.com>
wrote:
> Thanks, Kyle!
>
> Since adding labels is a part of possible triaging action, it would be
> good to increase their usefulness.
>
> Possible ideas:
> - Add a triaging recommendation page that shows commonly used labels that
> still make sense to use, add few word recommendation on when to use them if
> not obvious. It would be nice if this page was easily discoverable when
> users are looking for how to report issues.
> - Given that labels are arbitrary and we don't control them, we should
> create filters[1] and Kanban boards[2], which we control, and use them to
> look up triaged issues in the future. Filters can pull information from
> multiple labels. For example the flaky test filter[3] should pull issues
> with labels 'flake', 'flaky', 'flaky-test', 'flakey'... We may need to
> update filters periodically when synonymous labels start to appear.
> - Some filters/boards I'd like to see: issues suitable for a first-time
> PR, ease-of-use issues, filters for medium-size self-contained projects
> suitable for new contributors who are ramping up on a certain area of Beam.
> - Use components instead of labels when a identical component is available.
> - Maybe cleanup some of the labels if no longer useful. perhaps there is
> an automated way to do that easily.
>
> [1]
> https://issues.apache.org/jira/secure/ManageFilters.jspa?filterView=search
> [2] https://issues.apache.org/jira/secure/ManageRapidViews.jspa
> [3]
> https://issues.apache.org/jira/secure/RapidView.jspa?rapidView=464&tab=filter
>
> Here are some more labels I collected among all open and closed issues.
>
>
>
> 1. :jira$ cat ./get_all_labels.sh
> 2. for file_column in $(
> 3. for file in *.csv; do
> 4. cat $file | head -n1 | awk -F ',' -v fname="$file" '{for(c=1;c<=NF;c++) if($c=="Labels") print fname "," c }' ;
> 5. done
> 6. );
> 7. do
> 8. file=$(echo $file_column | cut -d',' -f1);
> 9. column=$(echo $file_column | cut -d',' -f2);
> 10. # cat $file | tail -n +2 | awk -F "\"*,\"*" '{print $'$colname'}'
> 11. csvtool format '%('$column')\n' $file | tail -n +2
> 12. done | grep -v "^\s*$" | sort | uniq -c | sort -h -r
> 13.
> 14.
> 15. :jira$ ./get_all_labels.sh
> 16. 1166 Done
> 17. 530 Clarified
> 18. 382 starter
> 19. 332 portability
> 20. 252 flake
> 21. 222 stale-assigned
> 22. 194 newbie
> 23. 185 currently-failing
> 24. 154 stale-P2
> 25. 128 portability-spark
> 26. 122 website-revamp-2020
> 27. 115 beam-fixit
> 28. 113 portability-flink
> 29. 111 backward-incompatible
> 30. 71 dataframe-api
> 31. 69 dsl_sql_merge
> 32. 60 easyfix
> 33. 49 triaged
> 34. 48 errorprone
> 35. 47 beginner
> 36. 39 documentation
> 37. 35 zetasql-compliance
> 38. 31 python
> 39. 31 flaky-test
> 40. 30 sickbay
> 41. 30 portability-samza
> 42. 28 gradle
> 43. 25 structured-streaming
> 44. 24 BeamSummitEU2019
> 45. 24 beamsummit
> 46. 22 zetasql-java-udf
> 47. 22 findbugs
> 48. 22 easy
> 49. 22 BeamSummitWebsite
> 50. 21 io
> 51. 21 dataflow
> 52. 20 performance
> 53. 20 nexmark
> 54. 20 features
> 55. 20 beamevents
> 56. 19 GCP
> 57. 19 dsl_sql_review
> 58. 16 types
> 59. 16 pipeline-patterns
> 60. 15 pull-request-available
> 61. 15 ccoss2019
> 62. 15 bigquery
> 63. 14 test
> 64. 14 simple
> 65. 14 beam-site-automation-reliability
> 66. 13 mentor
> 67. 13 gcp
> 68. 13 flaky
> 69. 12 sdk-consistency
> 70. 12 gsoc
> 71. 12 beam
> 72. 11 website-revamp-sprint-8
> 73. 10 java
> 74. 9 website-revamp-sprint-5
> 75. 9 schema-io
> 76. 8 website-revamp-sprint-4
> 77. 8 usability
> 78. 7 test-failure
> 79. 7 patch
> 80. 7 JdbcIO
> 81. 6 Windowing
> 82. 6 security
> 83. 6 mongodb
> 84. 6 KafkaIO
> 85. 6 infra
> 86. 6 gsoc2020
> 87. 6 gsoc2019
> 88. 6 flink
> 89. 6 build
> 90. 6 backwards-incompatible
> 91. 5 website-revamp-sprint-9
> 92. 5 website-revamp-sprint-6
> 93. 5 website-revamp-sprint-10
> 94. 5 Triggers
> 95. 5 State
> 96. 5 outreachy19dec
> 97. 5 jenkins
> 98. 5 google-cloud-spanner
> 99. 5 easy-fix
> 100. 5 datastore
> 101. 4 windowing
> 102. 4 website-revamp-sprint-3
> 103. 4 thrift
> 104. 4 SQL
> 105. 4 Python
> 106. 4 MongoDB
> 107. 4 java11
> 108. 4 infrastructure
> 109. 4 FlinkRunner
> 110. 4 community-metrics
> 111. 4 beam-website-sprint-2
> 112. 4 beamsummitsponsor
> 113. 4 apache-beam
> 114. 3 website-revamp-sprint-12
> 115. 3 website-revamp-sprint-11
> 116. 3 testing
> 117. 3 sql
> 118. 3 spark
> 119. 3 python-wheel
> 120. 3 PubSubIO
> 121. 3 pubsubio
> 122. 3 pubsub
> 123. 3 portable-metrics-bugs
> 124. 3 P2
> 125. 3 noob
> 126. 3 metrics
> 127. 3 maven
> 128. 3 kafka
> 129. 3 jdbc
> 130. 3 intellij
> 131. 3 GSoC2019
> 132. 3 google-dataflow
> 133. 3 document
> 134. 3 community-onboarding
> 135. 3 cloud
> 136. 3 CI
> 137. 3 bundle
> 138. 3 bug
> 139. 3 azureblob
> 140. 2 website-revamp-sprint-7
> 141. 2 watermark
> 142. 2 test-failures
> 143. 2 test-fail
> 144. 2 Starter
> 145. 2 split
> 146. 2 spark-runner
> 147. 2 schema
> 148. 2 release
> 149. 2 regression
> 150. 2 reference
> 151. 2 python-packages
> 152. 2 PubsubLiteIO
> 153. 2 perfomance
> 154. 2 parallel-deployment
> 155. 2 MySQL
> 156. 2 MQTT
> 157. 2 mitigated
> 158. 2 join
> 159. 2 Jenkins
> 160. 2 javadoc
> 161. 2 Java8
> 162. 2 Java11
> 163. 2 IO
> 164. 2 has-pr
> 165. 2 gsod2019
> 166. 2 gsod
> 167. 2 gsoc2018
> 168. 2 gsoc2017
> 169. 2 google-cloud-bigquery
> 170. 2 golang
> 171. 2 gcs
> 172. 2 documentaion
> 173. 2 docker
> 174. 2 dataflow-runner-v2
> 175. 2 containers
> 176. 2 cassandra
> 177. 2 buid
> 178. 2 blocking-postcommit
> 179. 2 bigdata
> 180. 2 azure
> 181. 2 AWS
> 182. 1 www
> 183. 1 windows
> 184. 1 web
> 185. 1 Watermark
> 186. 1 vulnerabilities
> 187. 1 Update
> 188. 1 typo
> 189. 1 Triaged
> 190. 1 TFX+Beam
> 191. 1 text
> 192. 1 tests
> 193. 1 test-patch
> 194. 1 testlabel
> 195. 1 test-infra
> 196. 1 test-framework
> 197. 1 tensorflow-datasets
> 198. 1 tensorflow
> 199. 1 T5
> 200. 1 streaming
> 201. 1 storage
> 202. 1 starer
> 203. 1 SSLException
> 204. 1 SQS
> 205. 1 sql-engine
> 206. 1 spring-boot
> 207. 1 spotbugs
> 208. 1 spark-streaming
> 209. 1 sparkrunner
> 210. 1 spam
> 211. 1 Snappy
> 212. 1 SLF4J
> 213. 1 sideinput
> 214. 1 shade
> 215. 1 SESSION
> 216. 1 session
> 217. 1 serialization
> 218. 1 serializable
> 219. 1 sdk-py-core
> 220. 1 sdk
> 221. 1 savepoints
> 222. 1 S3
> 223. 1 runner
> 224. 1 restful
> 225. 1 requirements
> 226. 1 rabbitmq
> 227. 1 quickstart
> 228. 1 python-sqltransform
> 229. 1 python-conversion
> 230. 1 python3
> 231. 1 precommit
> 232. 1 pom.xml
> 233. 1 Periodic
> 234. 1 Parquet
> 235. 1 parquet
> 236. 1 parameter
> 237. 1 P3
> 238. 1 p2
> 239. 1 p1
> 240. 1 oracle
> 241. 1 OOM
> 242. 1 on-hold
> 243. 1 offset
> 244. 1 Novice
> 245. 1 node.js
> 246. 1 newbie,
> 247. 1 n00b
> 248. 1 multi-threading
> 249. 1 mongo
> 250. 1 low-hanging-fruit
> 251. 1 logging,
> 252. 1 log-aggregation
> 253. 1 log4j
> 254. 1 log
> 255. 1 Learning
> 256. 1 label123
> 257. 1 kubernetes
> 258. 1 kotlin
> 259. 1 kafkaio
> 260. 1 jdbc_connector
> 261. 1 JavaDoc
> 262. 1 java9
> 263. 1 Java
> 264. 1 I/O
> 265. 1 hash
> 266. 1 Guava
> 267. 1 gsoc2021
> 268. 1 Grouping
> 269. 1 gradle-wrapper
> 270. 1 google-cloud-dataflow
> 271. 1 google
> 272. 1 github
> 273. 1 Flink
> 274. 1 flakey
> 275. 1 file-component
> 276. 1 fieldtype
> 277. 1 feature-request
> 278. 1 failed-test
> 279. 1 experimental
> 280. 1 examples
> 281. 1 eos
> 282. 1 elasticsearch
> 283. 1 Eclipse
> 284. 1 EaseOfUse
> 285. 1 duplicate
> 286. 1 Documentation
> 287. 1 docuentation
> 288. 1 doc_cleanup
> 289. 1 Doc
> 290. 1 doc
> 291. 1 dependencies
> 292. 1 cross-platform
> 293. 1 Couchbase
> 294. 1 contribution-guide
> 295. 1 compile-error
> 296. 1 codehealth
> 297. 1 ClassNotFoundException
> 298. 1 ClassCastException
> 299. 1 CI/CD
> 300. 1 ci-builds
> 301. 1 ci
> 302. 1 calcite
> 303. 1 C4
> 304. 1 c
> 305. 1 blog
> 306. 1 blocking
> 307. 1 Bigtable
> 308. 1 aws-s3
> 309. 1 aws
> 310. 1 auth
> 311. 1 apex-runner
> 312. 1 apache
> 313. 1 annotation
> 314. 1 2.2.0
>
>
> On Thu, May 13, 2021 at 12:08 PM Kyle Weaver <kc...@google.com> wrote:
>
>> It's a little cumbersome, but you can query JIRA and export a CSV with
>> the labels, and run a script to count them. Also, it won't let you export
>> results from a query with more than 1000 results.
>>
>> Here's the list from query "project = beam and created > startOfYear()"
>>
>> dataframe-api 45
>> stale-P2 36
>> currently-failing 31
>> stale-assigned 31
>> website-revamp-2020 28
>> flake 27
>> zetasql-java-udf 17
>> portability-spark 6
>> portability-flink 4
>> test-failure 4
>> starter 4
>> MongoDB 3
>> Python 3
>> PubSubIO 3
>> GCP 3
>> pipeline-patterns 3
>> newbie 2
>> python 2
>> PubsubLiteIO 2
>> beam-fixit 2
>> vulnerabilities 1
>> documentation 1
>> containers 1
>> types 1
>> mongo 1
>> mongodb 1
>> elasticsearch 1
>> dataflow 1
>> java 1
>> Grouping 1
>> Windowing 1
>> Doc 1
>> Learning 1
>> ClassNotFoundException 1
>> jdbc 1
>> gcp 1
>> pubsub 1
>> pubsubio 1
>> apache-beam 1
>> ClassCastException 1
>> JdbcIO 1
>> MySQL 1
>> easyfix 1
>> pull-request-available 1
>> gsoc 1
>> gsoc2021 1
>> mentor 1
>> python-sqltransform 1
>> OOM 1
>> AWS 1
>> multi-threading 1
>> S3 1
>> log4j 1
>> log-aggregation 1
>> "logging 1
>> " 1
>> SLF4J 1
>> google-cloud-spanner 1
>> kafka 1
>> savepoints 1
>> flaky-test 1
>> website-revamp-sprint-12 1
>> structured-streaming 1
>> nexmark 1
>>
>>
>> On Wed, May 12, 2021 at 3:10 PM Valentyn Tymofieiev <va...@google.com>
>> wrote:
>>
>>> Is there a way to see the list of labels used in Beam ? I found a
>>> discussion on using labels gadget and some SQL queries to pull the
>>> labels[1], but did not find a way to use them - does anyone have hands-on
>>> experience with any of these approaches? Does adding a gadget require PMC
>>> privileges?
>>>
>>> Thanks!
>>>
>>> [1]
>>> https://community.atlassian.com/t5/Jira-questions/Is-there-a-way-to-get-a-list-of-all-labels-being-used-in-a/qaq-p/344778
>>>
>>> On Mon, Mar 29, 2021 at 10:59 AM Kenneth Knowles <ke...@apache.org>
>>> wrote:
>>>
>>>> We are down to about 550.
>>>>
>>>> I randomly selected some long-time contributors who I am sure know
>>>> about components and priorities well enough. There are 10-15 issues across
>>>> a number of people. If these are already good, then it would close out a
>>>> lot of them and help focus on the ones that need attention.
>>>>
>>>> This Jira search searches by "current user" so you should see the bugs
>>>> that you have reported that are still marked as "Triage Needed". Take a
>>>> quick look and if you are confident you got the components, priority,
>>>> labels (especially "currently-failing" and "flake") then you could bulk
>>>> edit them to "Open" status:
>>>>
>>>>
>>>> https://issues.apache.org/jira/issues/?jql=project%20%3D%20BEAM%20AND%20status%20%3D%20%22Triage%20Needed%22%20AND%20reporter%20in%20(currentUser())
>>>>
>>>> Kenn
>>>>
>>>> On Mon, Mar 15, 2021 at 10:28 AM Tyson Hamilton <ty...@google.com>
>>>> wrote:
>>>>
>>>>> There is a 'Triaged' button that I click:
>>>>> https://photos.app.goo.gl/Ub5Qwnpp6aFrmaDZ9
>>>>>
>>>>> On Mon, Mar 15, 2021 at 9:48 AM Alex Amato <aj...@google.com> wrote:
>>>>>
>>>>>> (Do I need certain permissions to be able to do this?)
>>>>>>
>>>>>> On Mon, Mar 15, 2021 at 9:47 AM Alex Amato <aj...@google.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Would you mind posting a screenshot of exactly where you are
>>>>>>> supposed to click to move a jira issue to "Open" status? I honestly can't
>>>>>>> find where to click. I don't see the option in the edit dialog box
>>>>>>>
>>>>>>> On Sun, Mar 14, 2021 at 8:03 PM Kenneth Knowles <ke...@apache.org>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> No need for feeling any guilt :-)
>>>>>>>>
>>>>>>>> I'm just hoping that by everyone randomly doing a very small amount
>>>>>>>> of work, this could be in good shape very quickly. I've done a number of
>>>>>>>> bulk edits like automated dependency upgrade requests which brings the
>>>>>>>> number down to just over 600.
>>>>>>>>
>>>>>>>> Your message does highlight some easy cases: issues filed to track
>>>>>>>> your own feature work. I did built automation for this: "On Issue Created"
>>>>>>>> -> "If Assignee == Issue Creator" -> "Transition to 'Open'". If the
>>>>>>>> automation isn't working, that can probably be fixed. Some of the issues
>>>>>>>> might just predate the automation.
>>>>>>>>
>>>>>>>> To be super clear: I don't mean to ask anyone to waste time looking
>>>>>>>> at things that don't need attention, but to be able to notice things that
>>>>>>>> do need attention. I did a few manually too, and the components, issue
>>>>>>>> type, and priority very often need fixing up. I especially want to get
>>>>>>>> untriaged P0s and P1s to zero.
>>>>>>>>
>>>>>>>> Kenn
>>>>>>>>
>>>>>>>> On Fri, Mar 12, 2021 at 5:07 PM Tyson Hamilton <ty...@google.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> I'm guilty of creating issues and not moving them to 'open'. I'll
>>>>>>>>> do better to move them to open in the future. To recompense I will spend
>>>>>>>>> some additional time triaging =)
>>>>>>>>>
>>>>>>>>> Thanks for the review of the flow.
>>>>>>>>>
>>>>>>>>> On Thu, Mar 11, 2021 at 12:39 PM Kenneth Knowles <ke...@apache.org>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Hi all,
>>>>>>>>>>
>>>>>>>>>> You may or may not think about this very often, but our Jira
>>>>>>>>>> workflow goes like this:
>>>>>>>>>>
>>>>>>>>>> Needs Triage --> Open --> In Progress --> Resolved
>>>>>>>>>>
>>>>>>>>>> "Needs Triage" means someone needs to look at it briefly:
>>>>>>>>>>
>>>>>>>>>> - component(s)
>>>>>>>>>> - label(s)
>>>>>>>>>> - issue type
>>>>>>>>>> - priority (see
>>>>>>>>>> https://beam.apache.org/contribute/jira-priorities/)
>>>>>>>>>> - if appropriate, ping someone or write to dev@ especially for
>>>>>>>>>> P1 and P0
>>>>>>>>>>
>>>>>>>>>> Then transition the issue to "Open".
>>>>>>>>>>
>>>>>>>>>> Currently there is a big backlog but I don't think it is actually
>>>>>>>>>> accurate. I also think we have enough people to keep up with this and even
>>>>>>>>>> to eliminate the backlog pretty quick.
>>>>>>>>>>
>>>>>>>>>> Here are some things you can do when you are waiting for Jenkins
>>>>>>>>>> tests to complete:
>>>>>>>>>>
>>>>>>>>>> - check your assigned issues
>>>>>>>>>> - open up this filter and triage a couple issues at random:
>>>>>>>>>> https://issues.apache.org/jira/issues/?filter=12345682
>>>>>>>>>>
>>>>>>>>>> 800+ may seem like a lot, but dev@ had 65 participants in the
>>>>>>>>>> last 28 days (126 participants in the last 3 months). I would guess it
>>>>>>>>>> averages less than a minute per issue so this could be done in less than a
>>>>>>>>>> day, especially considering our CI times :-)
>>>>>>>>>>
>>>>>>>>>> Kenn
>>>>>>>>>>
>>>>>>>>>>