You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pulsar.apache.org by GitBox <gi...@apache.org> on 2022/09/23 03:28:29 UTC

[GitHub] [pulsar] tisonkun created a discussion: Lightweight Documentation Translation Solution

GitHub user tisonkun created a discussion: Lightweight Documentation Translation Solution

## Motivation

Three years ago we created the [pulsar-translation](https://github.com/apache/pulsar-translation) repository to try to handle documentation translation with Crowdin.

However, after three years, few (if not none) contents gets translated:

<img width="1728" alt="image" src="https://user-images.githubusercontent.com/18818196/191883024-498d920a-7397-4a2d-869c-e56e5ed98225.png">

As we migrate the official site to the new framework, several incompatibility issues occur between Crowdin and MDX files. Basically, MDX is far more fruitful to insert JSX code or widget, like:

```
:::tip

blablabla

:::
```

Crowdin treats `tip` here as a translatable item and mangles the result.

@urfreespace can have more inputs on this kind of issue that can break the website build. And after [a failed full build yesterday](https://github.com/apache/pulsar-site/actions/runs/3103382004) we already [redirect all "i18n" pages to the default language](https://github.com/apache/pulsar-site/blob/607cee490f9cfb8de6fce918204742a1b7704b78/content/.htaccess#L5-L6).

## Proposal

Crowdin is good for document workers that they will be familiar with. But most translation contributors should be developers for our project, just like [Flink](https://github.com/apache/flink/tree/master/docs/content.zh). For these people, Git is over Crowdin.

Also, Crowdin itself doesn't complete the whole i18n story. Technically, we use [Docusaurus's i18n functionality](https://docusaurus.io/docs/i18n/introduction#translation-files-location) and generate files under `i18n` folders from Crowdin inputs with [homemade scripts](https://github.com/apache/pulsar-site/blob/1d116a036ce26a3321b3089b8a5406cddebda777/site2/tools/build-site.sh#L40-L68).

So, we can use Docusaurus's i18n support barely as [how InLong does](https://github.com/apache/inlong-website/tree/master/i18n/zh-CN) to overcome these issues.

## Implementation

1. Remove the scaffolding integrating with Crowdin. cc @urfreespace 
2. Archive the pulsar-translation repo as it's no longer valid.
3. (Optionally) If we have a new initiative to do translation, follow the raw Docusaurus's i18n support. As I learned there's several translated content to pick up.

## Risk

No risk as these translations currently simply don't work.

## Appendix

Docusarus does provide [support to integrate with Crowdin and talk about MDX workaround](https://docusaurus.io/docs/i18n/crowdin#mdx-solutions). But it's less than awesome while we don't need Crowdin in the first place. Also, this workaround cannot resolve the case that we write descriptions in the MDX block, like prompts for Tabs.

cc @tuhaihe @urfreespace @michaeljmarshall @dave2wave

GitHub link: https://github.com/apache/pulsar/discussions/17810

----
This is an automatically sent email for dev@pulsar.apache.org.
To unsubscribe, please send an email to: dev-unsubscribe@pulsar.apache.org


[GitHub] [pulsar] tisonkun edited a discussion: Lightweight Documentation Translation Solution

Posted by GitBox <gi...@apache.org>.
GitHub user tisonkun edited a discussion: Lightweight Documentation Translation Solution

## Motivation

Three years ago we created the [pulsar-translation](https://github.com/apache/pulsar-translation) repository to try to handle documentation translation with Crowdin.

However, after three years, few (if not none) contents gets translated:

<img width="1728" alt="image" src="https://user-images.githubusercontent.com/18818196/191883024-498d920a-7397-4a2d-869c-e56e5ed98225.png">

As we migrate the official site to the new framework, several incompatibility issues occur between Crowdin and MDX files. Basically, MDX is far more fruitful to insert JSX code or widget, like:

```
:::tip

blablabla

:::
```

Crowdin treats `tip` here as a translatable item and mangles the result.

@urfreespace can have more inputs on this kind of issue that can break the website build. And after [a failed full build yesterday](https://github.com/apache/pulsar-site/actions/runs/3103382004) we already [redirect all "i18n" pages to the default language](https://github.com/apache/pulsar-site/blob/607cee490f9cfb8de6fce918204742a1b7704b78/content/.htaccess#L5-L6).

Generally, an initiative without further contributors and no maintainer to shepherd can be archived instead of leaving alone.

## Proposal

Crowdin is good for document workers that they will be familiar with. But most translation contributors should be developers for our project, just like [Flink](https://github.com/apache/flink/tree/master/docs/content.zh). For these people, Git is over Crowdin.

Also, Crowdin itself doesn't complete the whole i18n story. Technically, we use [Docusaurus's i18n functionality](https://docusaurus.io/docs/i18n/introduction#translation-files-location) and generate files under `i18n` folders from Crowdin inputs with [homemade scripts](https://github.com/apache/pulsar-site/blob/1d116a036ce26a3321b3089b8a5406cddebda777/site2/tools/build-site.sh#L40-L68).

So, we can use Docusaurus's i18n support barely as [how InLong does](https://github.com/apache/inlong-website/tree/master/i18n/zh-CN) to overcome these issues.

## Implementation

1. Remove the scaffolding integrating with Crowdin. cc @urfreespace 
2. Archive the pulsar-translation repo as it's no longer valid.
3. (Optionally) If we have a new initiative to do translation, follow the raw Docusaurus's i18n support. As I learned there's several translated content to pick up.

## Risk

No risk as these translations currently simply don't work.

## Appendix

Docusarus does provide [support to integrate with Crowdin and talk about MDX workaround](https://docusaurus.io/docs/i18n/crowdin#mdx-solutions). But it's less than awesome while we don't need Crowdin in the first place. Also, this workaround cannot resolve the case that we write descriptions in the MDX block, like prompts for Tabs.

cc @tuhaihe @urfreespace @michaeljmarshall @dave2wave

GitHub link: https://github.com/apache/pulsar/discussions/17810

----
This is an automatically sent email for dev@pulsar.apache.org.
To unsubscribe, please send an email to: dev-unsubscribe@pulsar.apache.org


[GitHub] [pulsar] tisonkun edited a comment on the discussion: Archive Crowdin based translation initiative

Posted by GitBox <gi...@apache.org>.
GitHub user tisonkun edited a comment on the discussion: Archive Crowdin based translation initiative

@tuhaihe @Anonymitaet @urfreespace Thanks for your feedback. To prevent spreading discussion to the next solution of translation, I update the topic as "Archive Crowdin-based translation initiative" since an initiative without further contributors and no maintainer to shepherd can be archived instead of left alone.

Whether CAT is a better choice or Crowdin can be added back later should be another topic. Now we archive a stalled initiative to avoid current build issues.

GitHub link: https://github.com/apache/pulsar/discussions/17810#discussioncomment-3738880

----
This is an automatically sent email for dev@pulsar.apache.org.
To unsubscribe, please send an email to: dev-unsubscribe@pulsar.apache.org


[GitHub] [pulsar] urfreespace added a comment to the discussion: Lightweight Documentation Translation Solution

Posted by GitBox <gi...@apache.org>.
GitHub user urfreespace added a comment to the discussion: Lightweight Documentation Translation Solution

Crowdin brings a lot of problems, sometimes, it breaks some tag structures in MDX, and these are even unpredictable, problems occur frequently. From a technical point of view, I think it is better to use git to manage translation documents and to participate in translation in the community. The people should be more developers, and `git` is also very friendly to them.

GitHub link: https://github.com/apache/pulsar/discussions/17810#discussioncomment-3731749

----
This is an automatically sent email for dev@pulsar.apache.org.
To unsubscribe, please send an email to: dev-unsubscribe@pulsar.apache.org


[GitHub] [pulsar] tuhaihe added a comment to the discussion: Lightweight Documentation Translation Solution

Posted by GitBox <gi...@apache.org>.
GitHub user tuhaihe added a comment to the discussion: Lightweight Documentation Translation Solution

> If Crowdin does not work well with Pulsar, it makes sense to choose another open-source CAT tool.
Yes, agree with this.

For the CAT tools, we can have some great candidate tools, such as:

- Weblate: https://weblate.org/en/
- Transifex: https://www.transifex.com

Both of them are adopted by some well-known open source projects. We can take them as references.

GitHub link: https://github.com/apache/pulsar/discussions/17810#discussioncomment-3731372

----
This is an automatically sent email for dev@pulsar.apache.org.
To unsubscribe, please send an email to: dev-unsubscribe@pulsar.apache.org


[GitHub] [pulsar] tisonkun edited a discussion: Archive Crowdin based translation initiative

Posted by GitBox <gi...@apache.org>.
GitHub user tisonkun edited a discussion: Archive Crowdin based translation initiative

## Motivation

Three years ago we created the [pulsar-translation](https://github.com/apache/pulsar-translation) repository to try to handle documentation translation with Crowdin.

However, after three years, few (if not none) contents gets translated:

> **NOTE**: I learned that we ever translated a few pages but since we bring all documents to the new website framework those contents lost and it needs one more turn to pick up all translations.

<img width="1728" alt="image" src="https://user-images.githubusercontent.com/18818196/191883024-498d920a-7397-4a2d-869c-e56e5ed98225.png">

As we migrate the official site to the new framework, several incompatibility issues occur between Crowdin and MDX files. Basically, MDX is far more fruitful to insert JSX code or widget, like:

```
:::tip

blablabla

:::
```

Crowdin treats `tip` here as a translatable item and mangles the result.

@urfreespace can have more inputs on this kind of issue that can break the website build. And after [a failed full build yesterday](https://github.com/apache/pulsar-site/actions/runs/3103382004) we already [redirect all "i18n" pages to the default language](https://github.com/apache/pulsar-site/blob/607cee490f9cfb8de6fce918204742a1b7704b78/content/.htaccess#L5-L6).

Generally, an initiative without further contributors and no maintainer to shepherd can be archived instead of leaving alone.

## Proposal

Crowdin is good for document workers that they will be familiar with. But most translation contributors should be developers for our project, just like [Flink](https://github.com/apache/flink/tree/master/docs/content.zh). For these people, Git is over Crowdin.

Also, Crowdin itself doesn't complete the whole i18n story. Technically, we use [Docusaurus's i18n functionality](https://docusaurus.io/docs/i18n/introduction#translation-files-location) and generate files under `i18n` folders from Crowdin inputs with [homemade scripts](https://github.com/apache/pulsar-site/blob/1d116a036ce26a3321b3089b8a5406cddebda777/site2/tools/build-site.sh#L40-L68).

So, we can use Docusaurus's i18n support barely as [how InLong does](https://github.com/apache/inlong-website/tree/master/i18n/zh-CN) to overcome these issues.

## Implementation

1. Remove the scaffolding integrating with Crowdin. cc @urfreespace 
2. Archive the pulsar-translation repo as it's no longer valid.
3. (Optionally) If we have a new initiative to do translation, follow the raw Docusaurus's i18n support.

## Risk

No risk as these translations currently simply don't work.

## Appendix

Docusarus does provide [support to integrate with Crowdin and talk about MDX workaround](https://docusaurus.io/docs/i18n/crowdin#mdx-solutions). But it's less than awesome while we don't need Crowdin in the first place. Also, this workaround cannot resolve the case that we write descriptions needing translation in the MDX block, like prompts for Tabs.

cc @tuhaihe @urfreespace @michaeljmarshall @dave2wave

GitHub link: https://github.com/apache/pulsar/discussions/17810

----
This is an automatically sent email for dev@pulsar.apache.org.
To unsubscribe, please send an email to: dev-unsubscribe@pulsar.apache.org


[GitHub] [pulsar] tisonkun edited a discussion: Lightweight Documentation Translation Solution

Posted by GitBox <gi...@apache.org>.
GitHub user tisonkun edited a discussion: Lightweight Documentation Translation Solution

## Motivation

Three years ago we created the [pulsar-translation](https://github.com/apache/pulsar-translation) repository to try to handle documentation translation with Crowdin.

However, after three years, few (if not none) contents gets translated:

> **NOTE**: I learned that we ever translated a few pages but since we bring all documents to the new website framework those contents lost and it needs one more turn to pick up all translations.

<img width="1728" alt="image" src="https://user-images.githubusercontent.com/18818196/191883024-498d920a-7397-4a2d-869c-e56e5ed98225.png">

As we migrate the official site to the new framework, several incompatibility issues occur between Crowdin and MDX files. Basically, MDX is far more fruitful to insert JSX code or widget, like:

```
:::tip

blablabla

:::
```

Crowdin treats `tip` here as a translatable item and mangles the result.

@urfreespace can have more inputs on this kind of issue that can break the website build. And after [a failed full build yesterday](https://github.com/apache/pulsar-site/actions/runs/3103382004) we already [redirect all "i18n" pages to the default language](https://github.com/apache/pulsar-site/blob/607cee490f9cfb8de6fce918204742a1b7704b78/content/.htaccess#L5-L6).

Generally, an initiative without further contributors and no maintainer to shepherd can be archived instead of leaving alone.

## Proposal

Crowdin is good for document workers that they will be familiar with. But most translation contributors should be developers for our project, just like [Flink](https://github.com/apache/flink/tree/master/docs/content.zh). For these people, Git is over Crowdin.

Also, Crowdin itself doesn't complete the whole i18n story. Technically, we use [Docusaurus's i18n functionality](https://docusaurus.io/docs/i18n/introduction#translation-files-location) and generate files under `i18n` folders from Crowdin inputs with [homemade scripts](https://github.com/apache/pulsar-site/blob/1d116a036ce26a3321b3089b8a5406cddebda777/site2/tools/build-site.sh#L40-L68).

So, we can use Docusaurus's i18n support barely as [how InLong does](https://github.com/apache/inlong-website/tree/master/i18n/zh-CN) to overcome these issues.

## Implementation

1. Remove the scaffolding integrating with Crowdin. cc @urfreespace 
2. Archive the pulsar-translation repo as it's no longer valid.
3. (Optionally) If we have a new initiative to do translation, follow the raw Docusaurus's i18n support.

## Risk

No risk as these translations currently simply don't work.

## Appendix

Docusarus does provide [support to integrate with Crowdin and talk about MDX workaround](https://docusaurus.io/docs/i18n/crowdin#mdx-solutions). But it's less than awesome while we don't need Crowdin in the first place. Also, this workaround cannot resolve the case that we write descriptions needing translation in the MDX block, like prompts for Tabs.

cc @tuhaihe @urfreespace @michaeljmarshall @dave2wave

GitHub link: https://github.com/apache/pulsar/discussions/17810

----
This is an automatically sent email for dev@pulsar.apache.org.
To unsubscribe, please send an email to: dev-unsubscribe@pulsar.apache.org


[GitHub] [pulsar] tuhaihe edited a comment on the discussion: Lightweight Documentation Translation Solution

Posted by GitBox <gi...@apache.org>.
GitHub user tuhaihe edited a comment on the discussion: Lightweight Documentation Translation Solution

> If Crowdin does not work well with Pulsar, it makes sense to choose another open-source CAT tool.

Yes, agree with this.

For the CAT tools, we can have some great candidate tools, such as:

- Weblate: https://weblate.org/en/
- Transifex: https://www.transifex.com

Both of them are adopted by some well-known open source projects. We can take them as references.

GitHub link: https://github.com/apache/pulsar/discussions/17810#discussioncomment-3731372

----
This is an automatically sent email for dev@pulsar.apache.org.
To unsubscribe, please send an email to: dev-unsubscribe@pulsar.apache.org


[GitHub] [pulsar] tisonkun edited a comment on the discussion: Archive Crowdin based translation initiative

Posted by GitBox <gi...@apache.org>.
GitHub user tisonkun edited a comment on the discussion: Archive Crowdin based translation initiative

@tuhaihe @Anonymitaet @urfreespace Thanks for your feedback. To prevent spreading discussion to the next solution of translation, I update the topic as "Archive Crowdin-based translation initiative" since an initiative without further contributors and no maintainer to shepherd can be archived instead of left alone.

Whether CAT is a better choice or Crowdin can be added back later should be another topic. We archive a stalled initiative to avoid current build issues.

GitHub link: https://github.com/apache/pulsar/discussions/17810#discussioncomment-3738880

----
This is an automatically sent email for dev@pulsar.apache.org.
To unsubscribe, please send an email to: dev-unsubscribe@pulsar.apache.org


[GitHub] [pulsar] comradekingu added a comment to the discussion: Archive Crowdin based translation initiative

Posted by GitBox <gi...@apache.org>.
GitHub user comradekingu added a comment to the discussion: Archive Crowdin based translation initiative

Par for the course when using Crowdin. There are no (?) successful libre software translation efforts on Transifex, but there are plenty on Weblate.

GitHub link: https://github.com/apache/pulsar/discussions/17810#discussioncomment-4200151

----
This is an automatically sent email for dev@pulsar.apache.org.
To unsubscribe, please send an email to: dev-unsubscribe@pulsar.apache.org


[GitHub] [pulsar] tisonkun edited a discussion: Lightweight Documentation Translation Solution

Posted by GitBox <gi...@apache.org>.
GitHub user tisonkun edited a discussion: Lightweight Documentation Translation Solution

## Motivation

Three years ago we created the [pulsar-translation](https://github.com/apache/pulsar-translation) repository to try to handle documentation translation with Crowdin.

However, after three years, few (if not none) contents gets translated:

<img width="1728" alt="image" src="https://user-images.githubusercontent.com/18818196/191883024-498d920a-7397-4a2d-869c-e56e5ed98225.png">

As we migrate the official site to the new framework, several incompatibility issues occur between Crowdin and MDX files. Basically, MDX is far more fruitful to insert JSX code or widget, like:

```
:::tip

blablabla

:::
```

Crowdin treats `tip` here as a translatable item and mangles the result.

@urfreespace can have more inputs on this kind of issue that can break the website build. And after [a failed full build yesterday](https://github.com/apache/pulsar-site/actions/runs/3103382004) we already [redirect all "i18n" pages to the default language](https://github.com/apache/pulsar-site/blob/607cee490f9cfb8de6fce918204742a1b7704b78/content/.htaccess#L5-L6).

Generally, an initiative without further contributors and no maintainer to shepherd can be archived instead of leaving alone.

## Proposal

Crowdin is good for document workers that they will be familiar with. But most translation contributors should be developers for our project, just like [Flink](https://github.com/apache/flink/tree/master/docs/content.zh). For these people, Git is over Crowdin.

Also, Crowdin itself doesn't complete the whole i18n story. Technically, we use [Docusaurus's i18n functionality](https://docusaurus.io/docs/i18n/introduction#translation-files-location) and generate files under `i18n` folders from Crowdin inputs with [homemade scripts](https://github.com/apache/pulsar-site/blob/1d116a036ce26a3321b3089b8a5406cddebda777/site2/tools/build-site.sh#L40-L68).

So, we can use Docusaurus's i18n support barely as [how InLong does](https://github.com/apache/inlong-website/tree/master/i18n/zh-CN) to overcome these issues.

## Implementation

1. Remove the scaffolding integrating with Crowdin. cc @urfreespace 
2. Archive the pulsar-translation repo as it's no longer valid.
3. (Optionally) If we have a new initiative to do translation, follow the raw Docusaurus's i18n support. As I learned there's several translated content to pick up.

## Risk

No risk as these translations currently simply don't work.

## Appendix

Docusarus does provide [support to integrate with Crowdin and talk about MDX workaround](https://docusaurus.io/docs/i18n/crowdin#mdx-solutions). But it's less than awesome while we don't need Crowdin in the first place. Also, this workaround cannot resolve the case that we write descriptions needing translation in the MDX block, like prompts for Tabs.

cc @tuhaihe @urfreespace @michaeljmarshall @dave2wave

GitHub link: https://github.com/apache/pulsar/discussions/17810

----
This is an automatically sent email for dev@pulsar.apache.org.
To unsubscribe, please send an email to: dev-unsubscribe@pulsar.apache.org


[GitHub] [pulsar] Anonymitaet added a comment to the discussion: Lightweight Documentation Translation Solution

Posted by GitBox <gi...@apache.org>.
GitHub user Anonymitaet added a comment to the discussion: Lightweight Documentation Translation Solution

@tisonkun thanks for raising this up!

If Crowdin does not work well with Pulsar, it makes sense to choose another open-source CAT tool rather than git/markdown since CAT can improve overall efficiency with various benefits/features like:

- Terminology management
- Workflow management
- Translation quality assurance
- Localisation automation
- Integration with a variety of CMSs and developer’s tools
- Translating via API
- Project management automation



GitHub link: https://github.com/apache/pulsar/discussions/17810#discussioncomment-3730319

----
This is an automatically sent email for dev@pulsar.apache.org.
To unsubscribe, please send an email to: dev-unsubscribe@pulsar.apache.org


[GitHub] [pulsar] tisonkun added a comment to the discussion: Archive Crowdin based translation initiative

Posted by GitBox <gi...@apache.org>.
GitHub user tisonkun added a comment to the discussion: Archive Crowdin based translation initiative

@tuhaihe @Anonymitaet @urfreespace Thanks for your feedback. To prevent spreading discussion to the next solution of translation, I update the topic as "Archive Crowdin-based translation initiative" since an initiative without further contributors and no maintainer to shepherd can be archived instead of left alone.

Whether CAT is a better choice or Crowdin can be added back later can be another topic. We archive a stalled initiative to avoid current build issues.

GitHub link: https://github.com/apache/pulsar/discussions/17810#discussioncomment-3738880

----
This is an automatically sent email for dev@pulsar.apache.org.
To unsubscribe, please send an email to: dev-unsubscribe@pulsar.apache.org