You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@joshua.apache.org by Henry Saputra <he...@gmail.com> on 2016/07/22 19:50:04 UTC

Re: Language Pack English-Japanese

HI Toshiki,

For this kind of discussion, let's have it in the dev@ list.

You can ask the question to dev@joshua.incubator.apache.org.

Thanks,

Henry

On Thu, Jul 21, 2016 at 9:46 PM, IGA Tosiki <ig...@gmail.com> wrote:

> Hi Matt,
>
> Thanks for your reply!
>
> I'm happy to read your mail, I want to help you Japanese-English language
> pack.
> And YES, I mean translation memories by TMS/XLIFF. But I may convert
> TMS to what you specified format.
>
> And also I knew English to Japanese is very difficult, but also I
> believe sample of English-Japanese language pack will attract many
> Japanese people to use Joshua.
>
> Regards,
> Toshiki
>
> 2016-07-22 12:42 GMT+09:00 Matt Post <po...@cs.jhu.edu>:
> > Hi,
> >
> > There is no Japanese--English language pack, but I would be happy to
> build one if you could help by pointing me to data. What we need is
> parallel data in the form of sentences that are translations of each other.
> If you have access to this or pointers to where I could find some, I would
> be happy to build it. There are likely standard datasets available; people
> like Graham Neubig (http://www.phontron.com) have been working on this
> for a while.
> >
> > What are TMS and LTIFF? Are you talking about translation memories?
> >
> > As a side note, translation between English and Japanese is very
> difficult and tends not to be very good. One approach that helps is
> translating from trees and forests. Joshua does not have this capability at
> the moment.
> >
> > Sincerely,
> > matt
> >
> >
> >> On Jul 21, 2016, at 11:28 PM, IGA Tosiki <ig...@gmail.com> wrote:
> >>
> >> Hi team,
> >>
> >> I got interest about Joshua, and language pack. I am Japanese, and I
> >> want to know around Japanese language pack.
> >>
> >> Is there any plan about building Japanese-English language pack?
> >> I believe TMS or LTIFF will usefull to building such language pack. I
> >> have many OSS based TMS between English-Japanese. Is there any path
> >> using TMX or LTIFF for input of Joshua language pack?
> >>
> >> Best regards,
> >> Toshiki Iga
> >
>

Re: Language Pack English-Japanese

Posted by Matt Post <po...@cs.jhu.edu>.
Hi IGA,

That would be great. 

There is also this collection of data for English/Japanese translation. If you collect and prepare all of this, I can then either help you build a model, or build it myself.

	http://www.phontron.com/japanese-translation-data.php

Sincerely,
Matt



> On Aug 5, 2016, at 5:22 AM, IGA Tosiki <ig...@gmail.com> wrote:
> 
> Hi Matt,
> 
> I can convert those XML en-ja pair into other format as you point, if
> you think the pairs are useful, and if you want to do so.
> 
> Regards,
> Toshiki
> 
> 2016-08-05 17:53 GMT+09:00 IGA Tosiki <ig...@gmail.com>:
>> Hi Matt,
>> 
>> I can share my en-ja parallel data.
>> 
>> https://osdn.jp/projects/blancofw/releases/52952
>> 
>> It is pair that translation en to ja for Eclipse IDE menu and
>> messages. It is translated by human and also checked by human.
>> 
>> Toshiki
>> 
>> 2016-08-04 22:02 GMT+09:00 Matt Post <po...@cs.jhu.edu>:
>>> Hi Toshiki,
>>> 
>>> Have you been able to gather any parallel data?
>>> 
>>> matt
>>> 
>>> 
>>>> On Jul 22, 2016, at 3:50 PM, Henry Saputra <he...@gmail.com> wrote:
>>>> 
>>>> HI Toshiki,
>>>> 
>>>> For this kind of discussion, let's have it in the dev@ list.
>>>> 
>>>> You can ask the question to dev@joshua.incubator.apache.org.
>>>> 
>>>> Thanks,
>>>> 
>>>> Henry
>>>> 
>>>> On Thu, Jul 21, 2016 at 9:46 PM, IGA Tosiki <ig...@gmail.com> wrote:
>>>> 
>>>>> Hi Matt,
>>>>> 
>>>>> Thanks for your reply!
>>>>> 
>>>>> I'm happy to read your mail, I want to help you Japanese-English language
>>>>> pack.
>>>>> And YES, I mean translation memories by TMS/XLIFF. But I may convert
>>>>> TMS to what you specified format.
>>>>> 
>>>>> And also I knew English to Japanese is very difficult, but also I
>>>>> believe sample of English-Japanese language pack will attract many
>>>>> Japanese people to use Joshua.
>>>>> 
>>>>> Regards,
>>>>> Toshiki
>>>>> 
>>>>> 2016-07-22 12:42 GMT+09:00 Matt Post <po...@cs.jhu.edu>:
>>>>>> Hi,
>>>>>> 
>>>>>> There is no Japanese--English language pack, but I would be happy to
>>>>> build one if you could help by pointing me to data. What we need is
>>>>> parallel data in the form of sentences that are translations of each other.
>>>>> If you have access to this or pointers to where I could find some, I would
>>>>> be happy to build it. There are likely standard datasets available; people
>>>>> like Graham Neubig (http://www.phontron.com) have been working on this
>>>>> for a while.
>>>>>> 
>>>>>> What are TMS and LTIFF? Are you talking about translation memories?
>>>>>> 
>>>>>> As a side note, translation between English and Japanese is very
>>>>> difficult and tends not to be very good. One approach that helps is
>>>>> translating from trees and forests. Joshua does not have this capability at
>>>>> the moment.
>>>>>> 
>>>>>> Sincerely,
>>>>>> matt
>>>>>> 
>>>>>> 
>>>>>>> On Jul 21, 2016, at 11:28 PM, IGA Tosiki <ig...@gmail.com> wrote:
>>>>>>> 
>>>>>>> Hi team,
>>>>>>> 
>>>>>>> I got interest about Joshua, and language pack. I am Japanese, and I
>>>>>>> want to know around Japanese language pack.
>>>>>>> 
>>>>>>> Is there any plan about building Japanese-English language pack?
>>>>>>> I believe TMS or LTIFF will usefull to building such language pack. I
>>>>>>> have many OSS based TMS between English-Japanese. Is there any path
>>>>>>> using TMX or LTIFF for input of Joshua language pack?
>>>>>>> 
>>>>>>> Best regards,
>>>>>>> Toshiki Iga
>>>>>> 
>>>>> 
>>> 


Re: Language Pack English-Japanese

Posted by IGA Tosiki <ig...@gmail.com>.
Hi Matt,

I can convert those XML en-ja pair into other format as you point, if
you think the pairs are useful, and if you want to do so.

Regards,
Toshiki

2016-08-05 17:53 GMT+09:00 IGA Tosiki <ig...@gmail.com>:
> Hi Matt,
>
> I can share my en-ja parallel data.
>
> https://osdn.jp/projects/blancofw/releases/52952
>
> It is pair that translation en to ja for Eclipse IDE menu and
> messages. It is translated by human and also checked by human.
>
> Toshiki
>
> 2016-08-04 22:02 GMT+09:00 Matt Post <po...@cs.jhu.edu>:
>> Hi Toshiki,
>>
>> Have you been able to gather any parallel data?
>>
>> matt
>>
>>
>>> On Jul 22, 2016, at 3:50 PM, Henry Saputra <he...@gmail.com> wrote:
>>>
>>> HI Toshiki,
>>>
>>> For this kind of discussion, let's have it in the dev@ list.
>>>
>>> You can ask the question to dev@joshua.incubator.apache.org.
>>>
>>> Thanks,
>>>
>>> Henry
>>>
>>> On Thu, Jul 21, 2016 at 9:46 PM, IGA Tosiki <ig...@gmail.com> wrote:
>>>
>>>> Hi Matt,
>>>>
>>>> Thanks for your reply!
>>>>
>>>> I'm happy to read your mail, I want to help you Japanese-English language
>>>> pack.
>>>> And YES, I mean translation memories by TMS/XLIFF. But I may convert
>>>> TMS to what you specified format.
>>>>
>>>> And also I knew English to Japanese is very difficult, but also I
>>>> believe sample of English-Japanese language pack will attract many
>>>> Japanese people to use Joshua.
>>>>
>>>> Regards,
>>>> Toshiki
>>>>
>>>> 2016-07-22 12:42 GMT+09:00 Matt Post <po...@cs.jhu.edu>:
>>>>> Hi,
>>>>>
>>>>> There is no Japanese--English language pack, but I would be happy to
>>>> build one if you could help by pointing me to data. What we need is
>>>> parallel data in the form of sentences that are translations of each other.
>>>> If you have access to this or pointers to where I could find some, I would
>>>> be happy to build it. There are likely standard datasets available; people
>>>> like Graham Neubig (http://www.phontron.com) have been working on this
>>>> for a while.
>>>>>
>>>>> What are TMS and LTIFF? Are you talking about translation memories?
>>>>>
>>>>> As a side note, translation between English and Japanese is very
>>>> difficult and tends not to be very good. One approach that helps is
>>>> translating from trees and forests. Joshua does not have this capability at
>>>> the moment.
>>>>>
>>>>> Sincerely,
>>>>> matt
>>>>>
>>>>>
>>>>>> On Jul 21, 2016, at 11:28 PM, IGA Tosiki <ig...@gmail.com> wrote:
>>>>>>
>>>>>> Hi team,
>>>>>>
>>>>>> I got interest about Joshua, and language pack. I am Japanese, and I
>>>>>> want to know around Japanese language pack.
>>>>>>
>>>>>> Is there any plan about building Japanese-English language pack?
>>>>>> I believe TMS or LTIFF will usefull to building such language pack. I
>>>>>> have many OSS based TMS between English-Japanese. Is there any path
>>>>>> using TMX or LTIFF for input of Joshua language pack?
>>>>>>
>>>>>> Best regards,
>>>>>> Toshiki Iga
>>>>>
>>>>
>>

Re: Language Pack English-Japanese

Posted by IGA Tosiki <ig...@gmail.com>.
Hi Matt,

I can share my en-ja parallel data.

https://osdn.jp/projects/blancofw/releases/52952

It is pair that translation en to ja for Eclipse IDE menu and
messages. It is translated by human and also checked by human.

Toshiki

2016-08-04 22:02 GMT+09:00 Matt Post <po...@cs.jhu.edu>:
> Hi Toshiki,
>
> Have you been able to gather any parallel data?
>
> matt
>
>
>> On Jul 22, 2016, at 3:50 PM, Henry Saputra <he...@gmail.com> wrote:
>>
>> HI Toshiki,
>>
>> For this kind of discussion, let's have it in the dev@ list.
>>
>> You can ask the question to dev@joshua.incubator.apache.org.
>>
>> Thanks,
>>
>> Henry
>>
>> On Thu, Jul 21, 2016 at 9:46 PM, IGA Tosiki <ig...@gmail.com> wrote:
>>
>>> Hi Matt,
>>>
>>> Thanks for your reply!
>>>
>>> I'm happy to read your mail, I want to help you Japanese-English language
>>> pack.
>>> And YES, I mean translation memories by TMS/XLIFF. But I may convert
>>> TMS to what you specified format.
>>>
>>> And also I knew English to Japanese is very difficult, but also I
>>> believe sample of English-Japanese language pack will attract many
>>> Japanese people to use Joshua.
>>>
>>> Regards,
>>> Toshiki
>>>
>>> 2016-07-22 12:42 GMT+09:00 Matt Post <po...@cs.jhu.edu>:
>>>> Hi,
>>>>
>>>> There is no Japanese--English language pack, but I would be happy to
>>> build one if you could help by pointing me to data. What we need is
>>> parallel data in the form of sentences that are translations of each other.
>>> If you have access to this or pointers to where I could find some, I would
>>> be happy to build it. There are likely standard datasets available; people
>>> like Graham Neubig (http://www.phontron.com) have been working on this
>>> for a while.
>>>>
>>>> What are TMS and LTIFF? Are you talking about translation memories?
>>>>
>>>> As a side note, translation between English and Japanese is very
>>> difficult and tends not to be very good. One approach that helps is
>>> translating from trees and forests. Joshua does not have this capability at
>>> the moment.
>>>>
>>>> Sincerely,
>>>> matt
>>>>
>>>>
>>>>> On Jul 21, 2016, at 11:28 PM, IGA Tosiki <ig...@gmail.com> wrote:
>>>>>
>>>>> Hi team,
>>>>>
>>>>> I got interest about Joshua, and language pack. I am Japanese, and I
>>>>> want to know around Japanese language pack.
>>>>>
>>>>> Is there any plan about building Japanese-English language pack?
>>>>> I believe TMS or LTIFF will usefull to building such language pack. I
>>>>> have many OSS based TMS between English-Japanese. Is there any path
>>>>> using TMX or LTIFF for input of Joshua language pack?
>>>>>
>>>>> Best regards,
>>>>> Toshiki Iga
>>>>
>>>
>

Re: Language Pack English-Japanese

Posted by Matt Post <po...@cs.jhu.edu>.
Hi Toshiki,

Have you been able to gather any parallel data?

matt


> On Jul 22, 2016, at 3:50 PM, Henry Saputra <he...@gmail.com> wrote:
> 
> HI Toshiki,
> 
> For this kind of discussion, let's have it in the dev@ list.
> 
> You can ask the question to dev@joshua.incubator.apache.org.
> 
> Thanks,
> 
> Henry
> 
> On Thu, Jul 21, 2016 at 9:46 PM, IGA Tosiki <ig...@gmail.com> wrote:
> 
>> Hi Matt,
>> 
>> Thanks for your reply!
>> 
>> I'm happy to read your mail, I want to help you Japanese-English language
>> pack.
>> And YES, I mean translation memories by TMS/XLIFF. But I may convert
>> TMS to what you specified format.
>> 
>> And also I knew English to Japanese is very difficult, but also I
>> believe sample of English-Japanese language pack will attract many
>> Japanese people to use Joshua.
>> 
>> Regards,
>> Toshiki
>> 
>> 2016-07-22 12:42 GMT+09:00 Matt Post <po...@cs.jhu.edu>:
>>> Hi,
>>> 
>>> There is no Japanese--English language pack, but I would be happy to
>> build one if you could help by pointing me to data. What we need is
>> parallel data in the form of sentences that are translations of each other.
>> If you have access to this or pointers to where I could find some, I would
>> be happy to build it. There are likely standard datasets available; people
>> like Graham Neubig (http://www.phontron.com) have been working on this
>> for a while.
>>> 
>>> What are TMS and LTIFF? Are you talking about translation memories?
>>> 
>>> As a side note, translation between English and Japanese is very
>> difficult and tends not to be very good. One approach that helps is
>> translating from trees and forests. Joshua does not have this capability at
>> the moment.
>>> 
>>> Sincerely,
>>> matt
>>> 
>>> 
>>>> On Jul 21, 2016, at 11:28 PM, IGA Tosiki <ig...@gmail.com> wrote:
>>>> 
>>>> Hi team,
>>>> 
>>>> I got interest about Joshua, and language pack. I am Japanese, and I
>>>> want to know around Japanese language pack.
>>>> 
>>>> Is there any plan about building Japanese-English language pack?
>>>> I believe TMS or LTIFF will usefull to building such language pack. I
>>>> have many OSS based TMS between English-Japanese. Is there any path
>>>> using TMX or LTIFF for input of Joshua language pack?
>>>> 
>>>> Best regards,
>>>> Toshiki Iga
>>> 
>>