You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@joshua.apache.org by Henry Saputra <he...@gmail.com> on 2016/07/22 19:50:04 UTC
Re: Language Pack English-Japanese
HI Toshiki,
For this kind of discussion, let's have it in the dev@ list.
You can ask the question to dev@joshua.incubator.apache.org.
Thanks,
Henry
On Thu, Jul 21, 2016 at 9:46 PM, IGA Tosiki <ig...@gmail.com> wrote:
> Hi Matt,
>
> Thanks for your reply!
>
> I'm happy to read your mail, I want to help you Japanese-English language
> pack.
> And YES, I mean translation memories by TMS/XLIFF. But I may convert
> TMS to what you specified format.
>
> And also I knew English to Japanese is very difficult, but also I
> believe sample of English-Japanese language pack will attract many
> Japanese people to use Joshua.
>
> Regards,
> Toshiki
>
> 2016-07-22 12:42 GMT+09:00 Matt Post <po...@cs.jhu.edu>:
> > Hi,
> >
> > There is no Japanese--English language pack, but I would be happy to
> build one if you could help by pointing me to data. What we need is
> parallel data in the form of sentences that are translations of each other.
> If you have access to this or pointers to where I could find some, I would
> be happy to build it. There are likely standard datasets available; people
> like Graham Neubig (http://www.phontron.com) have been working on this
> for a while.
> >
> > What are TMS and LTIFF? Are you talking about translation memories?
> >
> > As a side note, translation between English and Japanese is very
> difficult and tends not to be very good. One approach that helps is
> translating from trees and forests. Joshua does not have this capability at
> the moment.
> >
> > Sincerely,
> > matt
> >
> >
> >> On Jul 21, 2016, at 11:28 PM, IGA Tosiki <ig...@gmail.com> wrote:
> >>
> >> Hi team,
> >>
> >> I got interest about Joshua, and language pack. I am Japanese, and I
> >> want to know around Japanese language pack.
> >>
> >> Is there any plan about building Japanese-English language pack?
> >> I believe TMS or LTIFF will usefull to building such language pack. I
> >> have many OSS based TMS between English-Japanese. Is there any path
> >> using TMX or LTIFF for input of Joshua language pack?
> >>
> >> Best regards,
> >> Toshiki Iga
> >
>
Re: Language Pack English-Japanese
Posted by Matt Post <po...@cs.jhu.edu>.
Hi IGA,
That would be great.
There is also this collection of data for English/Japanese translation. If you collect and prepare all of this, I can then either help you build a model, or build it myself.
http://www.phontron.com/japanese-translation-data.php
Sincerely,
Matt
> On Aug 5, 2016, at 5:22 AM, IGA Tosiki <ig...@gmail.com> wrote:
>
> Hi Matt,
>
> I can convert those XML en-ja pair into other format as you point, if
> you think the pairs are useful, and if you want to do so.
>
> Regards,
> Toshiki
>
> 2016-08-05 17:53 GMT+09:00 IGA Tosiki <ig...@gmail.com>:
>> Hi Matt,
>>
>> I can share my en-ja parallel data.
>>
>> https://osdn.jp/projects/blancofw/releases/52952
>>
>> It is pair that translation en to ja for Eclipse IDE menu and
>> messages. It is translated by human and also checked by human.
>>
>> Toshiki
>>
>> 2016-08-04 22:02 GMT+09:00 Matt Post <po...@cs.jhu.edu>:
>>> Hi Toshiki,
>>>
>>> Have you been able to gather any parallel data?
>>>
>>> matt
>>>
>>>
>>>> On Jul 22, 2016, at 3:50 PM, Henry Saputra <he...@gmail.com> wrote:
>>>>
>>>> HI Toshiki,
>>>>
>>>> For this kind of discussion, let's have it in the dev@ list.
>>>>
>>>> You can ask the question to dev@joshua.incubator.apache.org.
>>>>
>>>> Thanks,
>>>>
>>>> Henry
>>>>
>>>> On Thu, Jul 21, 2016 at 9:46 PM, IGA Tosiki <ig...@gmail.com> wrote:
>>>>
>>>>> Hi Matt,
>>>>>
>>>>> Thanks for your reply!
>>>>>
>>>>> I'm happy to read your mail, I want to help you Japanese-English language
>>>>> pack.
>>>>> And YES, I mean translation memories by TMS/XLIFF. But I may convert
>>>>> TMS to what you specified format.
>>>>>
>>>>> And also I knew English to Japanese is very difficult, but also I
>>>>> believe sample of English-Japanese language pack will attract many
>>>>> Japanese people to use Joshua.
>>>>>
>>>>> Regards,
>>>>> Toshiki
>>>>>
>>>>> 2016-07-22 12:42 GMT+09:00 Matt Post <po...@cs.jhu.edu>:
>>>>>> Hi,
>>>>>>
>>>>>> There is no Japanese--English language pack, but I would be happy to
>>>>> build one if you could help by pointing me to data. What we need is
>>>>> parallel data in the form of sentences that are translations of each other.
>>>>> If you have access to this or pointers to where I could find some, I would
>>>>> be happy to build it. There are likely standard datasets available; people
>>>>> like Graham Neubig (http://www.phontron.com) have been working on this
>>>>> for a while.
>>>>>>
>>>>>> What are TMS and LTIFF? Are you talking about translation memories?
>>>>>>
>>>>>> As a side note, translation between English and Japanese is very
>>>>> difficult and tends not to be very good. One approach that helps is
>>>>> translating from trees and forests. Joshua does not have this capability at
>>>>> the moment.
>>>>>>
>>>>>> Sincerely,
>>>>>> matt
>>>>>>
>>>>>>
>>>>>>> On Jul 21, 2016, at 11:28 PM, IGA Tosiki <ig...@gmail.com> wrote:
>>>>>>>
>>>>>>> Hi team,
>>>>>>>
>>>>>>> I got interest about Joshua, and language pack. I am Japanese, and I
>>>>>>> want to know around Japanese language pack.
>>>>>>>
>>>>>>> Is there any plan about building Japanese-English language pack?
>>>>>>> I believe TMS or LTIFF will usefull to building such language pack. I
>>>>>>> have many OSS based TMS between English-Japanese. Is there any path
>>>>>>> using TMX or LTIFF for input of Joshua language pack?
>>>>>>>
>>>>>>> Best regards,
>>>>>>> Toshiki Iga
>>>>>>
>>>>>
>>>
Re: Language Pack English-Japanese
Posted by IGA Tosiki <ig...@gmail.com>.
Hi Matt,
I can convert those XML en-ja pair into other format as you point, if
you think the pairs are useful, and if you want to do so.
Regards,
Toshiki
2016-08-05 17:53 GMT+09:00 IGA Tosiki <ig...@gmail.com>:
> Hi Matt,
>
> I can share my en-ja parallel data.
>
> https://osdn.jp/projects/blancofw/releases/52952
>
> It is pair that translation en to ja for Eclipse IDE menu and
> messages. It is translated by human and also checked by human.
>
> Toshiki
>
> 2016-08-04 22:02 GMT+09:00 Matt Post <po...@cs.jhu.edu>:
>> Hi Toshiki,
>>
>> Have you been able to gather any parallel data?
>>
>> matt
>>
>>
>>> On Jul 22, 2016, at 3:50 PM, Henry Saputra <he...@gmail.com> wrote:
>>>
>>> HI Toshiki,
>>>
>>> For this kind of discussion, let's have it in the dev@ list.
>>>
>>> You can ask the question to dev@joshua.incubator.apache.org.
>>>
>>> Thanks,
>>>
>>> Henry
>>>
>>> On Thu, Jul 21, 2016 at 9:46 PM, IGA Tosiki <ig...@gmail.com> wrote:
>>>
>>>> Hi Matt,
>>>>
>>>> Thanks for your reply!
>>>>
>>>> I'm happy to read your mail, I want to help you Japanese-English language
>>>> pack.
>>>> And YES, I mean translation memories by TMS/XLIFF. But I may convert
>>>> TMS to what you specified format.
>>>>
>>>> And also I knew English to Japanese is very difficult, but also I
>>>> believe sample of English-Japanese language pack will attract many
>>>> Japanese people to use Joshua.
>>>>
>>>> Regards,
>>>> Toshiki
>>>>
>>>> 2016-07-22 12:42 GMT+09:00 Matt Post <po...@cs.jhu.edu>:
>>>>> Hi,
>>>>>
>>>>> There is no Japanese--English language pack, but I would be happy to
>>>> build one if you could help by pointing me to data. What we need is
>>>> parallel data in the form of sentences that are translations of each other.
>>>> If you have access to this or pointers to where I could find some, I would
>>>> be happy to build it. There are likely standard datasets available; people
>>>> like Graham Neubig (http://www.phontron.com) have been working on this
>>>> for a while.
>>>>>
>>>>> What are TMS and LTIFF? Are you talking about translation memories?
>>>>>
>>>>> As a side note, translation between English and Japanese is very
>>>> difficult and tends not to be very good. One approach that helps is
>>>> translating from trees and forests. Joshua does not have this capability at
>>>> the moment.
>>>>>
>>>>> Sincerely,
>>>>> matt
>>>>>
>>>>>
>>>>>> On Jul 21, 2016, at 11:28 PM, IGA Tosiki <ig...@gmail.com> wrote:
>>>>>>
>>>>>> Hi team,
>>>>>>
>>>>>> I got interest about Joshua, and language pack. I am Japanese, and I
>>>>>> want to know around Japanese language pack.
>>>>>>
>>>>>> Is there any plan about building Japanese-English language pack?
>>>>>> I believe TMS or LTIFF will usefull to building such language pack. I
>>>>>> have many OSS based TMS between English-Japanese. Is there any path
>>>>>> using TMX or LTIFF for input of Joshua language pack?
>>>>>>
>>>>>> Best regards,
>>>>>> Toshiki Iga
>>>>>
>>>>
>>
Re: Language Pack English-Japanese
Posted by IGA Tosiki <ig...@gmail.com>.
Hi Matt,
I can share my en-ja parallel data.
https://osdn.jp/projects/blancofw/releases/52952
It is pair that translation en to ja for Eclipse IDE menu and
messages. It is translated by human and also checked by human.
Toshiki
2016-08-04 22:02 GMT+09:00 Matt Post <po...@cs.jhu.edu>:
> Hi Toshiki,
>
> Have you been able to gather any parallel data?
>
> matt
>
>
>> On Jul 22, 2016, at 3:50 PM, Henry Saputra <he...@gmail.com> wrote:
>>
>> HI Toshiki,
>>
>> For this kind of discussion, let's have it in the dev@ list.
>>
>> You can ask the question to dev@joshua.incubator.apache.org.
>>
>> Thanks,
>>
>> Henry
>>
>> On Thu, Jul 21, 2016 at 9:46 PM, IGA Tosiki <ig...@gmail.com> wrote:
>>
>>> Hi Matt,
>>>
>>> Thanks for your reply!
>>>
>>> I'm happy to read your mail, I want to help you Japanese-English language
>>> pack.
>>> And YES, I mean translation memories by TMS/XLIFF. But I may convert
>>> TMS to what you specified format.
>>>
>>> And also I knew English to Japanese is very difficult, but also I
>>> believe sample of English-Japanese language pack will attract many
>>> Japanese people to use Joshua.
>>>
>>> Regards,
>>> Toshiki
>>>
>>> 2016-07-22 12:42 GMT+09:00 Matt Post <po...@cs.jhu.edu>:
>>>> Hi,
>>>>
>>>> There is no Japanese--English language pack, but I would be happy to
>>> build one if you could help by pointing me to data. What we need is
>>> parallel data in the form of sentences that are translations of each other.
>>> If you have access to this or pointers to where I could find some, I would
>>> be happy to build it. There are likely standard datasets available; people
>>> like Graham Neubig (http://www.phontron.com) have been working on this
>>> for a while.
>>>>
>>>> What are TMS and LTIFF? Are you talking about translation memories?
>>>>
>>>> As a side note, translation between English and Japanese is very
>>> difficult and tends not to be very good. One approach that helps is
>>> translating from trees and forests. Joshua does not have this capability at
>>> the moment.
>>>>
>>>> Sincerely,
>>>> matt
>>>>
>>>>
>>>>> On Jul 21, 2016, at 11:28 PM, IGA Tosiki <ig...@gmail.com> wrote:
>>>>>
>>>>> Hi team,
>>>>>
>>>>> I got interest about Joshua, and language pack. I am Japanese, and I
>>>>> want to know around Japanese language pack.
>>>>>
>>>>> Is there any plan about building Japanese-English language pack?
>>>>> I believe TMS or LTIFF will usefull to building such language pack. I
>>>>> have many OSS based TMS between English-Japanese. Is there any path
>>>>> using TMX or LTIFF for input of Joshua language pack?
>>>>>
>>>>> Best regards,
>>>>> Toshiki Iga
>>>>
>>>
>
Re: Language Pack English-Japanese
Posted by Matt Post <po...@cs.jhu.edu>.
Hi Toshiki,
Have you been able to gather any parallel data?
matt
> On Jul 22, 2016, at 3:50 PM, Henry Saputra <he...@gmail.com> wrote:
>
> HI Toshiki,
>
> For this kind of discussion, let's have it in the dev@ list.
>
> You can ask the question to dev@joshua.incubator.apache.org.
>
> Thanks,
>
> Henry
>
> On Thu, Jul 21, 2016 at 9:46 PM, IGA Tosiki <ig...@gmail.com> wrote:
>
>> Hi Matt,
>>
>> Thanks for your reply!
>>
>> I'm happy to read your mail, I want to help you Japanese-English language
>> pack.
>> And YES, I mean translation memories by TMS/XLIFF. But I may convert
>> TMS to what you specified format.
>>
>> And also I knew English to Japanese is very difficult, but also I
>> believe sample of English-Japanese language pack will attract many
>> Japanese people to use Joshua.
>>
>> Regards,
>> Toshiki
>>
>> 2016-07-22 12:42 GMT+09:00 Matt Post <po...@cs.jhu.edu>:
>>> Hi,
>>>
>>> There is no Japanese--English language pack, but I would be happy to
>> build one if you could help by pointing me to data. What we need is
>> parallel data in the form of sentences that are translations of each other.
>> If you have access to this or pointers to where I could find some, I would
>> be happy to build it. There are likely standard datasets available; people
>> like Graham Neubig (http://www.phontron.com) have been working on this
>> for a while.
>>>
>>> What are TMS and LTIFF? Are you talking about translation memories?
>>>
>>> As a side note, translation between English and Japanese is very
>> difficult and tends not to be very good. One approach that helps is
>> translating from trees and forests. Joshua does not have this capability at
>> the moment.
>>>
>>> Sincerely,
>>> matt
>>>
>>>
>>>> On Jul 21, 2016, at 11:28 PM, IGA Tosiki <ig...@gmail.com> wrote:
>>>>
>>>> Hi team,
>>>>
>>>> I got interest about Joshua, and language pack. I am Japanese, and I
>>>> want to know around Japanese language pack.
>>>>
>>>> Is there any plan about building Japanese-English language pack?
>>>> I believe TMS or LTIFF will usefull to building such language pack. I
>>>> have many OSS based TMS between English-Japanese. Is there any path
>>>> using TMX or LTIFF for input of Joshua language pack?
>>>>
>>>> Best regards,
>>>> Toshiki Iga
>>>
>>