You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@apex.apache.org by "Mukkamula, Suryavamshivardhan (CWM-NR)" <su...@rbc.com> on 2016/07/06 20:26:26 UTC

Multiple directories

Hi,

Can you please let me know, How would I add multiple directories to an Operator which extends 'AbstractFileInputOperator'?

I would like to read from multiple directories by a single operator by selecting multiple files using 'filePatternRegExp'.

Regards,
Surya Vamshi

_______________________________________________________________________
If you received this email in error, please advise the sender (by return email or otherwise) immediately. You have consented to receive the attached electronically at the above-noted email address; please retain a copy of this confirmation for future reference.  

Si vous recevez ce courriel par erreur, veuillez en aviser l'expéditeur immédiatement, par retour de courriel ou par un autre moyen. Vous avez accepté de recevoir le(s) document(s) ci-joint(s) par voie électronique à l'adresse courriel indiquée ci-dessus; veuillez conserver une copie de cette confirmation pour les fins de reference future.

Re: Multiple directories

Posted by ganesh borate <ga...@gmail.com>.
I request you sir for the being I don't want this kind of mails which is
related to your topic
  So please stop sending mail
  In future if I need your help about Apache apex or Hadoop I will contact
you but for now please stop sending mails
On 7 Jul 2016 1:56 a.m., "Mukkamula, Suryavamshivardhan (CWM-NR)" <
suryavamshivardhan.mukkamula@rbc.com> wrote:

> Hi,
>
> Can you please let me know, How would I add multiple directories to an
> Operator which extends ‘AbstractFileInputOperator’?
>
> I would like to read from multiple directories by a single operator by
> selecting multiple files using ‘filePatternRegExp’.
>
> Regards,
> Surya Vamshi
>
>
> _______________________________________________________________________
>
> If you received this email in error, please advise the sender (by return
> email or otherwise) immediately. You have consented to receive the attached
> electronically at the above-noted email address; please retain a copy of
> this confirmation for future reference.
>
> Si vous recevez ce courriel par erreur, veuillez en aviser l'expéditeur
> immédiatement, par retour de courriel ou par un autre moyen. Vous avez
> accepté de recevoir le(s) document(s) ci-joint(s) par voie électronique à
> l'adresse courriel indiquée ci-dessus; veuillez conserver une copie de
> cette confirmation pour les fins de reference future.
>
>

Re: Multiple directories

Posted by Priyanka Gugale <pr...@datatorrent.com>.
In fact you can take a look at FSInputModule in Malhar. This module will
filter as well as read all file blocks (files matching regex from all input
directories) for you. Check this json based application for sample usage:
https://github.com/apache/apex-malhar/tree/master/apps/filecopy

Only catch is as I understand from previous mail communication from you,
you need to read rectories serially i.e. one after the other, this is not
directly supported but can you check if you can modify sanner thread in
FileSplitterIunput to serve your needs.

-Priyanka



On Fri, Jul 8, 2016 at 11:43 AM, Priyanka Gugale <pr...@datatorrent.com>
wrote:

> Hi,
>
> Take a look at TimeBasedDirectoryScanner in FileSplitterInput, this
> scanner accepts list of files/directories to scan. Also it accepts regex to
> filter on file names. I think you can pick ides on how to scan multiple
> directories from there.
>
> -Priyanka
>
> On Thu, Jul 7, 2016 at 6:59 PM, Mukkamula, Suryavamshivardhan (CWM-NR) <
> suryavamshivardhan.mukkamula@rbc.com> wrote:
>
>> Hi Yunhan,
>>
>>
>>
>> This example I am already using for reading the data from multiple
>> directories in parallel. Hear each directory is given to an operator in
>> parallel.
>>
>>
>>
>> My requirement is I would like add multiple directories to a single
>> operator.
>>
>>
>>
>> Regards,
>>
>> Surya Vamshi
>>
>>
>>
>> *From:* Yunhan Wang [mailto:yunhan@datatorrent.com]
>> *Sent:* 2016, July, 06 4:37 PM
>> *To:* users@apex.apache.org
>> *Subject:* Re: Multiple directories
>>
>>
>>
>> Hi Surya,
>>
>>
>>
>> Please check our fileIO-multiDir example.
>> https://github.com/DataTorrent/examples/tree/master/tutorials/fileIO-multiDir
>> .
>>
>> Hope this can help.
>>
>>
>>
>> Thanks,
>>
>> Yunhan
>>
>>
>>
>> On Wed, Jul 6, 2016 at 1:26 PM, Mukkamula, Suryavamshivardhan (CWM-NR) <
>> suryavamshivardhan.mukkamula@rbc.com> wrote:
>>
>> Hi,
>>
>>
>>
>> Can you please let me know, How would I add multiple directories to an
>> Operator which extends ‘AbstractFileInputOperator’?
>>
>>
>>
>> I would like to read from multiple directories by a single operator by
>> selecting multiple files using ‘filePatternRegExp’.
>>
>>
>>
>> Regards,
>>
>> Surya Vamshi
>>
>>
>>
>> _______________________________________________________________________
>>
>> If you received this email in error, please advise the sender (by return
>> email or otherwise) immediately. You have consented to receive the attached
>> electronically at the above-noted email address; please retain a copy of
>> this confirmation for future reference.
>>
>> Si vous recevez ce courriel par erreur, veuillez en aviser l'expéditeur
>> immédiatement, par retour de courriel ou par un autre moyen. Vous avez
>> accepté de recevoir le(s) document(s) ci-joint(s) par voie électronique à
>> l'adresse courriel indiquée ci-dessus; veuillez conserver une copie de
>> cette confirmation pour les fins de reference future.
>>
>>
>>
>> _______________________________________________________________________
>>
>> If you received this email in error, please advise the sender (by return
>> email or otherwise) immediately. You have consented to receive the attached
>> electronically at the above-noted email address; please retain a copy of
>> this confirmation for future reference.
>>
>> Si vous recevez ce courriel par erreur, veuillez en aviser l'expéditeur
>> immédiatement, par retour de courriel ou par un autre moyen. Vous avez
>> accepté de recevoir le(s) document(s) ci-joint(s) par voie électronique à
>> l'adresse courriel indiquée ci-dessus; veuillez conserver une copie de
>> cette confirmation pour les fins de reference future.
>>
>>
>

Re: Multiple directories

Posted by Priyanka Gugale <pr...@datatorrent.com>.
Hi,

Take a look at TimeBasedDirectoryScanner in FileSplitterInput, this scanner
accepts list of files/directories to scan. Also it accepts regex to filter
on file names. I think you can pick ides on how to scan multiple
directories from there.

-Priyanka

On Thu, Jul 7, 2016 at 6:59 PM, Mukkamula, Suryavamshivardhan (CWM-NR) <
suryavamshivardhan.mukkamula@rbc.com> wrote:

> Hi Yunhan,
>
>
>
> This example I am already using for reading the data from multiple
> directories in parallel. Hear each directory is given to an operator in
> parallel.
>
>
>
> My requirement is I would like add multiple directories to a single
> operator.
>
>
>
> Regards,
>
> Surya Vamshi
>
>
>
> *From:* Yunhan Wang [mailto:yunhan@datatorrent.com]
> *Sent:* 2016, July, 06 4:37 PM
> *To:* users@apex.apache.org
> *Subject:* Re: Multiple directories
>
>
>
> Hi Surya,
>
>
>
> Please check our fileIO-multiDir example.
> https://github.com/DataTorrent/examples/tree/master/tutorials/fileIO-multiDir
> .
>
> Hope this can help.
>
>
>
> Thanks,
>
> Yunhan
>
>
>
> On Wed, Jul 6, 2016 at 1:26 PM, Mukkamula, Suryavamshivardhan (CWM-NR) <
> suryavamshivardhan.mukkamula@rbc.com> wrote:
>
> Hi,
>
>
>
> Can you please let me know, How would I add multiple directories to an
> Operator which extends ‘AbstractFileInputOperator’?
>
>
>
> I would like to read from multiple directories by a single operator by
> selecting multiple files using ‘filePatternRegExp’.
>
>
>
> Regards,
>
> Surya Vamshi
>
>
>
> _______________________________________________________________________
>
> If you received this email in error, please advise the sender (by return
> email or otherwise) immediately. You have consented to receive the attached
> electronically at the above-noted email address; please retain a copy of
> this confirmation for future reference.
>
> Si vous recevez ce courriel par erreur, veuillez en aviser l'expéditeur
> immédiatement, par retour de courriel ou par un autre moyen. Vous avez
> accepté de recevoir le(s) document(s) ci-joint(s) par voie électronique à
> l'adresse courriel indiquée ci-dessus; veuillez conserver une copie de
> cette confirmation pour les fins de reference future.
>
>
>
> _______________________________________________________________________
>
> If you received this email in error, please advise the sender (by return
> email or otherwise) immediately. You have consented to receive the attached
> electronically at the above-noted email address; please retain a copy of
> this confirmation for future reference.
>
> Si vous recevez ce courriel par erreur, veuillez en aviser l'expéditeur
> immédiatement, par retour de courriel ou par un autre moyen. Vous avez
> accepté de recevoir le(s) document(s) ci-joint(s) par voie électronique à
> l'adresse courriel indiquée ci-dessus; veuillez conserver une copie de
> cette confirmation pour les fins de reference future.
>
>

RE: Multiple directories

Posted by "Mukkamula, Suryavamshivardhan (CWM-NR)" <su...@rbc.com>.
Hi Yunhan,

This example I am already using for reading the data from multiple directories in parallel. Hear each directory is given to an operator in parallel.

My requirement is I would like add multiple directories to a single operator.

Regards,
Surya Vamshi

From: Yunhan Wang [mailto:yunhan@datatorrent.com]
Sent: 2016, July, 06 4:37 PM
To: users@apex.apache.org
Subject: Re: Multiple directories

Hi Surya,

Please check our fileIO-multiDir example. https://github.com/DataTorrent/examples/tree/master/tutorials/fileIO-multiDir.
Hope this can help.

Thanks,
Yunhan

On Wed, Jul 6, 2016 at 1:26 PM, Mukkamula, Suryavamshivardhan (CWM-NR) <su...@rbc.com>> wrote:
Hi,

Can you please let me know, How would I add multiple directories to an Operator which extends ‘AbstractFileInputOperator’?

I would like to read from multiple directories by a single operator by selecting multiple files using ‘filePatternRegExp’.

Regards,
Surya Vamshi


_______________________________________________________________________

If you received this email in error, please advise the sender (by return email or otherwise) immediately. You have consented to receive the attached electronically at the above-noted email address; please retain a copy of this confirmation for future reference.

Si vous recevez ce courriel par erreur, veuillez en aviser l'expéditeur immédiatement, par retour de courriel ou par un autre moyen. Vous avez accepté de recevoir le(s) document(s) ci-joint(s) par voie électronique à l'adresse courriel indiquée ci-dessus; veuillez conserver une copie de cette confirmation pour les fins de reference future.

_______________________________________________________________________
If you received this email in error, please advise the sender (by return email or otherwise) immediately. You have consented to receive the attached electronically at the above-noted email address; please retain a copy of this confirmation for future reference.  

Si vous recevez ce courriel par erreur, veuillez en aviser l'expéditeur immédiatement, par retour de courriel ou par un autre moyen. Vous avez accepté de recevoir le(s) document(s) ci-joint(s) par voie électronique à l'adresse courriel indiquée ci-dessus; veuillez conserver une copie de cette confirmation pour les fins de reference future.

Re: Multiple directories

Posted by Yunhan Wang <yu...@datatorrent.com>.
Hi Surya,

Please check our fileIO-multiDir example.
https://github.com/DataTorrent/examples/tree/master/tutorials/fileIO-multiDir
.
Hope this can help.

Thanks,
Yunhan

On Wed, Jul 6, 2016 at 1:26 PM, Mukkamula, Suryavamshivardhan (CWM-NR) <
suryavamshivardhan.mukkamula@rbc.com> wrote:

> Hi,
>
> Can you please let me know, How would I add multiple directories to an
> Operator which extends ‘AbstractFileInputOperator’?
>
> I would like to read from multiple directories by a single operator by
> selecting multiple files using ‘filePatternRegExp’.
>
> Regards,
> Surya Vamshi
>
>
> _______________________________________________________________________
>
> If you received this email in error, please advise the sender (by return
> email or otherwise) immediately. You have consented to receive the attached
> electronically at the above-noted email address; please retain a copy of
> this confirmation for future reference.
>
> Si vous recevez ce courriel par erreur, veuillez en aviser l'expéditeur
> immédiatement, par retour de courriel ou par un autre moyen. Vous avez
> accepté de recevoir le(s) document(s) ci-joint(s) par voie électronique à
> l'adresse courriel indiquée ci-dessus; veuillez conserver une copie de
> cette confirmation pour les fins de reference future.
>
>