You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@apex.apache.org by Isha Arkatkar <is...@datatorrent.com> on 2015/12/03 01:43:49 UTC

Best way to trigger start reading directory

Hi all,

   I have an application with 2 input file reader operators. In this case,
want to trigger start reading from 2nd input location, only after 1st
operator is done reading. What is the best way to do this?

Thanks!
Isha

Re: Best way to trigger start reading directory

Posted by Isha Arkatkar <is...@datatorrent.com>.
Thanks Gaurav and Sandeep. I went ahead with IdleTimeHandler approach
finally as it worked well in the given scenario and also with operator
restart.
Forgot to update the thread afterwords, so a late mail. :)

Thanks!
Isha

On Wed, Dec 2, 2015 at 11:57 PM, Sandeep Deshmukh <sa...@datatorrent.com>
wrote:

> Yes. Thanks Gaurav.
>
> Regards,
> Sandeep
>
> On Thu, Dec 3, 2015 at 12:31 PM, Gaurav Gupta <ga...@datatorrent.com>
> wrote:
>
> > Sandeep,
> >
> > I hope this documentation
> >
> https://www.datatorrent.com/docs/apidocs/com/datatorrent/api/Operator.IdleTimeHandler.html
> > answers your question.
> >
> > Thanks
> > - Gaurav
> >
> > > On Dec 2, 2015, at 10:56 PM, Sandeep Deshmukh <sandeep@datatorrent.com
> >
> > wrote:
> > >
> > > Is handleIdleTime guaranteed to to be called? Can there be a case when
> > the
> > > machine is loaded and hence don't have any cycles to invoke this extra
> > > functionality.
> > >
> > > Regards,
> > > Sandeep
> > >
> > > On Thu, Dec 3, 2015 at 12:17 PM, Isha Arkatkar <is...@datatorrent.com>
> > wrote:
> > >
> > >> Cool! did not know that :)
> > >> Will try this approach too!
> > >>
> > >> Thanks,
> > >> Isha
> > >>
> > >> On Wed, Dec 2, 2015 at 10:45 PM, Gaurav Gupta <gaurav@datatorrent.com
> >
> > >> wrote:
> > >>
> > >>> Here is code snippet
> > >>>
> > >>> public class DownStreamReceiver extends AbstractFileInputOperator
> > >>> implements Operator.IdleTimeHandler{
> > >>>  @Override
> > >>>  public void handleIdleTime()
> > >>>  {
> > >>>        if(upstreamDoneReading){ // this is set to true only after
> > >>> receiving the trigger from 1st reader
> > >>>         emitTuples();
> > >>>        }
> > >>>  }
> > >>> }
> > >>>
> > >>> Thanks
> > >>> - Gaurav
> > >>>
> > >>>> On Dec 2, 2015, at 10:41 PM, Gaurav Gupta <ga...@datatorrent.com>
> > >>> wrote:
> > >>>>
> > >>>> Isha,
> > >>>>
> > >>>> I know if you add input port it becomes generic operator and for
> that
> > >>> you can use IdleTimeHandler and in handleIdleTime call emitTuples
> only
> > if
> > >>> you have received trigger from 1st reader. Hope that helps.
> > >>>>
> > >>>> Thanks
> > >>>> - Gaurav
> > >>>>
> > >>>>> On Dec 2, 2015, at 10:34 PM, Isha Arkatkar <isha@datatorrent.com
> > >>> <ma...@datatorrent.com>> wrote:
> > >>>>>
> > >>>>> Hey Gaurav,
> > >>>>>
> > >>>>> No that does not work I am afraid. I tried the same thing first.
> But
> > >>> when
> > >>>>> you have a connected input port even if it is for an Input
> operator,
> > >> the
> > >>>>> Operator type changes from INPUT to GENERIC.  emitTuples is invoked
> > in
> > >>> loop
> > >>>>> only for InputNode.. So, operator emits nothing if I add a
> connected
> > >>> input
> > >>>>> stream to it.
> > >>>>> I think, though, this might be nice to have if sending triggers
> > >> between
> > >>>>> input operators is a common observed pattern.
> > >>>>>
> > >>>>> I will try out the StatsListener approach.
> > >>>>>
> > >>>>> Thanks,
> > >>>>> Isha
> > >>>>>
> > >>>>>
> > >>>>>
> > >>>>>
> > >>>>>
> > >>>>> On Wed, Dec 2, 2015 at 10:10 PM, Gaurav Gupta <
> > gaurav@datatorrent.com
> > >>> <ma...@datatorrent.com>>
> > >>>>> wrote:
> > >>>>>
> > >>>>>> Can this not be done as follows
> > >>>>>>
> > >>>>>> 2nd reader has an input port which is connected to output port of
> > 1st
> > >>>>>> reader. Once the 1st reader is done reading it can send trigger to
> > >> 2nd
> > >>>>>> reader over the output port and 2nd reader starts reading once it
> > >> gets
> > >>>>>> trigger.
> > >>>>>>
> > >>>>>> Thanks
> > >>>>>> - Gaurav
> > >>>>>>
> > >>>>>>> On Dec 2, 2015, at 7:19 PM, Sandeep Deshmukh <
> > >> sandeep@datatorrent.com
> > >>> <ma...@datatorrent.com>>
> > >>>>>> wrote:
> > >>>>>>>
> > >>>>>>> You use StatsListener shared between two operators to trigger
> this.
> > >>>>>>>
> > >>>>>>> Regards,
> > >>>>>>> Sandeep
> > >>>>>>>
> > >>>>>>> On Thu, Dec 3, 2015 at 8:08 AM, Isha Arkatkar <
> > isha@datatorrent.com
> > >>> <ma...@datatorrent.com>>
> > >>>>>> wrote:
> > >>>>>>>
> > >>>>>>>> Hi,
> > >>>>>>>>
> > >>>>>>>>  Yes, that should work. But suppose first operator rolled back
> to
> > >>>>>>>> previous checkpoint due to some fail over, the state of
> > application
> > >>>>>> would
> > >>>>>>>> be reset. Except the part that the 2nd file reader which was not
> > >>>>>> emitting
> > >>>>>>>> anything in those windows, now will continue to emit tuples.
> That
> > >>> will
> > >>>>>> make
> > >>>>>>>> the state inconsistent.
> > >>>>>>>> May be I can create the link from downstream operator after
> first
> > >>>>>> reader is
> > >>>>>>>> done to handle that.
> > >>>>>>>>
> > >>>>>>>> Thanks,
> > >>>>>>>> Isha
> > >>>>>>>>
> > >>>>>>>> On Wed, Dec 2, 2015 at 5:02 PM, Munagala Ramanath <
> > >>> ram@datatorrent.com <ma...@datatorrent.com>>
> > >>>>>>>> wrote:
> > >>>>>>>>
> > >>>>>>>>> One way (a bit hacky) is to have the 2nd operator monitor an
> > empty
> > >>>>>>>>> directory. Then, when the
> > >>>>>>>>> 1st is done reading, it sends a control tuple to a "file-link"
> > >>>>>> operator;
> > >>>>>>>>> that operator creates
> > >>>>>>>>> symbolic links from the directory monitored by the 2nd operator
> > to
> > >>> the
> > >>>>>>>>> actual input files.
> > >>>>>>>>> That should then trigger the 2nd input operator to do its
> thing.
> > >>>>>>>>>
> > >>>>>>>>> Ram
> > >>>>>>>>>
> > >>>>>>>>> On Wed, Dec 2, 2015 at 4:43 PM, Isha Arkatkar <
> > >> isha@datatorrent.com
> > >>> <ma...@datatorrent.com>>
> > >>>>>>>>> wrote:
> > >>>>>>>>>
> > >>>>>>>>>> Hi all,
> > >>>>>>>>>>
> > >>>>>>>>>> I have an application with 2 input file reader operators. In
> > >> this
> > >>>>>>>>> case,
> > >>>>>>>>>> want to trigger start reading from 2nd input location, only
> > after
> > >>> 1st
> > >>>>>>>>>> operator is done reading. What is the best way to do this?
> > >>>>>>>>>>
> > >>>>>>>>>> Thanks!
> > >>>>>>>>>> Isha
> > >>>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>
> > >>>>>>
> > >>>>>>
> > >>>>
> > >>>
> > >>>
> > >>
> >
> >
>

Re: Best way to trigger start reading directory

Posted by Sandeep Deshmukh <sa...@datatorrent.com>.
Yes. Thanks Gaurav.

Regards,
Sandeep

On Thu, Dec 3, 2015 at 12:31 PM, Gaurav Gupta <ga...@datatorrent.com>
wrote:

> Sandeep,
>
> I hope this documentation
> https://www.datatorrent.com/docs/apidocs/com/datatorrent/api/Operator.IdleTimeHandler.html
> answers your question.
>
> Thanks
> - Gaurav
>
> > On Dec 2, 2015, at 10:56 PM, Sandeep Deshmukh <sa...@datatorrent.com>
> wrote:
> >
> > Is handleIdleTime guaranteed to to be called? Can there be a case when
> the
> > machine is loaded and hence don't have any cycles to invoke this extra
> > functionality.
> >
> > Regards,
> > Sandeep
> >
> > On Thu, Dec 3, 2015 at 12:17 PM, Isha Arkatkar <is...@datatorrent.com>
> wrote:
> >
> >> Cool! did not know that :)
> >> Will try this approach too!
> >>
> >> Thanks,
> >> Isha
> >>
> >> On Wed, Dec 2, 2015 at 10:45 PM, Gaurav Gupta <ga...@datatorrent.com>
> >> wrote:
> >>
> >>> Here is code snippet
> >>>
> >>> public class DownStreamReceiver extends AbstractFileInputOperator
> >>> implements Operator.IdleTimeHandler{
> >>>  @Override
> >>>  public void handleIdleTime()
> >>>  {
> >>>        if(upstreamDoneReading){ // this is set to true only after
> >>> receiving the trigger from 1st reader
> >>>         emitTuples();
> >>>        }
> >>>  }
> >>> }
> >>>
> >>> Thanks
> >>> - Gaurav
> >>>
> >>>> On Dec 2, 2015, at 10:41 PM, Gaurav Gupta <ga...@datatorrent.com>
> >>> wrote:
> >>>>
> >>>> Isha,
> >>>>
> >>>> I know if you add input port it becomes generic operator and for that
> >>> you can use IdleTimeHandler and in handleIdleTime call emitTuples only
> if
> >>> you have received trigger from 1st reader. Hope that helps.
> >>>>
> >>>> Thanks
> >>>> - Gaurav
> >>>>
> >>>>> On Dec 2, 2015, at 10:34 PM, Isha Arkatkar <isha@datatorrent.com
> >>> <ma...@datatorrent.com>> wrote:
> >>>>>
> >>>>> Hey Gaurav,
> >>>>>
> >>>>> No that does not work I am afraid. I tried the same thing first. But
> >>> when
> >>>>> you have a connected input port even if it is for an Input operator,
> >> the
> >>>>> Operator type changes from INPUT to GENERIC.  emitTuples is invoked
> in
> >>> loop
> >>>>> only for InputNode.. So, operator emits nothing if I add a connected
> >>> input
> >>>>> stream to it.
> >>>>> I think, though, this might be nice to have if sending triggers
> >> between
> >>>>> input operators is a common observed pattern.
> >>>>>
> >>>>> I will try out the StatsListener approach.
> >>>>>
> >>>>> Thanks,
> >>>>> Isha
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>> On Wed, Dec 2, 2015 at 10:10 PM, Gaurav Gupta <
> gaurav@datatorrent.com
> >>> <ma...@datatorrent.com>>
> >>>>> wrote:
> >>>>>
> >>>>>> Can this not be done as follows
> >>>>>>
> >>>>>> 2nd reader has an input port which is connected to output port of
> 1st
> >>>>>> reader. Once the 1st reader is done reading it can send trigger to
> >> 2nd
> >>>>>> reader over the output port and 2nd reader starts reading once it
> >> gets
> >>>>>> trigger.
> >>>>>>
> >>>>>> Thanks
> >>>>>> - Gaurav
> >>>>>>
> >>>>>>> On Dec 2, 2015, at 7:19 PM, Sandeep Deshmukh <
> >> sandeep@datatorrent.com
> >>> <ma...@datatorrent.com>>
> >>>>>> wrote:
> >>>>>>>
> >>>>>>> You use StatsListener shared between two operators to trigger this.
> >>>>>>>
> >>>>>>> Regards,
> >>>>>>> Sandeep
> >>>>>>>
> >>>>>>> On Thu, Dec 3, 2015 at 8:08 AM, Isha Arkatkar <
> isha@datatorrent.com
> >>> <ma...@datatorrent.com>>
> >>>>>> wrote:
> >>>>>>>
> >>>>>>>> Hi,
> >>>>>>>>
> >>>>>>>>  Yes, that should work. But suppose first operator rolled back to
> >>>>>>>> previous checkpoint due to some fail over, the state of
> application
> >>>>>> would
> >>>>>>>> be reset. Except the part that the 2nd file reader which was not
> >>>>>> emitting
> >>>>>>>> anything in those windows, now will continue to emit tuples. That
> >>> will
> >>>>>> make
> >>>>>>>> the state inconsistent.
> >>>>>>>> May be I can create the link from downstream operator after first
> >>>>>> reader is
> >>>>>>>> done to handle that.
> >>>>>>>>
> >>>>>>>> Thanks,
> >>>>>>>> Isha
> >>>>>>>>
> >>>>>>>> On Wed, Dec 2, 2015 at 5:02 PM, Munagala Ramanath <
> >>> ram@datatorrent.com <ma...@datatorrent.com>>
> >>>>>>>> wrote:
> >>>>>>>>
> >>>>>>>>> One way (a bit hacky) is to have the 2nd operator monitor an
> empty
> >>>>>>>>> directory. Then, when the
> >>>>>>>>> 1st is done reading, it sends a control tuple to a "file-link"
> >>>>>> operator;
> >>>>>>>>> that operator creates
> >>>>>>>>> symbolic links from the directory monitored by the 2nd operator
> to
> >>> the
> >>>>>>>>> actual input files.
> >>>>>>>>> That should then trigger the 2nd input operator to do its thing.
> >>>>>>>>>
> >>>>>>>>> Ram
> >>>>>>>>>
> >>>>>>>>> On Wed, Dec 2, 2015 at 4:43 PM, Isha Arkatkar <
> >> isha@datatorrent.com
> >>> <ma...@datatorrent.com>>
> >>>>>>>>> wrote:
> >>>>>>>>>
> >>>>>>>>>> Hi all,
> >>>>>>>>>>
> >>>>>>>>>> I have an application with 2 input file reader operators. In
> >> this
> >>>>>>>>> case,
> >>>>>>>>>> want to trigger start reading from 2nd input location, only
> after
> >>> 1st
> >>>>>>>>>> operator is done reading. What is the best way to do this?
> >>>>>>>>>>
> >>>>>>>>>> Thanks!
> >>>>>>>>>> Isha
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>
> >>>>>>
> >>>>
> >>>
> >>>
> >>
>
>

Re: Best way to trigger start reading directory

Posted by Gaurav Gupta <ga...@datatorrent.com>.
Sandeep, 

I hope this documentation https://www.datatorrent.com/docs/apidocs/com/datatorrent/api/Operator.IdleTimeHandler.html answers your question.

Thanks
- Gaurav

> On Dec 2, 2015, at 10:56 PM, Sandeep Deshmukh <sa...@datatorrent.com> wrote:
> 
> Is handleIdleTime guaranteed to to be called? Can there be a case when the
> machine is loaded and hence don't have any cycles to invoke this extra
> functionality.
> 
> Regards,
> Sandeep
> 
> On Thu, Dec 3, 2015 at 12:17 PM, Isha Arkatkar <is...@datatorrent.com> wrote:
> 
>> Cool! did not know that :)
>> Will try this approach too!
>> 
>> Thanks,
>> Isha
>> 
>> On Wed, Dec 2, 2015 at 10:45 PM, Gaurav Gupta <ga...@datatorrent.com>
>> wrote:
>> 
>>> Here is code snippet
>>> 
>>> public class DownStreamReceiver extends AbstractFileInputOperator
>>> implements Operator.IdleTimeHandler{
>>>  @Override
>>>  public void handleIdleTime()
>>>  {
>>>        if(upstreamDoneReading){ // this is set to true only after
>>> receiving the trigger from 1st reader
>>>         emitTuples();
>>>        }
>>>  }
>>> }
>>> 
>>> Thanks
>>> - Gaurav
>>> 
>>>> On Dec 2, 2015, at 10:41 PM, Gaurav Gupta <ga...@datatorrent.com>
>>> wrote:
>>>> 
>>>> Isha,
>>>> 
>>>> I know if you add input port it becomes generic operator and for that
>>> you can use IdleTimeHandler and in handleIdleTime call emitTuples only if
>>> you have received trigger from 1st reader. Hope that helps.
>>>> 
>>>> Thanks
>>>> - Gaurav
>>>> 
>>>>> On Dec 2, 2015, at 10:34 PM, Isha Arkatkar <isha@datatorrent.com
>>> <ma...@datatorrent.com>> wrote:
>>>>> 
>>>>> Hey Gaurav,
>>>>> 
>>>>> No that does not work I am afraid. I tried the same thing first. But
>>> when
>>>>> you have a connected input port even if it is for an Input operator,
>> the
>>>>> Operator type changes from INPUT to GENERIC.  emitTuples is invoked in
>>> loop
>>>>> only for InputNode.. So, operator emits nothing if I add a connected
>>> input
>>>>> stream to it.
>>>>> I think, though, this might be nice to have if sending triggers
>> between
>>>>> input operators is a common observed pattern.
>>>>> 
>>>>> I will try out the StatsListener approach.
>>>>> 
>>>>> Thanks,
>>>>> Isha
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> On Wed, Dec 2, 2015 at 10:10 PM, Gaurav Gupta <gaurav@datatorrent.com
>>> <ma...@datatorrent.com>>
>>>>> wrote:
>>>>> 
>>>>>> Can this not be done as follows
>>>>>> 
>>>>>> 2nd reader has an input port which is connected to output port of 1st
>>>>>> reader. Once the 1st reader is done reading it can send trigger to
>> 2nd
>>>>>> reader over the output port and 2nd reader starts reading once it
>> gets
>>>>>> trigger.
>>>>>> 
>>>>>> Thanks
>>>>>> - Gaurav
>>>>>> 
>>>>>>> On Dec 2, 2015, at 7:19 PM, Sandeep Deshmukh <
>> sandeep@datatorrent.com
>>> <ma...@datatorrent.com>>
>>>>>> wrote:
>>>>>>> 
>>>>>>> You use StatsListener shared between two operators to trigger this.
>>>>>>> 
>>>>>>> Regards,
>>>>>>> Sandeep
>>>>>>> 
>>>>>>> On Thu, Dec 3, 2015 at 8:08 AM, Isha Arkatkar <isha@datatorrent.com
>>> <ma...@datatorrent.com>>
>>>>>> wrote:
>>>>>>> 
>>>>>>>> Hi,
>>>>>>>> 
>>>>>>>>  Yes, that should work. But suppose first operator rolled back to
>>>>>>>> previous checkpoint due to some fail over, the state of application
>>>>>> would
>>>>>>>> be reset. Except the part that the 2nd file reader which was not
>>>>>> emitting
>>>>>>>> anything in those windows, now will continue to emit tuples. That
>>> will
>>>>>> make
>>>>>>>> the state inconsistent.
>>>>>>>> May be I can create the link from downstream operator after first
>>>>>> reader is
>>>>>>>> done to handle that.
>>>>>>>> 
>>>>>>>> Thanks,
>>>>>>>> Isha
>>>>>>>> 
>>>>>>>> On Wed, Dec 2, 2015 at 5:02 PM, Munagala Ramanath <
>>> ram@datatorrent.com <ma...@datatorrent.com>>
>>>>>>>> wrote:
>>>>>>>> 
>>>>>>>>> One way (a bit hacky) is to have the 2nd operator monitor an empty
>>>>>>>>> directory. Then, when the
>>>>>>>>> 1st is done reading, it sends a control tuple to a "file-link"
>>>>>> operator;
>>>>>>>>> that operator creates
>>>>>>>>> symbolic links from the directory monitored by the 2nd operator to
>>> the
>>>>>>>>> actual input files.
>>>>>>>>> That should then trigger the 2nd input operator to do its thing.
>>>>>>>>> 
>>>>>>>>> Ram
>>>>>>>>> 
>>>>>>>>> On Wed, Dec 2, 2015 at 4:43 PM, Isha Arkatkar <
>> isha@datatorrent.com
>>> <ma...@datatorrent.com>>
>>>>>>>>> wrote:
>>>>>>>>> 
>>>>>>>>>> Hi all,
>>>>>>>>>> 
>>>>>>>>>> I have an application with 2 input file reader operators. In
>> this
>>>>>>>>> case,
>>>>>>>>>> want to trigger start reading from 2nd input location, only after
>>> 1st
>>>>>>>>>> operator is done reading. What is the best way to do this?
>>>>>>>>>> 
>>>>>>>>>> Thanks!
>>>>>>>>>> Isha
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>> 
>>>>>> 
>>>> 
>>> 
>>> 
>> 


Re: Best way to trigger start reading directory

Posted by Sandeep Deshmukh <sa...@datatorrent.com>.
Is handleIdleTime guaranteed to to be called? Can there be a case when the
machine is loaded and hence don't have any cycles to invoke this extra
functionality.

Regards,
Sandeep

On Thu, Dec 3, 2015 at 12:17 PM, Isha Arkatkar <is...@datatorrent.com> wrote:

> Cool! did not know that :)
> Will try this approach too!
>
> Thanks,
> Isha
>
> On Wed, Dec 2, 2015 at 10:45 PM, Gaurav Gupta <ga...@datatorrent.com>
> wrote:
>
> > Here is code snippet
> >
> > public class DownStreamReceiver extends AbstractFileInputOperator
> > implements Operator.IdleTimeHandler{
> >   @Override
> >   public void handleIdleTime()
> >   {
> >         if(upstreamDoneReading){ // this is set to true only after
> > receiving the trigger from 1st reader
> >          emitTuples();
> >         }
> >   }
> > }
> >
> > Thanks
> > - Gaurav
> >
> > > On Dec 2, 2015, at 10:41 PM, Gaurav Gupta <ga...@datatorrent.com>
> > wrote:
> > >
> > > Isha,
> > >
> > > I know if you add input port it becomes generic operator and for that
> > you can use IdleTimeHandler and in handleIdleTime call emitTuples only if
> > you have received trigger from 1st reader. Hope that helps.
> > >
> > > Thanks
> > > - Gaurav
> > >
> > >> On Dec 2, 2015, at 10:34 PM, Isha Arkatkar <isha@datatorrent.com
> > <ma...@datatorrent.com>> wrote:
> > >>
> > >> Hey Gaurav,
> > >>
> > >>  No that does not work I am afraid. I tried the same thing first. But
> > when
> > >> you have a connected input port even if it is for an Input operator,
> the
> > >> Operator type changes from INPUT to GENERIC.  emitTuples is invoked in
> > loop
> > >> only for InputNode.. So, operator emits nothing if I add a connected
> > input
> > >> stream to it.
> > >> I think, though, this might be nice to have if sending triggers
> between
> > >> input operators is a common observed pattern.
> > >>
> > >> I will try out the StatsListener approach.
> > >>
> > >> Thanks,
> > >> Isha
> > >>
> > >>
> > >>
> > >>
> > >>
> > >> On Wed, Dec 2, 2015 at 10:10 PM, Gaurav Gupta <gaurav@datatorrent.com
> > <ma...@datatorrent.com>>
> > >> wrote:
> > >>
> > >>> Can this not be done as follows
> > >>>
> > >>> 2nd reader has an input port which is connected to output port of 1st
> > >>> reader. Once the 1st reader is done reading it can send trigger to
> 2nd
> > >>> reader over the output port and 2nd reader starts reading once it
> gets
> > >>> trigger.
> > >>>
> > >>> Thanks
> > >>> - Gaurav
> > >>>
> > >>>> On Dec 2, 2015, at 7:19 PM, Sandeep Deshmukh <
> sandeep@datatorrent.com
> > <ma...@datatorrent.com>>
> > >>> wrote:
> > >>>>
> > >>>> You use StatsListener shared between two operators to trigger this.
> > >>>>
> > >>>> Regards,
> > >>>> Sandeep
> > >>>>
> > >>>> On Thu, Dec 3, 2015 at 8:08 AM, Isha Arkatkar <isha@datatorrent.com
> > <ma...@datatorrent.com>>
> > >>> wrote:
> > >>>>
> > >>>>> Hi,
> > >>>>>
> > >>>>>   Yes, that should work. But suppose first operator rolled back to
> > >>>>> previous checkpoint due to some fail over, the state of application
> > >>> would
> > >>>>> be reset. Except the part that the 2nd file reader which was not
> > >>> emitting
> > >>>>> anything in those windows, now will continue to emit tuples. That
> > will
> > >>> make
> > >>>>> the state inconsistent.
> > >>>>> May be I can create the link from downstream operator after first
> > >>> reader is
> > >>>>> done to handle that.
> > >>>>>
> > >>>>> Thanks,
> > >>>>> Isha
> > >>>>>
> > >>>>> On Wed, Dec 2, 2015 at 5:02 PM, Munagala Ramanath <
> > ram@datatorrent.com <ma...@datatorrent.com>>
> > >>>>> wrote:
> > >>>>>
> > >>>>>> One way (a bit hacky) is to have the 2nd operator monitor an empty
> > >>>>>> directory. Then, when the
> > >>>>>> 1st is done reading, it sends a control tuple to a "file-link"
> > >>> operator;
> > >>>>>> that operator creates
> > >>>>>> symbolic links from the directory monitored by the 2nd operator to
> > the
> > >>>>>> actual input files.
> > >>>>>> That should then trigger the 2nd input operator to do its thing.
> > >>>>>>
> > >>>>>> Ram
> > >>>>>>
> > >>>>>> On Wed, Dec 2, 2015 at 4:43 PM, Isha Arkatkar <
> isha@datatorrent.com
> > <ma...@datatorrent.com>>
> > >>>>>> wrote:
> > >>>>>>
> > >>>>>>> Hi all,
> > >>>>>>>
> > >>>>>>>  I have an application with 2 input file reader operators. In
> this
> > >>>>>> case,
> > >>>>>>> want to trigger start reading from 2nd input location, only after
> > 1st
> > >>>>>>> operator is done reading. What is the best way to do this?
> > >>>>>>>
> > >>>>>>> Thanks!
> > >>>>>>> Isha
> > >>>>>>>
> > >>>>>>
> > >>>>>
> > >>>
> > >>>
> > >
> >
> >
>

Re: Best way to trigger start reading directory

Posted by Isha Arkatkar <is...@datatorrent.com>.
Cool! did not know that :)
Will try this approach too!

Thanks,
Isha

On Wed, Dec 2, 2015 at 10:45 PM, Gaurav Gupta <ga...@datatorrent.com>
wrote:

> Here is code snippet
>
> public class DownStreamReceiver extends AbstractFileInputOperator
> implements Operator.IdleTimeHandler{
>   @Override
>   public void handleIdleTime()
>   {
>         if(upstreamDoneReading){ // this is set to true only after
> receiving the trigger from 1st reader
>          emitTuples();
>         }
>   }
> }
>
> Thanks
> - Gaurav
>
> > On Dec 2, 2015, at 10:41 PM, Gaurav Gupta <ga...@datatorrent.com>
> wrote:
> >
> > Isha,
> >
> > I know if you add input port it becomes generic operator and for that
> you can use IdleTimeHandler and in handleIdleTime call emitTuples only if
> you have received trigger from 1st reader. Hope that helps.
> >
> > Thanks
> > - Gaurav
> >
> >> On Dec 2, 2015, at 10:34 PM, Isha Arkatkar <isha@datatorrent.com
> <ma...@datatorrent.com>> wrote:
> >>
> >> Hey Gaurav,
> >>
> >>  No that does not work I am afraid. I tried the same thing first. But
> when
> >> you have a connected input port even if it is for an Input operator, the
> >> Operator type changes from INPUT to GENERIC.  emitTuples is invoked in
> loop
> >> only for InputNode.. So, operator emits nothing if I add a connected
> input
> >> stream to it.
> >> I think, though, this might be nice to have if sending triggers between
> >> input operators is a common observed pattern.
> >>
> >> I will try out the StatsListener approach.
> >>
> >> Thanks,
> >> Isha
> >>
> >>
> >>
> >>
> >>
> >> On Wed, Dec 2, 2015 at 10:10 PM, Gaurav Gupta <gaurav@datatorrent.com
> <ma...@datatorrent.com>>
> >> wrote:
> >>
> >>> Can this not be done as follows
> >>>
> >>> 2nd reader has an input port which is connected to output port of 1st
> >>> reader. Once the 1st reader is done reading it can send trigger to 2nd
> >>> reader over the output port and 2nd reader starts reading once it gets
> >>> trigger.
> >>>
> >>> Thanks
> >>> - Gaurav
> >>>
> >>>> On Dec 2, 2015, at 7:19 PM, Sandeep Deshmukh <sandeep@datatorrent.com
> <ma...@datatorrent.com>>
> >>> wrote:
> >>>>
> >>>> You use StatsListener shared between two operators to trigger this.
> >>>>
> >>>> Regards,
> >>>> Sandeep
> >>>>
> >>>> On Thu, Dec 3, 2015 at 8:08 AM, Isha Arkatkar <isha@datatorrent.com
> <ma...@datatorrent.com>>
> >>> wrote:
> >>>>
> >>>>> Hi,
> >>>>>
> >>>>>   Yes, that should work. But suppose first operator rolled back to
> >>>>> previous checkpoint due to some fail over, the state of application
> >>> would
> >>>>> be reset. Except the part that the 2nd file reader which was not
> >>> emitting
> >>>>> anything in those windows, now will continue to emit tuples. That
> will
> >>> make
> >>>>> the state inconsistent.
> >>>>> May be I can create the link from downstream operator after first
> >>> reader is
> >>>>> done to handle that.
> >>>>>
> >>>>> Thanks,
> >>>>> Isha
> >>>>>
> >>>>> On Wed, Dec 2, 2015 at 5:02 PM, Munagala Ramanath <
> ram@datatorrent.com <ma...@datatorrent.com>>
> >>>>> wrote:
> >>>>>
> >>>>>> One way (a bit hacky) is to have the 2nd operator monitor an empty
> >>>>>> directory. Then, when the
> >>>>>> 1st is done reading, it sends a control tuple to a "file-link"
> >>> operator;
> >>>>>> that operator creates
> >>>>>> symbolic links from the directory monitored by the 2nd operator to
> the
> >>>>>> actual input files.
> >>>>>> That should then trigger the 2nd input operator to do its thing.
> >>>>>>
> >>>>>> Ram
> >>>>>>
> >>>>>> On Wed, Dec 2, 2015 at 4:43 PM, Isha Arkatkar <isha@datatorrent.com
> <ma...@datatorrent.com>>
> >>>>>> wrote:
> >>>>>>
> >>>>>>> Hi all,
> >>>>>>>
> >>>>>>>  I have an application with 2 input file reader operators. In this
> >>>>>> case,
> >>>>>>> want to trigger start reading from 2nd input location, only after
> 1st
> >>>>>>> operator is done reading. What is the best way to do this?
> >>>>>>>
> >>>>>>> Thanks!
> >>>>>>> Isha
> >>>>>>>
> >>>>>>
> >>>>>
> >>>
> >>>
> >
>
>

Re: Best way to trigger start reading directory

Posted by Gaurav Gupta <ga...@datatorrent.com>.
Here is code snippet

public class DownStreamReceiver extends AbstractFileInputOperator implements Operator.IdleTimeHandler{	
  @Override
  public void handleIdleTime()
  {
   	if(upstreamDoneReading){ // this is set to true only after receiving the trigger from 1st reader
  	 emitTuples();
  	}
  }
}

Thanks
- Gaurav

> On Dec 2, 2015, at 10:41 PM, Gaurav Gupta <ga...@datatorrent.com> wrote:
> 
> Isha,
> 
> I know if you add input port it becomes generic operator and for that you can use IdleTimeHandler and in handleIdleTime call emitTuples only if you have received trigger from 1st reader. Hope that helps. 
> 
> Thanks
> - Gaurav
> 
>> On Dec 2, 2015, at 10:34 PM, Isha Arkatkar <isha@datatorrent.com <ma...@datatorrent.com>> wrote:
>> 
>> Hey Gaurav,
>> 
>>  No that does not work I am afraid. I tried the same thing first. But when
>> you have a connected input port even if it is for an Input operator, the
>> Operator type changes from INPUT to GENERIC.  emitTuples is invoked in loop
>> only for InputNode.. So, operator emits nothing if I add a connected input
>> stream to it.
>> I think, though, this might be nice to have if sending triggers between
>> input operators is a common observed pattern.
>> 
>> I will try out the StatsListener approach.
>> 
>> Thanks,
>> Isha
>> 
>> 
>> 
>> 
>> 
>> On Wed, Dec 2, 2015 at 10:10 PM, Gaurav Gupta <gaurav@datatorrent.com <ma...@datatorrent.com>>
>> wrote:
>> 
>>> Can this not be done as follows
>>> 
>>> 2nd reader has an input port which is connected to output port of 1st
>>> reader. Once the 1st reader is done reading it can send trigger to 2nd
>>> reader over the output port and 2nd reader starts reading once it gets
>>> trigger.
>>> 
>>> Thanks
>>> - Gaurav
>>> 
>>>> On Dec 2, 2015, at 7:19 PM, Sandeep Deshmukh <sandeep@datatorrent.com <ma...@datatorrent.com>>
>>> wrote:
>>>> 
>>>> You use StatsListener shared between two operators to trigger this.
>>>> 
>>>> Regards,
>>>> Sandeep
>>>> 
>>>> On Thu, Dec 3, 2015 at 8:08 AM, Isha Arkatkar <isha@datatorrent.com <ma...@datatorrent.com>>
>>> wrote:
>>>> 
>>>>> Hi,
>>>>> 
>>>>>   Yes, that should work. But suppose first operator rolled back to
>>>>> previous checkpoint due to some fail over, the state of application
>>> would
>>>>> be reset. Except the part that the 2nd file reader which was not
>>> emitting
>>>>> anything in those windows, now will continue to emit tuples. That will
>>> make
>>>>> the state inconsistent.
>>>>> May be I can create the link from downstream operator after first
>>> reader is
>>>>> done to handle that.
>>>>> 
>>>>> Thanks,
>>>>> Isha
>>>>> 
>>>>> On Wed, Dec 2, 2015 at 5:02 PM, Munagala Ramanath <ram@datatorrent.com <ma...@datatorrent.com>>
>>>>> wrote:
>>>>> 
>>>>>> One way (a bit hacky) is to have the 2nd operator monitor an empty
>>>>>> directory. Then, when the
>>>>>> 1st is done reading, it sends a control tuple to a "file-link"
>>> operator;
>>>>>> that operator creates
>>>>>> symbolic links from the directory monitored by the 2nd operator to the
>>>>>> actual input files.
>>>>>> That should then trigger the 2nd input operator to do its thing.
>>>>>> 
>>>>>> Ram
>>>>>> 
>>>>>> On Wed, Dec 2, 2015 at 4:43 PM, Isha Arkatkar <isha@datatorrent.com <ma...@datatorrent.com>>
>>>>>> wrote:
>>>>>> 
>>>>>>> Hi all,
>>>>>>> 
>>>>>>>  I have an application with 2 input file reader operators. In this
>>>>>> case,
>>>>>>> want to trigger start reading from 2nd input location, only after 1st
>>>>>>> operator is done reading. What is the best way to do this?
>>>>>>> 
>>>>>>> Thanks!
>>>>>>> Isha
>>>>>>> 
>>>>>> 
>>>>> 
>>> 
>>> 
> 


Re: Best way to trigger start reading directory

Posted by Gaurav Gupta <ga...@datatorrent.com>.
Isha,

I know if you add input port it becomes generic operator and for that you can use IdleTimeHandler and in handleIdleTime call emitTuples only if you have received trigger from 1st reader. Hope that helps. 

Thanks
- Gaurav

> On Dec 2, 2015, at 10:34 PM, Isha Arkatkar <is...@datatorrent.com> wrote:
> 
> Hey Gaurav,
> 
>  No that does not work I am afraid. I tried the same thing first. But when
> you have a connected input port even if it is for an Input operator, the
> Operator type changes from INPUT to GENERIC.  emitTuples is invoked in loop
> only for InputNode.. So, operator emits nothing if I add a connected input
> stream to it.
> I think, though, this might be nice to have if sending triggers between
> input operators is a common observed pattern.
> 
> I will try out the StatsListener approach.
> 
> Thanks,
> Isha
> 
> 
> 
> 
> 
> On Wed, Dec 2, 2015 at 10:10 PM, Gaurav Gupta <ga...@datatorrent.com>
> wrote:
> 
>> Can this not be done as follows
>> 
>> 2nd reader has an input port which is connected to output port of 1st
>> reader. Once the 1st reader is done reading it can send trigger to 2nd
>> reader over the output port and 2nd reader starts reading once it gets
>> trigger.
>> 
>> Thanks
>> - Gaurav
>> 
>>> On Dec 2, 2015, at 7:19 PM, Sandeep Deshmukh <sa...@datatorrent.com>
>> wrote:
>>> 
>>> You use StatsListener shared between two operators to trigger this.
>>> 
>>> Regards,
>>> Sandeep
>>> 
>>> On Thu, Dec 3, 2015 at 8:08 AM, Isha Arkatkar <is...@datatorrent.com>
>> wrote:
>>> 
>>>> Hi,
>>>> 
>>>>   Yes, that should work. But suppose first operator rolled back to
>>>> previous checkpoint due to some fail over, the state of application
>> would
>>>> be reset. Except the part that the 2nd file reader which was not
>> emitting
>>>> anything in those windows, now will continue to emit tuples. That will
>> make
>>>> the state inconsistent.
>>>> May be I can create the link from downstream operator after first
>> reader is
>>>> done to handle that.
>>>> 
>>>> Thanks,
>>>> Isha
>>>> 
>>>> On Wed, Dec 2, 2015 at 5:02 PM, Munagala Ramanath <ra...@datatorrent.com>
>>>> wrote:
>>>> 
>>>>> One way (a bit hacky) is to have the 2nd operator monitor an empty
>>>>> directory. Then, when the
>>>>> 1st is done reading, it sends a control tuple to a "file-link"
>> operator;
>>>>> that operator creates
>>>>> symbolic links from the directory monitored by the 2nd operator to the
>>>>> actual input files.
>>>>> That should then trigger the 2nd input operator to do its thing.
>>>>> 
>>>>> Ram
>>>>> 
>>>>> On Wed, Dec 2, 2015 at 4:43 PM, Isha Arkatkar <is...@datatorrent.com>
>>>>> wrote:
>>>>> 
>>>>>> Hi all,
>>>>>> 
>>>>>>  I have an application with 2 input file reader operators. In this
>>>>> case,
>>>>>> want to trigger start reading from 2nd input location, only after 1st
>>>>>> operator is done reading. What is the best way to do this?
>>>>>> 
>>>>>> Thanks!
>>>>>> Isha
>>>>>> 
>>>>> 
>>>> 
>> 
>> 


Re: Best way to trigger start reading directory

Posted by Isha Arkatkar <is...@datatorrent.com>.
Hey Gaurav,

  No that does not work I am afraid. I tried the same thing first. But when
you have a connected input port even if it is for an Input operator, the
Operator type changes from INPUT to GENERIC.  emitTuples is invoked in loop
only for InputNode.. So, operator emits nothing if I add a connected input
stream to it.
I think, though, this might be nice to have if sending triggers between
input operators is a common observed pattern.

I will try out the StatsListener approach.

Thanks,
Isha





On Wed, Dec 2, 2015 at 10:10 PM, Gaurav Gupta <ga...@datatorrent.com>
wrote:

> Can this not be done as follows
>
> 2nd reader has an input port which is connected to output port of 1st
> reader. Once the 1st reader is done reading it can send trigger to 2nd
> reader over the output port and 2nd reader starts reading once it gets
> trigger.
>
> Thanks
> - Gaurav
>
> > On Dec 2, 2015, at 7:19 PM, Sandeep Deshmukh <sa...@datatorrent.com>
> wrote:
> >
> > You use StatsListener shared between two operators to trigger this.
> >
> > Regards,
> > Sandeep
> >
> > On Thu, Dec 3, 2015 at 8:08 AM, Isha Arkatkar <is...@datatorrent.com>
> wrote:
> >
> >> Hi,
> >>
> >>    Yes, that should work. But suppose first operator rolled back to
> >> previous checkpoint due to some fail over, the state of application
> would
> >> be reset. Except the part that the 2nd file reader which was not
> emitting
> >> anything in those windows, now will continue to emit tuples. That will
> make
> >> the state inconsistent.
> >> May be I can create the link from downstream operator after first
> reader is
> >> done to handle that.
> >>
> >> Thanks,
> >> Isha
> >>
> >> On Wed, Dec 2, 2015 at 5:02 PM, Munagala Ramanath <ra...@datatorrent.com>
> >> wrote:
> >>
> >>> One way (a bit hacky) is to have the 2nd operator monitor an empty
> >>> directory. Then, when the
> >>> 1st is done reading, it sends a control tuple to a "file-link"
> operator;
> >>> that operator creates
> >>> symbolic links from the directory monitored by the 2nd operator to the
> >>> actual input files.
> >>> That should then trigger the 2nd input operator to do its thing.
> >>>
> >>> Ram
> >>>
> >>> On Wed, Dec 2, 2015 at 4:43 PM, Isha Arkatkar <is...@datatorrent.com>
> >>> wrote:
> >>>
> >>>> Hi all,
> >>>>
> >>>>   I have an application with 2 input file reader operators. In this
> >>> case,
> >>>> want to trigger start reading from 2nd input location, only after 1st
> >>>> operator is done reading. What is the best way to do this?
> >>>>
> >>>> Thanks!
> >>>> Isha
> >>>>
> >>>
> >>
>
>

Re: Best way to trigger start reading directory

Posted by Gaurav Gupta <ga...@datatorrent.com>.
Can this not be done as follows

2nd reader has an input port which is connected to output port of 1st reader. Once the 1st reader is done reading it can send trigger to 2nd reader over the output port and 2nd reader starts reading once it gets trigger. 

Thanks
- Gaurav

> On Dec 2, 2015, at 7:19 PM, Sandeep Deshmukh <sa...@datatorrent.com> wrote:
> 
> You use StatsListener shared between two operators to trigger this.
> 
> Regards,
> Sandeep
> 
> On Thu, Dec 3, 2015 at 8:08 AM, Isha Arkatkar <is...@datatorrent.com> wrote:
> 
>> Hi,
>> 
>>    Yes, that should work. But suppose first operator rolled back to
>> previous checkpoint due to some fail over, the state of application would
>> be reset. Except the part that the 2nd file reader which was not emitting
>> anything in those windows, now will continue to emit tuples. That will make
>> the state inconsistent.
>> May be I can create the link from downstream operator after first reader is
>> done to handle that.
>> 
>> Thanks,
>> Isha
>> 
>> On Wed, Dec 2, 2015 at 5:02 PM, Munagala Ramanath <ra...@datatorrent.com>
>> wrote:
>> 
>>> One way (a bit hacky) is to have the 2nd operator monitor an empty
>>> directory. Then, when the
>>> 1st is done reading, it sends a control tuple to a "file-link" operator;
>>> that operator creates
>>> symbolic links from the directory monitored by the 2nd operator to the
>>> actual input files.
>>> That should then trigger the 2nd input operator to do its thing.
>>> 
>>> Ram
>>> 
>>> On Wed, Dec 2, 2015 at 4:43 PM, Isha Arkatkar <is...@datatorrent.com>
>>> wrote:
>>> 
>>>> Hi all,
>>>> 
>>>>   I have an application with 2 input file reader operators. In this
>>> case,
>>>> want to trigger start reading from 2nd input location, only after 1st
>>>> operator is done reading. What is the best way to do this?
>>>> 
>>>> Thanks!
>>>> Isha
>>>> 
>>> 
>> 


Re: Best way to trigger start reading directory

Posted by Sandeep Deshmukh <sa...@datatorrent.com>.
You use StatsListener shared between two operators to trigger this.

Regards,
Sandeep

On Thu, Dec 3, 2015 at 8:08 AM, Isha Arkatkar <is...@datatorrent.com> wrote:

> Hi,
>
>     Yes, that should work. But suppose first operator rolled back to
> previous checkpoint due to some fail over, the state of application would
> be reset. Except the part that the 2nd file reader which was not emitting
> anything in those windows, now will continue to emit tuples. That will make
> the state inconsistent.
> May be I can create the link from downstream operator after first reader is
> done to handle that.
>
> Thanks,
> Isha
>
> On Wed, Dec 2, 2015 at 5:02 PM, Munagala Ramanath <ra...@datatorrent.com>
> wrote:
>
> > One way (a bit hacky) is to have the 2nd operator monitor an empty
> > directory. Then, when the
> > 1st is done reading, it sends a control tuple to a "file-link" operator;
> > that operator creates
> > symbolic links from the directory monitored by the 2nd operator to the
> > actual input files.
> > That should then trigger the 2nd input operator to do its thing.
> >
> > Ram
> >
> > On Wed, Dec 2, 2015 at 4:43 PM, Isha Arkatkar <is...@datatorrent.com>
> > wrote:
> >
> > > Hi all,
> > >
> > >    I have an application with 2 input file reader operators. In this
> > case,
> > > want to trigger start reading from 2nd input location, only after 1st
> > > operator is done reading. What is the best way to do this?
> > >
> > > Thanks!
> > > Isha
> > >
> >
>

Re: Best way to trigger start reading directory

Posted by Isha Arkatkar <is...@datatorrent.com>.
Hi,

    Yes, that should work. But suppose first operator rolled back to
previous checkpoint due to some fail over, the state of application would
be reset. Except the part that the 2nd file reader which was not emitting
anything in those windows, now will continue to emit tuples. That will make
the state inconsistent.
May be I can create the link from downstream operator after first reader is
done to handle that.

Thanks,
Isha

On Wed, Dec 2, 2015 at 5:02 PM, Munagala Ramanath <ra...@datatorrent.com>
wrote:

> One way (a bit hacky) is to have the 2nd operator monitor an empty
> directory. Then, when the
> 1st is done reading, it sends a control tuple to a "file-link" operator;
> that operator creates
> symbolic links from the directory monitored by the 2nd operator to the
> actual input files.
> That should then trigger the 2nd input operator to do its thing.
>
> Ram
>
> On Wed, Dec 2, 2015 at 4:43 PM, Isha Arkatkar <is...@datatorrent.com>
> wrote:
>
> > Hi all,
> >
> >    I have an application with 2 input file reader operators. In this
> case,
> > want to trigger start reading from 2nd input location, only after 1st
> > operator is done reading. What is the best way to do this?
> >
> > Thanks!
> > Isha
> >
>

Re: Best way to trigger start reading directory

Posted by Munagala Ramanath <ra...@datatorrent.com>.
One way (a bit hacky) is to have the 2nd operator monitor an empty
directory. Then, when the
1st is done reading, it sends a control tuple to a "file-link" operator;
that operator creates
symbolic links from the directory monitored by the 2nd operator to the
actual input files.
That should then trigger the 2nd input operator to do its thing.

Ram

On Wed, Dec 2, 2015 at 4:43 PM, Isha Arkatkar <is...@datatorrent.com> wrote:

> Hi all,
>
>    I have an application with 2 input file reader operators. In this case,
> want to trigger start reading from 2nd input location, only after 1st
> operator is done reading. What is the best way to do this?
>
> Thanks!
> Isha
>