You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spot.apache.org by Tadd Wood <ta...@apache.org> on 2020/01/14 23:51:10 UTC

[SPOT-INGEST] Ingest file organization

I noticed that after SPOT-141 was introduced (a new kind of Spot Ingest,
using PySpark Streaming) that it overlaid the new code on top of the old
code on /spot-ingest/. When debugging the code, it makes it hard to
determine which files are relevant to the new or the old ingest process. We
should split them apart. Thoughts?

Thank you,
Tadd Wood

Re: [SPOT-INGEST] Ingest file organization

Posted by Jeremy Nelson <je...@digitalminion.com>.
I think that's ok.  Since the ingester has to be copied to each node on
which it runs, we could probably just put in the documentation to make sure
users know they should pick one (or the other) based on their needs, and
adjust their automation scripts accordingly.

Additionally, your assistance in figuring out which files belong to which
ingester would be greatly appreciated.  I've looked at this, and I would
love to take a crack at moving the files to their new home(s).

Thanks,
Jeremy


On Wed, Jan 15, 2020 at 2:57 AM Kostas Tzoulas <kt...@gmail.com> wrote:

> Or even better
>
> ./spot-ingest/python
>
> ./spot-ingest/pyspark-streaming
>
> Also, I could help to track down the files that were created for ingestion.
>
>
> - kostas
>
>
> On 15/01/2020 06:28, Nate Smith wrote:
> > Perhaps separating by framework would be good,
> >
> > ./spot-ingest/python
> > ./spot-ingest/spark-streaming
> >
> > Just my 2 cents,
> >
> > - nathanael
> >
> >> On Jan 14, 2020, at 4:45 PM, Skip Cruse <f8...@gmail.com> wrote:
> >>
> >> We should keep the name /spot-ingest/ for the original ingester, but
> move the new ingester to /spot-ingest-sparkstreaming/ or similar.
> Hopefully we can use the ticket to track down the files that were created,
> so we can move them to a new home easily.
> >>
> >> Get Outlook for iOS<https://aka.ms/o0ukef>
> >>
> >> ________________________________
> >> From: Tadd Wood <ta...@apache.org>
> >> Sent: Tuesday, January 14, 2020 5:51 PM
> >> To: dev@spot.incubator.apache.org
> >> Subject: [SPOT-INGEST] Ingest file organization
> >>
> >> I noticed that after SPOT-141 was introduced (a new kind of Spot Ingest,
> >> using PySpark Streaming) that it overlaid the new code on top of the old
> >> code on /spot-ingest/. When debugging the code, it makes it hard to
> >> determine which files are relevant to the new or the old ingest
> process. We
> >> should split them apart. Thoughts?
> >>
> >> Thank you,
> >> Tadd Wood
>

Re: [SPOT-INGEST] Ingest file organization

Posted by Tadd Wood <ta...@apache.org>.
That would be really helpful Kostas.  I'll create a JIRA today for this if
you can work on tracking the files down in the meantime.

Thank you,
Tadd Wood

On Wed, Jan 15, 2020 at 12:57 AM Kostas Tzoulas <kt...@gmail.com> wrote:

> Or even better
>
> ./spot-ingest/python
>
> ./spot-ingest/pyspark-streaming
>
> Also, I could help to track down the files that were created for ingestion.
>
>
> - kostas
>
>
> On 15/01/2020 06:28, Nate Smith wrote:
> > Perhaps separating by framework would be good,
> >
> > ./spot-ingest/python
> > ./spot-ingest/spark-streaming
> >
> > Just my 2 cents,
> >
> > - nathanael
> >
> >> On Jan 14, 2020, at 4:45 PM, Skip Cruse <f8...@gmail.com> wrote:
> >>
> >> We should keep the name /spot-ingest/ for the original ingester, but
> move the new ingester to /spot-ingest-sparkstreaming/ or similar.
> Hopefully we can use the ticket to track down the files that were created,
> so we can move them to a new home easily.
> >>
> >> Get Outlook for iOS<https://aka.ms/o0ukef>
> >>
> >> ________________________________
> >> From: Tadd Wood <ta...@apache.org>
> >> Sent: Tuesday, January 14, 2020 5:51 PM
> >> To: dev@spot.incubator.apache.org
> >> Subject: [SPOT-INGEST] Ingest file organization
> >>
> >> I noticed that after SPOT-141 was introduced (a new kind of Spot Ingest,
> >> using PySpark Streaming) that it overlaid the new code on top of the old
> >> code on /spot-ingest/. When debugging the code, it makes it hard to
> >> determine which files are relevant to the new or the old ingest
> process. We
> >> should split them apart. Thoughts?
> >>
> >> Thank you,
> >> Tadd Wood
>

Re: [SPOT-INGEST] Ingest file organization

Posted by Kostas Tzoulas <kt...@gmail.com>.
Or even better

./spot-ingest/python

./spot-ingest/pyspark-streaming

Also, I could help to track down the files that were created for ingestion.


- kostas


On 15/01/2020 06:28, Nate Smith wrote:
> Perhaps separating by framework would be good,
>
> ./spot-ingest/python
> ./spot-ingest/spark-streaming
>
> Just my 2 cents,
>
> - nathanael
>
>> On Jan 14, 2020, at 4:45 PM, Skip Cruse <f8...@gmail.com> wrote:
>>
>> We should keep the name /spot-ingest/ for the original ingester, but move the new ingester to /spot-ingest-sparkstreaming/ or similar.  Hopefully we can use the ticket to track down the files that were created, so we can move them to a new home easily.
>>
>> Get Outlook for iOS<https://aka.ms/o0ukef>
>>
>> ________________________________
>> From: Tadd Wood <ta...@apache.org>
>> Sent: Tuesday, January 14, 2020 5:51 PM
>> To: dev@spot.incubator.apache.org
>> Subject: [SPOT-INGEST] Ingest file organization
>>
>> I noticed that after SPOT-141 was introduced (a new kind of Spot Ingest,
>> using PySpark Streaming) that it overlaid the new code on top of the old
>> code on /spot-ingest/. When debugging the code, it makes it hard to
>> determine which files are relevant to the new or the old ingest process. We
>> should split them apart. Thoughts?
>>
>> Thank you,
>> Tadd Wood

Re: [SPOT-INGEST] Ingest file organization

Posted by Nate Smith <na...@gmail.com>.
Perhaps separating by framework would be good,

./spot-ingest/python
./spot-ingest/spark-streaming

Just my 2 cents,

- nathanael 

> On Jan 14, 2020, at 4:45 PM, Skip Cruse <f8...@gmail.com> wrote:
> 
> We should keep the name /spot-ingest/ for the original ingester, but move the new ingester to /spot-ingest-sparkstreaming/ or similar.  Hopefully we can use the ticket to track down the files that were created, so we can move them to a new home easily.
> 
> Get Outlook for iOS<https://aka.ms/o0ukef>
> 
> ________________________________
> From: Tadd Wood <ta...@apache.org>
> Sent: Tuesday, January 14, 2020 5:51 PM
> To: dev@spot.incubator.apache.org
> Subject: [SPOT-INGEST] Ingest file organization
> 
> I noticed that after SPOT-141 was introduced (a new kind of Spot Ingest,
> using PySpark Streaming) that it overlaid the new code on top of the old
> code on /spot-ingest/. When debugging the code, it makes it hard to
> determine which files are relevant to the new or the old ingest process. We
> should split them apart. Thoughts?
> 
> Thank you,
> Tadd Wood

Re: [SPOT-INGEST] Ingest file organization

Posted by Skip Cruse <f8...@gmail.com>.
We should keep the name /spot-ingest/ for the original ingester, but move the new ingester to /spot-ingest-sparkstreaming/ or similar.  Hopefully we can use the ticket to track down the files that were created, so we can move them to a new home easily.

Get Outlook for iOS<https://aka.ms/o0ukef>

________________________________
From: Tadd Wood <ta...@apache.org>
Sent: Tuesday, January 14, 2020 5:51 PM
To: dev@spot.incubator.apache.org
Subject: [SPOT-INGEST] Ingest file organization

I noticed that after SPOT-141 was introduced (a new kind of Spot Ingest,
using PySpark Streaming) that it overlaid the new code on top of the old
code on /spot-ingest/. When debugging the code, it makes it hard to
determine which files are relevant to the new or the old ingest process. We
should split them apart. Thoughts?

Thank you,
Tadd Wood