You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@oodt.apache.org by "Mattmann, Chris A (3980)" <ch...@jpl.nasa.gov> on 2014/08/13 18:04:52 UTC

Re: Remote data transfer

Thanks guys.

Etienne, I hope you don't mind but I've copied dev@oodt.apache.org

on this email. That way you can tap into the entire Apache OODT
community for help.

The URI has authority component is usually an error indicating
that you have referenced some environment variable in your config
(e.g., filemgr.properties in the etc directory) but that variable
isn't defined. E.g., maybe you have a *.policy.dirs property set
to file://[SOME_UNDEFINED_VARIABLE]/path/dir/ and SOME_UNDEFINED_VARIABLE
is undefined.

Can you check that to see if that's the root cause of this issue?

Cheers,
Chris

------------------------
Chris Mattmann
chris.mattmann@gmail.com




-----Original Message-----
From: Etienne Koen <et...@scs-space.com>
Date: Wednesday, August 13, 2014 1:42 AM
To: Thomas Bennett <th...@ska.ac.za>
Cc: "cschollar@ska.ac.za" <cs...@ska.ac.za>, Chris Mattmann
<ch...@gmail.com>
Subject: RE: Remote data transfer

>Hi Tom,
>
>I get the following error when using the argument:
>
>ERROR: Failed to ingest product 'blah.txt' : URI has an authority
>component
>
>Here both the server and client were using port 9000
>
>I get this when both the server and client are running on the same port
>
>When communicating on different ports I get:
>
><-- some I/O / HTTP exceptions -->
>...
>...
>
>ERROR: Failed to ingest product 'blah.txt' : Connection refused
>
>Server:9000 and Client:431
>
>Do you know what any of this mean?
>
>Cheers
>Etienne
>
>________________________________________
>From: Thomas Bennett [thomas@ska.ac.za]
>Sent: Wednesday, August 13, 2014 10:02 AM
>To: Etienne Koen
>Cc: cschollar@ska.ac.za; chris.mattmann@gmail.com
>Subject: Re: Remote data transfer
>
>Hey Etienne,
>
>I've been out of the office the last week but I'm back now.
>
>./filemgr-client --url http://localhost:9000 --operation --ingestProduct
>--productName blah.txt --productStructure Flat --productTypeName
>GenericFile --metadataFile file:///tmp/blah.txt.met --refs
>file:///tmp/blah.txt
>
>How would this line be modified to achieve what I want to do? I see there
>is also an argument --clientTransfer --dataTransfer but I am not sure
>what java class to use for this?
>
>You will need to specify the filemgr remotely ie: --url
>http://192.168.0.1 - are you doing this?
>
>I've done remote file transfer before I'll see if I can remember how to
>do it.
>
>Can I log into the CHPC with the usual credentials?
>
>Cheers,
>Tom
>--
>Thomas Bennett
>
>SKA South Africa
>Science Processing Team
>
>Office: +27 21 5067341
>Mobile: +27 79 5237105
>
>________________________________
>Disclaimer: This E-mail message, including any attachments, is intended
>only for the person or entity to which it is addressed, and may contain
>confidential information. Each page attached hereto must also be read in
>conjunction with this disclaimer.
>If you are not the intended recipient you are hereby notified that any
>disclosure, copying, distribution or reliance upon the contents of this
>e-mail is strictly prohibited. E.&O.E.
>
>Disclaimer: This E-mail message, including any attachments, is intended
>only for the  person or entity to which it is addressed, and may contain
>confidential  information. Each page attached hereto must also be read in
>conjunction with this disclaimer.
>If you are not the intended recipient you are hereby notified that any
>disclosure, copying, distribution or reliance upon the contents of this
>e-mail is strictly prohibited.    E.&O.E.


Re: Remote data transfer

Posted by Thomas Bennett <th...@ska.ac.za>.
Hi Etienne,

For a good description of the push pull architecture checkout the following
web link:

http://oodt.apache.org/components/maven/pushpull/development/developer.html

I think that this represents a good method to use due to its extensibility.
I will at some point start implementing this component to transfer data
from site to our archive in Cape Town. Currently I use NFS which is not
ideal.

As for a typical user scenario - this might be a natural extension point
into Virtual Observatory interface.

Cheers,
Tom

On Mon, Aug 18, 2014 at 11:36 AM, Etienne Koen <et...@scs-space.com>
wrote:

> Hi Tomas and all,
>
> I came across the push/pull tutorial on
> https://cwiki.apache.org/confluence/display/OODT/OODT+Push-Pull+User+Guide
> .
>
> Would this guide be more appropriate to download files that have  been
> archived by the file manager and represent a typical user scenario?
>
> Regards
> Etienne
> ________________________________________
> From: Thomas Bennett [thomas@ska.ac.za]
> Sent: Friday, August 15, 2014 9:54 AM
> To: Etienne Koen
> Cc: Mattmann, Chris A (3980); cschollar@ska.ac.za; dev@oodt.apache.org
> Subject: Re: Remote data transfer
>
> Hi Etienne,
>
> There are various methods you can use to download the data.
>
> See this page:
>
> https://cwiki.apache.org/confluence/display/OODT/Getting+products+from+a+remote+FileManager
>
> Recently there is some great work that has been done on using a REST API -
> this exists on svn trunk. I don't think it has been released yet.
>
> https://cwiki.apache.org/confluence/display/OODT/File+Manager+REST+API
>
> To use these components you will need to deploy tomcat or jetty.
>
> Shout if you need some help.
>
> Cheers,
> Tom
>
>
>
>
> On Thu, Aug 14, 2014 at 4:31 PM, Etienne Koen <etiennek@scs-space.com
> <ma...@scs-space.com>> wrote:
> Hi Chris and Tom,
>
> As I have mentioned before in my previous email, I have managed to ingest
> a file to a remote location using the filemgr-client. I am also able to
> query the information remotely using for example the query_tool in this way:
>
> $ ./query_tool --url http://192.168.0.10:9000 --lucene -query
> 'CAS.ProductName:blah.txt'
>
> 978ca28e-23b0-11e4-87fb-4f1c29029486
>
> What component would I use for searching and downloading the actual
> product from the remote file manager? Is the filemgr-client or query_tool
> capable of doing this?
>
> Are there any tutorials you would recommend?
>
> Thanks
> Etienne
>
> ________________________________________
> From: Mattmann, Chris A (3980) [chris.a.mattmann@jpl.nasa.gov<mailto:
> chris.a.mattmann@jpl.nasa.gov>]
> Sent: Wednesday, August 13, 2014 6:04 PM
> To: Etienne Koen; Thomas Bennett
> Cc: cschollar@ska.ac.za<ma...@ska.ac.za>; dev@oodt.apache.org
> <ma...@oodt.apache.org>; Mattmann, Chris A (3980)
> Subject: Re: Remote data transfer
>
> Thanks guys.
>
> Etienne, I hope you don't mind but I've copied dev@oodt.apache.org<mailto:
> dev@oodt.apache.org>
>
> on this email. That way you can tap into the entire Apache OODT
> community for help.
>
> The URI has authority component is usually an error indicating
> that you have referenced some environment variable in your config
> (e.g., filemgr.properties in the etc directory) but that variable
> isn't defined. E.g., maybe you have a *.policy.dirs property set
> to file://[SOME_UNDEFINED_VARIABLE]/path/dir/ and SOME_UNDEFINED_VARIABLE
> is undefined.
>
> Can you check that to see if that's the root cause of this issue?
>
> Cheers,
> Chris
>
> ------------------------
> Chris Mattmann
> chris.mattmann@gmail.com<ma...@gmail.com>
>
>
>
>
> -----Original Message-----
> From: Etienne Koen <et...@scs-space.com>>
> Date: Wednesday, August 13, 2014 1:42 AM
> To: Thomas Bennett <th...@ska.ac.za>>
> Cc: "cschollar@ska.ac.za<ma...@ska.ac.za>" <cschollar@ska.ac.za
> <ma...@ska.ac.za>>, Chris Mattmann
> <ch...@gmail.com>>
> Subject: RE: Remote data transfer
>
> >Hi Tom,
> >
> >I get the following error when using the argument:
> >
> >ERROR: Failed to ingest product 'blah.txt' : URI has an authority
> >component
> >
> >Here both the server and client were using port 9000
> >
> >I get this when both the server and client are running on the same port
> >
> >When communicating on different ports I get:
> >
> ><-- some I/O / HTTP exceptions -->
> >...
> >...
> >
> >ERROR: Failed to ingest product 'blah.txt' : Connection refused
> >
> >Server:9000 and Client:431
> >
> >Do you know what any of this mean?
> >
> >Cheers
> >Etienne
> >
> >________________________________________
> >From: Thomas Bennett [thomas@ska.ac.za<ma...@ska.ac.za>]
> >Sent: Wednesday, August 13, 2014 10:02 AM
> >To: Etienne Koen
> >Cc: cschollar@ska.ac.za<ma...@ska.ac.za>;
> chris.mattmann@gmail.com<ma...@gmail.com>
> >Subject: Re: Remote data transfer
> >
> >Hey Etienne,
> >
> >I've been out of the office the last week but I'm back now.
> >
> >./filemgr-client --url http://localhost:9000 --operation --ingestProduct
> >--productName blah.txt --productStructure Flat --productTypeName
> >GenericFile --metadataFile file:///tmp/blah.txt.met --refs
> >file:///tmp/blah.txt
> >
> >How would this line be modified to achieve what I want to do? I see there
> >is also an argument --clientTransfer --dataTransfer but I am not sure
> >what java class to use for this?
> >
> >You will need to specify the filemgr remotely ie: --url
> >http://192.168.0.1 - are you doing this?
> >
> >I've done remote file transfer before I'll see if I can remember how to
> >do it.
> >
> >Can I log into the CHPC with the usual credentials?
> >
> >Cheers,
> >Tom
> >--
> >Thomas Bennett
> >
> >SKA South Africa
> >Science Processing Team
> >
> >Office: +27 21 5067341<tel:%2B27%2021%205067341>
> >Mobile: +27 79 5237105<tel:%2B27%2079%205237105>
> >
> >________________________________
> >Disclaimer: This E-mail message, including any attachments, is intended
> >only for the person or entity to which it is addressed, and may contain
> >confidential information. Each page attached hereto must also be read in
> >conjunction with this disclaimer.
> >If you are not the intended recipient you are hereby notified that any
> >disclosure, copying, distribution or reliance upon the contents of this
> >e-mail is strictly prohibited. E.&O.E.
> >
> >Disclaimer: This E-mail message, including any attachments, is intended
> >only for the  person or entity to which it is addressed, and may contain
> >confidential  information. Each page attached hereto must also be read in
> >conjunction with this disclaimer.
> >If you are not the intended recipient you are hereby notified that any
> >disclosure, copying, distribution or reliance upon the contents of this
> >e-mail is strictly prohibited.    E.&O.E.
>
>
> Disclaimer: This E-mail message, including any attachments, is intended
> only for the  person or entity to which it is addressed, and may contain
> confidential  information. Each page attached hereto must also be read in
> conjunction with this disclaimer.
> If you are not the intended recipient you are hereby notified that any
> disclosure, copying, distribution or reliance upon the contents of this
> e-mail is strictly prohibited.    E.&O.E.
>
> Disclaimer: This E-mail message, including any attachments, is intended
> only for the  person or entity to which it is addressed, and may contain
> confidential  information. Each page attached hereto must also be read in
> conjunction with this disclaimer.
> If you are not the intended recipient you are hereby notified that any
> disclosure, copying, distribution or reliance upon the contents of this
> e-mail is strictly prohibited.    E.&O.E.
>
>
>
> --
> Thomas Bennett
>
> SKA South Africa
> Science Processing Team
>
> Office: +27 21 5067341
> Mobile: +27 79 5237105
>
> ________________________________
> Disclaimer: This E-mail message, including any attachments, is intended
> only for the person or entity to which it is addressed, and may contain
> confidential information. Each page attached hereto must also be read in
> conjunction with this disclaimer.
> If you are not the intended recipient you are hereby notified that any
> disclosure, copying, distribution or reliance upon the contents of this
> e-mail is strictly prohibited. E.&O.E.
>
> Disclaimer: This E-mail message, including any attachments, is intended
> only for the  person or entity to which it is addressed, and may contain
> confidential  information. Each page attached hereto must also be read in
> conjunction with this disclaimer.
> If you are not the intended recipient you are hereby notified that any
> disclosure, copying, distribution or reliance upon the contents of this
> e-mail is strictly prohibited.    E.&O.E.
>



-- 
Thomas Bennett

SKA South Africa
Science Processing Team

Office: +27 21 5067341
Mobile: +27 79 5237105

Re: Remote data transfer

Posted by "Mattmann, Chris A (3980)" <ch...@jpl.nasa.gov>.
BOOM, thanks Tom.

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Chief Architect
Instrument Software and Science Data Systems Section (398)
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 168-519, Mailstop: 168-527
Email: chris.a.mattmann@nasa.gov
WWW:  http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Associate Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++






-----Original Message-----
From: Tom Barber <to...@meteorite.bi>
Reply-To: "dev@oodt.apache.org" <de...@oodt.apache.org>
Date: Monday, August 18, 2014 1:21 PM
To: "dev@oodt.apache.org" <de...@oodt.apache.org>
Subject: Re: Remote data transfer

>I promise to finish my Pentaho PDI plugins as well at some point, then
>you can all slurp and transform and ingest from pretty much anywhere.
>
>Tom
>
>On 18/08/14 19:39, Thomas Bennett wrote:
>> Thanks Chris.
>>
>> Just to add to the conversation - what protocols are currently
>>supported?
>>
>> I've seen scp, FTP and http. Also Amazon S3?
>>
>> On Monday, August 18, 2014, Mattmann, Chris A (3980) <
>> chris.a.mattmann@jpl.nasa.gov> wrote:
>>
>>> Hi Etienne,
>>>
>>> Thanks. The Push Pull system is a way to pull down remote or ancillary
>>> files usually *ahead* of file manager ingestion, since the crawler
>>> really doesn't have a protocol layer to mitigate remote content.
>>> The typical use case if you use Push Pull is:
>>>
>>> 1. Model remote/ancillary files on other sites
>>> 2. Download them with push pull into a "staging area"
>>> 3. Crawl and ingest with crawler, as if the content were
>>> local to start out with.
>>>
>>> There is a Push Pull users guide here, it's a bit old but should
>>> explain it:
>>>
>>> 
>>>http://svn.apache.org/repos/asf/oodt/trunk/pushpull/src/main/resources/d
>>>ocu
>>> mentation/
>>>
>>>
>>> Cheers,
>>> Chris
>>>
>>>
>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>> Chris Mattmann, Ph.D.
>>> Chief Architect
>>> Instrument Software and Science Data Systems Section (398)
>>> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
>>> Office: 168-519, Mailstop: 168-527
>>> Email: chris.a.mattmann@nasa.gov <javascript:;>
>>> WWW:  http://sunset.usc.edu/~mattmann/
>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>> Adjunct Associate Professor, Computer Science Department
>>> University of Southern California, Los Angeles, CA 90089 USA
>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>>
>>>
>>>
>>>
>>>
>>>
>>> -----Original Message-----
>>> From: Etienne Koen <etiennek@scs-space.com <javascript:;>>
>>> Date: Monday, August 18, 2014 2:36 AM
>>> To: Thomas Bennett <thomas@ska.ac.za <javascript:;>>
>>> Cc: Chris Mattmann <Chris.A.Mattmann@jpl.nasa.gov <javascript:;>>, "
>>> cschollar@ska.ac.za <javascript:;>"
>>> <cschollar@ska.ac.za <javascript:;>>, "dev@oodt.apache.org
>>><javascript:;>"
>>> <dev@oodt.apache.org <javascript:;>>
>>> Subject: RE: Remote data transfer
>>>
>>>> Hi Tomas and all,
>>>>
>>>> I came across the push/pull tutorial on
>>>>
>>> 
>>>https://cwiki.apache.org/confluence/display/OODT/OODT+Push-Pull+User+Gui
>>>de
>>>> .
>>>>
>>>> Would this guide be more appropriate to download files that have  been
>>>> archived by the file manager and represent a typical user scenario?
>>>>
>>>> Regards
>>>> Etienne
>>>> ________________________________________
>>>> From: Thomas Bennett [thomas@ska.ac.za <javascript:;>]
>>>> Sent: Friday, August 15, 2014 9:54 AM
>>>> To: Etienne Koen
>>>> Cc: Mattmann, Chris A (3980); cschollar@ska.ac.za <javascript:;>;
>>> dev@oodt.apache.org <javascript:;>
>>>> Subject: Re: Remote data transfer
>>>>
>>>> Hi Etienne,
>>>>
>>>> There are various methods you can use to download the data.
>>>>
>>>> See this page:
>>>>
>>> 
>>>https://cwiki.apache.org/confluence/display/OODT/Getting+products+from+a
>>>+r
>>>> emote+FileManager
>>>>
>>>> Recently there is some great work that has been done on using a REST
>>>>API
>>>> - this exists on svn trunk. I don't think it has been released yet.
>>>>
>>>> https://cwiki.apache.org/confluence/display/OODT/File+Manager+REST+API
>>>>
>>>> To use these components you will need to deploy tomcat or jetty.
>>>>
>>>> Shout if you need some help.
>>>>
>>>> Cheers,
>>>> Tom
>>>>
>>>>
>>>>
>>>>
>>>> On Thu, Aug 14, 2014 at 4:31 PM, Etienne Koen
>>>> <etiennek@scs-space.com <javascript:;><mailto:etiennek@scs-space.com
>>> <javascript:;>>> wrote:
>>>> Hi Chris and Tom,
>>>>
>>>> As I have mentioned before in my previous email, I have managed to
>>>>ingest
>>>> a file to a remote location using the filemgr-client. I am also able
>>>>to
>>>> query the information remotely using for example the query_tool in
>>>>this
>>>> way:
>>>>
>>>> $ ./query_tool --url http://192.168.0.10:9000 --lucene -query
>>>> 'CAS.ProductName:blah.txt'
>>>>
>>>> 978ca28e-23b0-11e4-87fb-4f1c29029486
>>>>
>>>> What component would I use for searching and downloading the actual
>>>> product from the remote file manager? Is the filemgr-client or
>>>>query_tool
>>>> capable of doing this?
>>>>
>>>> Are there any tutorials you would recommend?
>>>>
>>>> Thanks
>>>> Etienne
>>>>
>>>> ________________________________________
>>>> From: Mattmann, Chris A (3980)
>>>> [chris.a.mattmann@jpl.nasa.gov <javascript:;><mailto:
>>> chris.a.mattmann@jpl.nasa.gov <javascript:;>>]
>>>> Sent: Wednesday, August 13, 2014 6:04 PM
>>>> To: Etienne Koen; Thomas Bennett
>>>> Cc: cschollar@ska.ac.za <javascript:;><mailto:cschollar@ska.ac.za
>>> <javascript:;>>;
>>>> dev@oodt.apache.org <javascript:;><mailto:dev@oodt.apache.org
>>> <javascript:;>>; Mattmann, Chris A (3980)
>>>> Subject: Re: Remote data transfer
>>>>
>>>> Thanks guys.
>>>>
>>>> Etienne, I hope you don't mind but I've copied
>>>> dev@oodt.apache.org <javascript:;><mailto:dev@oodt.apache.org
>>> <javascript:;>>
>>>> on this email. That way you can tap into the entire Apache OODT
>>>> community for help.
>>>>
>>>> The URI has authority component is usually an error indicating
>>>> that you have referenced some environment variable in your config
>>>> (e.g., filemgr.properties in the etc directory) but that variable
>>>> isn't defined. E.g., maybe you have a *.policy.dirs property set
>>>> to file://[SOME_UNDEFINED_VARIABLE]/path/dir/ and
>>>>SOME_UNDEFINED_VARIABLE
>>>> is undefined.
>>>>
>>>> Can you check that to see if that's the root cause of this issue?
>>>>
>>>> Cheers,
>>>> Chris
>>>>
>>>> ------------------------
>>>> Chris Mattmann
>>>> chris.mattmann@gmail.com
>>>><javascript:;><mailto:chris.mattmann@gmail.com
>>> <javascript:;>>
>>>>
>>>>
>>>>
>>>> -----Original Message-----
>>>> From: Etienne Koen <etiennek@scs-space.com <javascript:;><mailto:
>>> etiennek@scs-space.com <javascript:;>>>
>>>> Date: Wednesday, August 13, 2014 1:42 AM
>>>> To: Thomas Bennett <thomas@ska.ac.za <javascript:;><mailto:
>>> thomas@ska.ac.za <javascript:;>>>
>>>> Cc: "cschollar@ska.ac.za <javascript:;><mailto:cschollar@ska.ac.za
>>> <javascript:;>>"
>>>> <cschollar@ska.ac.za <javascript:;><mailto:cschollar@ska.ac.za
>>> <javascript:;>>>, Chris Mattmann
>>>> <chris.mattmann@gmail.com
>>>><javascript:;><mailto:chris.mattmann@gmail.com
>>> <javascript:;>>>
>>>> Subject: RE: Remote data transfer
>>>>
>>>>> Hi Tom,
>>>>>
>>>>> I get the following error when using the argument:
>>>>>
>>>>> ERROR: Failed to ingest product 'blah.txt' : URI has an authority
>>>>> component
>>>>>
>>>>> Here both the server and client were using port 9000
>>>>>
>>>>> I get this when both the server and client are running on the same
>>>>>port
>>>>>
>>>>> When communicating on different ports I get:
>>>>>
>>>>> <-- some I/O / HTTP exceptions -->
>>>>> ...
>>>>> ...
>>>>>
>>>>> ERROR: Failed to ingest product 'blah.txt' : Connection refused
>>>>>
>>>>> Server:9000 and Client:431
>>>>>
>>>>> Do you know what any of this mean?
>>>>>
>>>>> Cheers
>>>>> Etienne
>>>>>
>>>>> ________________________________________
>>>>> From: Thomas Bennett [thomas@ska.ac.za <javascript:;><mailto:
>>> thomas@ska.ac.za <javascript:;>>]
>>>>> Sent: Wednesday, August 13, 2014 10:02 AM
>>>>> To: Etienne Koen
>>>>> Cc: cschollar@ska.ac.za <javascript:;><mailto:cschollar@ska.ac.za
>>> <javascript:;>>;
>>>>> chris.mattmann@gmail.com
>>>>><javascript:;><mailto:chris.mattmann@gmail.com
>>> <javascript:;>>
>>>>> Subject: Re: Remote data transfer
>>>>>
>>>>> Hey Etienne,
>>>>>
>>>>> I've been out of the office the last week but I'm back now.
>>>>>
>>>>> ./filemgr-client --url http://localhost:9000 --operation
>>>>>--ingestProduct
>>>>> --productName blah.txt --productStructure Flat --productTypeName
>>>>> GenericFile --metadataFile file:///tmp/blah.txt.met --refs
>>>>> file:///tmp/blah.txt
>>>>>
>>>>> How would this line be modified to achieve what I want to do? I see
>>>>>there
>>>>> is also an argument --clientTransfer --dataTransfer but I am not sure
>>>>> what java class to use for this?
>>>>>
>>>>> You will need to specify the filemgr remotely ie: --url
>>>>> http://192.168.0.1 - are you doing this?
>>>>>
>>>>> I've done remote file transfer before I'll see if I can remember how
>>>>>to
>>>>> do it.
>>>>>
>>>>> Can I log into the CHPC with the usual credentials?
>>>>>
>>>>> Cheers,
>>>>> Tom
>>>>> --
>>>>> Thomas Bennett
>>>>>
>>>>> SKA South Africa
>>>>> Science Processing Team
>>>>>
>>>>> Office: +27 21 5067341<tel:%2B27%2021%205067341>
>>>>> Mobile: +27 79 5237105<tel:%2B27%2079%205237105>
>>>>>
>>>>> ________________________________
>>>>> Disclaimer: This E-mail message, including any attachments, is
>>>>>intended
>>>>> only for the person or entity to which it is addressed, and may
>>>>>contain
>>>>> confidential information. Each page attached hereto must also be
>>>>>read in
>>>>> conjunction with this disclaimer.
>>>>> If you are not the intended recipient you are hereby notified that
>>>>>any
>>>>> disclosure, copying, distribution or reliance upon the contents of
>>>>>this
>>>>> e-mail is strictly prohibited. E.&O.E.
>>>>>
>>>>> Disclaimer: This E-mail message, including any attachments, is
>>>>>intended
>>>>> only for the  person or entity to which it is addressed, and may
>>>>>contain
>>>>> confidential  information. Each page attached hereto must also be
>>>>>read in
>>>>> conjunction with this disclaimer.
>>>>> If you are not the intended recipient you are hereby notified that
>>>>>any
>>>>> disclosure, copying, distribution or reliance upon the contents of
>>>>>this
>>>>> e-mail is strictly prohibited.    E.&O.E.
>>>>
>>>> Disclaimer: This E-mail message, including any attachments, is
>>>>intended
>>>> only for the  person or entity to which it is addressed, and may
>>>>contain
>>>> confidential  information. Each page attached hereto must also be
>>>>read in
>>>> conjunction with this disclaimer.
>>>> If you are not the intended recipient you are hereby notified that any
>>>> disclosure, copying, distribution or reliance upon the contents of
>>>>this
>>>> e-mail is strictly prohibited.    E.&O.E.
>>>>
>>>> Disclaimer: This E-mail message, including any attachments, is
>>>>intended
>>>> only for the  person or entity to which it is addressed, and may
>>>>contain
>>>> confidential  information. Each page attached hereto must also be
>>>>read in
>>>> conjunction with this disclaimer.
>>>> If you are not the intended recipient you are hereby notified that any
>>>> disclosure, copying, distribution or reliance upon the contents of
>>>>this
>>>> e-mail is strictly prohibited.    E.&O.E.
>>>>
>>>>
>>>>
>>>> --
>>>> Thomas Bennett
>>>>
>>>> SKA South Africa
>>>> Science Processing Team
>>>>
>>>> Office: +27 21 5067341
>>>> Mobile: +27 79 5237105
>>>>
>>>> ________________________________
>>>> Disclaimer: This E-mail message, including any attachments, is
>>>>intended
>>>> only for the person or entity to which it is addressed, and may
>>>>contain
>>>> confidential information. Each page attached hereto must also be read
>>>>in
>>>> conjunction with this disclaimer.
>>>> If you are not the intended recipient you are hereby notified that any
>>>> disclosure, copying, distribution or reliance upon the contents of
>>>>this
>>>> e-mail is strictly prohibited. E.&O.E.
>>>>
>>>> Disclaimer: This E-mail message, including any attachments, is
>>>>intended
>>>> only for the  person or entity to which it is addressed, and may
>>>>contain
>>>> confidential  information. Each page attached hereto must also be
>>>>read in
>>>> conjunction with this disclaimer.
>>>> If you are not the intended recipient you are hereby notified that any
>>>> disclosure, copying, distribution or reliance upon the contents of
>>>>this
>>>> e-mail is strictly prohibited.    E.&O.E.
>>>
>
>
>-- 
>*Tom Barber* | Technical Director
>
>meteorite bi
>*T:* +44 20 8133 3730
>*W:* www.meteorite.bi | *Skype:* meteorite.consulting
>*A:* Surrey Technology Centre, Surrey Research Park, Guildford, GU2 7YG,
>UK


Re: Remote data transfer

Posted by Tom Barber <to...@meteorite.bi>.
I promise to finish my Pentaho PDI plugins as well at some point, then 
you can all slurp and transform and ingest from pretty much anywhere.

Tom

On 18/08/14 19:39, Thomas Bennett wrote:
> Thanks Chris.
>
> Just to add to the conversation - what protocols are currently supported?
>
> I've seen scp, FTP and http. Also Amazon S3?
>
> On Monday, August 18, 2014, Mattmann, Chris A (3980) <
> chris.a.mattmann@jpl.nasa.gov> wrote:
>
>> Hi Etienne,
>>
>> Thanks. The Push Pull system is a way to pull down remote or ancillary
>> files usually *ahead* of file manager ingestion, since the crawler
>> really doesn't have a protocol layer to mitigate remote content.
>> The typical use case if you use Push Pull is:
>>
>> 1. Model remote/ancillary files on other sites
>> 2. Download them with push pull into a "staging area"
>> 3. Crawl and ingest with crawler, as if the content were
>> local to start out with.
>>
>> There is a Push Pull users guide here, it's a bit old but should
>> explain it:
>>
>> http://svn.apache.org/repos/asf/oodt/trunk/pushpull/src/main/resources/docu
>> mentation/
>>
>>
>> Cheers,
>> Chris
>>
>>
>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> Chris Mattmann, Ph.D.
>> Chief Architect
>> Instrument Software and Science Data Systems Section (398)
>> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
>> Office: 168-519, Mailstop: 168-527
>> Email: chris.a.mattmann@nasa.gov <javascript:;>
>> WWW:  http://sunset.usc.edu/~mattmann/
>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> Adjunct Associate Professor, Computer Science Department
>> University of Southern California, Los Angeles, CA 90089 USA
>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>
>>
>>
>>
>>
>>
>> -----Original Message-----
>> From: Etienne Koen <etiennek@scs-space.com <javascript:;>>
>> Date: Monday, August 18, 2014 2:36 AM
>> To: Thomas Bennett <thomas@ska.ac.za <javascript:;>>
>> Cc: Chris Mattmann <Chris.A.Mattmann@jpl.nasa.gov <javascript:;>>, "
>> cschollar@ska.ac.za <javascript:;>"
>> <cschollar@ska.ac.za <javascript:;>>, "dev@oodt.apache.org <javascript:;>"
>> <dev@oodt.apache.org <javascript:;>>
>> Subject: RE: Remote data transfer
>>
>>> Hi Tomas and all,
>>>
>>> I came across the push/pull tutorial on
>>>
>> https://cwiki.apache.org/confluence/display/OODT/OODT+Push-Pull+User+Guide
>>> .
>>>
>>> Would this guide be more appropriate to download files that have  been
>>> archived by the file manager and represent a typical user scenario?
>>>
>>> Regards
>>> Etienne
>>> ________________________________________
>>> From: Thomas Bennett [thomas@ska.ac.za <javascript:;>]
>>> Sent: Friday, August 15, 2014 9:54 AM
>>> To: Etienne Koen
>>> Cc: Mattmann, Chris A (3980); cschollar@ska.ac.za <javascript:;>;
>> dev@oodt.apache.org <javascript:;>
>>> Subject: Re: Remote data transfer
>>>
>>> Hi Etienne,
>>>
>>> There are various methods you can use to download the data.
>>>
>>> See this page:
>>>
>> https://cwiki.apache.org/confluence/display/OODT/Getting+products+from+a+r
>>> emote+FileManager
>>>
>>> Recently there is some great work that has been done on using a REST API
>>> - this exists on svn trunk. I don't think it has been released yet.
>>>
>>> https://cwiki.apache.org/confluence/display/OODT/File+Manager+REST+API
>>>
>>> To use these components you will need to deploy tomcat or jetty.
>>>
>>> Shout if you need some help.
>>>
>>> Cheers,
>>> Tom
>>>
>>>
>>>
>>>
>>> On Thu, Aug 14, 2014 at 4:31 PM, Etienne Koen
>>> <etiennek@scs-space.com <javascript:;><mailto:etiennek@scs-space.com
>> <javascript:;>>> wrote:
>>> Hi Chris and Tom,
>>>
>>> As I have mentioned before in my previous email, I have managed to ingest
>>> a file to a remote location using the filemgr-client. I am also able to
>>> query the information remotely using for example the query_tool in this
>>> way:
>>>
>>> $ ./query_tool --url http://192.168.0.10:9000 --lucene -query
>>> 'CAS.ProductName:blah.txt'
>>>
>>> 978ca28e-23b0-11e4-87fb-4f1c29029486
>>>
>>> What component would I use for searching and downloading the actual
>>> product from the remote file manager? Is the filemgr-client or query_tool
>>> capable of doing this?
>>>
>>> Are there any tutorials you would recommend?
>>>
>>> Thanks
>>> Etienne
>>>
>>> ________________________________________
>>> From: Mattmann, Chris A (3980)
>>> [chris.a.mattmann@jpl.nasa.gov <javascript:;><mailto:
>> chris.a.mattmann@jpl.nasa.gov <javascript:;>>]
>>> Sent: Wednesday, August 13, 2014 6:04 PM
>>> To: Etienne Koen; Thomas Bennett
>>> Cc: cschollar@ska.ac.za <javascript:;><mailto:cschollar@ska.ac.za
>> <javascript:;>>;
>>> dev@oodt.apache.org <javascript:;><mailto:dev@oodt.apache.org
>> <javascript:;>>; Mattmann, Chris A (3980)
>>> Subject: Re: Remote data transfer
>>>
>>> Thanks guys.
>>>
>>> Etienne, I hope you don't mind but I've copied
>>> dev@oodt.apache.org <javascript:;><mailto:dev@oodt.apache.org
>> <javascript:;>>
>>> on this email. That way you can tap into the entire Apache OODT
>>> community for help.
>>>
>>> The URI has authority component is usually an error indicating
>>> that you have referenced some environment variable in your config
>>> (e.g., filemgr.properties in the etc directory) but that variable
>>> isn't defined. E.g., maybe you have a *.policy.dirs property set
>>> to file://[SOME_UNDEFINED_VARIABLE]/path/dir/ and SOME_UNDEFINED_VARIABLE
>>> is undefined.
>>>
>>> Can you check that to see if that's the root cause of this issue?
>>>
>>> Cheers,
>>> Chris
>>>
>>> ------------------------
>>> Chris Mattmann
>>> chris.mattmann@gmail.com <javascript:;><mailto:chris.mattmann@gmail.com
>> <javascript:;>>
>>>
>>>
>>>
>>> -----Original Message-----
>>> From: Etienne Koen <etiennek@scs-space.com <javascript:;><mailto:
>> etiennek@scs-space.com <javascript:;>>>
>>> Date: Wednesday, August 13, 2014 1:42 AM
>>> To: Thomas Bennett <thomas@ska.ac.za <javascript:;><mailto:
>> thomas@ska.ac.za <javascript:;>>>
>>> Cc: "cschollar@ska.ac.za <javascript:;><mailto:cschollar@ska.ac.za
>> <javascript:;>>"
>>> <cschollar@ska.ac.za <javascript:;><mailto:cschollar@ska.ac.za
>> <javascript:;>>>, Chris Mattmann
>>> <chris.mattmann@gmail.com <javascript:;><mailto:chris.mattmann@gmail.com
>> <javascript:;>>>
>>> Subject: RE: Remote data transfer
>>>
>>>> Hi Tom,
>>>>
>>>> I get the following error when using the argument:
>>>>
>>>> ERROR: Failed to ingest product 'blah.txt' : URI has an authority
>>>> component
>>>>
>>>> Here both the server and client were using port 9000
>>>>
>>>> I get this when both the server and client are running on the same port
>>>>
>>>> When communicating on different ports I get:
>>>>
>>>> <-- some I/O / HTTP exceptions -->
>>>> ...
>>>> ...
>>>>
>>>> ERROR: Failed to ingest product 'blah.txt' : Connection refused
>>>>
>>>> Server:9000 and Client:431
>>>>
>>>> Do you know what any of this mean?
>>>>
>>>> Cheers
>>>> Etienne
>>>>
>>>> ________________________________________
>>>> From: Thomas Bennett [thomas@ska.ac.za <javascript:;><mailto:
>> thomas@ska.ac.za <javascript:;>>]
>>>> Sent: Wednesday, August 13, 2014 10:02 AM
>>>> To: Etienne Koen
>>>> Cc: cschollar@ska.ac.za <javascript:;><mailto:cschollar@ska.ac.za
>> <javascript:;>>;
>>>> chris.mattmann@gmail.com <javascript:;><mailto:chris.mattmann@gmail.com
>> <javascript:;>>
>>>> Subject: Re: Remote data transfer
>>>>
>>>> Hey Etienne,
>>>>
>>>> I've been out of the office the last week but I'm back now.
>>>>
>>>> ./filemgr-client --url http://localhost:9000 --operation --ingestProduct
>>>> --productName blah.txt --productStructure Flat --productTypeName
>>>> GenericFile --metadataFile file:///tmp/blah.txt.met --refs
>>>> file:///tmp/blah.txt
>>>>
>>>> How would this line be modified to achieve what I want to do? I see there
>>>> is also an argument --clientTransfer --dataTransfer but I am not sure
>>>> what java class to use for this?
>>>>
>>>> You will need to specify the filemgr remotely ie: --url
>>>> http://192.168.0.1 - are you doing this?
>>>>
>>>> I've done remote file transfer before I'll see if I can remember how to
>>>> do it.
>>>>
>>>> Can I log into the CHPC with the usual credentials?
>>>>
>>>> Cheers,
>>>> Tom
>>>> --
>>>> Thomas Bennett
>>>>
>>>> SKA South Africa
>>>> Science Processing Team
>>>>
>>>> Office: +27 21 5067341<tel:%2B27%2021%205067341>
>>>> Mobile: +27 79 5237105<tel:%2B27%2079%205237105>
>>>>
>>>> ________________________________
>>>> Disclaimer: This E-mail message, including any attachments, is intended
>>>> only for the person or entity to which it is addressed, and may contain
>>>> confidential information. Each page attached hereto must also be read in
>>>> conjunction with this disclaimer.
>>>> If you are not the intended recipient you are hereby notified that any
>>>> disclosure, copying, distribution or reliance upon the contents of this
>>>> e-mail is strictly prohibited. E.&O.E.
>>>>
>>>> Disclaimer: This E-mail message, including any attachments, is intended
>>>> only for the  person or entity to which it is addressed, and may contain
>>>> confidential  information. Each page attached hereto must also be read in
>>>> conjunction with this disclaimer.
>>>> If you are not the intended recipient you are hereby notified that any
>>>> disclosure, copying, distribution or reliance upon the contents of this
>>>> e-mail is strictly prohibited.    E.&O.E.
>>>
>>> Disclaimer: This E-mail message, including any attachments, is intended
>>> only for the  person or entity to which it is addressed, and may contain
>>> confidential  information. Each page attached hereto must also be read in
>>> conjunction with this disclaimer.
>>> If you are not the intended recipient you are hereby notified that any
>>> disclosure, copying, distribution or reliance upon the contents of this
>>> e-mail is strictly prohibited.    E.&O.E.
>>>
>>> Disclaimer: This E-mail message, including any attachments, is intended
>>> only for the  person or entity to which it is addressed, and may contain
>>> confidential  information. Each page attached hereto must also be read in
>>> conjunction with this disclaimer.
>>> If you are not the intended recipient you are hereby notified that any
>>> disclosure, copying, distribution or reliance upon the contents of this
>>> e-mail is strictly prohibited.    E.&O.E.
>>>
>>>
>>>
>>> --
>>> Thomas Bennett
>>>
>>> SKA South Africa
>>> Science Processing Team
>>>
>>> Office: +27 21 5067341
>>> Mobile: +27 79 5237105
>>>
>>> ________________________________
>>> Disclaimer: This E-mail message, including any attachments, is intended
>>> only for the person or entity to which it is addressed, and may contain
>>> confidential information. Each page attached hereto must also be read in
>>> conjunction with this disclaimer.
>>> If you are not the intended recipient you are hereby notified that any
>>> disclosure, copying, distribution or reliance upon the contents of this
>>> e-mail is strictly prohibited. E.&O.E.
>>>
>>> Disclaimer: This E-mail message, including any attachments, is intended
>>> only for the  person or entity to which it is addressed, and may contain
>>> confidential  information. Each page attached hereto must also be read in
>>> conjunction with this disclaimer.
>>> If you are not the intended recipient you are hereby notified that any
>>> disclosure, copying, distribution or reliance upon the contents of this
>>> e-mail is strictly prohibited.    E.&O.E.
>>


-- 
*Tom Barber* | Technical Director

meteorite bi
*T:* +44 20 8133 3730
*W:* www.meteorite.bi | *Skype:* meteorite.consulting
*A:* Surrey Technology Centre, Surrey Research Park, Guildford, GU2 7YG, UK

Re: Remote data transfer

Posted by "Mattmann, Chris A (3980)" <ch...@jpl.nasa.gov>.
Great job getting Push Pull running Etienne!

Parallel transfer is supported in both File Manager, Crawler, and
Push Pull.

By default:

1. File Manager does parallel transfers/ingests of files and can
ingest multiple files at once.

2. Crawler performs sequential file crawls, though you can run
multiple crawlers in parallel to achieve horizontal scalability.
(you can run crawlers themselves as daemons and on a port)

3. Push Pull itself is a parallel remote downloader, and
will download files in parallel.

As for checksums:

Specific projects have implemented checksums as a crawler
action. In particular, JPSS-GRAVITE, a mission data system
for the Joint Polar Orbiting Satellites (JPSS) ending up
implementing their own checksums:

http://archive.apachecon.com/na2013/presentations/27-Wednesday/Apache_in_Sc
ience/11:15-OODT_GRAVITE.pptx

The NPP Sounder PEATE project also had these similar crawler
actions. 

Others have implemented these actions as specific data
transfer factory implementations.

So, you have quite a bit of options here. I'd be happy
to discuss more.

Cheers,
Chris

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Chief Architect
Instrument Software and Science Data Systems Section (398)
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 168-519, Mailstop: 168-527
Email: chris.a.mattmann@nasa.gov
WWW:  http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Associate Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++






-----Original Message-----
From: Etienne Koen <et...@scs-space.com>
Date: Wednesday, August 20, 2014 12:34 AM
To: Chris Mattmann <Ch...@jpl.nasa.gov>, "dev@oodt.apache.org"
<de...@oodt.apache.org>
Cc: Thomas Bennett <th...@ska.ac.za>, "cschollar@ska.ac.za"
<cs...@ska.ac.za>
Subject: RE: Remote data transfer

>Hi Chris and Tom,
>
>We are still busy at SKA with refining the first test plan for data
>delivery and archive. There are a few things that came out of yesterdays
>meeting about the requirements which will be added to the tests which I
>need to explore using OODT as well
>
>- parallel transfer
>- transfer with/without a checksum calculation
>
>Would it be possible to point me to some documentation/tutorial again
>which describes both the use of these capabilities?
>
>btw, I got the pushpull component of OODT running on the CHPC with some
>elementary tests! :-)
>
>I will forward the documentation to you once it is completed.
>
>Regards
>Etienne
>________________________________________
>From: Mattmann, Chris A (3980) [chris.a.mattmann@jpl.nasa.gov]
>Sent: Monday, August 18, 2014 11:27 PM
>To: dev@oodt.apache.org
>Cc: Etienne Koen; Thomas Bennett; cschollar@ska.ac.za
>Subject: Re: Remote data transfer
>
>Hi Tom,
>
>Great question!
>
>By default, all of the protocols support here in the cas-protocol
>module of Apache OODT:
>
>http://svn.apache.org/repos/asf/oodt/trunk/protocol/
>
>
>* ftp
>* http(s)
>* imaps
>* sftp
>
>Note that there is an Amazon S3 "data transfer" module in
>the File Manager, but not explicitly in Push Pull. It would
>be hopefully not too difficult (and a welcomed patch!) to
>incorporate this functionality into the cas-protcool layer.
>
>There are also these specific plugins PushPull plugins:
>
>https://cwiki.apache.org/confluence/display/OODT/OODT+Push+Pull+Plugins
>
>
>Note the Push Pull plugins in the wiki page above leverage LGPL libraries
>and I wasn't able to find a replacement for them. We aren't officially
>"recommending" them as Apache OODT PMC members, but they are useful
>FTP plugins if you can't get the existing protocol-ftp plugin to work.
>You knowingly however do so by explicitly downloading these plugins
>and building them into your OODT push pull installation.
>
>I would love if someone were to find ALv2 compatible versions of the
>above plugins so we could manage them in our code base but hasn't
>be done yet.
>
>
>
>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>Chris Mattmann, Ph.D.
>Chief Architect
>Instrument Software and Science Data Systems Section (398)
>NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
>Office: 168-519, Mailstop: 168-527
>Email: chris.a.mattmann@nasa.gov
>WWW:  http://sunset.usc.edu/~mattmann/
>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>Adjunct Associate Professor, Computer Science Department
>University of Southern California, Los Angeles, CA 90089 USA
>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>
>
>
>
>
>
>-----Original Message-----
>From: Thomas Bennett <lm...@gmail.com>
>Reply-To: "dev@oodt.apache.org" <de...@oodt.apache.org>
>Date: Monday, August 18, 2014 11:39 AM
>To: "dev@oodt.apache.org" <de...@oodt.apache.org>
>Cc: Etienne Koen <et...@scs-space.com>, Thomas Bennett
><th...@ska.ac.za>, "cschollar@ska.ac.za" <cs...@ska.ac.za>
>Subject: Re: Remote data transfer
>
>>Thanks Chris.
>>
>>Just to add to the conversation - what protocols are currently supported?
>>
>>I've seen scp, FTP and http. Also Amazon S3?
>>
>>On Monday, August 18, 2014, Mattmann, Chris A (3980) <
>>chris.a.mattmann@jpl.nasa.gov> wrote:
>>
>>> Hi Etienne,
>>>
>>> Thanks. The Push Pull system is a way to pull down remote or ancillary
>>> files usually *ahead* of file manager ingestion, since the crawler
>>> really doesn't have a protocol layer to mitigate remote content.
>>> The typical use case if you use Push Pull is:
>>>
>>> 1. Model remote/ancillary files on other sites
>>> 2. Download them with push pull into a "staging area"
>>> 3. Crawl and ingest with crawler, as if the content were
>>> local to start out with.
>>>
>>> There is a Push Pull users guide here, it's a bit old but should
>>> explain it:
>>>
>>>
>>>http://svn.apache.org/repos/asf/oodt/trunk/pushpull/src/main/resources/d
>>>o
>>>cu
>>> mentation/
>>>
>>>
>>> Cheers,
>>> Chris
>>>
>>>
>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>> Chris Mattmann, Ph.D.
>>> Chief Architect
>>> Instrument Software and Science Data Systems Section (398)
>>> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
>>> Office: 168-519, Mailstop: 168-527
>>> Email: chris.a.mattmann@nasa.gov <javascript:;>
>>> WWW:  http://sunset.usc.edu/~mattmann/
>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>> Adjunct Associate Professor, Computer Science Department
>>> University of Southern California, Los Angeles, CA 90089 USA
>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>>
>>>
>>>
>>>
>>>
>>>
>>> -----Original Message-----
>>> From: Etienne Koen <etiennek@scs-space.com <javascript:;>>
>>> Date: Monday, August 18, 2014 2:36 AM
>>> To: Thomas Bennett <thomas@ska.ac.za <javascript:;>>
>>> Cc: Chris Mattmann <Chris.A.Mattmann@jpl.nasa.gov <javascript:;>>, "
>>> cschollar@ska.ac.za <javascript:;>"
>>> <cschollar@ska.ac.za <javascript:;>>, "dev@oodt.apache.org
>>><javascript:;>"
>>> <dev@oodt.apache.org <javascript:;>>
>>> Subject: RE: Remote data transfer
>>>
>>> >Hi Tomas and all,
>>> >
>>> >I came across the push/pull tutorial on
>>> >
>>>
>>>https://cwiki.apache.org/confluence/display/OODT/OODT+Push-Pull+User+Gui
>>>d
>>>e
>>> >.
>>> >
>>> >Would this guide be more appropriate to download files that have  been
>>> >archived by the file manager and represent a typical user scenario?
>>> >
>>> >Regards
>>> >Etienne
>>> >________________________________________
>>> >From: Thomas Bennett [thomas@ska.ac.za <javascript:;>]
>>> >Sent: Friday, August 15, 2014 9:54 AM
>>> >To: Etienne Koen
>>> >Cc: Mattmann, Chris A (3980); cschollar@ska.ac.za <javascript:;>;
>>> dev@oodt.apache.org <javascript:;>
>>> >Subject: Re: Remote data transfer
>>> >
>>> >Hi Etienne,
>>> >
>>> >There are various methods you can use to download the data.
>>> >
>>> >See this page:
>>> >
>>>
>>>https://cwiki.apache.org/confluence/display/OODT/Getting+products+from+a
>>>+
>>>r
>>> >emote+FileManager
>>> >
>>> >Recently there is some great work that has been done on using a REST
>>>API
>>> >- this exists on svn trunk. I don't think it has been released yet.
>>> >
>>> >https://cwiki.apache.org/confluence/display/OODT/File+Manager+REST+API
>>> >
>>> >To use these components you will need to deploy tomcat or jetty.
>>> >
>>> >Shout if you need some help.
>>> >
>>> >Cheers,
>>> >Tom
>>> >
>>> >
>>> >
>>> >
>>> >On Thu, Aug 14, 2014 at 4:31 PM, Etienne Koen
>>> ><etiennek@scs-space.com <javascript:;><mailto:etiennek@scs-space.com
>>> <javascript:;>>> wrote:
>>> >Hi Chris and Tom,
>>> >
>>> >As I have mentioned before in my previous email, I have managed to
>>>ingest
>>> >a file to a remote location using the filemgr-client. I am also able
>>>to
>>> >query the information remotely using for example the query_tool in
>>>this
>>> >way:
>>> >
>>> >$ ./query_tool --url http://192.168.0.10:9000 --lucene -query
>>> >'CAS.ProductName:blah.txt'
>>> >
>>> >978ca28e-23b0-11e4-87fb-4f1c29029486
>>> >
>>> >What component would I use for searching and downloading the actual
>>> >product from the remote file manager? Is the filemgr-client or
>>>query_tool
>>> >capable of doing this?
>>> >
>>> >Are there any tutorials you would recommend?
>>> >
>>> >Thanks
>>> >Etienne
>>> >
>>> >________________________________________
>>> >From: Mattmann, Chris A (3980)
>>> >[chris.a.mattmann@jpl.nasa.gov <javascript:;><mailto:
>>> chris.a.mattmann@jpl.nasa.gov <javascript:;>>]
>>> >Sent: Wednesday, August 13, 2014 6:04 PM
>>> >To: Etienne Koen; Thomas Bennett
>>> >Cc: cschollar@ska.ac.za <javascript:;><mailto:cschollar@ska.ac.za
>>> <javascript:;>>;
>>> >dev@oodt.apache.org <javascript:;><mailto:dev@oodt.apache.org
>>> <javascript:;>>; Mattmann, Chris A (3980)
>>> >Subject: Re: Remote data transfer
>>> >
>>> >Thanks guys.
>>> >
>>> >Etienne, I hope you don't mind but I've copied
>>> >dev@oodt.apache.org <javascript:;><mailto:dev@oodt.apache.org
>>> <javascript:;>>
>>> >
>>> >on this email. That way you can tap into the entire Apache OODT
>>> >community for help.
>>> >
>>> >The URI has authority component is usually an error indicating
>>> >that you have referenced some environment variable in your config
>>> >(e.g., filemgr.properties in the etc directory) but that variable
>>> >isn't defined. E.g., maybe you have a *.policy.dirs property set
>>> >to file://[SOME_UNDEFINED_VARIABLE]/path/dir/ and
>>>SOME_UNDEFINED_VARIABLE
>>> >is undefined.
>>> >
>>> >Can you check that to see if that's the root cause of this issue?
>>> >
>>> >Cheers,
>>> >Chris
>>> >
>>> >------------------------
>>> >Chris Mattmann
>>> >chris.mattmann@gmail.com
>>><javascript:;><mailto:chris.mattmann@gmail.com
>>> <javascript:;>>
>>> >
>>> >
>>> >
>>> >
>>> >-----Original Message-----
>>> >From: Etienne Koen <etiennek@scs-space.com <javascript:;><mailto:
>>> etiennek@scs-space.com <javascript:;>>>
>>> >Date: Wednesday, August 13, 2014 1:42 AM
>>> >To: Thomas Bennett <thomas@ska.ac.za <javascript:;><mailto:
>>> thomas@ska.ac.za <javascript:;>>>
>>> >Cc: "cschollar@ska.ac.za <javascript:;><mailto:cschollar@ska.ac.za
>>> <javascript:;>>"
>>> ><cschollar@ska.ac.za <javascript:;><mailto:cschollar@ska.ac.za
>>> <javascript:;>>>, Chris Mattmann
>>> ><chris.mattmann@gmail.com
>>><javascript:;><mailto:chris.mattmann@gmail.com
>>> <javascript:;>>>
>>> >Subject: RE: Remote data transfer
>>> >
>>> >>Hi Tom,
>>> >>
>>> >>I get the following error when using the argument:
>>> >>
>>> >>ERROR: Failed to ingest product 'blah.txt' : URI has an authority
>>> >>component
>>> >>
>>> >>Here both the server and client were using port 9000
>>> >>
>>> >>I get this when both the server and client are running on the same
>>>port
>>> >>
>>> >>When communicating on different ports I get:
>>> >>
>>> >><-- some I/O / HTTP exceptions -->
>>> >>...
>>> >>...
>>> >>
>>> >>ERROR: Failed to ingest product 'blah.txt' : Connection refused
>>> >>
>>> >>Server:9000 and Client:431
>>> >>
>>> >>Do you know what any of this mean?
>>> >>
>>> >>Cheers
>>> >>Etienne
>>> >>
>>> >>________________________________________
>>> >>From: Thomas Bennett [thomas@ska.ac.za <javascript:;><mailto:
>>> thomas@ska.ac.za <javascript:;>>]
>>> >>Sent: Wednesday, August 13, 2014 10:02 AM
>>> >>To: Etienne Koen
>>> >>Cc: cschollar@ska.ac.za <javascript:;><mailto:cschollar@ska.ac.za
>>> <javascript:;>>;
>>> >>chris.mattmann@gmail.com
>>><javascript:;><mailto:chris.mattmann@gmail.com
>>> <javascript:;>>
>>> >>Subject: Re: Remote data transfer
>>> >>
>>> >>Hey Etienne,
>>> >>
>>> >>I've been out of the office the last week but I'm back now.
>>> >>
>>> >>./filemgr-client --url http://localhost:9000 --operation
>>>--ingestProduct
>>> >>--productName blah.txt --productStructure Flat --productTypeName
>>> >>GenericFile --metadataFile file:///tmp/blah.txt.met --refs
>>> >>file:///tmp/blah.txt
>>> >>
>>> >>How would this line be modified to achieve what I want to do? I see
>>>there
>>> >>is also an argument --clientTransfer --dataTransfer but I am not sure
>>> >>what java class to use for this?
>>> >>
>>> >>You will need to specify the filemgr remotely ie: --url
>>> >>http://192.168.0.1 - are you doing this?
>>> >>
>>> >>I've done remote file transfer before I'll see if I can remember how
>>>to
>>> >>do it.
>>> >>
>>> >>Can I log into the CHPC with the usual credentials?
>>> >>
>>> >>Cheers,
>>> >>Tom
>>> >>--
>>> >>Thomas Bennett
>>> >>
>>> >>SKA South Africa
>>> >>Science Processing Team
>>> >>
>>> >>Office: +27 21 5067341<tel:%2B27%2021%205067341>
>>> >>Mobile: +27 79 5237105<tel:%2B27%2079%205237105>
>>> >>
>>> >>________________________________
>>> >>Disclaimer: This E-mail message, including any attachments, is
>>>intended
>>> >>only for the person or entity to which it is addressed, and may
>>>contain
>>> >>confidential information. Each page attached hereto must also be read
>>>in
>>> >>conjunction with this disclaimer.
>>> >>If you are not the intended recipient you are hereby notified that
>>>any
>>> >>disclosure, copying, distribution or reliance upon the contents of
>>>this
>>> >>e-mail is strictly prohibited. E.&O.E.
>>> >>
>>> >>Disclaimer: This E-mail message, including any attachments, is
>>>intended
>>> >>only for the  person or entity to which it is addressed, and may
>>>contain
>>> >>confidential  information. Each page attached hereto must also be
>>>read in
>>> >>conjunction with this disclaimer.
>>> >>If you are not the intended recipient you are hereby notified that
>>>any
>>> >>disclosure, copying, distribution or reliance upon the contents of
>>>this
>>> >>e-mail is strictly prohibited.    E.&O.E.
>>> >
>>> >
>>> >Disclaimer: This E-mail message, including any attachments, is
>>>intended
>>> >only for the  person or entity to which it is addressed, and may
>>>contain
>>> >confidential  information. Each page attached hereto must also be read
>>>in
>>> >conjunction with this disclaimer.
>>> >If you are not the intended recipient you are hereby notified that any
>>> >disclosure, copying, distribution or reliance upon the contents of
>>>this
>>> >e-mail is strictly prohibited.    E.&O.E.
>>> >
>>> >Disclaimer: This E-mail message, including any attachments, is
>>>intended
>>> >only for the  person or entity to which it is addressed, and may
>>>contain
>>> >confidential  information. Each page attached hereto must also be read
>>>in
>>> >conjunction with this disclaimer.
>>> >If you are not the intended recipient you are hereby notified that any
>>> >disclosure, copying, distribution or reliance upon the contents of
>>>this
>>> >e-mail is strictly prohibited.    E.&O.E.
>>> >
>>> >
>>> >
>>> >--
>>> >Thomas Bennett
>>> >
>>> >SKA South Africa
>>> >Science Processing Team
>>> >
>>> >Office: +27 21 5067341
>>> >Mobile: +27 79 5237105
>>> >
>>> >________________________________
>>> >Disclaimer: This E-mail message, including any attachments, is
>>>intended
>>> >only for the person or entity to which it is addressed, and may
>>>contain
>>> >confidential information. Each page attached hereto must also be read
>>>in
>>> >conjunction with this disclaimer.
>>> >If you are not the intended recipient you are hereby notified that any
>>> >disclosure, copying, distribution or reliance upon the contents of
>>>this
>>> >e-mail is strictly prohibited. E.&O.E.
>>> >
>>> >Disclaimer: This E-mail message, including any attachments, is
>>>intended
>>> >only for the  person or entity to which it is addressed, and may
>>>contain
>>> >confidential  information. Each page attached hereto must also be read
>>>in
>>> >conjunction with this disclaimer.
>>> >If you are not the intended recipient you are hereby notified that any
>>> >disclosure, copying, distribution or reliance upon the contents of
>>>this
>>> >e-mail is strictly prohibited.    E.&O.E.
>>>
>>>
>
>
>Disclaimer: This E-mail message, including any attachments, is intended
>only for the  person or entity to which it is addressed, and may contain
>confidential  information. Each page attached hereto must also be read in
>conjunction with this disclaimer.
>If you are not the intended recipient you are hereby notified that any
>disclosure, copying, distribution or reliance upon the contents of this
>e-mail is strictly prohibited.    E.&O.E.
>
>Disclaimer: This E-mail message, including any attachments, is intended
>only for the  person or entity to which it is addressed, and may contain
>confidential  information. Each page attached hereto must also be read in
>conjunction with this disclaimer.
>If you are not the intended recipient you are hereby notified that any
>disclosure, copying, distribution or reliance upon the contents of this
>e-mail is strictly prohibited.    E.&O.E.


Re: Remote data transfer

Posted by "Mattmann, Chris A (3980)" <ch...@jpl.nasa.gov>.
Hey Guys,

The key here is that the data transfer factory that the File Manager
(Server and Client), the Crawler, CAS-PGE, etc., use could be leveraged
to try and test different data transfer technologies.

You guys may also be interested in scoping out my PhD dissertation
which covered this topic extensively:

http://sunset.usc.edu/~mattmann/Dissertation.pdf

Also the software framework for evaluating and classifying different
movement technologies is now up on Github:

https://github.com/chrismattmann/disco/

The real cool thing to do would be to make a "DISCOConnector", which uses
the evaluation framework to pick the right connector for the scenario.

BTW, iRODS can do data movement - we've used it before *with* OODT
as one example of a data movement technology. OODT is happy to
work with it in the context of the Data Transfer interface I mentioned
above.

Cheers,
Chris


++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Chief Architect
Instrument Software and Science Data Systems Section (398)
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 168-519, Mailstop: 168-527
Email: chris.a.mattmann@nasa.gov
WWW:  http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Associate Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++






-----Original Message-----
From: Etienne Koen <et...@scs-space.com>
Reply-To: "dev@oodt.apache.org" <de...@oodt.apache.org>
Date: Wednesday, August 20, 2014 5:35 AM
To: Thomas Bennett <th...@ska.ac.za>
Cc: "dev@oodt.apache.org" <de...@oodt.apache.org>
Subject: RE: Remote data transfer

>Hi Tom
>
>One of the other tools which is currently being tested called iRods has
>this capability/option you are referring to built into it. Does OODT have
>something similar or how would it be done?
>
>Here is the definition of the checksum we want to perform given on the
>confluence page:
>
>"A checksum or hash sum is a small-size datum from an arbitrary block of
>digital data for the purpose of detecting errors which may have been
>introduced during its transmission or storage."
>
>Does this help?
>
>Etienne
>________________________________________
>From: Thomas Bennett [thomas@ska.ac.za]
>Sent: Wednesday, August 20, 2014 1:26 PM
>To: Etienne Koen
>Cc: dev@oodt.apache.org
>Subject: Re: Remote data transfer
>
>Hi Etienne,
>
>I am using TCP. My first thoughts were that a checksum using TCP is not
>necessary since the protocol takes care of the data integrity but there
>seems to be a requirement for this in the test plan to have a chescksum
>capability.
>
>Not to confuse issues, but I perform an md5sum on all data files (hdf5
>files) once they have been created and store it as metadata for the file.
>Does the requirement maybe not refer to this?
>
>I would like to test a UDP protocol with a checksum calculation as well
>for some comparison. Please let me know if you has success with
>implementing/using this functionality!
>
>I will also have a look at the UDT transport mechanism as you proposed.
>
>Woot!
>
>Cheers,
>Tom
>
>________________________________
>Disclaimer: This E-mail message, including any attachments, is intended
>only for the person or entity to which it is addressed, and may contain
>confidential information. Each page attached hereto must also be read in
>conjunction with this disclaimer.
>If you are not the intended recipient you are hereby notified that any
>disclosure, copying, distribution or reliance upon the contents of this
>e-mail is strictly prohibited. E.&O.E.
>
>Disclaimer: This E-mail message, including any attachments, is intended
>only for the  person or entity to which it is addressed, and may contain
>confidential  information. Each page attached hereto must also be read in
>conjunction with this disclaimer.
>If you are not the intended recipient you are hereby notified that any
>disclosure, copying, distribution or reliance upon the contents of this
>e-mail is strictly prohibited.    E.&O.E.


RE: Remote data transfer

Posted by Etienne Koen <et...@scs-space.com>.
Hi Tom

One of the other tools which is currently being tested called iRods has this capability/option you are referring to built into it. Does OODT have something similar or how would it be done?

Here is the definition of the checksum we want to perform given on the confluence page:

"A checksum or hash sum is a small-size datum from an arbitrary block of digital data for the purpose of detecting errors which may have been introduced during its transmission or storage."

Does this help?

Etienne
________________________________________
From: Thomas Bennett [thomas@ska.ac.za]
Sent: Wednesday, August 20, 2014 1:26 PM
To: Etienne Koen
Cc: dev@oodt.apache.org
Subject: Re: Remote data transfer

Hi Etienne,

I am using TCP. My first thoughts were that a checksum using TCP is not necessary since the protocol takes care of the data integrity but there seems to be a requirement for this in the test plan to have a chescksum capability.

Not to confuse issues, but I perform an md5sum on all data files (hdf5 files) once they have been created and store it as metadata for the file.
Does the requirement maybe not refer to this?

I would like to test a UDP protocol with a checksum calculation as well for some comparison. Please let me know if you has success with implementing/using this functionality!

I will also have a look at the UDT transport mechanism as you proposed.

Woot!

Cheers,
Tom

________________________________
Disclaimer: This E-mail message, including any attachments, is intended only for the person or entity to which it is addressed, and may contain confidential information. Each page attached hereto must also be read in conjunction with this disclaimer.
If you are not the intended recipient you are hereby notified that any disclosure, copying, distribution or reliance upon the contents of this e-mail is strictly prohibited. E.&O.E.

Disclaimer: This E-mail message, including any attachments, is intended only for the  person or entity to which it is addressed, and may contain confidential  information. Each page attached hereto must also be read in conjunction with this disclaimer.
If you are not the intended recipient you are hereby notified that any disclosure, copying, distribution or reliance upon the contents of this e-mail is strictly prohibited.    E.&O.E.

RE: Remote data transfer

Posted by Etienne Koen <et...@scs-space.com>.
Thanks Tom,

I will checkout the site.

I am using TCP. My first thoughts were that a checksum using TCP is not necessary since the protocol takes care of the data integrity but there seems to be a requirement for this in the test plan to have a chescksum capability.

I would like to test a UDP protocol with a checksum calculation as well for some comparison. Please let me know if you has success with implementing/using this functionality!

I will also have a look at the UDT transport mechanism as you proposed.

Cheers
Etienne

From: Thomas Bennett [mailto:thomas@ska.ac.za]
Sent: Wednesday, August 20, 2014 10:49 AM
To: Etienne Koen
Cc: dev@oodt.apache.org
Subject: Re: Remote data transfer

Hi Etienne,

The following page: http://oodt.apache.org/components/maven/pushpull/development/developer.html lists one of its key capabilities as:

Fast Data-transfer - Support of Parallel File Transfers and Data Downloads.

So the capability must already be baked into the pushpull component. I'm unable to comment further until I get a chance to test it :)

W.r.t data transfer, are you referring to TCP vs UDP?

I'm very interested in looking at UDT<http://en.wikipedia.org/wiki/UDP-based_Data_Transfer_Protocol> as a possible transport mechanism. There is an "rsync over udt" implementation already...

Cheers,
Tom

We are still busy at SKA with refining the first test plan for data delivery and archive. There are a few things that came out of yesterdays meeting about the requirements which will be added to the tests which I need to explore using OODT as well

- parallel transfer
- transfer with/without a checksum calculation

Would it be possible to point me to some documentation/tutorial again which describes both the use of these capabilities?

btw, I got the pushpull component of OODT running on the CHPC with some elementary tests! :-)

I will forward the documentation to you once it is completed.

Regards
Etienne
________________________________________
From: Mattmann, Chris A (3980) [chris.a.mattmann@jpl.nasa.gov<ma...@jpl.nasa.gov>]
Sent: Monday, August 18, 2014 11:27 PM
To: dev@oodt.apache.org<ma...@oodt.apache.org>
Cc: Etienne Koen; Thomas Bennett; cschollar@ska.ac.za<ma...@ska.ac.za>
Subject: Re: Remote data transfer

Hi Tom,

Great question!

By default, all of the protocols support here in the cas-protocol
module of Apache OODT:

http://svn.apache.org/repos/asf/oodt/trunk/protocol/


* ftp
* http(s)
* imaps
* sftp

Note that there is an Amazon S3 "data transfer" module in
the File Manager, but not explicitly in Push Pull. It would
be hopefully not too difficult (and a welcomed patch!) to
incorporate this functionality into the cas-protcool layer.

There are also these specific plugins PushPull plugins:

https://cwiki.apache.org/confluence/display/OODT/OODT+Push+Pull+Plugins


Note the Push Pull plugins in the wiki page above leverage LGPL libraries
and I wasn't able to find a replacement for them. We aren't officially
"recommending" them as Apache OODT PMC members, but they are useful
FTP plugins if you can't get the existing protocol-ftp plugin to work.
You knowingly however do so by explicitly downloading these plugins
and building them into your OODT push pull installation.

I would love if someone were to find ALv2 compatible versions of the
above plugins so we could manage them in our code base but hasn't
be done yet.



++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Chief Architect
Instrument Software and Science Data Systems Section (398)
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 168-519, Mailstop: 168-527
Email: chris.a.mattmann@nasa.gov<ma...@nasa.gov>
WWW:  http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Associate Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++






-----Original Message-----
From: Thomas Bennett <lm...@gmail.com>>
Reply-To: "dev@oodt.apache.org<ma...@oodt.apache.org>" <de...@oodt.apache.org>>
Date: Monday, August 18, 2014 11:39 AM
To: "dev@oodt.apache.org<ma...@oodt.apache.org>" <de...@oodt.apache.org>>
Cc: Etienne Koen <et...@scs-space.com>>, Thomas Bennett
<th...@ska.ac.za>>, "cschollar@ska.ac.za<ma...@ska.ac.za>" <cs...@ska.ac.za>>
Subject: Re: Remote data transfer

>Thanks Chris.
>
>Just to add to the conversation - what protocols are currently supported?
>
>I've seen scp, FTP and http. Also Amazon S3?
>
>On Monday, August 18, 2014, Mattmann, Chris A (3980) <
>chris.a.mattmann@jpl.nasa.gov<ma...@jpl.nasa.gov>> wrote:
>
>> Hi Etienne,
>>
>> Thanks. The Push Pull system is a way to pull down remote or ancillary
>> files usually *ahead* of file manager ingestion, since the crawler
>> really doesn't have a protocol layer to mitigate remote content.
>> The typical use case if you use Push Pull is:
>>
>> 1. Model remote/ancillary files on other sites
>> 2. Download them with push pull into a "staging area"
>> 3. Crawl and ingest with crawler, as if the content were
>> local to start out with.
>>
>> There is a Push Pull users guide here, it's a bit old but should
>> explain it:
>>
>>
>>http://svn.apache.org/repos/asf/oodt/trunk/pushpull/src/main/resources/do
>>cu
>> mentation/
>>
>>
>> Cheers,
>> Chris
>>
>>
>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> Chris Mattmann, Ph.D.
>> Chief Architect
>> Instrument Software and Science Data Systems Section (398)
>> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
>> Office: 168-519, Mailstop: 168-527
>> Email: chris.a.mattmann@nasa.gov<ma...@nasa.gov> <javascript:;>
>> WWW:  http://sunset.usc.edu/~mattmann/
>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> Adjunct Associate Professor, Computer Science Department
>> University of Southern California, Los Angeles, CA 90089 USA
>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>
>>
>>
>>
>>
>>
>> -----Original Message-----
>> From: Etienne Koen <et...@scs-space.com> <javascript:;>>
>> Date: Monday, August 18, 2014 2:36 AM
>> To: Thomas Bennett <th...@ska.ac.za> <javascript:;>>
>> Cc: Chris Mattmann <Ch...@jpl.nasa.gov> <javascript:;>>, "
>> cschollar@ska.ac.za<ma...@ska.ac.za> <javascript:;>"
>> <cs...@ska.ac.za> <javascript:;>>, "dev@oodt.apache.org<ma...@oodt.apache.org>
>><javascript:;>"
>> <de...@oodt.apache.org> <javascript:;>>
>> Subject: RE: Remote data transfer
>>
>> >Hi Tomas and all,
>> >
>> >I came across the push/pull tutorial on
>> >
>>
>>https://cwiki.apache.org/confluence/display/OODT/OODT+Push-Pull+User+Guid
>>e
>> >.
>> >
>> >Would this guide be more appropriate to download files that have  been
>> >archived by the file manager and represent a typical user scenario?
>> >
>> >Regards
>> >Etienne
>> >________________________________________
>> >From: Thomas Bennett [thomas@ska.ac.za<ma...@ska.ac.za> <javascript:;>]
>> >Sent: Friday, August 15, 2014 9:54 AM
>> >To: Etienne Koen
>> >Cc: Mattmann, Chris A (3980); cschollar@ska.ac.za<ma...@ska.ac.za> <javascript:;>;
>> dev@oodt.apache.org<ma...@oodt.apache.org> <javascript:;>
>> >Subject: Re: Remote data transfer
>> >
>> >Hi Etienne,
>> >
>> >There are various methods you can use to download the data.
>> >
>> >See this page:
>> >
>>
>>https://cwiki.apache.org/confluence/display/OODT/Getting+products+from+a+
>>r
>> >emote+FileManager
>> >
>> >Recently there is some great work that has been done on using a REST
>>API
>> >- this exists on svn trunk. I don't think it has been released yet.
>> >
>> >https://cwiki.apache.org/confluence/display/OODT/File+Manager+REST+API
>> >
>> >To use these components you will need to deploy tomcat or jetty.
>> >
>> >Shout if you need some help.
>> >
>> >Cheers,
>> >Tom
>> >
>> >
>> >
>> >
>> >On Thu, Aug 14, 2014 at 4:31 PM, Etienne Koen
>> ><et...@scs-space.com> <ja...@scs-space.com>
>> <javascript:;>>> wrote:
>> >Hi Chris and Tom,
>> >
>> >As I have mentioned before in my previous email, I have managed to
>>ingest
>> >a file to a remote location using the filemgr-client. I am also able to
>> >query the information remotely using for example the query_tool in this
>> >way:
>> >
>> >$ ./query_tool --url http://192.168.0.10:9000 --lucene -query
>> >'CAS.ProductName:blah.txt'
>> >
>> >978ca28e-23b0-11e4-87fb-4f1c29029486
>> >
>> >What component would I use for searching and downloading the actual
>> >product from the remote file manager? Is the filemgr-client or
>>query_tool
>> >capable of doing this?
>> >
>> >Are there any tutorials you would recommend?
>> >
>> >Thanks
>> >Etienne
>> >
>> >________________________________________
>> >From: Mattmann, Chris A (3980)
>> >[chris.a.mattmann@jpl.nasa.gov<ma...@jpl.nasa.gov> <javascript:;><mailto:
<mailto:%0b>>> chris.a.mattmann@jpl.nasa.gov<ma...@jpl.nasa.gov> <javascript:;>>]
>> >Sent: Wednesday, August 13, 2014 6:04 PM
>> >To: Etienne Koen; Thomas Bennett
>> >Cc: cschollar@ska.ac.za<ma...@ska.ac.za> <ja...@ska.ac.za>
>> <javascript:;>>;
>> >dev@oodt.apache.org<ma...@oodt.apache.org> <ja...@oodt.apache.org>
>> <javascript:;>>; Mattmann, Chris A (3980)
>> >Subject: Re: Remote data transfer
>> >
>> >Thanks guys.
>> >
>> >Etienne, I hope you don't mind but I've copied
>> >dev@oodt.apache.org<ma...@oodt.apache.org> <ja...@oodt.apache.org>
>> <javascript:;>>
>> >
>> >on this email. That way you can tap into the entire Apache OODT
>> >community for help.
>> >
>> >The URI has authority component is usually an error indicating
>> >that you have referenced some environment variable in your config
>> >(e.g., filemgr.properties in the etc directory) but that variable
>> >isn't defined. E.g., maybe you have a *.policy.dirs property set
>> >to file://[SOME_UNDEFINED_VARIABLE]/path/dir/<file:///\\[SOME_UNDEFINED_VARIABLE]\path\dir\> and
>>SOME_UNDEFINED_VARIABLE
>> >is undefined.
>> >
>> >Can you check that to see if that's the root cause of this issue?
>> >
>> >Cheers,
>> >Chris
>> >
>> >------------------------
>> >Chris Mattmann
>> >chris.mattmann@gmail.com<ma...@gmail.com> <ja...@gmail.com>
>> <javascript:;>>
>> >
>> >
>> >
>> >
>> >-----Original Message-----
>> >From: Etienne Koen <et...@scs-space.com> <javascript:;><mailto:
<mailto:%0b>>> etiennek@scs-space.com<ma...@scs-space.com> <javascript:;>>>
>> >Date: Wednesday, August 13, 2014 1:42 AM
>> >To: Thomas Bennett <th...@ska.ac.za> <javascript:;><mailto:
<mailto:%0b>>> thomas@ska.ac.za<ma...@ska.ac.za> <javascript:;>>>
>> >Cc: "cschollar@ska.ac.za<ma...@ska.ac.za> <ja...@ska.ac.za>
>> <javascript:;>>"
>> ><cs...@ska.ac.za> <ja...@ska.ac.za>
>> <javascript:;>>>, Chris Mattmann
>> ><ch...@gmail.com>
>><ja...@gmail.com>
>> <javascript:;>>>
>> >Subject: RE: Remote data transfer
>> >
>> >>Hi Tom,
>> >>
>> >>I get the following error when using the argument:
>> >>
>> >>ERROR: Failed to ingest product 'blah.txt' : URI has an authority
>> >>component
>> >>
>> >>Here both the server and client were using port 9000
>> >>
>> >>I get this when both the server and client are running on the same
>>port
>> >>
>> >>When communicating on different ports I get:
>> >>
>> >><-- some I/O / HTTP exceptions -->
>> >>...
>> >>...
>> >>
>> >>ERROR: Failed to ingest product 'blah.txt' : Connection refused
>> >>
>> >>Server:9000 and Client:431
>> >>
>> >>Do you know what any of this mean?
>> >>
>> >>Cheers
>> >>Etienne
>> >>
>> >>________________________________________
>> >>From: Thomas Bennett [thomas@ska.ac.za<ma...@ska.ac.za> <javascript:;><mailto:
<mailto:%0b>>> thomas@ska.ac.za<ma...@ska.ac.za> <javascript:;>>]
>> >>Sent: Wednesday, August 13, 2014 10:02 AM
>> >>To: Etienne Koen
>> >>Cc: cschollar@ska.ac.za<ma...@ska.ac.za> <ja...@ska.ac.za>
>> <javascript:;>>;
>> >>chris.mattmann@gmail.com<ma...@gmail.com>
>><ja...@gmail.com>
>> <javascript:;>>
>> >>Subject: Re: Remote data transfer
>> >>
>> >>Hey Etienne,
>> >>
>> >>I've been out of the office the last week but I'm back now.
>> >>
>> >>./filemgr-client --url http://localhost:9000 --operation
>>--ingestProduct
>> >>--productName blah.txt --productStructure Flat --productTypeName
>> >>GenericFile --metadataFile file:///tmp/blah.txt.met<file:///\\tmp\blah.txt.met> --refs
>> >>file:///tmp/blah.txt<file:///\\tmp\blah.txt>
>> >>
>> >>How would this line be modified to achieve what I want to do? I see
>>there
>> >>is also an argument --clientTransfer --dataTransfer but I am not sure
>> >>what java class to use for this?
>> >>
>> >>You will need to specify the filemgr remotely ie: --url
>> >>http://192.168.0.1 - are you doing this?
>> >>
>> >>I've done remote file transfer before I'll see if I can remember how
>>to
>> >>do it.
>> >>
>> >>Can I log into the CHPC with the usual credentials?
>> >>
>> >>Cheers,
>> >>Tom
>> >>--
>> >>Thomas Bennett
>> >>
>> >>SKA South Africa
>> >>Science Processing Team
>> >>
>> >>Office: +27 21 5067341<tel:%2B27%2021%205067341><tel:%2B27%2021%205067341>
>> >>Mobile: +27 79 5237105<tel:%2B27%2079%205237105><tel:%2B27%2079%205237105>
>> >>
>> >>________________________________
>> >>Disclaimer: This E-mail message, including any attachments, is
>>intended
>> >>only for the person or entity to which it is addressed, and may
>>contain
>> >>confidential information. Each page attached hereto must also be read
>>in
>> >>conjunction with this disclaimer.
>> >>If you are not the intended recipient you are hereby notified that any
>> >>disclosure, copying, distribution or reliance upon the contents of
>>this
>> >>e-mail is strictly prohibited. E.&O.E.
>> >>
>> >>Disclaimer: This E-mail message, including any attachments, is
>>intended
>> >>only for the  person or entity to which it is addressed, and may
>>contain
>> >>confidential  information. Each page attached hereto must also be
>>read in
>> >>conjunction with this disclaimer.
>> >>If you are not the intended recipient you are hereby notified that any
>> >>disclosure, copying, distribution or reliance upon the contents of
>>this
>> >>e-mail is strictly prohibited.    E.&O.E.
>> >
>> >
>> >Disclaimer: This E-mail message, including any attachments, is intended
>> >only for the  person or entity to which it is addressed, and may
>>contain
>> >confidential  information. Each page attached hereto must also be read
>>in
>> >conjunction with this disclaimer.
>> >If you are not the intended recipient you are hereby notified that any
>> >disclosure, copying, distribution or reliance upon the contents of this
>> >e-mail is strictly prohibited.    E.&O.E.
>> >
>> >Disclaimer: This E-mail message, including any attachments, is intended
>> >only for the  person or entity to which it is addressed, and may
>>contain
>> >confidential  information. Each page attached hereto must also be read
>>in
>> >conjunction with this disclaimer.
>> >If you are not the intended recipient you are hereby notified that any
>> >disclosure, copying, distribution or reliance upon the contents of this
>> >e-mail is strictly prohibited.    E.&O.E.
>> >
>> >
>> >
>> >--
>> >Thomas Bennett
>> >
>> >SKA South Africa
>> >Science Processing Team
>> >
>> >Office: +27 21 5067341<tel:%2B27%2021%205067341>
>> >Mobile: +27 79 5237105<tel:%2B27%2079%205237105>
>> >
>> >________________________________
>> >Disclaimer: This E-mail message, including any attachments, is intended
>> >only for the person or entity to which it is addressed, and may contain
>> >confidential information. Each page attached hereto must also be read
>>in
>> >conjunction with this disclaimer.
>> >If you are not the intended recipient you are hereby notified that any
>> >disclosure, copying, distribution or reliance upon the contents of this
>> >e-mail is strictly prohibited. E.&O.E.
>> >
>> >Disclaimer: This E-mail message, including any attachments, is intended
>> >only for the  person or entity to which it is addressed, and may
>>contain
>> >confidential  information. Each page attached hereto must also be read
>>in
>> >conjunction with this disclaimer.
>> >If you are not the intended recipient you are hereby notified that any
>> >disclosure, copying, distribution or reliance upon the contents of this
>> >e-mail is strictly prohibited.    E.&O.E.
>>
>>


Disclaimer: This E-mail message, including any attachments, is intended only for the  person or entity to which it is addressed, and may contain confidential  information. Each page attached hereto must also be read in conjunction with this disclaimer.
If you are not the intended recipient you are hereby notified that any disclosure, copying, distribution or reliance upon the contents of this e-mail is strictly prohibited.    E.&O.E.

Disclaimer: This E-mail message, including any attachments, is intended only for the  person or entity to which it is addressed, and may contain confidential  information. Each page attached hereto must also be read in conjunction with this disclaimer.
If you are not the intended recipient you are hereby notified that any disclosure, copying, distribution or reliance upon the contents of this e-mail is strictly prohibited.    E.&O.E.



--
Thomas Bennett

SKA South Africa
Science Processing Team

Office: +27 21 5067341
Mobile: +27 79 5237105

________________________________
Disclaimer: This E-mail message, including any attachments, is intended only for the person or entity to which it is addressed, and may contain confidential information. Each page attached hereto must also be read in conjunction with this disclaimer.
If you are not the intended recipient you are hereby notified that any disclosure, copying, distribution or reliance upon the contents of this e-mail is strictly prohibited. E.&O.E.

________________________________
Disclaimer: This E-mail message, including any attachments, is intended only for the person or entity to which it is addressed, and may contain confidential information. Each page attached hereto must also be read in conjunction with this disclaimer.
If you are not the intended recipient you are hereby notified that any disclosure, copying, distribution or reliance upon the contents of this e-mail is strictly prohibited. E.&O.E.

RE: Remote data transfer

Posted by Etienne Koen <et...@scs-space.com>.
Hi Chris and Tom,

We are still busy at SKA with refining the first test plan for data delivery and archive. There are a few things that came out of yesterdays meeting about the requirements which will be added to the tests which I need to explore using OODT as well

- parallel transfer
- transfer with/without a checksum calculation

Would it be possible to point me to some documentation/tutorial again which describes both the use of these capabilities?

btw, I got the pushpull component of OODT running on the CHPC with some elementary tests! :-)

I will forward the documentation to you once it is completed.

Regards
Etienne
________________________________________
From: Mattmann, Chris A (3980) [chris.a.mattmann@jpl.nasa.gov]
Sent: Monday, August 18, 2014 11:27 PM
To: dev@oodt.apache.org
Cc: Etienne Koen; Thomas Bennett; cschollar@ska.ac.za
Subject: Re: Remote data transfer

Hi Tom,

Great question!

By default, all of the protocols support here in the cas-protocol
module of Apache OODT:

http://svn.apache.org/repos/asf/oodt/trunk/protocol/


* ftp
* http(s)
* imaps
* sftp

Note that there is an Amazon S3 "data transfer" module in
the File Manager, but not explicitly in Push Pull. It would
be hopefully not too difficult (and a welcomed patch!) to
incorporate this functionality into the cas-protcool layer.

There are also these specific plugins PushPull plugins:

https://cwiki.apache.org/confluence/display/OODT/OODT+Push+Pull+Plugins


Note the Push Pull plugins in the wiki page above leverage LGPL libraries
and I wasn't able to find a replacement for them. We aren't officially
"recommending" them as Apache OODT PMC members, but they are useful
FTP plugins if you can't get the existing protocol-ftp plugin to work.
You knowingly however do so by explicitly downloading these plugins
and building them into your OODT push pull installation.

I would love if someone were to find ALv2 compatible versions of the
above plugins so we could manage them in our code base but hasn't
be done yet.



++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Chief Architect
Instrument Software and Science Data Systems Section (398)
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 168-519, Mailstop: 168-527
Email: chris.a.mattmann@nasa.gov
WWW:  http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Associate Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++






-----Original Message-----
From: Thomas Bennett <lm...@gmail.com>
Reply-To: "dev@oodt.apache.org" <de...@oodt.apache.org>
Date: Monday, August 18, 2014 11:39 AM
To: "dev@oodt.apache.org" <de...@oodt.apache.org>
Cc: Etienne Koen <et...@scs-space.com>, Thomas Bennett
<th...@ska.ac.za>, "cschollar@ska.ac.za" <cs...@ska.ac.za>
Subject: Re: Remote data transfer

>Thanks Chris.
>
>Just to add to the conversation - what protocols are currently supported?
>
>I've seen scp, FTP and http. Also Amazon S3?
>
>On Monday, August 18, 2014, Mattmann, Chris A (3980) <
>chris.a.mattmann@jpl.nasa.gov> wrote:
>
>> Hi Etienne,
>>
>> Thanks. The Push Pull system is a way to pull down remote or ancillary
>> files usually *ahead* of file manager ingestion, since the crawler
>> really doesn't have a protocol layer to mitigate remote content.
>> The typical use case if you use Push Pull is:
>>
>> 1. Model remote/ancillary files on other sites
>> 2. Download them with push pull into a "staging area"
>> 3. Crawl and ingest with crawler, as if the content were
>> local to start out with.
>>
>> There is a Push Pull users guide here, it's a bit old but should
>> explain it:
>>
>>
>>http://svn.apache.org/repos/asf/oodt/trunk/pushpull/src/main/resources/do
>>cu
>> mentation/
>>
>>
>> Cheers,
>> Chris
>>
>>
>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> Chris Mattmann, Ph.D.
>> Chief Architect
>> Instrument Software and Science Data Systems Section (398)
>> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
>> Office: 168-519, Mailstop: 168-527
>> Email: chris.a.mattmann@nasa.gov <javascript:;>
>> WWW:  http://sunset.usc.edu/~mattmann/
>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> Adjunct Associate Professor, Computer Science Department
>> University of Southern California, Los Angeles, CA 90089 USA
>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>
>>
>>
>>
>>
>>
>> -----Original Message-----
>> From: Etienne Koen <etiennek@scs-space.com <javascript:;>>
>> Date: Monday, August 18, 2014 2:36 AM
>> To: Thomas Bennett <thomas@ska.ac.za <javascript:;>>
>> Cc: Chris Mattmann <Chris.A.Mattmann@jpl.nasa.gov <javascript:;>>, "
>> cschollar@ska.ac.za <javascript:;>"
>> <cschollar@ska.ac.za <javascript:;>>, "dev@oodt.apache.org
>><javascript:;>"
>> <dev@oodt.apache.org <javascript:;>>
>> Subject: RE: Remote data transfer
>>
>> >Hi Tomas and all,
>> >
>> >I came across the push/pull tutorial on
>> >
>>
>>https://cwiki.apache.org/confluence/display/OODT/OODT+Push-Pull+User+Guid
>>e
>> >.
>> >
>> >Would this guide be more appropriate to download files that have  been
>> >archived by the file manager and represent a typical user scenario?
>> >
>> >Regards
>> >Etienne
>> >________________________________________
>> >From: Thomas Bennett [thomas@ska.ac.za <javascript:;>]
>> >Sent: Friday, August 15, 2014 9:54 AM
>> >To: Etienne Koen
>> >Cc: Mattmann, Chris A (3980); cschollar@ska.ac.za <javascript:;>;
>> dev@oodt.apache.org <javascript:;>
>> >Subject: Re: Remote data transfer
>> >
>> >Hi Etienne,
>> >
>> >There are various methods you can use to download the data.
>> >
>> >See this page:
>> >
>>
>>https://cwiki.apache.org/confluence/display/OODT/Getting+products+from+a+
>>r
>> >emote+FileManager
>> >
>> >Recently there is some great work that has been done on using a REST
>>API
>> >- this exists on svn trunk. I don't think it has been released yet.
>> >
>> >https://cwiki.apache.org/confluence/display/OODT/File+Manager+REST+API
>> >
>> >To use these components you will need to deploy tomcat or jetty.
>> >
>> >Shout if you need some help.
>> >
>> >Cheers,
>> >Tom
>> >
>> >
>> >
>> >
>> >On Thu, Aug 14, 2014 at 4:31 PM, Etienne Koen
>> ><etiennek@scs-space.com <javascript:;><mailto:etiennek@scs-space.com
>> <javascript:;>>> wrote:
>> >Hi Chris and Tom,
>> >
>> >As I have mentioned before in my previous email, I have managed to
>>ingest
>> >a file to a remote location using the filemgr-client. I am also able to
>> >query the information remotely using for example the query_tool in this
>> >way:
>> >
>> >$ ./query_tool --url http://192.168.0.10:9000 --lucene -query
>> >'CAS.ProductName:blah.txt'
>> >
>> >978ca28e-23b0-11e4-87fb-4f1c29029486
>> >
>> >What component would I use for searching and downloading the actual
>> >product from the remote file manager? Is the filemgr-client or
>>query_tool
>> >capable of doing this?
>> >
>> >Are there any tutorials you would recommend?
>> >
>> >Thanks
>> >Etienne
>> >
>> >________________________________________
>> >From: Mattmann, Chris A (3980)
>> >[chris.a.mattmann@jpl.nasa.gov <javascript:;><mailto:
>> chris.a.mattmann@jpl.nasa.gov <javascript:;>>]
>> >Sent: Wednesday, August 13, 2014 6:04 PM
>> >To: Etienne Koen; Thomas Bennett
>> >Cc: cschollar@ska.ac.za <javascript:;><mailto:cschollar@ska.ac.za
>> <javascript:;>>;
>> >dev@oodt.apache.org <javascript:;><mailto:dev@oodt.apache.org
>> <javascript:;>>; Mattmann, Chris A (3980)
>> >Subject: Re: Remote data transfer
>> >
>> >Thanks guys.
>> >
>> >Etienne, I hope you don't mind but I've copied
>> >dev@oodt.apache.org <javascript:;><mailto:dev@oodt.apache.org
>> <javascript:;>>
>> >
>> >on this email. That way you can tap into the entire Apache OODT
>> >community for help.
>> >
>> >The URI has authority component is usually an error indicating
>> >that you have referenced some environment variable in your config
>> >(e.g., filemgr.properties in the etc directory) but that variable
>> >isn't defined. E.g., maybe you have a *.policy.dirs property set
>> >to file://[SOME_UNDEFINED_VARIABLE]/path/dir/ and
>>SOME_UNDEFINED_VARIABLE
>> >is undefined.
>> >
>> >Can you check that to see if that's the root cause of this issue?
>> >
>> >Cheers,
>> >Chris
>> >
>> >------------------------
>> >Chris Mattmann
>> >chris.mattmann@gmail.com <javascript:;><mailto:chris.mattmann@gmail.com
>> <javascript:;>>
>> >
>> >
>> >
>> >
>> >-----Original Message-----
>> >From: Etienne Koen <etiennek@scs-space.com <javascript:;><mailto:
>> etiennek@scs-space.com <javascript:;>>>
>> >Date: Wednesday, August 13, 2014 1:42 AM
>> >To: Thomas Bennett <thomas@ska.ac.za <javascript:;><mailto:
>> thomas@ska.ac.za <javascript:;>>>
>> >Cc: "cschollar@ska.ac.za <javascript:;><mailto:cschollar@ska.ac.za
>> <javascript:;>>"
>> ><cschollar@ska.ac.za <javascript:;><mailto:cschollar@ska.ac.za
>> <javascript:;>>>, Chris Mattmann
>> ><chris.mattmann@gmail.com
>><javascript:;><mailto:chris.mattmann@gmail.com
>> <javascript:;>>>
>> >Subject: RE: Remote data transfer
>> >
>> >>Hi Tom,
>> >>
>> >>I get the following error when using the argument:
>> >>
>> >>ERROR: Failed to ingest product 'blah.txt' : URI has an authority
>> >>component
>> >>
>> >>Here both the server and client were using port 9000
>> >>
>> >>I get this when both the server and client are running on the same
>>port
>> >>
>> >>When communicating on different ports I get:
>> >>
>> >><-- some I/O / HTTP exceptions -->
>> >>...
>> >>...
>> >>
>> >>ERROR: Failed to ingest product 'blah.txt' : Connection refused
>> >>
>> >>Server:9000 and Client:431
>> >>
>> >>Do you know what any of this mean?
>> >>
>> >>Cheers
>> >>Etienne
>> >>
>> >>________________________________________
>> >>From: Thomas Bennett [thomas@ska.ac.za <javascript:;><mailto:
>> thomas@ska.ac.za <javascript:;>>]
>> >>Sent: Wednesday, August 13, 2014 10:02 AM
>> >>To: Etienne Koen
>> >>Cc: cschollar@ska.ac.za <javascript:;><mailto:cschollar@ska.ac.za
>> <javascript:;>>;
>> >>chris.mattmann@gmail.com
>><javascript:;><mailto:chris.mattmann@gmail.com
>> <javascript:;>>
>> >>Subject: Re: Remote data transfer
>> >>
>> >>Hey Etienne,
>> >>
>> >>I've been out of the office the last week but I'm back now.
>> >>
>> >>./filemgr-client --url http://localhost:9000 --operation
>>--ingestProduct
>> >>--productName blah.txt --productStructure Flat --productTypeName
>> >>GenericFile --metadataFile file:///tmp/blah.txt.met --refs
>> >>file:///tmp/blah.txt
>> >>
>> >>How would this line be modified to achieve what I want to do? I see
>>there
>> >>is also an argument --clientTransfer --dataTransfer but I am not sure
>> >>what java class to use for this?
>> >>
>> >>You will need to specify the filemgr remotely ie: --url
>> >>http://192.168.0.1 - are you doing this?
>> >>
>> >>I've done remote file transfer before I'll see if I can remember how
>>to
>> >>do it.
>> >>
>> >>Can I log into the CHPC with the usual credentials?
>> >>
>> >>Cheers,
>> >>Tom
>> >>--
>> >>Thomas Bennett
>> >>
>> >>SKA South Africa
>> >>Science Processing Team
>> >>
>> >>Office: +27 21 5067341<tel:%2B27%2021%205067341>
>> >>Mobile: +27 79 5237105<tel:%2B27%2079%205237105>
>> >>
>> >>________________________________
>> >>Disclaimer: This E-mail message, including any attachments, is
>>intended
>> >>only for the person or entity to which it is addressed, and may
>>contain
>> >>confidential information. Each page attached hereto must also be read
>>in
>> >>conjunction with this disclaimer.
>> >>If you are not the intended recipient you are hereby notified that any
>> >>disclosure, copying, distribution or reliance upon the contents of
>>this
>> >>e-mail is strictly prohibited. E.&O.E.
>> >>
>> >>Disclaimer: This E-mail message, including any attachments, is
>>intended
>> >>only for the  person or entity to which it is addressed, and may
>>contain
>> >>confidential  information. Each page attached hereto must also be
>>read in
>> >>conjunction with this disclaimer.
>> >>If you are not the intended recipient you are hereby notified that any
>> >>disclosure, copying, distribution or reliance upon the contents of
>>this
>> >>e-mail is strictly prohibited.    E.&O.E.
>> >
>> >
>> >Disclaimer: This E-mail message, including any attachments, is intended
>> >only for the  person or entity to which it is addressed, and may
>>contain
>> >confidential  information. Each page attached hereto must also be read
>>in
>> >conjunction with this disclaimer.
>> >If you are not the intended recipient you are hereby notified that any
>> >disclosure, copying, distribution or reliance upon the contents of this
>> >e-mail is strictly prohibited.    E.&O.E.
>> >
>> >Disclaimer: This E-mail message, including any attachments, is intended
>> >only for the  person or entity to which it is addressed, and may
>>contain
>> >confidential  information. Each page attached hereto must also be read
>>in
>> >conjunction with this disclaimer.
>> >If you are not the intended recipient you are hereby notified that any
>> >disclosure, copying, distribution or reliance upon the contents of this
>> >e-mail is strictly prohibited.    E.&O.E.
>> >
>> >
>> >
>> >--
>> >Thomas Bennett
>> >
>> >SKA South Africa
>> >Science Processing Team
>> >
>> >Office: +27 21 5067341
>> >Mobile: +27 79 5237105
>> >
>> >________________________________
>> >Disclaimer: This E-mail message, including any attachments, is intended
>> >only for the person or entity to which it is addressed, and may contain
>> >confidential information. Each page attached hereto must also be read
>>in
>> >conjunction with this disclaimer.
>> >If you are not the intended recipient you are hereby notified that any
>> >disclosure, copying, distribution or reliance upon the contents of this
>> >e-mail is strictly prohibited. E.&O.E.
>> >
>> >Disclaimer: This E-mail message, including any attachments, is intended
>> >only for the  person or entity to which it is addressed, and may
>>contain
>> >confidential  information. Each page attached hereto must also be read
>>in
>> >conjunction with this disclaimer.
>> >If you are not the intended recipient you are hereby notified that any
>> >disclosure, copying, distribution or reliance upon the contents of this
>> >e-mail is strictly prohibited.    E.&O.E.
>>
>>


Disclaimer: This E-mail message, including any attachments, is intended only for the  person or entity to which it is addressed, and may contain confidential  information. Each page attached hereto must also be read in conjunction with this disclaimer.
If you are not the intended recipient you are hereby notified that any disclosure, copying, distribution or reliance upon the contents of this e-mail is strictly prohibited.    E.&O.E.

Disclaimer: This E-mail message, including any attachments, is intended only for the  person or entity to which it is addressed, and may contain confidential  information. Each page attached hereto must also be read in conjunction with this disclaimer.
If you are not the intended recipient you are hereby notified that any disclosure, copying, distribution or reliance upon the contents of this e-mail is strictly prohibited.    E.&O.E.

Re: Remote data transfer

Posted by "Mattmann, Chris A (3980)" <ch...@jpl.nasa.gov>.
Hi Tom,

Great question!

By default, all of the protocols support here in the cas-protocol
module of Apache OODT:

http://svn.apache.org/repos/asf/oodt/trunk/protocol/


* ftp
* http(s)
* imaps
* sftp

Note that there is an Amazon S3 "data transfer" module in
the File Manager, but not explicitly in Push Pull. It would
be hopefully not too difficult (and a welcomed patch!) to
incorporate this functionality into the cas-protcool layer.

There are also these specific plugins PushPull plugins:

https://cwiki.apache.org/confluence/display/OODT/OODT+Push+Pull+Plugins


Note the Push Pull plugins in the wiki page above leverage LGPL libraries
and I wasn't able to find a replacement for them. We aren't officially
"recommending" them as Apache OODT PMC members, but they are useful
FTP plugins if you can't get the existing protocol-ftp plugin to work.
You knowingly however do so by explicitly downloading these plugins
and building them into your OODT push pull installation.

I would love if someone were to find ALv2 compatible versions of the
above plugins so we could manage them in our code base but hasn't
be done yet.



++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Chief Architect
Instrument Software and Science Data Systems Section (398)
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 168-519, Mailstop: 168-527
Email: chris.a.mattmann@nasa.gov
WWW:  http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Associate Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++






-----Original Message-----
From: Thomas Bennett <lm...@gmail.com>
Reply-To: "dev@oodt.apache.org" <de...@oodt.apache.org>
Date: Monday, August 18, 2014 11:39 AM
To: "dev@oodt.apache.org" <de...@oodt.apache.org>
Cc: Etienne Koen <et...@scs-space.com>, Thomas Bennett
<th...@ska.ac.za>, "cschollar@ska.ac.za" <cs...@ska.ac.za>
Subject: Re: Remote data transfer

>Thanks Chris.
>
>Just to add to the conversation - what protocols are currently supported?
>
>I've seen scp, FTP and http. Also Amazon S3?
>
>On Monday, August 18, 2014, Mattmann, Chris A (3980) <
>chris.a.mattmann@jpl.nasa.gov> wrote:
>
>> Hi Etienne,
>>
>> Thanks. The Push Pull system is a way to pull down remote or ancillary
>> files usually *ahead* of file manager ingestion, since the crawler
>> really doesn't have a protocol layer to mitigate remote content.
>> The typical use case if you use Push Pull is:
>>
>> 1. Model remote/ancillary files on other sites
>> 2. Download them with push pull into a "staging area"
>> 3. Crawl and ingest with crawler, as if the content were
>> local to start out with.
>>
>> There is a Push Pull users guide here, it's a bit old but should
>> explain it:
>>
>> 
>>http://svn.apache.org/repos/asf/oodt/trunk/pushpull/src/main/resources/do
>>cu
>> mentation/
>>
>>
>> Cheers,
>> Chris
>>
>>
>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> Chris Mattmann, Ph.D.
>> Chief Architect
>> Instrument Software and Science Data Systems Section (398)
>> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
>> Office: 168-519, Mailstop: 168-527
>> Email: chris.a.mattmann@nasa.gov <javascript:;>
>> WWW:  http://sunset.usc.edu/~mattmann/
>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> Adjunct Associate Professor, Computer Science Department
>> University of Southern California, Los Angeles, CA 90089 USA
>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>
>>
>>
>>
>>
>>
>> -----Original Message-----
>> From: Etienne Koen <etiennek@scs-space.com <javascript:;>>
>> Date: Monday, August 18, 2014 2:36 AM
>> To: Thomas Bennett <thomas@ska.ac.za <javascript:;>>
>> Cc: Chris Mattmann <Chris.A.Mattmann@jpl.nasa.gov <javascript:;>>, "
>> cschollar@ska.ac.za <javascript:;>"
>> <cschollar@ska.ac.za <javascript:;>>, "dev@oodt.apache.org
>><javascript:;>"
>> <dev@oodt.apache.org <javascript:;>>
>> Subject: RE: Remote data transfer
>>
>> >Hi Tomas and all,
>> >
>> >I came across the push/pull tutorial on
>> >
>> 
>>https://cwiki.apache.org/confluence/display/OODT/OODT+Push-Pull+User+Guid
>>e
>> >.
>> >
>> >Would this guide be more appropriate to download files that have  been
>> >archived by the file manager and represent a typical user scenario?
>> >
>> >Regards
>> >Etienne
>> >________________________________________
>> >From: Thomas Bennett [thomas@ska.ac.za <javascript:;>]
>> >Sent: Friday, August 15, 2014 9:54 AM
>> >To: Etienne Koen
>> >Cc: Mattmann, Chris A (3980); cschollar@ska.ac.za <javascript:;>;
>> dev@oodt.apache.org <javascript:;>
>> >Subject: Re: Remote data transfer
>> >
>> >Hi Etienne,
>> >
>> >There are various methods you can use to download the data.
>> >
>> >See this page:
>> >
>> 
>>https://cwiki.apache.org/confluence/display/OODT/Getting+products+from+a+
>>r
>> >emote+FileManager
>> >
>> >Recently there is some great work that has been done on using a REST
>>API
>> >- this exists on svn trunk. I don't think it has been released yet.
>> >
>> >https://cwiki.apache.org/confluence/display/OODT/File+Manager+REST+API
>> >
>> >To use these components you will need to deploy tomcat or jetty.
>> >
>> >Shout if you need some help.
>> >
>> >Cheers,
>> >Tom
>> >
>> >
>> >
>> >
>> >On Thu, Aug 14, 2014 at 4:31 PM, Etienne Koen
>> ><etiennek@scs-space.com <javascript:;><mailto:etiennek@scs-space.com
>> <javascript:;>>> wrote:
>> >Hi Chris and Tom,
>> >
>> >As I have mentioned before in my previous email, I have managed to
>>ingest
>> >a file to a remote location using the filemgr-client. I am also able to
>> >query the information remotely using for example the query_tool in this
>> >way:
>> >
>> >$ ./query_tool --url http://192.168.0.10:9000 --lucene -query
>> >'CAS.ProductName:blah.txt'
>> >
>> >978ca28e-23b0-11e4-87fb-4f1c29029486
>> >
>> >What component would I use for searching and downloading the actual
>> >product from the remote file manager? Is the filemgr-client or
>>query_tool
>> >capable of doing this?
>> >
>> >Are there any tutorials you would recommend?
>> >
>> >Thanks
>> >Etienne
>> >
>> >________________________________________
>> >From: Mattmann, Chris A (3980)
>> >[chris.a.mattmann@jpl.nasa.gov <javascript:;><mailto:
>> chris.a.mattmann@jpl.nasa.gov <javascript:;>>]
>> >Sent: Wednesday, August 13, 2014 6:04 PM
>> >To: Etienne Koen; Thomas Bennett
>> >Cc: cschollar@ska.ac.za <javascript:;><mailto:cschollar@ska.ac.za
>> <javascript:;>>;
>> >dev@oodt.apache.org <javascript:;><mailto:dev@oodt.apache.org
>> <javascript:;>>; Mattmann, Chris A (3980)
>> >Subject: Re: Remote data transfer
>> >
>> >Thanks guys.
>> >
>> >Etienne, I hope you don't mind but I've copied
>> >dev@oodt.apache.org <javascript:;><mailto:dev@oodt.apache.org
>> <javascript:;>>
>> >
>> >on this email. That way you can tap into the entire Apache OODT
>> >community for help.
>> >
>> >The URI has authority component is usually an error indicating
>> >that you have referenced some environment variable in your config
>> >(e.g., filemgr.properties in the etc directory) but that variable
>> >isn't defined. E.g., maybe you have a *.policy.dirs property set
>> >to file://[SOME_UNDEFINED_VARIABLE]/path/dir/ and
>>SOME_UNDEFINED_VARIABLE
>> >is undefined.
>> >
>> >Can you check that to see if that's the root cause of this issue?
>> >
>> >Cheers,
>> >Chris
>> >
>> >------------------------
>> >Chris Mattmann
>> >chris.mattmann@gmail.com <javascript:;><mailto:chris.mattmann@gmail.com
>> <javascript:;>>
>> >
>> >
>> >
>> >
>> >-----Original Message-----
>> >From: Etienne Koen <etiennek@scs-space.com <javascript:;><mailto:
>> etiennek@scs-space.com <javascript:;>>>
>> >Date: Wednesday, August 13, 2014 1:42 AM
>> >To: Thomas Bennett <thomas@ska.ac.za <javascript:;><mailto:
>> thomas@ska.ac.za <javascript:;>>>
>> >Cc: "cschollar@ska.ac.za <javascript:;><mailto:cschollar@ska.ac.za
>> <javascript:;>>"
>> ><cschollar@ska.ac.za <javascript:;><mailto:cschollar@ska.ac.za
>> <javascript:;>>>, Chris Mattmann
>> ><chris.mattmann@gmail.com
>><javascript:;><mailto:chris.mattmann@gmail.com
>> <javascript:;>>>
>> >Subject: RE: Remote data transfer
>> >
>> >>Hi Tom,
>> >>
>> >>I get the following error when using the argument:
>> >>
>> >>ERROR: Failed to ingest product 'blah.txt' : URI has an authority
>> >>component
>> >>
>> >>Here both the server and client were using port 9000
>> >>
>> >>I get this when both the server and client are running on the same
>>port
>> >>
>> >>When communicating on different ports I get:
>> >>
>> >><-- some I/O / HTTP exceptions -->
>> >>...
>> >>...
>> >>
>> >>ERROR: Failed to ingest product 'blah.txt' : Connection refused
>> >>
>> >>Server:9000 and Client:431
>> >>
>> >>Do you know what any of this mean?
>> >>
>> >>Cheers
>> >>Etienne
>> >>
>> >>________________________________________
>> >>From: Thomas Bennett [thomas@ska.ac.za <javascript:;><mailto:
>> thomas@ska.ac.za <javascript:;>>]
>> >>Sent: Wednesday, August 13, 2014 10:02 AM
>> >>To: Etienne Koen
>> >>Cc: cschollar@ska.ac.za <javascript:;><mailto:cschollar@ska.ac.za
>> <javascript:;>>;
>> >>chris.mattmann@gmail.com
>><javascript:;><mailto:chris.mattmann@gmail.com
>> <javascript:;>>
>> >>Subject: Re: Remote data transfer
>> >>
>> >>Hey Etienne,
>> >>
>> >>I've been out of the office the last week but I'm back now.
>> >>
>> >>./filemgr-client --url http://localhost:9000 --operation
>>--ingestProduct
>> >>--productName blah.txt --productStructure Flat --productTypeName
>> >>GenericFile --metadataFile file:///tmp/blah.txt.met --refs
>> >>file:///tmp/blah.txt
>> >>
>> >>How would this line be modified to achieve what I want to do? I see
>>there
>> >>is also an argument --clientTransfer --dataTransfer but I am not sure
>> >>what java class to use for this?
>> >>
>> >>You will need to specify the filemgr remotely ie: --url
>> >>http://192.168.0.1 - are you doing this?
>> >>
>> >>I've done remote file transfer before I'll see if I can remember how
>>to
>> >>do it.
>> >>
>> >>Can I log into the CHPC with the usual credentials?
>> >>
>> >>Cheers,
>> >>Tom
>> >>--
>> >>Thomas Bennett
>> >>
>> >>SKA South Africa
>> >>Science Processing Team
>> >>
>> >>Office: +27 21 5067341<tel:%2B27%2021%205067341>
>> >>Mobile: +27 79 5237105<tel:%2B27%2079%205237105>
>> >>
>> >>________________________________
>> >>Disclaimer: This E-mail message, including any attachments, is
>>intended
>> >>only for the person or entity to which it is addressed, and may
>>contain
>> >>confidential information. Each page attached hereto must also be read
>>in
>> >>conjunction with this disclaimer.
>> >>If you are not the intended recipient you are hereby notified that any
>> >>disclosure, copying, distribution or reliance upon the contents of
>>this
>> >>e-mail is strictly prohibited. E.&O.E.
>> >>
>> >>Disclaimer: This E-mail message, including any attachments, is
>>intended
>> >>only for the  person or entity to which it is addressed, and may
>>contain
>> >>confidential  information. Each page attached hereto must also be
>>read in
>> >>conjunction with this disclaimer.
>> >>If you are not the intended recipient you are hereby notified that any
>> >>disclosure, copying, distribution or reliance upon the contents of
>>this
>> >>e-mail is strictly prohibited.    E.&O.E.
>> >
>> >
>> >Disclaimer: This E-mail message, including any attachments, is intended
>> >only for the  person or entity to which it is addressed, and may
>>contain
>> >confidential  information. Each page attached hereto must also be read
>>in
>> >conjunction with this disclaimer.
>> >If you are not the intended recipient you are hereby notified that any
>> >disclosure, copying, distribution or reliance upon the contents of this
>> >e-mail is strictly prohibited.    E.&O.E.
>> >
>> >Disclaimer: This E-mail message, including any attachments, is intended
>> >only for the  person or entity to which it is addressed, and may
>>contain
>> >confidential  information. Each page attached hereto must also be read
>>in
>> >conjunction with this disclaimer.
>> >If you are not the intended recipient you are hereby notified that any
>> >disclosure, copying, distribution or reliance upon the contents of this
>> >e-mail is strictly prohibited.    E.&O.E.
>> >
>> >
>> >
>> >--
>> >Thomas Bennett
>> >
>> >SKA South Africa
>> >Science Processing Team
>> >
>> >Office: +27 21 5067341
>> >Mobile: +27 79 5237105
>> >
>> >________________________________
>> >Disclaimer: This E-mail message, including any attachments, is intended
>> >only for the person or entity to which it is addressed, and may contain
>> >confidential information. Each page attached hereto must also be read
>>in
>> >conjunction with this disclaimer.
>> >If you are not the intended recipient you are hereby notified that any
>> >disclosure, copying, distribution or reliance upon the contents of this
>> >e-mail is strictly prohibited. E.&O.E.
>> >
>> >Disclaimer: This E-mail message, including any attachments, is intended
>> >only for the  person or entity to which it is addressed, and may
>>contain
>> >confidential  information. Each page attached hereto must also be read
>>in
>> >conjunction with this disclaimer.
>> >If you are not the intended recipient you are hereby notified that any
>> >disclosure, copying, distribution or reliance upon the contents of this
>> >e-mail is strictly prohibited.    E.&O.E.
>>
>>


Re: Remote data transfer

Posted by Thomas Bennett <lm...@gmail.com>.
Thanks Chris.

Just to add to the conversation - what protocols are currently supported?

I've seen scp, FTP and http. Also Amazon S3?

On Monday, August 18, 2014, Mattmann, Chris A (3980) <
chris.a.mattmann@jpl.nasa.gov> wrote:

> Hi Etienne,
>
> Thanks. The Push Pull system is a way to pull down remote or ancillary
> files usually *ahead* of file manager ingestion, since the crawler
> really doesn't have a protocol layer to mitigate remote content.
> The typical use case if you use Push Pull is:
>
> 1. Model remote/ancillary files on other sites
> 2. Download them with push pull into a "staging area"
> 3. Crawl and ingest with crawler, as if the content were
> local to start out with.
>
> There is a Push Pull users guide here, it's a bit old but should
> explain it:
>
> http://svn.apache.org/repos/asf/oodt/trunk/pushpull/src/main/resources/docu
> mentation/
>
>
> Cheers,
> Chris
>
>
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> Chris Mattmann, Ph.D.
> Chief Architect
> Instrument Software and Science Data Systems Section (398)
> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
> Office: 168-519, Mailstop: 168-527
> Email: chris.a.mattmann@nasa.gov <javascript:;>
> WWW:  http://sunset.usc.edu/~mattmann/
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> Adjunct Associate Professor, Computer Science Department
> University of Southern California, Los Angeles, CA 90089 USA
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>
>
>
>
>
>
> -----Original Message-----
> From: Etienne Koen <etiennek@scs-space.com <javascript:;>>
> Date: Monday, August 18, 2014 2:36 AM
> To: Thomas Bennett <thomas@ska.ac.za <javascript:;>>
> Cc: Chris Mattmann <Chris.A.Mattmann@jpl.nasa.gov <javascript:;>>, "
> cschollar@ska.ac.za <javascript:;>"
> <cschollar@ska.ac.za <javascript:;>>, "dev@oodt.apache.org <javascript:;>"
> <dev@oodt.apache.org <javascript:;>>
> Subject: RE: Remote data transfer
>
> >Hi Tomas and all,
> >
> >I came across the push/pull tutorial on
> >
> https://cwiki.apache.org/confluence/display/OODT/OODT+Push-Pull+User+Guide
> >.
> >
> >Would this guide be more appropriate to download files that have  been
> >archived by the file manager and represent a typical user scenario?
> >
> >Regards
> >Etienne
> >________________________________________
> >From: Thomas Bennett [thomas@ska.ac.za <javascript:;>]
> >Sent: Friday, August 15, 2014 9:54 AM
> >To: Etienne Koen
> >Cc: Mattmann, Chris A (3980); cschollar@ska.ac.za <javascript:;>;
> dev@oodt.apache.org <javascript:;>
> >Subject: Re: Remote data transfer
> >
> >Hi Etienne,
> >
> >There are various methods you can use to download the data.
> >
> >See this page:
> >
> https://cwiki.apache.org/confluence/display/OODT/Getting+products+from+a+r
> >emote+FileManager
> >
> >Recently there is some great work that has been done on using a REST API
> >- this exists on svn trunk. I don't think it has been released yet.
> >
> >https://cwiki.apache.org/confluence/display/OODT/File+Manager+REST+API
> >
> >To use these components you will need to deploy tomcat or jetty.
> >
> >Shout if you need some help.
> >
> >Cheers,
> >Tom
> >
> >
> >
> >
> >On Thu, Aug 14, 2014 at 4:31 PM, Etienne Koen
> ><etiennek@scs-space.com <javascript:;><mailto:etiennek@scs-space.com
> <javascript:;>>> wrote:
> >Hi Chris and Tom,
> >
> >As I have mentioned before in my previous email, I have managed to ingest
> >a file to a remote location using the filemgr-client. I am also able to
> >query the information remotely using for example the query_tool in this
> >way:
> >
> >$ ./query_tool --url http://192.168.0.10:9000 --lucene -query
> >'CAS.ProductName:blah.txt'
> >
> >978ca28e-23b0-11e4-87fb-4f1c29029486
> >
> >What component would I use for searching and downloading the actual
> >product from the remote file manager? Is the filemgr-client or query_tool
> >capable of doing this?
> >
> >Are there any tutorials you would recommend?
> >
> >Thanks
> >Etienne
> >
> >________________________________________
> >From: Mattmann, Chris A (3980)
> >[chris.a.mattmann@jpl.nasa.gov <javascript:;><mailto:
> chris.a.mattmann@jpl.nasa.gov <javascript:;>>]
> >Sent: Wednesday, August 13, 2014 6:04 PM
> >To: Etienne Koen; Thomas Bennett
> >Cc: cschollar@ska.ac.za <javascript:;><mailto:cschollar@ska.ac.za
> <javascript:;>>;
> >dev@oodt.apache.org <javascript:;><mailto:dev@oodt.apache.org
> <javascript:;>>; Mattmann, Chris A (3980)
> >Subject: Re: Remote data transfer
> >
> >Thanks guys.
> >
> >Etienne, I hope you don't mind but I've copied
> >dev@oodt.apache.org <javascript:;><mailto:dev@oodt.apache.org
> <javascript:;>>
> >
> >on this email. That way you can tap into the entire Apache OODT
> >community for help.
> >
> >The URI has authority component is usually an error indicating
> >that you have referenced some environment variable in your config
> >(e.g., filemgr.properties in the etc directory) but that variable
> >isn't defined. E.g., maybe you have a *.policy.dirs property set
> >to file://[SOME_UNDEFINED_VARIABLE]/path/dir/ and SOME_UNDEFINED_VARIABLE
> >is undefined.
> >
> >Can you check that to see if that's the root cause of this issue?
> >
> >Cheers,
> >Chris
> >
> >------------------------
> >Chris Mattmann
> >chris.mattmann@gmail.com <javascript:;><mailto:chris.mattmann@gmail.com
> <javascript:;>>
> >
> >
> >
> >
> >-----Original Message-----
> >From: Etienne Koen <etiennek@scs-space.com <javascript:;><mailto:
> etiennek@scs-space.com <javascript:;>>>
> >Date: Wednesday, August 13, 2014 1:42 AM
> >To: Thomas Bennett <thomas@ska.ac.za <javascript:;><mailto:
> thomas@ska.ac.za <javascript:;>>>
> >Cc: "cschollar@ska.ac.za <javascript:;><mailto:cschollar@ska.ac.za
> <javascript:;>>"
> ><cschollar@ska.ac.za <javascript:;><mailto:cschollar@ska.ac.za
> <javascript:;>>>, Chris Mattmann
> ><chris.mattmann@gmail.com <javascript:;><mailto:chris.mattmann@gmail.com
> <javascript:;>>>
> >Subject: RE: Remote data transfer
> >
> >>Hi Tom,
> >>
> >>I get the following error when using the argument:
> >>
> >>ERROR: Failed to ingest product 'blah.txt' : URI has an authority
> >>component
> >>
> >>Here both the server and client were using port 9000
> >>
> >>I get this when both the server and client are running on the same port
> >>
> >>When communicating on different ports I get:
> >>
> >><-- some I/O / HTTP exceptions -->
> >>...
> >>...
> >>
> >>ERROR: Failed to ingest product 'blah.txt' : Connection refused
> >>
> >>Server:9000 and Client:431
> >>
> >>Do you know what any of this mean?
> >>
> >>Cheers
> >>Etienne
> >>
> >>________________________________________
> >>From: Thomas Bennett [thomas@ska.ac.za <javascript:;><mailto:
> thomas@ska.ac.za <javascript:;>>]
> >>Sent: Wednesday, August 13, 2014 10:02 AM
> >>To: Etienne Koen
> >>Cc: cschollar@ska.ac.za <javascript:;><mailto:cschollar@ska.ac.za
> <javascript:;>>;
> >>chris.mattmann@gmail.com <javascript:;><mailto:chris.mattmann@gmail.com
> <javascript:;>>
> >>Subject: Re: Remote data transfer
> >>
> >>Hey Etienne,
> >>
> >>I've been out of the office the last week but I'm back now.
> >>
> >>./filemgr-client --url http://localhost:9000 --operation --ingestProduct
> >>--productName blah.txt --productStructure Flat --productTypeName
> >>GenericFile --metadataFile file:///tmp/blah.txt.met --refs
> >>file:///tmp/blah.txt
> >>
> >>How would this line be modified to achieve what I want to do? I see there
> >>is also an argument --clientTransfer --dataTransfer but I am not sure
> >>what java class to use for this?
> >>
> >>You will need to specify the filemgr remotely ie: --url
> >>http://192.168.0.1 - are you doing this?
> >>
> >>I've done remote file transfer before I'll see if I can remember how to
> >>do it.
> >>
> >>Can I log into the CHPC with the usual credentials?
> >>
> >>Cheers,
> >>Tom
> >>--
> >>Thomas Bennett
> >>
> >>SKA South Africa
> >>Science Processing Team
> >>
> >>Office: +27 21 5067341<tel:%2B27%2021%205067341>
> >>Mobile: +27 79 5237105<tel:%2B27%2079%205237105>
> >>
> >>________________________________
> >>Disclaimer: This E-mail message, including any attachments, is intended
> >>only for the person or entity to which it is addressed, and may contain
> >>confidential information. Each page attached hereto must also be read in
> >>conjunction with this disclaimer.
> >>If you are not the intended recipient you are hereby notified that any
> >>disclosure, copying, distribution or reliance upon the contents of this
> >>e-mail is strictly prohibited. E.&O.E.
> >>
> >>Disclaimer: This E-mail message, including any attachments, is intended
> >>only for the  person or entity to which it is addressed, and may contain
> >>confidential  information. Each page attached hereto must also be read in
> >>conjunction with this disclaimer.
> >>If you are not the intended recipient you are hereby notified that any
> >>disclosure, copying, distribution or reliance upon the contents of this
> >>e-mail is strictly prohibited.    E.&O.E.
> >
> >
> >Disclaimer: This E-mail message, including any attachments, is intended
> >only for the  person or entity to which it is addressed, and may contain
> >confidential  information. Each page attached hereto must also be read in
> >conjunction with this disclaimer.
> >If you are not the intended recipient you are hereby notified that any
> >disclosure, copying, distribution or reliance upon the contents of this
> >e-mail is strictly prohibited.    E.&O.E.
> >
> >Disclaimer: This E-mail message, including any attachments, is intended
> >only for the  person or entity to which it is addressed, and may contain
> >confidential  information. Each page attached hereto must also be read in
> >conjunction with this disclaimer.
> >If you are not the intended recipient you are hereby notified that any
> >disclosure, copying, distribution or reliance upon the contents of this
> >e-mail is strictly prohibited.    E.&O.E.
> >
> >
> >
> >--
> >Thomas Bennett
> >
> >SKA South Africa
> >Science Processing Team
> >
> >Office: +27 21 5067341
> >Mobile: +27 79 5237105
> >
> >________________________________
> >Disclaimer: This E-mail message, including any attachments, is intended
> >only for the person or entity to which it is addressed, and may contain
> >confidential information. Each page attached hereto must also be read in
> >conjunction with this disclaimer.
> >If you are not the intended recipient you are hereby notified that any
> >disclosure, copying, distribution or reliance upon the contents of this
> >e-mail is strictly prohibited. E.&O.E.
> >
> >Disclaimer: This E-mail message, including any attachments, is intended
> >only for the  person or entity to which it is addressed, and may contain
> >confidential  information. Each page attached hereto must also be read in
> >conjunction with this disclaimer.
> >If you are not the intended recipient you are hereby notified that any
> >disclosure, copying, distribution or reliance upon the contents of this
> >e-mail is strictly prohibited.    E.&O.E.
>
>

Re: Remote data transfer

Posted by "Mattmann, Chris A (3980)" <ch...@jpl.nasa.gov>.
Hi Etienne,

Thanks. The Push Pull system is a way to pull down remote or ancillary
files usually *ahead* of file manager ingestion, since the crawler
really doesn't have a protocol layer to mitigate remote content.
The typical use case if you use Push Pull is:

1. Model remote/ancillary files on other sites
2. Download them with push pull into a "staging area"
3. Crawl and ingest with crawler, as if the content were
local to start out with.

There is a Push Pull users guide here, it's a bit old but should
explain it:

http://svn.apache.org/repos/asf/oodt/trunk/pushpull/src/main/resources/docu
mentation/


Cheers,
Chris


++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Chief Architect
Instrument Software and Science Data Systems Section (398)
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 168-519, Mailstop: 168-527
Email: chris.a.mattmann@nasa.gov
WWW:  http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Associate Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++






-----Original Message-----
From: Etienne Koen <et...@scs-space.com>
Date: Monday, August 18, 2014 2:36 AM
To: Thomas Bennett <th...@ska.ac.za>
Cc: Chris Mattmann <Ch...@jpl.nasa.gov>, "cschollar@ska.ac.za"
<cs...@ska.ac.za>, "dev@oodt.apache.org" <de...@oodt.apache.org>
Subject: RE: Remote data transfer

>Hi Tomas and all,
>
>I came across the push/pull tutorial on
>https://cwiki.apache.org/confluence/display/OODT/OODT+Push-Pull+User+Guide
>.
>
>Would this guide be more appropriate to download files that have  been
>archived by the file manager and represent a typical user scenario?
>
>Regards
>Etienne
>________________________________________
>From: Thomas Bennett [thomas@ska.ac.za]
>Sent: Friday, August 15, 2014 9:54 AM
>To: Etienne Koen
>Cc: Mattmann, Chris A (3980); cschollar@ska.ac.za; dev@oodt.apache.org
>Subject: Re: Remote data transfer
>
>Hi Etienne,
>
>There are various methods you can use to download the data.
>
>See this page:
>https://cwiki.apache.org/confluence/display/OODT/Getting+products+from+a+r
>emote+FileManager
>
>Recently there is some great work that has been done on using a REST API
>- this exists on svn trunk. I don't think it has been released yet.
>
>https://cwiki.apache.org/confluence/display/OODT/File+Manager+REST+API
>
>To use these components you will need to deploy tomcat or jetty.
>
>Shout if you need some help.
>
>Cheers,
>Tom
>
>
>
>
>On Thu, Aug 14, 2014 at 4:31 PM, Etienne Koen
><et...@scs-space.com>> wrote:
>Hi Chris and Tom,
>
>As I have mentioned before in my previous email, I have managed to ingest
>a file to a remote location using the filemgr-client. I am also able to
>query the information remotely using for example the query_tool in this
>way:
>
>$ ./query_tool --url http://192.168.0.10:9000 --lucene -query
>'CAS.ProductName:blah.txt'
>
>978ca28e-23b0-11e4-87fb-4f1c29029486
>
>What component would I use for searching and downloading the actual
>product from the remote file manager? Is the filemgr-client or query_tool
>capable of doing this?
>
>Are there any tutorials you would recommend?
>
>Thanks
>Etienne
>
>________________________________________
>From: Mattmann, Chris A (3980)
>[chris.a.mattmann@jpl.nasa.gov<ma...@jpl.nasa.gov>]
>Sent: Wednesday, August 13, 2014 6:04 PM
>To: Etienne Koen; Thomas Bennett
>Cc: cschollar@ska.ac.za<ma...@ska.ac.za>;
>dev@oodt.apache.org<ma...@oodt.apache.org>; Mattmann, Chris A (3980)
>Subject: Re: Remote data transfer
>
>Thanks guys.
>
>Etienne, I hope you don't mind but I've copied
>dev@oodt.apache.org<ma...@oodt.apache.org>
>
>on this email. That way you can tap into the entire Apache OODT
>community for help.
>
>The URI has authority component is usually an error indicating
>that you have referenced some environment variable in your config
>(e.g., filemgr.properties in the etc directory) but that variable
>isn't defined. E.g., maybe you have a *.policy.dirs property set
>to file://[SOME_UNDEFINED_VARIABLE]/path/dir/ and SOME_UNDEFINED_VARIABLE
>is undefined.
>
>Can you check that to see if that's the root cause of this issue?
>
>Cheers,
>Chris
>
>------------------------
>Chris Mattmann
>chris.mattmann@gmail.com<ma...@gmail.com>
>
>
>
>
>-----Original Message-----
>From: Etienne Koen <et...@scs-space.com>>
>Date: Wednesday, August 13, 2014 1:42 AM
>To: Thomas Bennett <th...@ska.ac.za>>
>Cc: "cschollar@ska.ac.za<ma...@ska.ac.za>"
><cs...@ska.ac.za>>, Chris Mattmann
><ch...@gmail.com>>
>Subject: RE: Remote data transfer
>
>>Hi Tom,
>>
>>I get the following error when using the argument:
>>
>>ERROR: Failed to ingest product 'blah.txt' : URI has an authority
>>component
>>
>>Here both the server and client were using port 9000
>>
>>I get this when both the server and client are running on the same port
>>
>>When communicating on different ports I get:
>>
>><-- some I/O / HTTP exceptions -->
>>...
>>...
>>
>>ERROR: Failed to ingest product 'blah.txt' : Connection refused
>>
>>Server:9000 and Client:431
>>
>>Do you know what any of this mean?
>>
>>Cheers
>>Etienne
>>
>>________________________________________
>>From: Thomas Bennett [thomas@ska.ac.za<ma...@ska.ac.za>]
>>Sent: Wednesday, August 13, 2014 10:02 AM
>>To: Etienne Koen
>>Cc: cschollar@ska.ac.za<ma...@ska.ac.za>;
>>chris.mattmann@gmail.com<ma...@gmail.com>
>>Subject: Re: Remote data transfer
>>
>>Hey Etienne,
>>
>>I've been out of the office the last week but I'm back now.
>>
>>./filemgr-client --url http://localhost:9000 --operation --ingestProduct
>>--productName blah.txt --productStructure Flat --productTypeName
>>GenericFile --metadataFile file:///tmp/blah.txt.met --refs
>>file:///tmp/blah.txt
>>
>>How would this line be modified to achieve what I want to do? I see there
>>is also an argument --clientTransfer --dataTransfer but I am not sure
>>what java class to use for this?
>>
>>You will need to specify the filemgr remotely ie: --url
>>http://192.168.0.1 - are you doing this?
>>
>>I've done remote file transfer before I'll see if I can remember how to
>>do it.
>>
>>Can I log into the CHPC with the usual credentials?
>>
>>Cheers,
>>Tom
>>--
>>Thomas Bennett
>>
>>SKA South Africa
>>Science Processing Team
>>
>>Office: +27 21 5067341<tel:%2B27%2021%205067341>
>>Mobile: +27 79 5237105<tel:%2B27%2079%205237105>
>>
>>________________________________
>>Disclaimer: This E-mail message, including any attachments, is intended
>>only for the person or entity to which it is addressed, and may contain
>>confidential information. Each page attached hereto must also be read in
>>conjunction with this disclaimer.
>>If you are not the intended recipient you are hereby notified that any
>>disclosure, copying, distribution or reliance upon the contents of this
>>e-mail is strictly prohibited. E.&O.E.
>>
>>Disclaimer: This E-mail message, including any attachments, is intended
>>only for the  person or entity to which it is addressed, and may contain
>>confidential  information. Each page attached hereto must also be read in
>>conjunction with this disclaimer.
>>If you are not the intended recipient you are hereby notified that any
>>disclosure, copying, distribution or reliance upon the contents of this
>>e-mail is strictly prohibited.    E.&O.E.
>
>
>Disclaimer: This E-mail message, including any attachments, is intended
>only for the  person or entity to which it is addressed, and may contain
>confidential  information. Each page attached hereto must also be read in
>conjunction with this disclaimer.
>If you are not the intended recipient you are hereby notified that any
>disclosure, copying, distribution or reliance upon the contents of this
>e-mail is strictly prohibited.    E.&O.E.
>
>Disclaimer: This E-mail message, including any attachments, is intended
>only for the  person or entity to which it is addressed, and may contain
>confidential  information. Each page attached hereto must also be read in
>conjunction with this disclaimer.
>If you are not the intended recipient you are hereby notified that any
>disclosure, copying, distribution or reliance upon the contents of this
>e-mail is strictly prohibited.    E.&O.E.
>
>
>
>--
>Thomas Bennett
>
>SKA South Africa
>Science Processing Team
>
>Office: +27 21 5067341
>Mobile: +27 79 5237105
>
>________________________________
>Disclaimer: This E-mail message, including any attachments, is intended
>only for the person or entity to which it is addressed, and may contain
>confidential information. Each page attached hereto must also be read in
>conjunction with this disclaimer.
>If you are not the intended recipient you are hereby notified that any
>disclosure, copying, distribution or reliance upon the contents of this
>e-mail is strictly prohibited. E.&O.E.
>
>Disclaimer: This E-mail message, including any attachments, is intended
>only for the  person or entity to which it is addressed, and may contain
>confidential  information. Each page attached hereto must also be read in
>conjunction with this disclaimer.
>If you are not the intended recipient you are hereby notified that any
>disclosure, copying, distribution or reliance upon the contents of this
>e-mail is strictly prohibited.    E.&O.E.


RE: Remote data transfer

Posted by Etienne Koen <et...@scs-space.com>.
Hi Tomas and all,

I came across the push/pull tutorial on https://cwiki.apache.org/confluence/display/OODT/OODT+Push-Pull+User+Guide.

Would this guide be more appropriate to download files that have  been archived by the file manager and represent a typical user scenario?

Regards
Etienne
________________________________________
From: Thomas Bennett [thomas@ska.ac.za]
Sent: Friday, August 15, 2014 9:54 AM
To: Etienne Koen
Cc: Mattmann, Chris A (3980); cschollar@ska.ac.za; dev@oodt.apache.org
Subject: Re: Remote data transfer

Hi Etienne,

There are various methods you can use to download the data.

See this page:
https://cwiki.apache.org/confluence/display/OODT/Getting+products+from+a+remote+FileManager

Recently there is some great work that has been done on using a REST API - this exists on svn trunk. I don't think it has been released yet.

https://cwiki.apache.org/confluence/display/OODT/File+Manager+REST+API

To use these components you will need to deploy tomcat or jetty.

Shout if you need some help.

Cheers,
Tom




On Thu, Aug 14, 2014 at 4:31 PM, Etienne Koen <et...@scs-space.com>> wrote:
Hi Chris and Tom,

As I have mentioned before in my previous email, I have managed to ingest a file to a remote location using the filemgr-client. I am also able to query the information remotely using for example the query_tool in this way:

$ ./query_tool --url http://192.168.0.10:9000 --lucene -query 'CAS.ProductName:blah.txt'

978ca28e-23b0-11e4-87fb-4f1c29029486

What component would I use for searching and downloading the actual product from the remote file manager? Is the filemgr-client or query_tool capable of doing this?

Are there any tutorials you would recommend?

Thanks
Etienne

________________________________________
From: Mattmann, Chris A (3980) [chris.a.mattmann@jpl.nasa.gov<ma...@jpl.nasa.gov>]
Sent: Wednesday, August 13, 2014 6:04 PM
To: Etienne Koen; Thomas Bennett
Cc: cschollar@ska.ac.za<ma...@ska.ac.za>; dev@oodt.apache.org<ma...@oodt.apache.org>; Mattmann, Chris A (3980)
Subject: Re: Remote data transfer

Thanks guys.

Etienne, I hope you don't mind but I've copied dev@oodt.apache.org<ma...@oodt.apache.org>

on this email. That way you can tap into the entire Apache OODT
community for help.

The URI has authority component is usually an error indicating
that you have referenced some environment variable in your config
(e.g., filemgr.properties in the etc directory) but that variable
isn't defined. E.g., maybe you have a *.policy.dirs property set
to file://[SOME_UNDEFINED_VARIABLE]/path/dir/ and SOME_UNDEFINED_VARIABLE
is undefined.

Can you check that to see if that's the root cause of this issue?

Cheers,
Chris

------------------------
Chris Mattmann
chris.mattmann@gmail.com<ma...@gmail.com>




-----Original Message-----
From: Etienne Koen <et...@scs-space.com>>
Date: Wednesday, August 13, 2014 1:42 AM
To: Thomas Bennett <th...@ska.ac.za>>
Cc: "cschollar@ska.ac.za<ma...@ska.ac.za>" <cs...@ska.ac.za>>, Chris Mattmann
<ch...@gmail.com>>
Subject: RE: Remote data transfer

>Hi Tom,
>
>I get the following error when using the argument:
>
>ERROR: Failed to ingest product 'blah.txt' : URI has an authority
>component
>
>Here both the server and client were using port 9000
>
>I get this when both the server and client are running on the same port
>
>When communicating on different ports I get:
>
><-- some I/O / HTTP exceptions -->
>...
>...
>
>ERROR: Failed to ingest product 'blah.txt' : Connection refused
>
>Server:9000 and Client:431
>
>Do you know what any of this mean?
>
>Cheers
>Etienne
>
>________________________________________
>From: Thomas Bennett [thomas@ska.ac.za<ma...@ska.ac.za>]
>Sent: Wednesday, August 13, 2014 10:02 AM
>To: Etienne Koen
>Cc: cschollar@ska.ac.za<ma...@ska.ac.za>; chris.mattmann@gmail.com<ma...@gmail.com>
>Subject: Re: Remote data transfer
>
>Hey Etienne,
>
>I've been out of the office the last week but I'm back now.
>
>./filemgr-client --url http://localhost:9000 --operation --ingestProduct
>--productName blah.txt --productStructure Flat --productTypeName
>GenericFile --metadataFile file:///tmp/blah.txt.met --refs
>file:///tmp/blah.txt
>
>How would this line be modified to achieve what I want to do? I see there
>is also an argument --clientTransfer --dataTransfer but I am not sure
>what java class to use for this?
>
>You will need to specify the filemgr remotely ie: --url
>http://192.168.0.1 - are you doing this?
>
>I've done remote file transfer before I'll see if I can remember how to
>do it.
>
>Can I log into the CHPC with the usual credentials?
>
>Cheers,
>Tom
>--
>Thomas Bennett
>
>SKA South Africa
>Science Processing Team
>
>Office: +27 21 5067341<tel:%2B27%2021%205067341>
>Mobile: +27 79 5237105<tel:%2B27%2079%205237105>
>
>________________________________
>Disclaimer: This E-mail message, including any attachments, is intended
>only for the person or entity to which it is addressed, and may contain
>confidential information. Each page attached hereto must also be read in
>conjunction with this disclaimer.
>If you are not the intended recipient you are hereby notified that any
>disclosure, copying, distribution or reliance upon the contents of this
>e-mail is strictly prohibited. E.&O.E.
>
>Disclaimer: This E-mail message, including any attachments, is intended
>only for the  person or entity to which it is addressed, and may contain
>confidential  information. Each page attached hereto must also be read in
>conjunction with this disclaimer.
>If you are not the intended recipient you are hereby notified that any
>disclosure, copying, distribution or reliance upon the contents of this
>e-mail is strictly prohibited.    E.&O.E.


Disclaimer: This E-mail message, including any attachments, is intended only for the  person or entity to which it is addressed, and may contain confidential  information. Each page attached hereto must also be read in conjunction with this disclaimer.
If you are not the intended recipient you are hereby notified that any disclosure, copying, distribution or reliance upon the contents of this e-mail is strictly prohibited.    E.&O.E.

Disclaimer: This E-mail message, including any attachments, is intended only for the  person or entity to which it is addressed, and may contain confidential  information. Each page attached hereto must also be read in conjunction with this disclaimer.
If you are not the intended recipient you are hereby notified that any disclosure, copying, distribution or reliance upon the contents of this e-mail is strictly prohibited.    E.&O.E.



--
Thomas Bennett

SKA South Africa
Science Processing Team

Office: +27 21 5067341
Mobile: +27 79 5237105

________________________________
Disclaimer: This E-mail message, including any attachments, is intended only for the person or entity to which it is addressed, and may contain confidential information. Each page attached hereto must also be read in conjunction with this disclaimer.
If you are not the intended recipient you are hereby notified that any disclosure, copying, distribution or reliance upon the contents of this e-mail is strictly prohibited. E.&O.E.

Disclaimer: This E-mail message, including any attachments, is intended only for the  person or entity to which it is addressed, and may contain confidential  information. Each page attached hereto must also be read in conjunction with this disclaimer.
If you are not the intended recipient you are hereby notified that any disclosure, copying, distribution or reliance upon the contents of this e-mail is strictly prohibited.    E.&O.E.

Re: Remote data transfer

Posted by Thomas Bennett <th...@ska.ac.za>.
Hi Etienne,

There are various methods you can use to download the data.

See this page:
https://cwiki.apache.org/confluence/display/OODT/Getting+products+from+a+remote+FileManager

Recently there is some great work that has been done on using a REST API -
this exists on svn trunk. I don't think it has been released yet.

https://cwiki.apache.org/confluence/display/OODT/File+Manager+REST+API

To use these components you will need to deploy tomcat or jetty.

Shout if you need some help.

Cheers,
Tom




On Thu, Aug 14, 2014 at 4:31 PM, Etienne Koen <et...@scs-space.com>
wrote:

> Hi Chris and Tom,
>
> As I have mentioned before in my previous email, I have managed to ingest
> a file to a remote location using the filemgr-client. I am also able to
> query the information remotely using for example the query_tool in this way:
>
> $ ./query_tool --url http://192.168.0.10:9000 --lucene -query
> 'CAS.ProductName:blah.txt'
>
> 978ca28e-23b0-11e4-87fb-4f1c29029486
>
> What component would I use for searching and downloading the actual
> product from the remote file manager? Is the filemgr-client or query_tool
> capable of doing this?
>
> Are there any tutorials you would recommend?
>
> Thanks
> Etienne
>
> ________________________________________
> From: Mattmann, Chris A (3980) [chris.a.mattmann@jpl.nasa.gov]
> Sent: Wednesday, August 13, 2014 6:04 PM
> To: Etienne Koen; Thomas Bennett
> Cc: cschollar@ska.ac.za; dev@oodt.apache.org; Mattmann, Chris A (3980)
> Subject: Re: Remote data transfer
>
> Thanks guys.
>
> Etienne, I hope you don't mind but I've copied dev@oodt.apache.org
>
> on this email. That way you can tap into the entire Apache OODT
> community for help.
>
> The URI has authority component is usually an error indicating
> that you have referenced some environment variable in your config
> (e.g., filemgr.properties in the etc directory) but that variable
> isn't defined. E.g., maybe you have a *.policy.dirs property set
> to file://[SOME_UNDEFINED_VARIABLE]/path/dir/ and SOME_UNDEFINED_VARIABLE
> is undefined.
>
> Can you check that to see if that's the root cause of this issue?
>
> Cheers,
> Chris
>
> ------------------------
> Chris Mattmann
> chris.mattmann@gmail.com
>
>
>
>
> -----Original Message-----
> From: Etienne Koen <et...@scs-space.com>
> Date: Wednesday, August 13, 2014 1:42 AM
> To: Thomas Bennett <th...@ska.ac.za>
> Cc: "cschollar@ska.ac.za" <cs...@ska.ac.za>, Chris Mattmann
> <ch...@gmail.com>
> Subject: RE: Remote data transfer
>
> >Hi Tom,
> >
> >I get the following error when using the argument:
> >
> >ERROR: Failed to ingest product 'blah.txt' : URI has an authority
> >component
> >
> >Here both the server and client were using port 9000
> >
> >I get this when both the server and client are running on the same port
> >
> >When communicating on different ports I get:
> >
> ><-- some I/O / HTTP exceptions -->
> >...
> >...
> >
> >ERROR: Failed to ingest product 'blah.txt' : Connection refused
> >
> >Server:9000 and Client:431
> >
> >Do you know what any of this mean?
> >
> >Cheers
> >Etienne
> >
> >________________________________________
> >From: Thomas Bennett [thomas@ska.ac.za]
> >Sent: Wednesday, August 13, 2014 10:02 AM
> >To: Etienne Koen
> >Cc: cschollar@ska.ac.za; chris.mattmann@gmail.com
> >Subject: Re: Remote data transfer
> >
> >Hey Etienne,
> >
> >I've been out of the office the last week but I'm back now.
> >
> >./filemgr-client --url http://localhost:9000 --operation --ingestProduct
> >--productName blah.txt --productStructure Flat --productTypeName
> >GenericFile --metadataFile file:///tmp/blah.txt.met --refs
> >file:///tmp/blah.txt
> >
> >How would this line be modified to achieve what I want to do? I see there
> >is also an argument --clientTransfer --dataTransfer but I am not sure
> >what java class to use for this?
> >
> >You will need to specify the filemgr remotely ie: --url
> >http://192.168.0.1 - are you doing this?
> >
> >I've done remote file transfer before I'll see if I can remember how to
> >do it.
> >
> >Can I log into the CHPC with the usual credentials?
> >
> >Cheers,
> >Tom
> >--
> >Thomas Bennett
> >
> >SKA South Africa
> >Science Processing Team
> >
> >Office: +27 21 5067341
> >Mobile: +27 79 5237105
> >
> >________________________________
> >Disclaimer: This E-mail message, including any attachments, is intended
> >only for the person or entity to which it is addressed, and may contain
> >confidential information. Each page attached hereto must also be read in
> >conjunction with this disclaimer.
> >If you are not the intended recipient you are hereby notified that any
> >disclosure, copying, distribution or reliance upon the contents of this
> >e-mail is strictly prohibited. E.&O.E.
> >
> >Disclaimer: This E-mail message, including any attachments, is intended
> >only for the  person or entity to which it is addressed, and may contain
> >confidential  information. Each page attached hereto must also be read in
> >conjunction with this disclaimer.
> >If you are not the intended recipient you are hereby notified that any
> >disclosure, copying, distribution or reliance upon the contents of this
> >e-mail is strictly prohibited.    E.&O.E.
>
>
> Disclaimer: This E-mail message, including any attachments, is intended
> only for the  person or entity to which it is addressed, and may contain
> confidential  information. Each page attached hereto must also be read in
> conjunction with this disclaimer.
> If you are not the intended recipient you are hereby notified that any
> disclosure, copying, distribution or reliance upon the contents of this
> e-mail is strictly prohibited.    E.&O.E.
>
> Disclaimer: This E-mail message, including any attachments, is intended
> only for the  person or entity to which it is addressed, and may contain
> confidential  information. Each page attached hereto must also be read in
> conjunction with this disclaimer.
> If you are not the intended recipient you are hereby notified that any
> disclosure, copying, distribution or reliance upon the contents of this
> e-mail is strictly prohibited.    E.&O.E.
>



-- 
Thomas Bennett

SKA South Africa
Science Processing Team

Office: +27 21 5067341
Mobile: +27 79 5237105

Re: Remote data transfer

Posted by "Verma, Rishi (398J)" <Ri...@jpl.nasa.gov>.
Hi Etienne, all,

Another option is to try using OODT Web Grid. This web-application will need to run on a server on your remote machine, but once installed there, it will stream products given an HTTP invocation. Sean Kelly had written some great documentation on how to use this service [1].

So the scenario could be: (1) run OODT File Manager on local machine, but archive the products remotely, (2) set up web-grid on remote server, and store the HTTP URL (also called 'OFSN' in OODT terminology) as a metadata value for a given product within your File Manager. Once users query your File Manager catalog, they will download the file directly from the remote server because your File Manager has a link to the URL on the remote machine.

Hope that helps!

Thanks,
Rishi

[1] http://oodt.apache.org/components/maven/grid/slides.pdf

On Aug 14, 2014, at 9:25 PM, Lewis John Mcgibbney wrote:

> I wonder if you have seen this one?
> https://cwiki.apache.org/confluence/display/OODT/Getting+products+from+a+remote+FileManager
> hth
> Lewis
> 
> 
> On Thu, Aug 14, 2014 at 7:31 AM, Etienne Koen <et...@scs-space.com>
> wrote:
> 
>> Hi Chris and Tom,
>> 
>> As I have mentioned before in my previous email, I have managed to ingest
>> a file to a remote location using the filemgr-client. I am also able to
>> query the information remotely using for example the query_tool in this way:
>> 
>> $ ./query_tool --url http://192.168.0.10:9000 --lucene -query
>> 'CAS.ProductName:blah.txt'
>> 
>> 978ca28e-23b0-11e4-87fb-4f1c29029486
>> 
>> What component would I use for searching and downloading the actual
>> product from the remote file manager? Is the filemgr-client or query_tool
>> capable of doing this?
>> 
>> Are there any tutorials you would recommend?
>> 
>> Thanks
>> Etienne
>> 
>> ________________________________________
>> From: Mattmann, Chris A (3980) [chris.a.mattmann@jpl.nasa.gov]
>> Sent: Wednesday, August 13, 2014 6:04 PM
>> To: Etienne Koen; Thomas Bennett
>> Cc: cschollar@ska.ac.za; dev@oodt.apache.org; Mattmann, Chris A (3980)
>> Subject: Re: Remote data transfer
>> 
>> Thanks guys.
>> 
>> Etienne, I hope you don't mind but I've copied dev@oodt.apache.org
>> 
>> on this email. That way you can tap into the entire Apache OODT
>> community for help.
>> 
>> The URI has authority component is usually an error indicating
>> that you have referenced some environment variable in your config
>> (e.g., filemgr.properties in the etc directory) but that variable
>> isn't defined. E.g., maybe you have a *.policy.dirs property set
>> to file://[SOME_UNDEFINED_VARIABLE]/path/dir/ and SOME_UNDEFINED_VARIABLE
>> is undefined.
>> 
>> Can you check that to see if that's the root cause of this issue?
>> 
>> Cheers,
>> Chris
>> 
>> ------------------------
>> Chris Mattmann
>> chris.mattmann@gmail.com
>> 
>> 
>> 
>> 
>> -----Original Message-----
>> From: Etienne Koen <et...@scs-space.com>
>> Date: Wednesday, August 13, 2014 1:42 AM
>> To: Thomas Bennett <th...@ska.ac.za>
>> Cc: "cschollar@ska.ac.za" <cs...@ska.ac.za>, Chris Mattmann
>> <ch...@gmail.com>
>> Subject: RE: Remote data transfer
>> 
>>> Hi Tom,
>>> 
>>> I get the following error when using the argument:
>>> 
>>> ERROR: Failed to ingest product 'blah.txt' : URI has an authority
>>> component
>>> 
>>> Here both the server and client were using port 9000
>>> 
>>> I get this when both the server and client are running on the same port
>>> 
>>> When communicating on different ports I get:
>>> 
>>> <-- some I/O / HTTP exceptions -->
>>> ...
>>> ...
>>> 
>>> ERROR: Failed to ingest product 'blah.txt' : Connection refused
>>> 
>>> Server:9000 and Client:431
>>> 
>>> Do you know what any of this mean?
>>> 
>>> Cheers
>>> Etienne
>>> 
>>> ________________________________________
>>> From: Thomas Bennett [thomas@ska.ac.za]
>>> Sent: Wednesday, August 13, 2014 10:02 AM
>>> To: Etienne Koen
>>> Cc: cschollar@ska.ac.za; chris.mattmann@gmail.com
>>> Subject: Re: Remote data transfer
>>> 
>>> Hey Etienne,
>>> 
>>> I've been out of the office the last week but I'm back now.
>>> 
>>> ./filemgr-client --url http://localhost:9000 --operation --ingestProduct
>>> --productName blah.txt --productStructure Flat --productTypeName
>>> GenericFile --metadataFile file:///tmp/blah.txt.met --refs
>>> file:///tmp/blah.txt
>>> 
>>> How would this line be modified to achieve what I want to do? I see there
>>> is also an argument --clientTransfer --dataTransfer but I am not sure
>>> what java class to use for this?
>>> 
>>> You will need to specify the filemgr remotely ie: --url
>>> http://192.168.0.1 - are you doing this?
>>> 
>>> I've done remote file transfer before I'll see if I can remember how to
>>> do it.
>>> 
>>> Can I log into the CHPC with the usual credentials?
>>> 
>>> Cheers,
>>> Tom
>>> --
>>> Thomas Bennett
>>> 
>>> SKA South Africa
>>> Science Processing Team
>>> 
>>> Office: +27 21 5067341
>>> Mobile: +27 79 5237105
>>> 
>>> ________________________________
>>> Disclaimer: This E-mail message, including any attachments, is intended
>>> only for the person or entity to which it is addressed, and may contain
>>> confidential information. Each page attached hereto must also be read in
>>> conjunction with this disclaimer.
>>> If you are not the intended recipient you are hereby notified that any
>>> disclosure, copying, distribution or reliance upon the contents of this
>>> e-mail is strictly prohibited. E.&O.E.
>>> 
>>> Disclaimer: This E-mail message, including any attachments, is intended
>>> only for the  person or entity to which it is addressed, and may contain
>>> confidential  information. Each page attached hereto must also be read in
>>> conjunction with this disclaimer.
>>> If you are not the intended recipient you are hereby notified that any
>>> disclosure, copying, distribution or reliance upon the contents of this
>>> e-mail is strictly prohibited.    E.&O.E.
>> 
>> 
>> Disclaimer: This E-mail message, including any attachments, is intended
>> only for the  person or entity to which it is addressed, and may contain
>> confidential  information. Each page attached hereto must also be read in
>> conjunction with this disclaimer.
>> If you are not the intended recipient you are hereby notified that any
>> disclosure, copying, distribution or reliance upon the contents of this
>> e-mail is strictly prohibited.    E.&O.E.
>> 
>> Disclaimer: This E-mail message, including any attachments, is intended
>> only for the  person or entity to which it is addressed, and may contain
>> confidential  information. Each page attached hereto must also be read in
>> conjunction with this disclaimer.
>> If you are not the intended recipient you are hereby notified that any
>> disclosure, copying, distribution or reliance upon the contents of this
>> e-mail is strictly prohibited.    E.&O.E.
>> 
> 
> 
> 
> -- 
> *Lewis*






Re: Remote data transfer

Posted by Lewis John Mcgibbney <le...@gmail.com>.
I wonder if you have seen this one?
https://cwiki.apache.org/confluence/display/OODT/Getting+products+from+a+remote+FileManager
hth
Lewis


On Thu, Aug 14, 2014 at 7:31 AM, Etienne Koen <et...@scs-space.com>
wrote:

> Hi Chris and Tom,
>
> As I have mentioned before in my previous email, I have managed to ingest
> a file to a remote location using the filemgr-client. I am also able to
> query the information remotely using for example the query_tool in this way:
>
> $ ./query_tool --url http://192.168.0.10:9000 --lucene -query
> 'CAS.ProductName:blah.txt'
>
> 978ca28e-23b0-11e4-87fb-4f1c29029486
>
> What component would I use for searching and downloading the actual
> product from the remote file manager? Is the filemgr-client or query_tool
> capable of doing this?
>
> Are there any tutorials you would recommend?
>
> Thanks
> Etienne
>
> ________________________________________
> From: Mattmann, Chris A (3980) [chris.a.mattmann@jpl.nasa.gov]
> Sent: Wednesday, August 13, 2014 6:04 PM
> To: Etienne Koen; Thomas Bennett
> Cc: cschollar@ska.ac.za; dev@oodt.apache.org; Mattmann, Chris A (3980)
> Subject: Re: Remote data transfer
>
> Thanks guys.
>
> Etienne, I hope you don't mind but I've copied dev@oodt.apache.org
>
> on this email. That way you can tap into the entire Apache OODT
> community for help.
>
> The URI has authority component is usually an error indicating
> that you have referenced some environment variable in your config
> (e.g., filemgr.properties in the etc directory) but that variable
> isn't defined. E.g., maybe you have a *.policy.dirs property set
> to file://[SOME_UNDEFINED_VARIABLE]/path/dir/ and SOME_UNDEFINED_VARIABLE
> is undefined.
>
> Can you check that to see if that's the root cause of this issue?
>
> Cheers,
> Chris
>
> ------------------------
> Chris Mattmann
> chris.mattmann@gmail.com
>
>
>
>
> -----Original Message-----
> From: Etienne Koen <et...@scs-space.com>
> Date: Wednesday, August 13, 2014 1:42 AM
> To: Thomas Bennett <th...@ska.ac.za>
> Cc: "cschollar@ska.ac.za" <cs...@ska.ac.za>, Chris Mattmann
> <ch...@gmail.com>
> Subject: RE: Remote data transfer
>
> >Hi Tom,
> >
> >I get the following error when using the argument:
> >
> >ERROR: Failed to ingest product 'blah.txt' : URI has an authority
> >component
> >
> >Here both the server and client were using port 9000
> >
> >I get this when both the server and client are running on the same port
> >
> >When communicating on different ports I get:
> >
> ><-- some I/O / HTTP exceptions -->
> >...
> >...
> >
> >ERROR: Failed to ingest product 'blah.txt' : Connection refused
> >
> >Server:9000 and Client:431
> >
> >Do you know what any of this mean?
> >
> >Cheers
> >Etienne
> >
> >________________________________________
> >From: Thomas Bennett [thomas@ska.ac.za]
> >Sent: Wednesday, August 13, 2014 10:02 AM
> >To: Etienne Koen
> >Cc: cschollar@ska.ac.za; chris.mattmann@gmail.com
> >Subject: Re: Remote data transfer
> >
> >Hey Etienne,
> >
> >I've been out of the office the last week but I'm back now.
> >
> >./filemgr-client --url http://localhost:9000 --operation --ingestProduct
> >--productName blah.txt --productStructure Flat --productTypeName
> >GenericFile --metadataFile file:///tmp/blah.txt.met --refs
> >file:///tmp/blah.txt
> >
> >How would this line be modified to achieve what I want to do? I see there
> >is also an argument --clientTransfer --dataTransfer but I am not sure
> >what java class to use for this?
> >
> >You will need to specify the filemgr remotely ie: --url
> >http://192.168.0.1 - are you doing this?
> >
> >I've done remote file transfer before I'll see if I can remember how to
> >do it.
> >
> >Can I log into the CHPC with the usual credentials?
> >
> >Cheers,
> >Tom
> >--
> >Thomas Bennett
> >
> >SKA South Africa
> >Science Processing Team
> >
> >Office: +27 21 5067341
> >Mobile: +27 79 5237105
> >
> >________________________________
> >Disclaimer: This E-mail message, including any attachments, is intended
> >only for the person or entity to which it is addressed, and may contain
> >confidential information. Each page attached hereto must also be read in
> >conjunction with this disclaimer.
> >If you are not the intended recipient you are hereby notified that any
> >disclosure, copying, distribution or reliance upon the contents of this
> >e-mail is strictly prohibited. E.&O.E.
> >
> >Disclaimer: This E-mail message, including any attachments, is intended
> >only for the  person or entity to which it is addressed, and may contain
> >confidential  information. Each page attached hereto must also be read in
> >conjunction with this disclaimer.
> >If you are not the intended recipient you are hereby notified that any
> >disclosure, copying, distribution or reliance upon the contents of this
> >e-mail is strictly prohibited.    E.&O.E.
>
>
> Disclaimer: This E-mail message, including any attachments, is intended
> only for the  person or entity to which it is addressed, and may contain
> confidential  information. Each page attached hereto must also be read in
> conjunction with this disclaimer.
> If you are not the intended recipient you are hereby notified that any
> disclosure, copying, distribution or reliance upon the contents of this
> e-mail is strictly prohibited.    E.&O.E.
>
> Disclaimer: This E-mail message, including any attachments, is intended
> only for the  person or entity to which it is addressed, and may contain
> confidential  information. Each page attached hereto must also be read in
> conjunction with this disclaimer.
> If you are not the intended recipient you are hereby notified that any
> disclosure, copying, distribution or reliance upon the contents of this
> e-mail is strictly prohibited.    E.&O.E.
>



-- 
*Lewis*

RE: Remote data transfer

Posted by Etienne Koen <et...@scs-space.com>.
Hi Chris and Tom,

As I have mentioned before in my previous email, I have managed to ingest a file to a remote location using the filemgr-client. I am also able to query the information remotely using for example the query_tool in this way:

$ ./query_tool --url http://192.168.0.10:9000 --lucene -query 'CAS.ProductName:blah.txt'

978ca28e-23b0-11e4-87fb-4f1c29029486

What component would I use for searching and downloading the actual product from the remote file manager? Is the filemgr-client or query_tool capable of doing this?

Are there any tutorials you would recommend?

Thanks
Etienne

________________________________________
From: Mattmann, Chris A (3980) [chris.a.mattmann@jpl.nasa.gov]
Sent: Wednesday, August 13, 2014 6:04 PM
To: Etienne Koen; Thomas Bennett
Cc: cschollar@ska.ac.za; dev@oodt.apache.org; Mattmann, Chris A (3980)
Subject: Re: Remote data transfer

Thanks guys.

Etienne, I hope you don't mind but I've copied dev@oodt.apache.org

on this email. That way you can tap into the entire Apache OODT
community for help.

The URI has authority component is usually an error indicating
that you have referenced some environment variable in your config
(e.g., filemgr.properties in the etc directory) but that variable
isn't defined. E.g., maybe you have a *.policy.dirs property set
to file://[SOME_UNDEFINED_VARIABLE]/path/dir/ and SOME_UNDEFINED_VARIABLE
is undefined.

Can you check that to see if that's the root cause of this issue?

Cheers,
Chris

------------------------
Chris Mattmann
chris.mattmann@gmail.com




-----Original Message-----
From: Etienne Koen <et...@scs-space.com>
Date: Wednesday, August 13, 2014 1:42 AM
To: Thomas Bennett <th...@ska.ac.za>
Cc: "cschollar@ska.ac.za" <cs...@ska.ac.za>, Chris Mattmann
<ch...@gmail.com>
Subject: RE: Remote data transfer

>Hi Tom,
>
>I get the following error when using the argument:
>
>ERROR: Failed to ingest product 'blah.txt' : URI has an authority
>component
>
>Here both the server and client were using port 9000
>
>I get this when both the server and client are running on the same port
>
>When communicating on different ports I get:
>
><-- some I/O / HTTP exceptions -->
>...
>...
>
>ERROR: Failed to ingest product 'blah.txt' : Connection refused
>
>Server:9000 and Client:431
>
>Do you know what any of this mean?
>
>Cheers
>Etienne
>
>________________________________________
>From: Thomas Bennett [thomas@ska.ac.za]
>Sent: Wednesday, August 13, 2014 10:02 AM
>To: Etienne Koen
>Cc: cschollar@ska.ac.za; chris.mattmann@gmail.com
>Subject: Re: Remote data transfer
>
>Hey Etienne,
>
>I've been out of the office the last week but I'm back now.
>
>./filemgr-client --url http://localhost:9000 --operation --ingestProduct
>--productName blah.txt --productStructure Flat --productTypeName
>GenericFile --metadataFile file:///tmp/blah.txt.met --refs
>file:///tmp/blah.txt
>
>How would this line be modified to achieve what I want to do? I see there
>is also an argument --clientTransfer --dataTransfer but I am not sure
>what java class to use for this?
>
>You will need to specify the filemgr remotely ie: --url
>http://192.168.0.1 - are you doing this?
>
>I've done remote file transfer before I'll see if I can remember how to
>do it.
>
>Can I log into the CHPC with the usual credentials?
>
>Cheers,
>Tom
>--
>Thomas Bennett
>
>SKA South Africa
>Science Processing Team
>
>Office: +27 21 5067341
>Mobile: +27 79 5237105
>
>________________________________
>Disclaimer: This E-mail message, including any attachments, is intended
>only for the person or entity to which it is addressed, and may contain
>confidential information. Each page attached hereto must also be read in
>conjunction with this disclaimer.
>If you are not the intended recipient you are hereby notified that any
>disclosure, copying, distribution or reliance upon the contents of this
>e-mail is strictly prohibited. E.&O.E.
>
>Disclaimer: This E-mail message, including any attachments, is intended
>only for the  person or entity to which it is addressed, and may contain
>confidential  information. Each page attached hereto must also be read in
>conjunction with this disclaimer.
>If you are not the intended recipient you are hereby notified that any
>disclosure, copying, distribution or reliance upon the contents of this
>e-mail is strictly prohibited.    E.&O.E.


Disclaimer: This E-mail message, including any attachments, is intended only for the  person or entity to which it is addressed, and may contain confidential  information. Each page attached hereto must also be read in conjunction with this disclaimer.
If you are not the intended recipient you are hereby notified that any disclosure, copying, distribution or reliance upon the contents of this e-mail is strictly prohibited.    E.&O.E.

Disclaimer: This E-mail message, including any attachments, is intended only for the  person or entity to which it is addressed, and may contain confidential  information. Each page attached hereto must also be read in conjunction with this disclaimer.
If you are not the intended recipient you are hereby notified that any disclosure, copying, distribution or reliance upon the contents of this e-mail is strictly prohibited.    E.&O.E.

Re: Remote data transfer

Posted by "Mattmann, Chris A (3980)" <ch...@jpl.nasa.gov>.
Etienne,

That is great to hear!

I am going to review and answer your prior emails but glad you got
this going. Please feel free to contribute any updates to the wiki
that you think make sense:

https://cwiki.apache.org/confluence/display/OODT/Home


We would definitely welcome them!

Cheers,
Chris

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Chief Architect
Instrument Software and Science Data Systems Section (398)
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 168-519, Mailstop: 168-527
Email: chris.a.mattmann@nasa.gov
WWW:  http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Associate Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++






-----Original Message-----
From: Etienne Koen <et...@scs-space.com>
Date: Thursday, August 14, 2014 12:38 AM
To: Chris Mattmann <Ch...@jpl.nasa.gov>, Thomas Bennett
<th...@ska.ac.za>
Cc: "cschollar@ska.ac.za" <cs...@ska.ac.za>, "dev@oodt.apache.org"
<de...@oodt.apache.org>
Subject: RE: Remote data transfer

>Hi Chris,
>
>Thanks for your help! No problem, please feel free to make our
>conversation available to the community!
>
>I had success with ingesting a file to a remote server. Here is the
>scenario I had:
>
>A file manager running 192.168.0.10 and a client, running on
>192.168.0.11, wanting to ingest a file and archive it on the file manager
>node (192.168.0.10). I used the tutorial at
>https://cwiki.apache.org/confluence/display/OODT/OODT+Filemgr+User+Guide
>(Thanks Tom!) as an outline to configure or the necessary parameters,
>e.g. the archive repository.
>
>Here are a list of variables I declared:
>
>export PROD_NAME=blah.txt
>export PROD_REF=/root/source/blah.txt
>export PROD_MET=/root/source/blah.txt.met
>
>using it with the command:
>
>./filemgr-client --url http://192.168.0.10:9000 --clientTransfer
>--dataTransfer 
>org.apache.oodt.cas.filemgr.datatransfer.RemoteDataTransferFactory
>--operation --ingestProduct --productName $PROD_NAME --productStructure
>Flat --productTypeName GenericFile --metadataFile file://$PROD_MET --refs
>file://$PROD_REF
>
>This gave a notification on the client side (192.168.0.11) that the file
>was ingested successfully. The file appeared on the file manager node
>(192.168.0.10) under the directory I specified :-)
>
>However, on the file manager node, I got a warning:
>
>WARNING: No Metadata specified for product [blah.txt] for required field
>[DataVersion]: Attempting to continue processing metadata
>
>I guess this is due to not having a metadata extractor running?
>
>I am a bit of a newbie when it comes to cluster software, security etc..
>So here is what I have done prior to running the file manager to make
>sure the environment was configured to allow communication:
>
>- I flushed the firewall settings and added a rule to allow communication
>
>$ iptables -F
>$ iptables -I INPUT -j ACCEPT
>
>I then checked to see if all ports allowed communication:
>
>$ iptables -L
>
>Cheers
>Etienne
>________________________________________
>From: Mattmann, Chris A (3980) [chris.a.mattmann@jpl.nasa.gov]
>Sent: Wednesday, August 13, 2014 6:04 PM
>To: Etienne Koen; Thomas Bennett
>Cc: cschollar@ska.ac.za; dev@oodt.apache.org; Mattmann, Chris A (3980)
>Subject: Re: Remote data transfer
>
>Thanks guys.
>
>Etienne, I hope you don't mind but I've copied dev@oodt.apache.org
>
>on this email. That way you can tap into the entire Apache OODT
>community for help.
>
>The URI has authority component is usually an error indicating
>that you have referenced some environment variable in your config
>(e.g., filemgr.properties in the etc directory) but that variable
>isn't defined. E.g., maybe you have a *.policy.dirs property set
>to file://[SOME_UNDEFINED_VARIABLE]/path/dir/ and SOME_UNDEFINED_VARIABLE
>is undefined.
>
>Can you check that to see if that's the root cause of this issue?
>
>Cheers,
>Chris
>
>------------------------
>Chris Mattmann
>chris.mattmann@gmail.com
>
>
>
>
>-----Original Message-----
>From: Etienne Koen <et...@scs-space.com>
>Date: Wednesday, August 13, 2014 1:42 AM
>To: Thomas Bennett <th...@ska.ac.za>
>Cc: "cschollar@ska.ac.za" <cs...@ska.ac.za>, Chris Mattmann
><ch...@gmail.com>
>Subject: RE: Remote data transfer
>
>>Hi Tom,
>>
>>I get the following error when using the argument:
>>
>>ERROR: Failed to ingest product 'blah.txt' : URI has an authority
>>component
>>
>>Here both the server and client were using port 9000
>>
>>I get this when both the server and client are running on the same port
>>
>>When communicating on different ports I get:
>>
>><-- some I/O / HTTP exceptions -->
>>...
>>...
>>
>>ERROR: Failed to ingest product 'blah.txt' : Connection refused
>>
>>Server:9000 and Client:431
>>
>>Do you know what any of this mean?
>>
>>Cheers
>>Etienne
>>
>>________________________________________
>>From: Thomas Bennett [thomas@ska.ac.za]
>>Sent: Wednesday, August 13, 2014 10:02 AM
>>To: Etienne Koen
>>Cc: cschollar@ska.ac.za; chris.mattmann@gmail.com
>>Subject: Re: Remote data transfer
>>
>>Hey Etienne,
>>
>>I've been out of the office the last week but I'm back now.
>>
>>./filemgr-client --url http://localhost:9000 --operation --ingestProduct
>>--productName blah.txt --productStructure Flat --productTypeName
>>GenericFile --metadataFile file:///tmp/blah.txt.met --refs
>>file:///tmp/blah.txt
>>
>>How would this line be modified to achieve what I want to do? I see there
>>is also an argument --clientTransfer --dataTransfer but I am not sure
>>what java class to use for this?
>>
>>You will need to specify the filemgr remotely ie: --url
>>http://192.168.0.1 - are you doing this?
>>
>>I've done remote file transfer before I'll see if I can remember how to
>>do it.
>>
>>Can I log into the CHPC with the usual credentials?
>>
>>Cheers,
>>Tom
>>--
>>Thomas Bennett
>>
>>SKA South Africa
>>Science Processing Team
>>
>>Office: +27 21 5067341
>>Mobile: +27 79 5237105
>>
>>________________________________
>>Disclaimer: This E-mail message, including any attachments, is intended
>>only for the person or entity to which it is addressed, and may contain
>>confidential information. Each page attached hereto must also be read in
>>conjunction with this disclaimer.
>>If you are not the intended recipient you are hereby notified that any
>>disclosure, copying, distribution or reliance upon the contents of this
>>e-mail is strictly prohibited. E.&O.E.
>>
>>Disclaimer: This E-mail message, including any attachments, is intended
>>only for the  person or entity to which it is addressed, and may contain
>>confidential  information. Each page attached hereto must also be read in
>>conjunction with this disclaimer.
>>If you are not the intended recipient you are hereby notified that any
>>disclosure, copying, distribution or reliance upon the contents of this
>>e-mail is strictly prohibited.    E.&O.E.
>
>
>Disclaimer: This E-mail message, including any attachments, is intended
>only for the  person or entity to which it is addressed, and may contain
>confidential  information. Each page attached hereto must also be read in
>conjunction with this disclaimer.
>If you are not the intended recipient you are hereby notified that any
>disclosure, copying, distribution or reliance upon the contents of this
>e-mail is strictly prohibited.    E.&O.E.
>
>Disclaimer: This E-mail message, including any attachments, is intended
>only for the  person or entity to which it is addressed, and may contain
>confidential  information. Each page attached hereto must also be read in
>conjunction with this disclaimer.
>If you are not the intended recipient you are hereby notified that any
>disclosure, copying, distribution or reliance upon the contents of this
>e-mail is strictly prohibited.    E.&O.E.


RE: Remote data transfer

Posted by Etienne Koen <et...@scs-space.com>.
Hi Chris,

Thanks for your help! No problem, please feel free to make our conversation available to the community!

I had success with ingesting a file to a remote server. Here is the scenario I had:

A file manager running 192.168.0.10 and a client, running on 192.168.0.11, wanting to ingest a file and archive it on the file manager node (192.168.0.10). I used the tutorial at https://cwiki.apache.org/confluence/display/OODT/OODT+Filemgr+User+Guide (Thanks Tom!) as an outline to configure or the necessary parameters, e.g. the archive repository.

Here are a list of variables I declared:

export PROD_NAME=blah.txt
export PROD_REF=/root/source/blah.txt
export PROD_MET=/root/source/blah.txt.met

using it with the command:

./filemgr-client --url http://192.168.0.10:9000 --clientTransfer --dataTransfer org.apache.oodt.cas.filemgr.datatransfer.RemoteDataTransferFactory --operation --ingestProduct --productName $PROD_NAME --productStructure Flat --productTypeName GenericFile --metadataFile file://$PROD_MET --refs file://$PROD_REF

This gave a notification on the client side (192.168.0.11) that the file was ingested successfully. The file appeared on the file manager node (192.168.0.10) under the directory I specified :-)

However, on the file manager node, I got a warning:

WARNING: No Metadata specified for product [blah.txt] for required field [DataVersion]: Attempting to continue processing metadata

I guess this is due to not having a metadata extractor running?

I am a bit of a newbie when it comes to cluster software, security etc.. So here is what I have done prior to running the file manager to make sure the environment was configured to allow communication:

- I flushed the firewall settings and added a rule to allow communication

$ iptables -F
$ iptables -I INPUT -j ACCEPT

I then checked to see if all ports allowed communication:

$ iptables -L

Cheers
Etienne
________________________________________
From: Mattmann, Chris A (3980) [chris.a.mattmann@jpl.nasa.gov]
Sent: Wednesday, August 13, 2014 6:04 PM
To: Etienne Koen; Thomas Bennett
Cc: cschollar@ska.ac.za; dev@oodt.apache.org; Mattmann, Chris A (3980)
Subject: Re: Remote data transfer

Thanks guys.

Etienne, I hope you don't mind but I've copied dev@oodt.apache.org

on this email. That way you can tap into the entire Apache OODT
community for help.

The URI has authority component is usually an error indicating
that you have referenced some environment variable in your config
(e.g., filemgr.properties in the etc directory) but that variable
isn't defined. E.g., maybe you have a *.policy.dirs property set
to file://[SOME_UNDEFINED_VARIABLE]/path/dir/ and SOME_UNDEFINED_VARIABLE
is undefined.

Can you check that to see if that's the root cause of this issue?

Cheers,
Chris

------------------------
Chris Mattmann
chris.mattmann@gmail.com




-----Original Message-----
From: Etienne Koen <et...@scs-space.com>
Date: Wednesday, August 13, 2014 1:42 AM
To: Thomas Bennett <th...@ska.ac.za>
Cc: "cschollar@ska.ac.za" <cs...@ska.ac.za>, Chris Mattmann
<ch...@gmail.com>
Subject: RE: Remote data transfer

>Hi Tom,
>
>I get the following error when using the argument:
>
>ERROR: Failed to ingest product 'blah.txt' : URI has an authority
>component
>
>Here both the server and client were using port 9000
>
>I get this when both the server and client are running on the same port
>
>When communicating on different ports I get:
>
><-- some I/O / HTTP exceptions -->
>...
>...
>
>ERROR: Failed to ingest product 'blah.txt' : Connection refused
>
>Server:9000 and Client:431
>
>Do you know what any of this mean?
>
>Cheers
>Etienne
>
>________________________________________
>From: Thomas Bennett [thomas@ska.ac.za]
>Sent: Wednesday, August 13, 2014 10:02 AM
>To: Etienne Koen
>Cc: cschollar@ska.ac.za; chris.mattmann@gmail.com
>Subject: Re: Remote data transfer
>
>Hey Etienne,
>
>I've been out of the office the last week but I'm back now.
>
>./filemgr-client --url http://localhost:9000 --operation --ingestProduct
>--productName blah.txt --productStructure Flat --productTypeName
>GenericFile --metadataFile file:///tmp/blah.txt.met --refs
>file:///tmp/blah.txt
>
>How would this line be modified to achieve what I want to do? I see there
>is also an argument --clientTransfer --dataTransfer but I am not sure
>what java class to use for this?
>
>You will need to specify the filemgr remotely ie: --url
>http://192.168.0.1 - are you doing this?
>
>I've done remote file transfer before I'll see if I can remember how to
>do it.
>
>Can I log into the CHPC with the usual credentials?
>
>Cheers,
>Tom
>--
>Thomas Bennett
>
>SKA South Africa
>Science Processing Team
>
>Office: +27 21 5067341
>Mobile: +27 79 5237105
>
>________________________________
>Disclaimer: This E-mail message, including any attachments, is intended
>only for the person or entity to which it is addressed, and may contain
>confidential information. Each page attached hereto must also be read in
>conjunction with this disclaimer.
>If you are not the intended recipient you are hereby notified that any
>disclosure, copying, distribution or reliance upon the contents of this
>e-mail is strictly prohibited. E.&O.E.
>
>Disclaimer: This E-mail message, including any attachments, is intended
>only for the  person or entity to which it is addressed, and may contain
>confidential  information. Each page attached hereto must also be read in
>conjunction with this disclaimer.
>If you are not the intended recipient you are hereby notified that any
>disclosure, copying, distribution or reliance upon the contents of this
>e-mail is strictly prohibited.    E.&O.E.


Disclaimer: This E-mail message, including any attachments, is intended only for the  person or entity to which it is addressed, and may contain confidential  information. Each page attached hereto must also be read in conjunction with this disclaimer.
If you are not the intended recipient you are hereby notified that any disclosure, copying, distribution or reliance upon the contents of this e-mail is strictly prohibited.    E.&O.E.

Disclaimer: This E-mail message, including any attachments, is intended only for the  person or entity to which it is addressed, and may contain confidential  information. Each page attached hereto must also be read in conjunction with this disclaimer.
If you are not the intended recipient you are hereby notified that any disclosure, copying, distribution or reliance upon the contents of this e-mail is strictly prohibited.    E.&O.E.