You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@oodt.apache.org by Eva Schlauch <es...@mpifr-bonn.mpg.de> on 2015/05/20 13:51:35 UTC

meerkat oodt

Dear developers,

I tried a new start. I am now at oodt 0.8.1 and have installed with 
maven the filemgr. Needed to get java (I have never worked with java 
before) running correctly and update maven to a newer one. Anyhow, the 
filemgr just ingested a file (yay!) and I am trying to figure out what 
the next steps should be.

So my oodt consists of a filemgr so far. I think I will need a workflow 
manager as well. Also, a gui to monitor the filemgr and workflowmgr 
would be nice (ops ui?). In the radix version I got the ops ui but it 
did not show anything since the rest was not running. Now, how to 
proceed...

I want to:

- ingest files connected to the characterization of 64 s-band meerkat 
receivers
- run postprocessing like visualization, extraction of important 
parameters etc.
- let people view and download files and their respective postprocessing 
products via a web interface
- provide metadata like notes that people would take alongside measurements
- get a grip on hardware versions by also versioning the ingested 
products connected to hardware component

so in general, I want to build a catalog for our s-band receivers at 
meerkat that should help the engineers to ingest and later identify 
components and their characteristics.

it may come in handy, if we can later run some statistics on the data, I 
don't know if we'll be using python or r, but that should not matter 
now. So, for example: that amplifier run produced stable lnas, so lets 
stick to it, or else do another itertation. Something along these lines.

So:

- how do I define file types? Do I need to do that or does GenericFile 
suffice?
- how do I attach specific metadata?
- how to write and attach tasks?
- how can other people connect to the machine (interface for 
users/server client model?)
- how can I visualize the products in my catalog?
- do I need to write my own web interface or can I build on something?
- and so on...

Enough for now , best, Eva

Re: meerkat oodt

Posted by Lewis John Mcgibbney <le...@gmail.com>.

Hi Eva,
Great to have you on list and wow have you got a lot of questions.
Hopefully between up we can help you out here!
Let me get the balling rolling... my responses inline

On Wed, May 20, 2015 at 4:51 AM, Eva Schlauch <es...@mpifr-bonn.mpg.de>
wrote:

> Dear developers,
>
> I tried a new start. I am now at oodt 0.8.1 and have installed with maven
> the filemgr. Needed to get java (I have never worked with java before)
> running correctly and update maven to a newer one. Anyhow, the filemgr just
> ingested a file (yay!) and I am trying to figure out what the next steps
> should be.
>

Congratulations.

>
> So my oodt consists of a filemgr so far. I think I will need a workflow
> manager as well. Also, a gui to monitor the filemgr and workflowmgr would
> be nice (ops ui?). In the radix version I got the ops ui but it did not
> show anything since the rest was not running. Now, how to proceed...
>

I don't know if you saw our getting started guides. We link to the Vagrant
Powered OODT tutorial from the Getting Started tab on the website.
https://cwiki.apache.org/confluence/display/OODT/Vagrant+Powered+OODT
This is aimed at being a real quick way to get up and running with the OODT
stack. It eassentially creates the environment and initiates what the RADiX
stack
https://cwiki.apache.org/confluence/display/OODT/RADiX+Powered+By+OODT
If you have any issues then give us a shout!

>
> I want to:
>
> - ingest files connected to the characterization of 64 s-band meerkat
> receivers
>

Once you've familiarized yourself with the RADiX install and stack as above
(and ironed out any further questions here) then typically this could be
done by staging those files somewhere and simply monitoring the staging
resource with the Push-Pull components
https://cwiki.apache.org/confluence/display/OODT/OODT+Push-Pull+User+Guide
https://cwiki.apache.org/confluence/display/OODT/OODT+Push+Pull+Plugins

> - run postprocessing like visualization, extraction of important
> parameters etc.
>

Please see
https://cwiki.apache.org/confluence/display/OODT/CAS-PGE+Learn+by+Example
as an example (note this may be slightly dated however it is an excellent
place to get started).

> - let people view and download files and their respective postprocessing
> products via a web interface
>

   - Quick Start for PCS OPSUI
   <https://cwiki.apache.org/confluence/display/OODT/Quick+Start+for+PCS+OPSUI>
   - Running multiple Apache OODT OPSUI
   <https://cwiki.apache.org/confluence/display/OODT/Running+multiple+Apache+OODT+OPSUI>
   - Skinning the OPSUI web application
   <https://cwiki.apache.org/confluence/display/OODT/Skinning+the+OPSUI+web+application>

> - provide metadata like notes that people would take alongside measurements
>

One of the core OODT datastructures provides the concept of Metadata
[0]. Metadata
is a {@link Map} of <code>String</code> keys mapped to Object values. So,
each key can map to potentially many values, but also can map to  null, or
to a single value. This data structure will adequately accommodate your
measurement notes. You would just need to establish the appropriate
(possibly custom) Extractor. An example of how to do this can be found below
https://cwiki.apache.org/confluence/display/OODT/MetExtractors+for+Crawler

[0]
http://svn.apache.org/repos/asf/oodt/trunk/metadata/src/main/java/org/apache/oodt/cas/metadata/Metadata.java

> - get a grip on hardware versions by also versioning the ingested products
> connected to hardware component
>

I have no clue? Someone else able to help out here?

>
> so in general, I want to build a catalog for our s-band receivers at
> meerkat that should help the engineers to ingest and later identify
> components and their characteristics.
>

This is a typical usecase for OODT and I think you'll be happy once you get
your system put in place.

>
> it may come in handy, if we can later run some statistics on the data, I
> don't know if we'll be using python or r, but that should not matter now.

I wouldn't have thought so. If you decide to use Solr as your Catalog
datastore, Solr has a bunch of API's in different languages. You would be
able to infer some nice stats.

> So, for example: that amplifier run produced stable lnas, so lets stick to
> it, or else do another itertation. Something along these lines.
>
> So:
>
> - how do I define file types? Do I need to do that or does GenericFile
> suffice?
> - how do I attach specific metadata?
> - how to write and attach tasks?
> - how can other people connect to the machine (interface for users/server
> client model?)
> - how can I visualize the products in my catalog?
> - do I need to write my own web interface or can I build on something?
> - and so on...
>
> Enough for now , best, Eva
>

You've bolted on a good few questions on the end. Lets address the earlier
ones first then we can move on. How does that sound?
Thanks

-- 
*Lewis*