You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by kevin slote <ks...@gmail.com> on 2014/06/27 16:55:37 UTC

Expected output

Hello everyone.  I have a question about the expected output for tika.  I
am working on integrating my python application with tika-server.  One of
the test files for unit test produces this for the metadata.  The test file
is test.he5,
and the way I call tika is just to send this file to
http://localhost:9998/meta while tika-serve-1.5 is running.

Should I expect csv formatted data that occasionally has long strings of
text with many line breaks?


"StartUTC","2009-05-02T00:00:00.000000Z"
"InstrumentName","MLS Aura"
"LastMAF","6114076"
"ProcessLevel","L2"
"GranuleYear","2009"
"OrbitNumber","25509"
"GranuleDayOfYear","122"
"VerticalCoordinate","Pressure","Pressure","Pressure"
"HostName"," "
"EndUTC","2009-05-02T23:59:59.999999Z"
"GranuleDay","2"
"cdm_data_type","PROFILE"
"PCF1","#
# filename:
# PCF.relB0
#
# description:
#   Process Control File (PCF)
#
# notes:
#
# This file supports the Release B version of the toolkit.
#       It is intended for use with toolkit version ""TK_VERSION_STRING"".
#
#       The logical IDs 10000-10999 (inclusive) are reserved for internal
#       Toolkit/ECS usage, DO NOT add logical IDs with these values.
#
# Please treat this file as a master template and make copies of it
# for your own testing. Note that the Toolkit installation script
#   sets PGS_PC_INFO_FILE to point to this master file by default.
#       Remember to reset the environment variable PGS_PC_INFO_FILE to
#       point to the instance of your PCF.
#
#       The toolkit will not interpret environment variables specified
#       in this file (e.g. ~/database/$OSTYPE/TD is not a valid reference).
#       The '~' character, however, when appearing in a reference WILL be
#       replaced with the value of the environment variable PGSHOME.
#
#       The PCF file delivered with the toolkit should be taken as a
#       template.  User entries should be added as necessary to this
#       template.  Existing entries may (in some cases should) be altered
#       but generally should not be commented out or deleted.  A few
#       entries may not be needed by all users and can in some cases
#       be commented out or deleted.  Such entries should be clearly
#       identified in the comment(s) preceding the entry/entries.
#
#       Entries preceded by the comment: (DO NOT REMOVE THIS ENTRY)
#       are deemed especially critical and should not be removed for
#       any reason (although the values of the various fields of such an
#       entry may be configurable).
#
# -----------------------------------------------------------------------
?   SYSTEM RUNTIME PARAMETERS
# -----------------------------------------------------------------------
#########################################################################
#
# This section contains unique identifiers used to track instances of
# a PGE run, versions of science software, etc.  This section must
# contain exactly two entries.  These values will be inserted by
# ECS just before a PGE is executed.  At the SCF the values may be set
# to anything but these values are not normally user definable and user
# values will be ignored/overwritten at the DAAC.
#
#########################################################################
#
# Production Run ID - unique production instance identifier
# (DO NOT REMOVE THIS ENTRY)
# -----------------------------------------------------------------------
1
# -----------------------------------------------------------------------
# Software ID - unique software configuration identifier
# (DO NOT REMOVE THIS ENTRY)
# -----------------------------------------------------------------------
1
#
?   PRODUCT INPUT FILES
#########################################################################
#
# This section is intended for standard product inputs, i.e., major
# input files such as Level 0 data files.
#
# Each logical ID may have several file instances, as given by the
# version number in the last field.
#
#########################################################################
#
# Next non-comment line is the default location for PRODUCT INPUT FILES
# WARNING! DO NOT MODIFY THIS LINE unless you have relocated these
# data set files to the location specified by the new setting.
!  /workops/jobs/science/1241373300.02916
#
#-----------------
# Test input files
#-----------------
900|job.PCF|||||1
901|l2cf.0223|/science||||1
20000|emls-signals.dat|/science||||1
20001|MLS-Aura_L2Cal-AAAP_v2-0-0_0000d000.txt|/science/l2cal||||1
20002|MLS-Aura_L2Cal-Filters_v1-5-0_0000d000.txt|/science/l2cal||||1
20003|MLS-Aura_L2Cal-DACSFilters_v1-5-1_0000d000.txt|/science/l2cal||||1
20004|MLS-Aura_L2Cal-PFG_v2-0-4_0000d000.txt|/science/l2cal||||1
20005|PFAData_R1A_v2-0-5.h5|/science/l2cal||||1
20006|PFAData_R1B_v2-0-5.h5|/science/l2cal||||1
20007|PFAData_R2_v2-0-5.h5|/science/l2cal||||1
20008|PFAData_R3_v2-0-5.h5|/science/l2cal||||1
20009|PFAData_R4_v2-0-6.h5|/science/l2cal||||1
20010|PFAData_R5H_v2-0-5.h5|/science/l2cal||||1
20011|PFAData_R5V_v2-0-5.h5|/science/l2cal||||1
20012|PFAData_DACS_v2-0-5.h5|/science/l2cal||||1
20035|MLS-Aura_L2Cal-L2PC-band1-LATSCALARHIRES_v1-7-0-l2pc039_m05.h5|/science/l2cal||||1
20036|MLS-Aura_L2Cal-L2PC-band2-LATSCALARHIRES_v1-7-0-l2pc039_m05.h5|/science/l2cal||||1
20037|MLS-Aura_L2Cal-L2PC-band3-LATSCALARHIRES_v1-7-0-l2pc039_m05.h5|/science/l2cal||||1
20038|MLS-Aura_L2Cal-L2PC-band4-LATSCALARHIRES_v1-7-0-l2pc039_m05.h5|/science/l2cal||||1
20039|MLS-Aura_L2Cal-L2PC-band5-LATSCALARHIRES_v1-7-0-l2pc039_m05.h5|/science/l2cal||||1
20040|MLS-Aura_L2Cal-L2PC-band6-LATSCALARHIRES_v1-7-0-l2pc039_m05.h5|/science/l2cal||||1
20041|MLS-Aura_L2Cal-L2PC-band7-LATSCALARHIRES_v1-7-0-l2pc039_m05.h5|/science/l2cal||||1
20042|MLS-Aura_L2Cal-L2PC-band8-LATSCALARHIRES_v1-7-0-l2pc039_m05.h5|/science/l2cal||||1
20043|MLS-Aura_L2Cal-L2PC-band9-LATSCALARHIRES_v1-7-0-l2pc039_m05.h5|/science/l2cal||||1
20044|MLS-Aura_L2Cal-L2PC-band10-LATSCALARHIRES_v1-7-0-l2pc039_m05.h5|/science/l2cal||||1
20045|MLS-Aura_L2Cal-L2PC-band11-LATSCALARHIRES_v1-7-0-l2pc039_m05.h5|/science/l2cal||||1
20046|MLS-Aura_L2Cal-L2PC-band12-LATSCALARHIRES_v1-7-0-l2pc039_m05.h5|/science/l2cal||||1
20047|MLS-Aura_L2Cal-L2PC-band13-LATSCALARHIRES_v1-7-0-l2pc039_m05.h5|/science/l2cal||||1
20048|MLS-Aura_L2Cal-L2PC-band14-LATSCALARHIRES_v1-7-0-l2pc039_m05.h5|/science/l2cal||||1
20049|MLS-Aura_L2Cal-L2PC-band17-LATSCALARHIRES_v1-7-0-l2pc039_m05.h5|/science/l2cal||||1
20050|MLS-Aura_L2Cal-L2PC-band20-LATSCALARHIRES_v1-7-0-l2pc039_m05.h5|/science/l2cal||||1
20051|MLS-Aura_L2Cal-L2PC-band23-LATSCALARHIRES_v1-7-0-l2pc039_m05.h5|/science/l2cal||||1
20052|MLS-Aura_L2Cal-L2PC-band24-LATSCALARHIRES_v1-7-0-l2pc039_m05.h5|/science/l2cal||||1
20053|MLS-Aura_L2Cal-L2PC-band25-LATSCALARHIRES_v1-7-0-l2pc039_m05.h5|/science/l2cal||||1
20054|MLS-Aura_L2Cal-L2PC-band27-LATSCALARHIRES_v1-7-0-l2pc039_m05.h5|/science/l2cal||||1
20055|MLS-Aura_L2Cal-L2PC-band28-LATSCALARHIRES_v1-7-0-l2pc039_m05.h5|/science/l2cal||||1
20056|MLS-Aura_L2Cal-L2PC-band29-LATSCALARHIRES_v1-7-0-l2pc039_m05.h5|/science/l2cal||||1
20057|MLS-Aura_L2Cal-L2PC-band30-LATSCALARHIRES_v1-7-0-l2pc039_m05.h5|/science/l2cal||||1
20058|MLS-Aura_L2Cal-L2PC-band31-LATSCALARHIRES_v1-7-0-l2pc039_m05.h5|/science/l2cal||||1
20059|MLS-Aura_L2Cal-L2PC-band32-LATSCALARHIRES_v1-7-0-l2pc039_m05.h5|/science/l2cal||||1
20060|MLS-Aura_L2Cal-L2PC-band33-LATSCALARHIRES_v1-7-0-l2pc039_m05.h5|/science/l2cal||||1
20061|MLS-Aura_L2Cal-L2PC-band34-LATSCALARHIRES_v1-7-0-l2pc039_m05.h5|/science/l2cal||||1
20062|MLS-Aura_L2Cal-L2PC-band15-SZASCALARHIRES_v1-7-0-l2pc039_0000d000.h5|/science/l2cal||||1
20063|MLS-Aura_L2Cal-L2PC-band16-SZASCALARHIRES_v1-7-0-l2pc039_0000d000.h5|/science/l2cal||||1
20064|MLS-Aura_L2Cal-L2PC-band18-SZASCALARHIRES_v1-7-0-l2pc039_0000d000.h5|/science/l2cal||||1
20065|MLS-Aura_L2Cal-L2PC-band19-SZASCALARHIRES_v1-7-0-l2pc039_0000d000.h5|/science/l2cal||||1
20066|MLS-Aura_L2Cal-L2PC-band22-LATPOLARHIRES_v1-7-0-l2pc039_0000d000.h5|/science/l2cal||||1
22200|MLS-Aura_L2Cal-Climatologies_v1-7-3_0000d000.txt|/science/l2cal||||1
21110|MLS-Aura_L1BOA_v02-23-c01_2009d122.h5|/workops/jobs/science/1241373300.02916|||LGID:ML1OA:1:1283707|1
21050|MLS-Aura_L1BRADD_v02-23-c01_2009d122.h5|/workops/jobs/science/1241373300.02916|||LGID:ML1RADD:1:1283706|1
21052|MLS-Aura_L1BRADT_v02-23-c01_2009d122.h5|/workops/jobs/science/1241373300.02916|||LGID:ML1RADT:1:1283708|1
21051|MLS-Aura_L1BRADG_v02-23-c01_2009d122.h5|/workops/jobs/science/1241373300.02916|||LGID:ML1RADG:1:1283705|1
22000|DAS.ops.asm.tavg3d_dyn_v.GEOS520.20090502_0000.V01.hdf|/workops/jobs/science/1241373300.02916|||LGID:D5OTVDYN:1:1283358|1
22001|DAS.ops.asm.tavg3d_dyn_v.GEOS520.20090502_0600.V01.hdf|/workops/jobs/science/1241373300.02916|||LGID:D5OTVDYN:1:1283424|1
22002|DAS.ops.asm.tavg3d_dyn_v.GEOS520.20090502_1200.V01.hdf|/workops/jobs/science/1241373300.02916|||LGID:D5OTVDYN:1:1283648|1
22003|DAS.ops.asm.tavg3d_dyn_v.GEOS520.20090502_1800.V01.hdf|/workops/jobs/science/1241373300.02916|||LGID:D5OTVDYN:1:1283651|1
22004|DAS.ops.asm.tavg3d_dyn_v.GEOS520.20090503_0000.V01.hdf|/workops/jobs/science/1241373300.02916|||LGID:D5OTVDYN:1:1283666|1
22005|DAS.ops.asm.tavg3d_prs_v.GEOS520.20090502_0000.V01.hdf|/workops/jobs/science/1241373300.02916|||LGID:D5OTVPRS:1:1283357|1
22006|DAS.ops.asm.tavg3d_prs_v.GEOS520.20090502_0600.V01.hdf|/workops/jobs/science/1241373300.02916|||LGID:D5OTVPRS:1:1283425|1
22007|DAS.ops.asm.tavg3d_prs_v.GEOS520.20090502_1200.V01.hdf|/workops/jobs/science/1241373300.02916|||LGID:D5OTVPRS:1:1283647|1
22008|DAS.ops.asm.tavg3d_prs_v.GEOS520.20090502_1800.V01.hdf|/workops/jobs/science/1241373300.02916|||LGID:D5OTVPRS:1:1283649|1
22009|DAS.ops.asm.tavg3d_prs_v.GEOS520.20090503_0000.V01.hdf|/workops/jobs/science/1241373300.02916|||LGID:D5OTVPRS:1:1283667|1
#
# -----------------------------------------------------------------------
# These are actual ancillary data set files - supplied by ECS or the
# user.  The following are supplied for purposes of tests and as a useful
# set of ancillary data.  These entries may be removed IF the AA tools
# are not being used.
# -----------------------------------------------------------------------
10780|usatile12|AA_DATA_INSTALL_DIR|||10751|12
10780|usatile11|AA_DATA_INSTALL_DIR|||10750|11
10780|usatile10|AA_DATA_INSTALL_DIR|||10749|10
10780|usatile9|AA_DATA_INSTALL_DIR|||10748|9
10780|usatile8|AA_DATA_INSTALL_DIR|||10747|8
10780|usatile7|AA_DATA_INSTALL_DIR|||10746|7
10780|usatile6|AA_DATA_INSTALL_DIR|||10745|6
10780|usatile5|AA_DATA_INSTALL_DIR|||10744|5
10780|usatile4|AA_DATA_INSTALL_DIR|||10743|4
10780|usatile3|AA_DATA_INSTALL_DIR|||10742|3
10780|usatile2|AA_DATA_INSTALL_DIR|||10741|2
10780|usatile1|AA_DATA_INSTALL_DIR|||10740|1
10951|mowe13a.img|AA_DATA_INSTALL_DIR||||1
10952|owe13a.img|AA_DATA_INSTALL_DIR||||1
10953|owe14d.img|AA_DATA_INSTALL_DIR||||1
10954|owe14dr.img|AA_DATA_INSTALL_DIR||||1
10955|etop05.dat|AA_DATA_INSTALL_DIR||||1
10956|fnocazm.img|AA_DATA_INSTALL_DIR||||1
10957|fnococm.img|AA_DATA_INSTALL_DIR||||1
10958|fnocpt.img|AA_DATA_INSTALL_DIR||||1
10959|fnocrdg.img|AA_DATA_INSTALL_DIR||||1
10960|fnocst.img|AA_DATA_INSTALL_DIR||||1
10961|fnocurb.img|AA_DATA_INSTALL_DIR||||1
10962|fnocwat.img|AA_DATA_INSTALL_DIR||||1
10963|fnocmax.imgs|AA_DATA_INSTALL_DIR||||1
10964|fnocmin.imgs|AA_DATA_INSTALL_DIR||||1
10965|fnocmod.imgs|AA_DATA_INSTALL_DIR||||1
10966|srzarea.img|AA_DATA_INSTALL_DIR||||1
10967|srzcode.img|AA_DATA_INSTALL_DIR||||1
10968|srzphas.img|AA_DATA_INSTALL_DIR||||1
10969|srzslop.img|AA_DATA_INSTALL_DIR||||1
10970|srzsoil.img|AA_DATA_INSTALL_DIR||||1
10971|srztext.img|AA_DATA_INSTALL_DIR||||1
10972|nmcRucPotPres.datrepack|AA_DATA_INSTALL_DIR||||1
10973|tbase.bin|AA_DATA_INSTALL_DIR|||10915|1
10974|tbase.br|AA_DATA_INSTALL_DIR|||10919|4
10974|tbase.bl|AA_DATA_INSTALL_DIR|||10918|3
10974|tbase.tr|AA_DATA_INSTALL_DIR|||10917|2
10974|tbase.tl|AA_DATA_INSTALL_DIR|||10916|1
10975|geoid.dat|AA_DATA_INSTALL_DIR||||1
#
# -----------------------------------------------------------------------
# The following are for the PGS_GCT tool only.  The IDs are #defined in
# the PGS_GCT.h file.  These entries are essential for the State Plane
# Projection but can otherwise be deleted or commented out.
# -----------------------------------------------------------------------
10210|nad27sp|~/database/common/GCT||||1
10201|nad83sp|~/database/common/GCT||||1
# -----------------------------------------------------------------------
# The following are for the PGS_AA_DCW tool only.
# The IDs are #defined in the PGS_AA_DCW.h file.
# These entries may be deleted or commented out IF the AA tools are not
# being used.
# -----------------------------------------------------------------------
10990|eurnasia/|AA_DATA_INSTALL_DIR||||1
10991|noamer/|AA_DATA_INSTALL_DIR||||1
10992|soamafr/|AA_DATA_INSTALL_DIR||||1
10993|sasaus/|AA_DATA_INSTALL_DIR||||1
#
# -----------------------------------------------------------------------
# file for Constant & Unit Conversion (CUC) tools
# IMPORTANT NOTE: THIS FILE WILL BE SUPPLIED AFTER TK4 DELIVERY!
# -----------------------------------------------------------------------
10999|PGS_CUC_maths_parameters|~/database/common/CUC||||1
#
#
#------------------------------------------------------------------------
# Metadata Configuration File (MCF) is a template to be filled in by the
# Instrument teams.  MCFWrite.temp is a scratch file used to dump the MCF
# prior to writing to the hdf file. GetAttr.temp is similarly used to
# dump metadata from the hdf attributes and is used by PGS_MET_GetPCAttr.
# (DO NOT REMOVE THESE ENTRIES)
#------------------------------------------------------------------------
10250|MCF|||||1
4000|PH.0001.MCF|/science/mcf||||1
4001|ML2T.0223.MCF|/science/mcf||||1
4002|ML2GPH.0223.MCF|/science/mcf||||1
4003|ML2CO.0223.MCF|/science/mcf||||1
4004|ML2O3.0223.MCF|/science/mcf||||1
4005|ML2OH.0223.MCF|/science/mcf||||1
4006|ML2BRO.0223.MCF|/science/mcf||||1
4007|ML2CLO.0223.MCF|/science/mcf||||1
4008|ML2H2O.0223.MCF|/science/mcf||||1
4009|ML2HCL.0223.MCF|/science/mcf||||1
4010|ML2HCN.0223.MCF|/science/mcf||||1
4011|ML2HO2.0223.MCF|/science/mcf||||1
4012|ML2IWC.0223.MCF|/science/mcf||||1
4013|ML2N2O.0223.MCF|/science/mcf||||1
4014|ML2OTH.0223.MCF|/science/mcf||||1
4015|ML2RHI.0223.MCF|/science/mcf||||1
4016|ML2SO2.0223.MCF|/science/mcf||||1
4017|ML2HNO3.0223.MCF|/science/mcf||||1
4018|ML2HOCL.0223.MCF|/science/mcf||||1
4019|ML2CH3CN.0223.MCF|/science/mcf||||1
4022|ML2DGG.0223.MCF|/science/mcf||||1
4023|ML2DGM.0223.MCF|/science/mcf||||1
#
#------------------------------------------------------------------------
# Datasets for PGS_DEM tools.
# A dataset of a given resolution is accessed via a single logical ID,
# therefore all physical files comprising a data set must be accessed
# via the same logical ID.  Use file versions to allow for multiple
# physical files within a single data set.
# Data files of 30 arc-sec resolution will be accessed via the
# logical ID 10650.
# Data files of 3 arc-sec resolution will be accessed via the
# logical ID 10653.
# NOTE: The file names in each entry must also appear in the attribute
#       column of the entry (this is a requirement of the metadata tools).
#       The entries given below are ""template"" entries and should be
#       replaced with actual file name/location data before attempting
#       to use the DEM tools.
#------------------------------------------------------------------------
#
10650|DEM30ARC_NAME.hdf|DEM_LOCATION|||DEM30ARC_NAME.hdf|1
10653|DEM3ARC_NAME.hdf|DEM_LOCATION|||DEM3ARC_NAME.hdf|1
#
?   PRODUCT OUTPUT FILES
#########################################################################
#
# This section is intended for standard product outputs, i.e., HDF-EOS
# files generated by this PGE.
#
# Each logical ID may have several file instances, as given by the
# version number in the last field.
#
#########################################################################
#
# Next line is the default location for PRODUCT OUTPUT FILES
!  /workops/jobs/science/1241373300.02916
#
30000|MLS-Aura_L2GP-BrO_v02-23-c01_2009d122.he5|/workops/jobs/science/1241373300.02916||||1
30001|MLS-Aura_L2GP-CH3CN_v02-23-c01_2009d122.he5|/workops/jobs/science/1241373300.02916||||1
30620|MLS-Aura_L2AUX-Cloud_v02-23-c01_2009d122.h5|/workops/jobs/science/1241373300.02916||||1
30002|MLS-Aura_L2GP-ClO_v02-23-c01_2009d122.he5|/workops/jobs/science/1241373300.02916||||1
30003|MLS-Aura_L2GP-CO_v02-23-c01_2009d122.he5|/workops/jobs/science/1241373300.02916||||1
30590|MLS-Aura_L2GP-DGG_v02-23-c01_2009d122.he5|/workops/jobs/science/1241373300.02916||||1
30621|MLS-Aura_L2AUX-DGM_v02-23-c01_2009d122.h5|/workops/jobs/science/1241373300.02916||||1
30632|MLS-Aura_L2FWM-DAC_v02-23-c01_2009d122.h5|/workops/jobs/science/1241373300.02916||||1
30630|MLS-Aura_L2FWM-GHz_v02-23-c01_2009d122.h5|/workops/jobs/science/1241373300.02916||||1
30631|MLS-Aura_L2FWM-THz_v02-23-c01_2009d122.h5|/workops/jobs/science/1241373300.02916||||1
30014|MLS-Aura_L2GP-GPH_v02-23-c01_2009d122.he5|/workops/jobs/science/1241373300.02916||||1
30004|MLS-Aura_L2GP-H2O_v02-23-c01_2009d122.he5|/workops/jobs/science/1241373300.02916||||1
30005|MLS-Aura_L2GP-HCl_v02-23-c01_2009d122.he5|/workops/jobs/science/1241373300.02916||||1
30006|MLS-Aura_L2GP-HCN_v02-23-c01_2009d122.he5|/workops/jobs/science/1241373300.02916||||1
30007|MLS-Aura_L2GP-HNO3_v02-23-c01_2009d122.he5|/workops/jobs/science/1241373300.02916||||1
30008|MLS-Aura_L2GP-HO2_v02-23-c01_2009d122.he5|/workops/jobs/science/1241373300.02916||||1
30009|MLS-Aura_L2GP-HOCl_v02-23-c01_2009d122.he5|/workops/jobs/science/1241373300.02916||||1
30016|MLS-Aura_L2GP-IWC_v02-23-c01_2009d122.he5|/workops/jobs/science/1241373300.02916||||1
30010|MLS-Aura_L2GP-N2O_v02-23-c01_2009d122.he5|/workops/jobs/science/1241373300.02916||||1
30011|MLS-Aura_L2GP-O3_v02-23-c01_2009d122.he5|/workops/jobs/science/1241373300.02916||||1
30012|MLS-Aura_L2GP-OH_v02-23-c01_2009d122.he5|/workops/jobs/science/1241373300.02916||||1
30015|MLS-Aura_L2GP-RHI_v02-23-c01_2009d122.he5|/workops/jobs/science/1241373300.02916||||1
30017|MLS-Aura_L2GP-SO2_v02-23-c01_2009d122.he5|/workops/jobs/science/1241373300.02916||||1
30660|MLS-Aura_L2Staging-Full_v02-23-c01_2009d122.he5|/workops/jobs/science/1241373300.02916||||1
30013|MLS-Aura_L2GP-Temperature_v02-23-c01_2009d122.he5|/workops/jobs/science/1241373300.02916||||1
10100|MLS-Aura_L2LOGSTATUS_v02-23-c01_2009d122.txt|/workops/jobs/science/1241373300.02916||||1
10102|MLS-Aura_L2LOGUSER_v02-23-c01_2009d122.txt|/workops/jobs/science/1241373300.02916||||1
10101|MLS-Aura_L2LOG_v02-23-c01_2009d122.txt|/workops/jobs/science/1241373300.02916||||1
#
30570|MLS-Aura_L2GP-DGG1_v02-23-c01_2009d122.he5|/workops/jobs/science/1241373300.02916||||1
30571|MLS-Aura_L2GP-DGG2_v02-23-c01_2009d122.he5|/workops/jobs/science/1241373300.02916||||1
30572|MLS-Aura_L2GP-DGG3_v02-23-c01_2009d122.he5|/workops/jobs/science/1241373300.02916||||1
30573|MLS-Aura_L2GP-DGG4_v02-23-c01_2009d122.he5|/workops/jobs/science/1241373300.02916||||1
30574|MLS-Aura_L2GP-DGG5_v02-23-c01_2009d122.he5|/workops/jobs/science/1241373300.02916||||1
30575|MLS-Aura_L2GP-DGG6_v02-23-c01_2009d122.he5|/workops/jobs/science/1241373300.02916||||1
30576|MLS-Aura_L2GP-DGG7_v02-23-c01_2009d122.he5|/workops/jobs/science/1241373300.02916||||1
30577|MLS-Aura_L2GP-DGG8_v02-23-c01_2009d122.he5|/workops/jobs/science/1241373300.02916||||1
30578|MLS-Aura_L2GP-DGG9_v02-23-c01_2009d122.he5|/workops/jobs/science/1241373300.02916||||1
30579|MLS-Aura_L2GP-DGG10_v02-23-c01_2009d122.he5|/workops/jobs/science/1241373300.02916||||1
30580|MLS-Aura_L2GP-DGG11_v02-23-c01_2009d122.he5|/workops/jobs/science/1241373300.02916||||1
30581|MLS-Aura_L2GP-DGG12_v02-23-c01_2009d122.he5|/workops/jobs/science/1241373300.02916||||1
30582|MLS-Aura_L2GP-DGG13_v02-23-c01_2009d122.he5|/workops/jobs/science/1241373300.02916||||1
30583|MLS-Aura_L2GP-DGG14_v02-23-c01_2009d122.he5|/workops/jobs/science/1241373300.02916||||1
30584|MLS-Aura_L2GP-DGG15_v02-23-c01_2009d122.he5|/workops/jobs/science/1241373300.02916||||1
30585|MLS-Aura_L2GP-DGG16_v02-23-c01_2009d122.he5|/workops/jobs/science/1241373300.02916||||1
30586|MLS-Aura_L2GP-DGG17_v02-23-c01_2009d122.he5|/workops/jobs/science/1241373300.02916||||1
30587|MLS-Aura_L2GP-DGG18_v02-23-c01_2009d122.he5|/workops/jobs/science/1241373300.02916||||1
30588|MLS-Aura_L2GP-DGG19_v02-23-c01_2009d122.he5|/workops/jobs/science/1241373300.02916||||1
30589|MLS-Aura_L2GP-DGG20_v02-23-c01_2009d122.he5|/workops/jobs/science/1241373300.02916||||1
30600|MLS-Aura_L2AUX-DGM1_v02-23-c01_2009d122.h5|/workops/jobs/science/1241373300.02916||||1
30601|MLS-Aura_L2AUX-DGM2_v02-23-c01_2009d122.h5|/workops/jobs/science/1241373300.02916||||1
30602|MLS-Aura_L2AUX-DGM3_v02-23-c01_2009d122.h5|/workops/jobs/science/1241373300.02916||||1
30603|MLS-Aura_L2AUX-DGM4_v02-23-c01_2009d122.h5|/workops/jobs/science/1241373300.02916||||1
30604|MLS-Aura_L2AUX-DGM5_v02-23-c01_2009d122.h5|/workops/jobs/science/1241373300.02916||||1
30605|MLS-Aura_L2AUX-DGM6_v02-23-c01_2009d122.h5|/workops/jobs/science/1241373300.02916||||1
30606|MLS-Aura_L2AUX-DGM7_v02-23-c01_2009d122.h5|/workops/jobs/science/1241373300.02916||||1
30607|MLS-Aura_L2AUX-DGM8_v02-23-c01_2009d122.h5|/workops/jobs/science/1241373300.02916||||1
30608|MLS-Aura_L2AUX-DGM9_v02-23-c01_2009d122.h5|/workops/jobs/science/1241373300.02916||||1
30609|MLS-Aura_L2AUX-DGM10_v02-23-c01_2009d122.h5|/workops/jobs/science/1241373300.02916||||1
30610|MLS-Aura_L2AUX-DGM11_v02-23-c01_2009d122.h5|/workops/jobs/science/1241373300.02916||||1
30611|MLS-Aura_L2AUX-DGM12_v02-23-c01_2009d122.h5|/workops/jobs/science/1241373300.02916||||1
30612|MLS-Aura_L2AUX-DGM13_v02-23-c01_2009d122.h5|/workops/jobs/science/1241373300.02916||||1
30613|MLS-Aura_L2AUX-DGM14_v02-23-c01_2009d122.h5|/workops/jobs/science/1241373300.02916||||1
30614|MLS-Aura_L2AUX-DGM15_v02-23-c01_2009d122.h5|/workops/jobs/science/1241373300.02916||||1
30615|MLS-Aura_L2AUX-DGM16_v02-23-c01_2009d122.h5|/workops/jobs/science/1241373300.02916||||1
30616|MLS-Aura_L2AUX-DGM17_v02-23-c01_2009d122.h5|/workops/jobs/science/1241373300.02916||||1
30617|MLS-Aura_L2AUX-DGM18_v02-23-c01_2009d122.h5|/workops/jobs/science/1241373300.02916||||1
30618|MLS-Aura_L2AUX-DGM19_v02-23-c01_2009d122.h5|/workops/jobs/science/1241373300.02916||||1
30619|MLS-Aura_L2AUX-DGM20_v02-23-c01_2009d122.h5|/workops/jobs/science/1241373300.02916||||1
#
10252|GetAttr.temp|||||1
10254|MCFWrite.temp|||||1
#
#------------------------------------------------------------------------
# This file is created when PGS_MET_Write is used with an intention
# to write an ASCII representation of the MCF in memory. The user is
# allowed to change the name and path if required.
#
# NOTE: THIS IS OBSOLETE, THIS ENTRY IS ONLY HERE FOR BACKWARD
#       COMPATIBILITY WITH PREVIOUS VERSIONS OF THE TOOLKIT.
#       THE LOGICAL ID 10255 SHOULD BE MOVED DOWN TO THE RUNTIME
#       PARAMETERS SECTION OF THIS FILE AND GIVEN A VALUE OF:
#       <logical_id>:<version_number> WHERE THOSE VALUES REFLECT THE
#       ACTUAL VALUES FOR THE NON-HDF OUTPUT PRODUCT FOR WHICH THE
#       ASCII METADATA IS BEING WRITTEN.  e.g.:
#       10255|reference output product|100:2
#
#------------------------------------------------------------------------
10255|asciidump|||||1
# -----------------------------------------------------------------------
#
?   SUPPORT INPUT FILES
#########################################################################
#
# This section is intended for minor input files, e.g., calibration
# files.
#
# Each logical ID may have several file instances, as given by the
# version number in the last field.
#
#########################################################################
#
# Next line is the default location for SUPPORT INPUT FILES
!  /workops/jobs/science/1241373300.02916
#
#
# -----------------------------------------------------------------------
# This ID is #defined in PGS_AA_Tools.h
# This file contains the IDs for all support and format files (listed
# below).  This entry may be deleted or commented out if the AA tools are
# not being used.
# -----------------------------------------------------------------------
10900|indexFile|~/database/common/AA||||1
#
# -----------------------------------------------------------------------
# These are support files for the data set files - to be created by user
# (not necessarily a one-to-one relationship).
# The IDs must correspond to the logical IDs in the index file (above).
# These entries may be deleted or commented out if the AA tools are not
# being used.
# -----------------------------------------------------------------------
10901|mowe13aSupport|~/database/common/AA||||1
10902|owe13aSupport|~/database/common/AA||||1
10903|owe14Support|~/database/common/AA||||1
10904|etop05Support|~/database/common/AA||||1
10905|fnoc1Support|~/database/common/AA||||1
10906|fnoc2Support|~/database/common/AA||||1
10907|zobler1Support|~/database/common/AA||||1
10908|zobler2Support|~/database/common/AA||||1
10909|nmcRucSupport|~/database/common/AA||||1
10915|tbaseSupport|~/database/common/AA||||1
10916|tbase1Support|~/database/common/AA||||1
10917|tbase2Support|~/database/common/AA||||1
10918|tbase3Support|~/database/common/AA||||1
10919|tbase4Support|~/database/common/AA||||1
10740|usatile1Support|~/database/common/AA||||1
10741|usatile2Support|~/database/common/AA||||1
10742|usatile3Support|~/database/common/AA||||1
10743|usatile4Support|~/database/common/AA||||1
10744|usatile5Support|~/database/common/AA||||1
10745|usatile6Support|~/database/common/AA||||1
10746|usatile7Support|~/database/common/AA||||1
10747|usatile8Support|~/database/common/AA||||1
10748|usatile9Support|~/database/common/AA||||1
10749|usatile10Support|~/database/common/AA||||1
10750|usatile11Support|~/database/common/AA||||1
10751|usatile12Support|~/database/common/AA||||1
10948|geoidSupport|~/database/common/AA||||1
#
# -----------------------------------------------------------------------
# The following are format files for each data set file (not necessarily
# a one-to-one relationship).  # The IDs must correspond to the logical
# IDs in the index file (10900, above).
# These entries may be deleted or commented out if the AA tools are not
# being used.
# -----------------------------------------------------------------------
10920|mowe13a.bfm|~/database/common/AA||||1
10921|owe13a.bfm|~/database/common/AA||||1
10922|owe14d.bfm|~/database/common/AA||||1
10923|owe14dr.bfm|~/database/common/AA||||1
10924|etop05.bfm|~/database/common/AA||||1
10925|fnocAzm.bfm|~/database/common/AA||||1
10926|fnocOcm.bfm|~/database/common/AA||||1
10927|fnocPt.bfm|~/database/common/AA||||1
10928|fnocRdg.bfm|~/database/common/AA||||1
10929|fnocSt.bfm|~/database/common/AA||||1
10930|fnocUrb.bfm|~/database/common/AA||||1
10931|fnocWat.bfm|~/database/common/AA||||1
10932|fnocMax.bfm|~/database/common/AA||||1
10933|fnocMin.bfm|~/database/common/AA||||1
10934|fnocMod.bfm|~/database/common/AA||||1
10935|srzArea.bfm|~/database/common/AA||||1
10936|srzCode.bfm|~/database/common/AA||||1
10937|srzPhas.bfm|~/database/common/AA||||1
10938|srzSlop.bfm|~/database/common/AA||||1
10939|srzSoil.bfm|~/database/common/AA||||1
10940|srzText.bfm|~/database/common/AA||||1
10941|nmcRucSigPotPres.bfm|~/database/common/AA||||1
10942|tbase.bfm|~/database/common/AA||||1
10943|tbase1.bfm|~/database/common/AA||||1
10944|tbase2.bfm|~/database/common/AA||||1
10945|tbase3.bfm|~/database/common/AA||||1
10946|tbase4.bfm|~/database/common/AA||||1
10700|usatile1.bfm|~/database/common/AA||||1
10701|usatile2.bfm|~/database/common/AA||||1
10702|usatile3.bfm|~/database/common/AA||||1
10703|usatile4.bfm|~/database/common/AA||||1
10704|usatile5.bfm|~/database/common/AA||||1
10705|usatile6.bfm|~/database/common/AA||||1
10706|usatile7.bfm|~/database/common/AA||||1
10707|usatile8.bfm|~/database/common/AA||||1
10708|usatile9.bfm|~/database/common/AA||||1
10709|usatile10.bfm|~/database/common/AA||||1
10710|usatile11.bfm|~/database/common/AA||||1
10711|usatile12.bfm|~/database/common/AA||||1
10947|geoid.bfm|~/database/common/AA||||1
#
#
# -----------------------------------------------------------------------
# leap seconds (TAI-UTC) file (DO NOT REMOVE THIS ENTRY)
# -----------------------------------------------------------------------
10301|leapsec.dat|~/database/common/TD||||1
#
# -----------------------------------------------------------------------
# polar motion and UTC-UT1 file (DO NOT REMOVE THIS ENTRY)
# -----------------------------------------------------------------------
10401|utcpole.dat|~/database/common/CSC||||1
#
# -----------------------------------------------------------------------
# earth model tags file (DO NOT REMOVE THIS ENTRY)
# -----------------------------------------------------------------------
10402|earthfigure.dat|~/database/common/CSC||||1
#
# -----------------------------------------------------------------------
# JPL planetary ephemeris file (binary form) (DO NOT REMOVE THIS ENTRY)
# -----------------------------------------------------------------------
10601|de200.eos|~/database/linux/CBP||||1
#
# -----------------------------------------------------------------------
# spacecraft tag definition file (DO NOT REMOVE THIS ENTRY)
# -----------------------------------------------------------------------
10801|sc_tags.dat|~/database/common/EPH||||1
#
# -----------------------------------------------------------------------
# units conversion definition file (DO NOT REMOVE THIS ENTRY)
# -----------------------------------------------------------------------
10302|udunits.dat|~/database/common/CUC||||1
#
#
?   SUPPORT OUTPUT FILES
#########################################################################
#
# This section is intended for minor output files, e.g., log files.
#
# Each logical ID may have several file instances, as given by the
# version number in the last field.
#
#########################################################################
#
# Next line is default location for SUPPORT OUTPUT FILES
!  /workops/jobs/science/1241373300.02916
#
#
# -----------------------------------------------------------------------
# These files support the SMF log functionality. Each run will cause
# status information to be written to 1 or more of the Log files. To
# simulate DAAC operations, remove the 3 Logfiles between test runs.
# Remember: all executables within a PGE will contribute status data to
# the same batch of log files. (DO NOT REMOVE THESE ENTRIES)
# -----------------------------------------------------------------------
10103|TmpStatus|||||1
10151|TmpReport|||||1
10105|TmpUser|||||1
10110|MailFile|||||1
#
# -----------------------------------------------------------------------
# ASCII file which stores pointers to runtime SMF files in lieu of
# loading them to shared memory, which is a TK5 enhancement.
# (DO NOT REMOVE THIS ENTRY)
# -----------------------------------------------------------------------
10111|ShmMem|||||1
#
#
?   USER DEFINED RUNTIME PARAMETERS
#########################################################################
#
# This section is intended for parameters used as PGE input.
#
# Note: these parameters may NOT be changed dynamically.
#
#########################################################################
#
101|ProductMetadataFile|100:1
2001|PGEVersion|V02-23
2002|Cycle|01
#
# -----------------------------------------------------------------------
# This entries define the the processing start and end times (CCSDS) and
# are used by pgs_pc_getconfigdata to obtain the processing range.
# -----------------------------------------------------------------------
2003|Processing CCSDS Start Time|2009-05-02T00:00:00.000000Z
2004|Processing CCSDS End Time|2009-05-02T23:59:59.999999Z
#
#These parameters were once needed to override the hard-wired associations
#between the species part (""keys"") of l2gp file names and their mcfs
(""hash"")
2005|L2 Species Name
Keys|temperature,gph,h2o,hno3,o3,hcl,clo,co,n2o,oh,rhi,so2,ho2,bro,hocl,hcn,iwc,ch3cn
2006|L2 Species Name
Hash|t,gph,h2o,hno3,o3,hcl,clo,co,n2o,oh,rhi,so2,ho2,bro,hocl,hcn,iwc,ch3cn
#2007|L2 Runtime Switches|pro apr0
#
#
# -----------------------------------------------------------------------
# These parameters are required to support the PGS_SMF_Send...() tools.
# If the first parameter (TransmitFlag) is disabled, then none of the
# other parameters need to be set. By default, this functionality has been
# disabled. To enable, set TransmitFlag to 1 and supply the other 3
# parameters with local information. (DO NOT REMOVE THESE ENTRIES)
# -----------------------------------------------------------------------
10109|TransmitFlag; 1=transmit,0=disable|0
10106|RemoteHost|sandcrab
10107|RemotePath|/usr/kwan/test/PC/data
10108|EmailAddresses|kwan@eos.hitc.com
#
# -----------------------------------------------------------------------
# The following runtime parameters define various logging options.
# Parameters described as lists should be space (i.e. ' ') separated.
# The logical IDs 10117, 10118, 10119 listed below are for OPTIONAL
# control of SMF logging.  Any of these logical IDs which is unused by a
# PGE may be safely commented out (e.g. if logging is not disabled for
# any status level, then the line beginning 10117 may be commented out).
# -----------------------------------------------------------------------
10114|Logging Control; 0=disable logging, 1=enable logging|1
10115|Trace Control; 0=no trace, 1=error trace, 2=full trace|0
10116|Process ID logging; 0=don't log PID, 1=log PID|0
10117|Disabled status level list (e.g. W S F)|
10118|Disabled seed list|
10119|Disabled status code list|76809 26112
#
# -----------------------------------------------------------------------
# Toolkit version for which this PCF was intended.
# DO NOT REMOVE THIS VERSION ENTRY!
# -----------------------------------------------------------------------
10220|Toolkit version string|SCF  TK5.2.14
#
# -----------------------------------------------------------------------
# The following parameters define the ADEOS-II TMDF values (all values
# are assumed to be floating point types).  The ground reference time
# should be in TAI93 format (SI seconds since 12 AM UTC 1993-01-01).
# These formats are only prototypes and are subject to change when
# the ADEOS-II TMDF values are clearly defined.  PGEs that do not access
# ADEOS-II L0 data files do not require these parameters.  In this case
# they may be safely commented out, otherwise appropriate values should
# be supplied.
# -----------------------------------------------------------------------
10120|ADEOS-II s/c reference time|
10121|ADEOS-II ground reference time|
10122|ADEOS-II s/c clock period|
#
# -----------------------------------------------------------------------
# The following parameter defines the TRMM UTCF value (the value is
# assumed to be a floating point type).  PGEs that do not access TRMM
# data of any sort do not require this parameter.  In this case it may be
# safely commented out, otherwise an appropriate value should be
# supplied.
# -----------------------------------------------------------------------
10123|TRMM UTCF value|
#
# -----------------------------------------------------------------------
# The following parameter defines the Epoch date to be used for the
# interpretation (conversion) of NASA PB5C times (the Epoch date should
# be specified here in CCSDS ASCII format--A or B) (reserved for future
# use--this quantity is not referenced in TK 5.2).  This entry may be
# safely commented out or deleted.
# -----------------------------------------------------------------------
10124|NASA PB5C time Epoch date (ASCII UTC)|
#
# -----------------------------------------------------------------------
# The following parameter is a ""mask"" for the ephemeris data quality
# flag.  The value should be specified as an unsigned integer
# specifying those bits of the ephemeris data quality flag that
# should be considered fatal (i.e. the ephemeris data associated
# with the quality flag should be REJECTED/IGNORED).
# -----------------------------------------------------------------------
10507|ephemeris data quality flag mask|65536
#
# -----------------------------------------------------------------------
# The following parameter is a ""mask"" for the attitude data quality
# flag.  The value should be specified as an unsigned integer
# specifying those bits of the attitude data quality flag that
# should be considered fatal (i.e. the attitude data associated
# with the quality flag should be REJECTED/IGNORED).
# -----------------------------------------------------------------------
10508|attitude data quality flag mask|65536
#
# -----------------------------------------------------------------------
# ECS DPS trigger for PGE debug runs
#
# NOTICE TO PGE DEVELOPERS: PGEs which have a debug mode
# need to examine this parameter to evaluate activation rule
# (DO NOT REMOVE THIS ENTRY)
# -----------------------------------------------------------------------
10911|ECS DEBUG; 0=normal, 1=debug|0
#
# -----------------------------------------------------------------------
# This entry defines the IP address of the processing host and is used
# by the Toolkit when generating unique Intermediate and Temporary file
# names.  The Toolkit no longer relies on the PGS_HOST_PATH environment
# variable to otain this information. (DO NOT REMOVE THIS ENTRY)
# -----------------------------------------------------------------------
10099|Local IP Address of 'ether'|128.149.224.148
#
?   INTERMEDIATE INPUT
#########################################################################
#
# This section is intended for intermediate input files, i.e., files
# which are output by an earlier PGE but which are not standard
# products.
#
# Each logical ID may have only one file instance.
# Last field on the line is ignored.
#
#########################################################################
#
# Next line is default location for INTERMEDIATE INPUT FILES
!  /workops/jobs/science/1241373300.02916
#
#
?   INTERMEDIATE OUTPUT
#########################################################################
#
# This section is intended for intermediate output files, i.e., files
# which are to be input to later PGEs, but which are not standard
# products.
#
# Each logical ID may have only one file instance.
# Last field on the line is ignored.
#
#########################################################################
#
# Next line is default location for INTERMEDIATE OUTPUT FILES
!  /workops/jobs/science/1241373300.02916
#
#
?   TEMPORARY I/O
#########################################################################
#
# This section is intended for temporary files, i.e., files
# which are generated during a PGE run and deleted at PGE termination.
#
# Entries in this section are generated internally by the Toolkit.
# DO NOT MAKE MANUAL ENTRIES IN THIS SECTION.
#
#########################################################################
#
# Next line is default location for TEMPORARY FILES
!  /workops/jobs/science/1241373300.02916
#
#
?   END

"
"TAI93At0zOfGranule","5.15376007E8"
"FirstMAF","6110569"
"OrbitPeriod","5932.937352001667"
"GranuleMonth","5"
"HDFEOSVersion","HDFEOS_5.1.10"
"Pressure","1000.0","-999.99","-999.99"
"Content-Type","application/x-hdf"
"PGEVersion","V02-23"
"MiscNotes","No gmao or ncep files--falling back to climatology"

Re: Expected output

Posted by Tyler Palsulich <tp...@gmail.com>.
Hi Kevin,

The output seems correct to me. I would guess that meta-data field, in this
case, is just very long. But, I'm not familiar with the he5 format. So,
there is a chance that field shouldn't be in the metadata at all, but that
is unlikely.

You can check out http://wiki.apache.org/tika/TikaJAXRS for some
information on tika-server.

Tyler

On Fri, Jun 27, 2014 at 7:55 AM, kevin slote <ks...@gmail.com> wrote:

> Hello everyone.  I have a question about the expected output for tika.  I
> am working on integrating my python application with tika-server.  One of
> the test files for unit test produces this for the metadata.  The test file
> is test.he5,
> and the way I call tika is just to send this file to
> http://localhost:9998/meta while tika-serve-1.5 is running.
>
> Should I expect csv formatted data that occasionally has long strings of
> text with many line breaks?
>