You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by "Anantharaman, Srinatha (Contractor)" <Sr...@comcast.com> on 2017/04/20 15:02:13 UTC

Issues with ingesting to Solr using Flume

Hi all,

I am trying to ingest data to Solr 6.3 using flume 1.5 on Hortonworks 2.5 platform Facing below issue while sinking the data

19 Apr 2017 19:54:26,943 ERROR [lifecycleSupervisor-1-3] (org.apache.flume.lifecycle.LifecycleSupervisor$MonitorRunnable.run:253)  - Unable to start SinkRunner: { policy:org.apache.flume.sink.DefaultSinkProcessor@130344d7 counterGroup:{ name:null counters:{} } } - Exception follows.
org.kitesdk.morphline.api.MorphlineCompilationException: No command builder registered for name: detectMimeType near: {
    # /etc/flume/conf/morphline.conf: 48
    "detectMimeType" : {
        # /etc/flume/conf/morphline.conf: 50
        "includeDefaultMimeTypes" : true
    }
}

The morphline config file is as below


    id : morphline1

    importCommands : ["org.kitesdk.**", "org.apache.solr.**"]
#    importCommands : ["com.cloudera.**", "org.kitesdk.**"]

    commands :
    [

      { detectMimeType { includeDefaultMimeTypes : true } }

      {

        solrCell {

          solrLocator : ${solrLocator}

          captureAttr : true

          lowernames : true

          capture : [_attachment_body, _attachment_mimetype, basename, content, content_encoding, content_type, file, meta,text]

          parsers : [ # { parser : org.apache.tika.parser.txt.TXTParser }

                    # { parser : org.apache.tika.parser.AutoDetectParser }
                      #{ parser : org.apache.tika.parser.asm.ClassParser }
                      #{ parser : org.gagravarr.tika.FlacParser }
                      #{ parser : org.apache.tika.parser.executable.ExecutableParser }
                      #{ parser : org.apache.tika.parser.font.TrueTypeParser }
                      #{ parser : org.apache.tika.parser.xml.XMLParser }
                      #{ parser : org.apache.tika.parser.html.HtmlParser }
                      #{ parser : org.apache.tika.parser.image.TiffParser }
                      # { parser : org.apache.tika.parser.mail.RFC822Parser }
                      #{ parser : org.apache.tika.parser.mbox.MboxParser, additionalSupportedMimeTypes : [message/x-emlx] }
                      #{ parser : org.apache.tika.parser.microsoft.OfficeParser }
                      #{ parser : org.apache.tika.parser.hdf.HDFParser }
                      #{ parser : org.apache.tika.parser.odf.OpenDocumentParser }
                      #{ parser : org.apache.tika.parser.pdf.PDFParser }
                      #{ parser : org.apache.tika.parser.rtf.RTFParser }
                      { parser : org.apache.tika.parser.txt.TXTParser }
                      #{ parser : org.apache.tika.parser.chm.ChmParser }
                    ]

         fmap : { content : text }
         }

      }
      { generateUUID { field : id } }

      { sanitizeUnknownSolrFields { solrLocator : ${solrLocator} } }


      { logDebug { format : "output record: {}", args : ["@{}"] } }

      { loadSolr: { solrLocator : ${solrLocator} } }

    ]

  }

]


I have copied all required jars files to Flume Classpath Kindly let me know the solution for this issue

Regards,
~Sri


Re: Issues with ingesting to Solr using Flume

Posted by Shawn Heisey <ap...@elyograg.org>.
On 4/20/2017 9:02 AM, Anantharaman, Srinatha (Contractor) wrote:
> Hi all,
>
> I am trying to ingest data to Solr 6.3 using flume 1.5 on Hortonworks 2.5 platform Facing below issue while sinking the data
>
> 19 Apr 2017 19:54:26,943 ERROR [lifecycleSupervisor-1-3] (org.apache.flume.lifecycle.LifecycleSupervisor$MonitorRunnable.run:253)  - Unable to start SinkRunner: { policy:org.apache.flume.sink.DefaultSinkProcessor@130344d7 counterGroup:{ name:null counters:{} } } - Exception follows.
> org.kitesdk.morphline.api.MorphlineCompilationException: No command builder registered for name: detectMimeType near: {
>     # /etc/flume/conf/morphline.conf: 48
>     "detectMimeType" : {
>         # /etc/flume/conf/morphline.conf: 50
>         "includeDefaultMimeTypes" : true
>     }
> }

I know nothing at all about Flume, but reading that message, Solr is not
mentioned anywhere.  My recommendation is to ask for help on this
problem using a Flume resource.  If Solr is doing something wrong, they
should be able to help you find evidence showing that.  At that point,
you can come back to this thread with that evidence.

Are there any ERROR or WARN messages in the Solr logs?

Thanks,
Shawn