You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by KRIS MUSSHORN <mu...@comcast.net> on 2016/09/06 17:29:17 UTC
indexing metatags with Nutch 1.12
https://wiki.apache.org/nutch/IndexMetatags
Soon as i switch to nutch-site_v2 nutch throws protocol missing errors during crawl.
2016-09-06 12:23:53,102 INFO fetcher.Fetcher - -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=442, fetchQueues.getQueueCount=1
2016-09-06 12:23:53,576 INFO fetcher.FetcherThread - fetching https://snip/inside/events/events_summary/documents/Harford_Co_Sheriff_Special_Brief.pdf (queue crawl delay=500ms)
2016-09-06 12:23:53,576 INFO fetcher.FetcherThread - fetch of https://snip/inside/events/events_summary/documents/Harford_Co_Sheriff_Special_Brief.pdf failed with: org.apache.nutch.protocol.ProtocolNotFound: protocol not found for url=https
at org.apache.nutch.protocol.ProtocolFactory.getProtocol(ProtocolFactory.java:84)
at org.apache.nutch.fetcher.FetcherThread.run(FetcherThread.java:257)
how can i fix this?
Kris