You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Hoss Man (JIRA)" <ji...@apache.org> on 2012/05/04 04:20:48 UTC
[jira] [Updated] (SOLR-3434) CSVRequestHandler does not trim header
when using header=true&trim=true
[ https://issues.apache.org/jira/browse/SOLR-3434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Hoss Man updated SOLR-3434:
---------------------------
Issue Type: Improvement (was: Bug)
Summary: CSVRequestHandler does not trim header when using header=true&trim=true (was: CSVRequestHandler does not parse header properly)
edited summary & trimed down description to reduce verbosity
{panel:title=original issue description with formatting fixes}
The documentation says:
header
true if the first line of the CSV input contains field or column names. The default is header=true. If the fieldnames parameter is absent, these field names will be used when adding documents to the index.
My command:
{noformat}
/usr/bin/curl --proxy "" 'http://localhost:8983/solr/update/csv?commit=true&debug=true&separator=|&escape=\&trim=true&header=true&overwrite=true' --data-binary @/tmp/file_with_header.txt -H 'Content-type:text/plain; charset=utf-8'
{noformat}
My data file (/tmp/file_with_header.txt) :
{noformat}
|busdate |book_id |jq_idn |name_id
|--------|-----------|-------------|-----------
|20120420| 15600| 2070469502| 12787
|20120420| 64400| 2070469503| 12787
|20120420| 100000| 2070469501| 12787
|20120420| 60000| 2070469504| 12787
|20120420| 60000| 2070538002| 12787
|20120420| 206501| 2070538003| 12787
|20120420| 199418| 2070538004| 12787
|20120420| 7000| 2070538005| 12787
{noformat}
schema.xml: (tried different variations)
{noformat}
897 <field name="jq_idn" type="string" indexed="true" stored="true" required="false" />
1005 <uniqueKey>jq_idn</uniqueKey>
{noformat}
Stack trace:
{noformat}
SEVERE: org.apache.solr.common.SolrException: Document is missing mandatory uniqueKey field: jq_idn
at org.apache.solr.update.UpdateHandler.getIndexedId(UpdateHandler.java:118)
at org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:229)
at org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:61)
at org.apache.solr.update.processor.LogUpdateProcessor.processAdd(LogUpdateProcessorFactory.java:115)
at org.apache.solr.handler.CSVLoader.doAdd(CSVRequestHandler.java:416)
at org.apache.solr.handler.SingleThreadedCSVLoader.addDoc(CSVRequestHandler.java:431)
at org.apache.solr.handler.CSVLoader.load(CSVRequestHandler.java:393)
at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:58)
at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
at org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(RequestHandlers.java:244)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1376)
at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:365)
at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:260)
at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)
at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450)
at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
at org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114)
at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
at org.mortbay.jetty.Server.handle(Server.java:326)
at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
at org.mortbay.jetty.HttpConnection$RequestHandler.content(HttpConnection.java:945)
at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:756)
at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)
at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
at org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:228)
at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)
{noformat}
{panel}
> CSVRequestHandler does not trim header when using header=true&trim=true
> -----------------------------------------------------------------------
>
> Key: SOLR-3434
> URL: https://issues.apache.org/jira/browse/SOLR-3434
> Project: Solr
> Issue Type: Improvement
> Affects Versions: 3.6
> Environment: Linux
> Reporter: david babits
> Labels: CSV,, header, separator
>
> The documentation says:
> header
> true if the first line of the CSV input contains field or column names. The default is header=true. If the fieldnames parameter is absent, these field names will be used when adding documents to the index.
> My command:
> /usr/bin/curl --proxy "" 'http://localhost:8983/solr/update/csv?commit=true&debug=true&separator=|&escape=\&trim=true&header=true&overwrite=true' --data-binary @/tmp/file_with_header.txt -H 'Content-type:text/plain; charset=utf-8'
> My data file (/tmp/file_with_header.txt) :
> |busdate |book_id |jq_idn |name_id
> |--------|-----------|-------------|-----------
> |20120420| 15600| 2070469502| 12787
> |20120420| 64400| 2070469503| 12787
> |20120420| 100000| 2070469501| 12787
> |20120420| 60000| 2070469504| 12787
> |20120420| 60000| 2070538002| 12787
> |20120420| 206501| 2070538003| 12787
> |20120420| 199418| 2070538004| 12787
> |20120420| 7000| 2070538005| 12787
> schema.xml: (tried different variations)
> 897 <field name="jq_idn" type="string" indexed="true" stored="true" required="false" />
> 1005 <uniqueKey>jq_idn</uniqueKey>
> Stack trace:
> SEVERE: org.apache.solr.common.SolrException: Document is missing mandatory uniqueKey field: jq_idn
> at org.apache.solr.update.UpdateHandler.getIndexedId(UpdateHandler.java:118)
> at org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:229)
> at org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:61)
> at org.apache.solr.update.processor.LogUpdateProcessor.processAdd(LogUpdateProcessorFactory.java:115)
> at org.apache.solr.handler.CSVLoader.doAdd(CSVRequestHandler.java:416)
> at org.apache.solr.handler.SingleThreadedCSVLoader.addDoc(CSVRequestHandler.java:431)
> at org.apache.solr.handler.CSVLoader.load(CSVRequestHandler.java:393)
> at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:58)
> at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
> at org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(RequestHandlers.java:244)
> at org.apache.solr.core.SolrCore.execute(SolrCore.java:1376)
> at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:365)
> at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:260)
> at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
> at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)
> at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
> at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
> at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
> at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450)
> at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
> at org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114)
> at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
> at org.mortbay.jetty.Server.handle(Server.java:326)
> at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
> at org.mortbay.jetty.HttpConnection$RequestHandler.content(HttpConnection.java:945)
> at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:756)
> at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)
> at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
> at org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:228)
> at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org