You are viewing a plain text version of this content. The canonical link for it is here.

Posted to solr-user@lucene.apache.org by Nutan <nu...@gmail.com> on 2013/09/15 13:50:08 UTC

searching within documents

this is my schema.xml :
<schema name="documents">
<fields> 
<field name="id" type="string" indexed="true" stored="true" required="true"
multiValued="false"/>
<field name="author" type="string" indexed="true" stored="true"
multiValued="true"/>
<field name="comments" type="text" indexed="true" stored="true"
multiValued="false"/>
<field name="keywords" type="text" indexed="true" stored="true"
multiValued="false"/>
<field name="contents" type="string" indexed="true" stored="true"
multiValued="false"/>
<field name="title" type="text" indexed="true" stored="true"
multiValued="false"/>
<field name="revision_number" type="string" indexed="true" stored="true"
multiValued="false"/>

<field name="_version_" type="long" indexed="true" stored="true"
multiValued="false"/>
<dynamicField name="ignored_*" type="string" indexed="false" stored="true"
multiValued="true"/>
</fields> 
<types>
<fieldType name="string" class="solr.StrField" />
<fieldType name="integer" class="solr.IntField" />
<fieldType name="long" class="solr.LongField" />

<fieldType name="text" class="solr.TextField" >
<analyzer>
<tokenizer class="solr.WhitespaceTokenizerFactory"/…
</analyzer>
</fieldType>
<fieldtype name="ignored" stored="false" indexed="false" multiValued="true"
class="solr.StrField" />
</types>
<uniqueKey>id</uniqueKey>
</schema>

this is the document i want to search through contents:
<doc>
<str name="id">8</str>
<arr name="author">
<str>nutan shinde</str>
</arr>
<str name="comments">best book for solr</str>
<str name="keywords">solr,lucene,apache tika</str>
<str name="contents">
solr,lucene is used for search based service.Google works uses web
crawler.Lucene can implement web crawler
</str>
<str name="title">solr enterprise search server</str>
<str name="revision_number">00123467889767</s…
&lt;/doc>

But when i fire this query :

http://localhost:8080/solr/select?q=co…

I get 0 documents found.Document is successfully indexed.



--
View this message in context: http://lucene.472066.n3.nabble.com/searching-within-documents-tp4090173.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: searching within documents

Posted by Nutan <nu...@gmail.com>.

Okay thanks,
I will surely read that page.
Thanks a lot.


On Wed, Sep 25, 2013 at 4:21 PM, Erick Erickson [via Lucene] <
ml-node+s472066n4091910h83@n3.nabble.com> wrote:

> Because your "text" field type is completely broken.
>
> for instance, at indexing time
> > lowercasing before using WordDelimiterFilterFactory
> means that one of the purposes of WDFF, breaking
> tokens up on upper/lower case transitions can't happen.
> Which you apparently intend since you have
> splitOnCaseChange="1"
>
> > you apply stemming at index time but not query time
> (not finding q=acted)
>
> For your query,
> > you don't lowercase the input (contents:Sushant
> not getting hits).
>
> Please spend some time with the admin/analysis page
> to understand the transformations at index time and
> query time, that'll clarify a lot.
>
> Best
> Erick
>
> On Tue, Sep 24, 2013 at 6:49 AM, Nutan <[hidden email]<http://user/SendEmail.jtp?type=node&node=4091910&i=0>>
> wrote:
>
> > Why does it happens that for few words it shows output and for few it
> does
> > not?
> >
> > For example,
> > 1)
> > q=contents:Sushant
> >
> > numfound is 0
> >
> > q=contents:sushant
> >
> > gives output
> >
> > 2)
> > q=contents:acted
> >
> > numfound 0
> >
> > q=contents:well
> >
> > gives output
> >
> > This is the document:
> > <result name="response" numFound="1" start="0">
> >   <doc>
> >     <str name="id">13</str>
> >     <arr name="author">
> >       <str>chetan</str>
> >     </arr>
> >     <str name="comments">worst book</str>
> >     <str name="keywords">solr,lucene</str>
> >     <str name="contents">Sushant acted well in kaipoche.</str>
> >     <str name="title">3 mistakes</str>
> >     <str name="revision_number">0012345654334</str></doc>
> > </result>
> > </response>
> >
> > Please do reply.Help will be appreciated.
> > Thanks in advance.
> >
> >
> >
> > --
> > View this message in context:
> http://lucene.472066.n3.nabble.com/searching-within-documents-tp4090173p4091713.html
>
> > Sent from the Solr - User mailing list archive at Nabble.com.
>
>
> ------------------------------
>  If you reply to this email, your message will be added to the discussion
> below:
>
> http://lucene.472066.n3.nabble.com/searching-within-documents-tp4090173p4091910.html
>  To unsubscribe from searching within documents, click here<http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=4090173&code=bnV0YW5zaGluZGUxOTkyQGdtYWlsLmNvbXw0MDkwMTczfC0xMzEzOTU5Mzcx>
> .
> NAML<http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>
>




--
View this message in context: http://lucene.472066.n3.nabble.com/searching-within-documents-tp4090173p4091938.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: searching within documents

Posted by Erick Erickson <er...@gmail.com>.

Because your "text" field type is completely broken.

for instance, at indexing time
> lowercasing before using WordDelimiterFilterFactory
means that one of the purposes of WDFF, breaking
tokens up on upper/lower case transitions can't happen.
Which you apparently intend since you have
splitOnCaseChange="1"

> you apply stemming at index time but not query time
(not finding q=acted)

For your query,
> you don't lowercase the input (contents:Sushant
not getting hits).

Please spend some time with the admin/analysis page
to understand the transformations at index time and
query time, that'll clarify a lot.

Best
Erick

On Tue, Sep 24, 2013 at 6:49 AM, Nutan <nu...@gmail.com> wrote:
> Why does it happens that for few words it shows output and for few it does
> not?
>
> For example,
> 1)
> q=contents:Sushant
>
> numfound is 0
>
> q=contents:sushant
>
> gives output
>
> 2)
> q=contents:acted
>
> numfound 0
>
> q=contents:well
>
> gives output
>
> This is the document:
> <result name="response" numFound="1" start="0">
>   <doc>
>     <str name="id">13</str>
>     <arr name="author">
>       <str>chetan</str>
>     </arr>
>     <str name="comments">worst book</str>
>     <str name="keywords">solr,lucene</str>
>     <str name="contents">Sushant acted well in kaipoche.</str>
>     <str name="title">3 mistakes</str>
>     <str name="revision_number">0012345654334</str></doc>
> </result>
> </response>
>
> Please do reply.Help will be appreciated.
> Thanks in advance.
>
>
>
> --
> View this message in context: http://lucene.472066.n3.nabble.com/searching-within-documents-tp4090173p4091713.html
> Sent from the Solr - User mailing list archive at Nabble.com.

RE: searching within documents

Posted by Nutan <nu...@gmail.com>.

Why does it happens that for few words it shows output and for few it does
not?

For example,
1)
q=contents:Sushant

numfound is 0

q=contents:sushant

gives output

2)
q=contents:acted

numfound 0

q=contents:well

gives output

This is the document:
<result name="response" numFound="1" start="0">
  <doc>
    <str name="id">13</str>
    <arr name="author">
      <str>chetan</str>
    </arr>
    <str name="comments">worst book</str>
    <str name="keywords">solr,lucene</str>
    <str name="contents">Sushant acted well in kaipoche.</str>
    <str name="title">3 mistakes</str>
    <str name="revision_number">0012345654334</str></doc>
</result>
</response>

Please do reply.Help will be appreciated.
Thanks in advance.



--
View this message in context: http://lucene.472066.n3.nabble.com/searching-within-documents-tp4090173p4091713.html
Sent from the Solr - User mailing list archive at Nabble.com.

RE: searching within documents

Posted by Nutan <nu...@gmail.com>.

Okay thanks.



--
View this message in context: http://lucene.472066.n3.nabble.com/searching-within-documents-tp4090173p4091705.html
Sent from the Solr - User mailing list archive at Nabble.com.

RE: searching within documents

Posted by "Gupta, Abhinav" <ab...@ptc.com>.

It's not always that when you change schema.xml you need to re-index. 
For eg., if you add any tokenizer for Query Analyser you don't need to reindex. 

But in below case I suppose your changes in schema is related for indexing time. Then you need to re-index.

Sequencing of documents depends entirely on relevance (score) of document. 
Hope it helps.

Thanks,
Abhinav

-----Original Message-----
From: Nutan [mailto:nutanshinde1992@gmail.com] 
Sent: 24 September 2013 14:34
To: solr-user@lucene.apache.org
Subject: Re: searching within documents

First I indexed documents using "indexing xml files to solr(sending doc to solr using xml file)"
Then I made changes to schema.xml ie. I added analyzer and tokenizer.
I then indexed some new documents using same procedure,now my searching with spaces works only for newly indexed files and not the initial files.
Is it true that, after making changes to schema.xml re-indexing is necessary??

Is it the case that searching few words works and for others it may not, like when i query:
q=contents:used

output:numfound=0

and for
q=contents:for
 
output:
 "response":{"numFound":2,"start":0,"docs":[
      {
        "id":"7",
        "author":["nutan"],
        "comments":"best book",
        "keywords":"solr,lucene",
        "contents":"solr,lucene is used for search based service.",
        "title":"solr cookbook 3.1",
        "revision_number":"0012345654334"},
      {
        "id":"8",
        "author":["nutan shinde"],
        "comments":"best book for solr",
        "keywords":"solr,lucene,apache tika",
        "contents":"solr,lucene is used for search based service.Google works uses web crawler.Lucene can implelment web crawler",
        "title":"solr enterprise search server",
        "revision_number":"00123467889767"}]
  }}

my shema.xml is:

<schema  name="documents">
<fields>
<field name="id" type="string" indexed="true" stored="true" required="true"
multiValued="false"/>
<field name="author" type="string" indexed="true" stored="true"
multiValued="true"/>
<field name="comments" type="text" indexed="true" stored="true"
multiValued="false"/>
<field name="keywords" type="text" indexed="true" stored="true"
multiValued="false"/>
<field name="contents" type="text" indexed="true" stored="true"
multiValued="false"/>
<field name="title" type="text" indexed="true" stored="true"
multiValued="false"/>
<field name="revision_number" type="string" indexed="true" stored="true"
multiValued="false"/>

<field name="_version_" type="long" indexed="true" stored="true"
multiValued="false"/>
<dynamicField name="ignored_*" type="string" indexed="false" stored="true"
multiValued="true"/>
<copyfield source="id" dest="text" />
<copyfield source="author" dest="text" /> </fields> <types> <fieldType name="integer" class="solr.IntField" /> <fieldType name="long" class="solr.LongField" />

<fieldType name="string" class="solr.StrField"  /> <fieldType name="text" class="solr.TextField" > <analyzer type="index"> <tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory" /> <filter class="solr.EnglishMinimalStemFilterFactory" /> <filter class="solr.SnowballPorterFilterFactory" /> <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1"
splitOnCaseChange="1"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
</analyzer>
</fieldType>
<fieldtype name="ignored" stored="false" indexed="false" multiValued="true"
class="solr.StrField" />
</types>
<uniqueKey>id</uniqueKey>
</schema>

and also for each query :
contents:for
contents:search

the sequence in which documents occur changes.What is the reason for it?
How are the documents retrieved?Does it depend on the number of indexes




--
View this message in context: http://lucene.472066.n3.nabble.com/searching-within-documents-tp4090173p4091697.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: searching within documents

Posted by Gora Mohanty <go...@mimirtech.com>.

On 24 September 2013 14:34, Nutan <nu...@gmail.com> wrote:
> First I indexed documents using "indexing xml files to solr(sending doc to
> solr using xml file)"
> Then I made changes to schema.xml ie. I added analyzer and tokenizer.
> I then indexed some new documents using same procedure,now my searching with
> spaces works only for newly indexed files and not the initial files.
> Is it true that, after making changes to schema.xml re-indexing is
> necessary??
[...]

Yes, it is required.

Regards,
Gora

Re: searching within documents

Posted by Nutan <nu...@gmail.com>.

First I indexed documents using "indexing xml files to solr(sending doc to
solr using xml file)"
Then I made changes to schema.xml ie. I added analyzer and tokenizer.
I then indexed some new documents using same procedure,now my searching with
spaces works only for newly indexed files and not the initial files.
Is it true that, after making changes to schema.xml re-indexing is
necessary??

Is it the case that searching few words works and for others it may not,
like when i query:
q=contents:used

output:numfound=0

and for
q=contents:for
 
output:
 "response":{"numFound":2,"start":0,"docs":[
      {
        "id":"7",
        "author":["nutan"],
        "comments":"best book",
        "keywords":"solr,lucene",
        "contents":"solr,lucene is used for search based service.",
        "title":"solr cookbook 3.1",
        "revision_number":"0012345654334"},
      {
        "id":"8",
        "author":["nutan shinde"],
        "comments":"best book for solr",
        "keywords":"solr,lucene,apache tika",
        "contents":"solr,lucene is used for search based service.Google
works uses web crawler.Lucene can implelment web crawler",
        "title":"solr enterprise search server",
        "revision_number":"00123467889767"}]
  }}

my shema.xml is:

<schema  name="documents">
<fields> 
<field name="id" type="string" indexed="true" stored="true" required="true"
multiValued="false"/>
<field name="author" type="string" indexed="true" stored="true"
multiValued="true"/>
<field name="comments" type="text" indexed="true" stored="true"
multiValued="false"/>
<field name="keywords" type="text" indexed="true" stored="true"
multiValued="false"/>
<field name="contents" type="text" indexed="true" stored="true"
multiValued="false"/>
<field name="title" type="text" indexed="true" stored="true"
multiValued="false"/>
<field name="revision_number" type="string" indexed="true" stored="true"
multiValued="false"/>

<field name="_version_" type="long" indexed="true" stored="true"
multiValued="false"/>
<dynamicField name="ignored_*" type="string" indexed="false" stored="true"
multiValued="true"/>
<copyfield source="id" dest="text" />
<copyfield source="author" dest="text" />
</fields> 
<types>
<fieldType name="integer" class="solr.IntField" />
<fieldType name="long" class="solr.LongField" />

<fieldType name="string" class="solr.StrField"  />  
<fieldType name="text" class="solr.TextField" >
<analyzer type="index">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory" />
<filter class="solr.EnglishMinimalStemFilterFactory" />
<filter class="solr.SnowballPorterFilterFactory" /> 
<filter class="solr.WordDelimiterFilterFactory" generateWordParts="1"
splitOnCaseChange="1"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
</analyzer>
</fieldType>
<fieldtype name="ignored" stored="false" indexed="false" multiValued="true"
class="solr.StrField" />
</types>
<uniqueKey>id</uniqueKey>
</schema>

and also for each query :
contents:for
contents:search

the sequence in which documents occur changes.What is the reason for it?
How are the documents retrieved?Does it depend on the number of indexes




--
View this message in context: http://lucene.472066.n3.nabble.com/searching-within-documents-tp4090173p4091697.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: searching within documents

Posted by Nutan Shinde <nu...@gmail.com>.

And this works:
localhost:8080/solr/select?q=title:solr
this gives output as required doc,
but
localhost:8080/solr/select?q=contents:solr
 gives num found as 0

This is the new edited schema.xml :
<schema name="documents">
<fields>
<field name="id" type="string" indexed="true" stored="true" required="true"
multiValued="false"/>
<field name="author" type="string" indexed="true" stored="true"
multiValued="true"/>
<field name="comments" type="text" indexed="true" stored="true"
multiValued="false"/>
<field name="keywords" type="text" indexed="true" stored="true"
multiValued="false"/>
<field name="contents" type="text" indexed="true" stored="true"
multiValued="false"/>
<field name="title" type="text" indexed="true" stored="true"
multiValued="false"/>
<field name="revision_number" type="string" indexed="true" stored="true"
multiValued="false"/>

<field name="_version_" type="long" indexed="true" stored="true"
multiValued="false"/>
<dynamicField name="ignored_*" type="string" indexed="false" stored="true"
multiValued="true"/>
<copyfield source="*" dest="text" />
</fields>
<types>
<fieldType name="string" class="solr.StrField" />
<fieldType name="integer" class="solr.IntField" />
<fieldType name="long" class="solr.LongField" />
<fieldType name="text" class="solr.TextField" >
<analyzer type="index">
<tokenizer class="solr.WhitespaceTokenizerFactory"/…
</analyzer>
<analyzer type="query">
<tokenizer class="solr.WhitespaceTokenizerFactory"/…
</analyzer>
</fieldType>
<fieldtype name="ignored" stored="false" indexed="false" multiValued="true"
class="solr.StrField" />
</types>
<uniqueKey>id</uniqueKey>
</schema>


On Sat, Sep 21, 2013 at 7:58 PM, Nutan <nu...@gmail.com> wrote:

> I have been trying to resolve the problem of searching within doc,it wasnt
> working so I thought of installing solr on other system.I followed the same
> process->to install tomcat->create solr-home folder->solr.xml->then I get
> the homepage(admin)of solr and followed Solr cookbook for extracting
> handler
> but I get this error:
> update/extract/ not found on this server.
> Now I am stuck at both the systems.Therefore two different errors on
> different machines.
> Coming back to this error,I want to search within documents that is the
> contents of schema.xml :
> <schema name="documents">
> <fields>
> <field name="id" type="string" indexed="true" stored="true" required="true"
> multiValued="false"/>
> <field name="author" type="string" indexed="true" stored="true"
> multiValued="true"/>
> <field name="comments" type="text" indexed="true" stored="true"
> multiValued="false"/>
> <field name="keywords" type="text" indexed="true" stored="true"
> multiValued="false"/>
> <field name="contents" type="string" indexed="true" stored="true"
> multiValued="false"/>
> <field name="title" type="text" indexed="true" stored="true"
> multiValued="false"/>
> <field name="revision_number" type="string" indexed="true" stored="true"
> multiValued="false"/>
>
> <field name="_version_" type="long" indexed="true" stored="true"
> multiValued="false"/>
> <dynamicField name="ignored_*" type="string" indexed="false" stored="true"
> multiValued="true"/>
> </fields>
> <types>
> <fieldType name="string" class="solr.StrField" />
> <fieldType name="integer" class="solr.IntField" />
> <fieldType name="long" class="solr.LongField" />
>
> <fieldType name="text" class="solr.TextField" >
> <analyzer>
> <tokenizer class="solr.WhitespaceTokenizerFactory"/…
> </analyzer>
> </fieldType>
> <fieldtype name="ignored" stored="false" indexed="false" multiValued="true"
> class="solr.StrField" />
> </types>
> <uniqueKey>id</uniqueKey>
> </schema><schema name="documents">
> <fields>
> <field name="id" type="string" indexed="true" stored="true" required="true"
> multiValued="false"/>
> <field name="author" type="string" indexed="true" stored="true"
> multiValued="true"/>
> <field name="comments" type="text" indexed="true" stored="true"
> multiValued="false"/>
> <field name="keywords" type="text" indexed="true" stored="true"
> multiValued="false"/>
> <field name="contents" type="string" indexed="true" stored="true"
> multiValued="false"/>
> <field name="title" type="text" indexed="true" stored="true"
> multiValued="false"/>
> <field name="revision_number" type="string" indexed="true" stored="true"
> multiValued="false"/>
>
> <field name="_version_" type="long" indexed="true" stored="true"
> multiValued="false"/>
> <dynamicField name="ignored_*" type="string" indexed="false" stored="true"
> multiValued="true"/>
> </fields>
> <types>
> <fieldType name="string" class="solr.StrField" />
> <fieldType name="integer" class="solr.IntField" />
> <fieldType name="long" class="solr.LongField" />
>
> <fieldType name="text" class="solr.TextField" >
> <analyzer>
> <tokenizer class="solr.WhitespaceTokenizerFactory"/…
> </analyzer>
> </fieldType>
> <fieldtype name="ignored" stored="false" indexed="false" multiValued="true"
> class="solr.StrField" />
> </types>
> <uniqueKey>id</uniqueKey>
> </schema>
>
> In my solrconfig I have defined the standard handler for  select as :
> <requestHandler name="standard" class="solr.StandardRequestHandler"
> default="true">
> <lst name="defaults">
>        <int name="rows">20</int>
>        <str name="fl">*</str>
>      </lst>
> </requestHandler>
>
> This is the example doc which i want to search,(this is the output for *:*
> query)
> <doc>
> <str name="id">8</str>
> <arr name="author">
> <str>nutan shinde</str>
> </arr>
> <str name="comments">best book for solr</str>
> <str name="keywords">solr,lucene,apache tika</str>
> <str name="contents">
> solr,lucene is used for search based service.Google works uses web
> crawler.Lucene can implement web crawler
> </str>
> <str name="title">solr enterprise search server</str>
> <str name="revision_number">00123467889767</s…
> &amp;lt;/doc>
>
> I indexed this record through indexing using xml file.
> And I have no idea about copy fields,so  please help me.
> My Tomcat is working normal.
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/searching-within-documents-tp4090173p4091368.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>

Re: searching within documents

Posted by Nutan <nu...@gmail.com>.

I have been trying to resolve the problem of searching within doc,it wasnt
working so I thought of installing solr on other system.I followed the same
process->to install tomcat->create solr-home folder->solr.xml->then I get
the homepage(admin)of solr and followed Solr cookbook for extracting handler
but I get this error:
update/extract/ not found on this server.
Now I am stuck at both the systems.Therefore two different errors on
different machines.
Coming back to this error,I want to search within documents that is the
contents of schema.xml :
<schema name="documents">
<fields>
<field name="id" type="string" indexed="true" stored="true" required="true"
multiValued="false"/>
<field name="author" type="string" indexed="true" stored="true"
multiValued="true"/>
<field name="comments" type="text" indexed="true" stored="true"
multiValued="false"/>
<field name="keywords" type="text" indexed="true" stored="true"
multiValued="false"/>
<field name="contents" type="string" indexed="true" stored="true"
multiValued="false"/>
<field name="title" type="text" indexed="true" stored="true"
multiValued="false"/>
<field name="revision_number" type="string" indexed="true" stored="true"
multiValued="false"/>

<field name="_version_" type="long" indexed="true" stored="true"
multiValued="false"/>
<dynamicField name="ignored_*" type="string" indexed="false" stored="true"
multiValued="true"/>
</fields>
<types>
<fieldType name="string" class="solr.StrField" />
<fieldType name="integer" class="solr.IntField" />
<fieldType name="long" class="solr.LongField" />

<fieldType name="text" class="solr.TextField" >
<analyzer>
<tokenizer class="solr.WhitespaceTokenizerFactory"/…
</analyzer>
</fieldType>
<fieldtype name="ignored" stored="false" indexed="false" multiValued="true"
class="solr.StrField" />
</types>
<uniqueKey>id</uniqueKey>
</schema><schema name="documents">
<fields>
<field name="id" type="string" indexed="true" stored="true" required="true"
multiValued="false"/>
<field name="author" type="string" indexed="true" stored="true"
multiValued="true"/>
<field name="comments" type="text" indexed="true" stored="true"
multiValued="false"/>
<field name="keywords" type="text" indexed="true" stored="true"
multiValued="false"/>
<field name="contents" type="string" indexed="true" stored="true"
multiValued="false"/>
<field name="title" type="text" indexed="true" stored="true"
multiValued="false"/>
<field name="revision_number" type="string" indexed="true" stored="true"
multiValued="false"/>

<field name="_version_" type="long" indexed="true" stored="true"
multiValued="false"/>
<dynamicField name="ignored_*" type="string" indexed="false" stored="true"
multiValued="true"/>
</fields>
<types>
<fieldType name="string" class="solr.StrField" />
<fieldType name="integer" class="solr.IntField" />
<fieldType name="long" class="solr.LongField" />

<fieldType name="text" class="solr.TextField" >
<analyzer>
<tokenizer class="solr.WhitespaceTokenizerFactory"/…
</analyzer>
</fieldType>
<fieldtype name="ignored" stored="false" indexed="false" multiValued="true"
class="solr.StrField" />
</types>
<uniqueKey>id</uniqueKey>
</schema>

In my solrconfig I have defined the standard handler for  select as :
<requestHandler name="standard" class="solr.StandardRequestHandler"
default="true">
<lst name="defaults">
       <int name="rows">20</int>
       <str name="fl">*</str>
     </lst>
</requestHandler>

This is the example doc which i want to search,(this is the output for *:*
query)
<doc>
<str name="id">8</str>
<arr name="author">
<str>nutan shinde</str>
</arr>
<str name="comments">best book for solr</str>
<str name="keywords">solr,lucene,apache tika</str>
<str name="contents">
solr,lucene is used for search based service.Google works uses web
crawler.Lucene can implement web crawler
</str>
<str name="title">solr enterprise search server</str>
<str name="revision_number">00123467889767</s…
&amp;lt;/doc>

I indexed this record through indexing using xml file.
And I have no idea about copy fields,so  please help me.
My Tomcat is working normal.



--
View this message in context: http://lucene.472066.n3.nabble.com/searching-within-documents-tp4090173p4091368.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: searching within documents

Posted by Erick Erickson <er...@gmail.com>.

Stop. Back up. Start from the beginning.

Reiterating what Gora was saying, you're
jumping into the middle of your problem.
Start at the beginning.

Have you been able to just do the example
without changing anything else? I.e. are you
sure you have a working installation under
Tomcat? Don't try to skip this step, many Tomcat
issues (and what you've outlined in your other
two posts) are rooted in not having Solr under
Tomcat configured correctly.

Here's what I'd do
1> just use the embedded Jetty to a> set up
a Solr and b> index PDF files. That
eliminates Tomcat config issues and starts
to get you familiar with Solr.

2> when <1> works, install a stock Solr under
Tomcat and get the example to work. Don't change
your schema. Don't try to index PDF files. Don't
do _anything_ else until you have the example
working.

3> When <1> and <2> work, move on to PDF
 files under Tomcat, changing one thing at a time.

Note there are several Wiki pages that give you
detailed info on setting up Solr under Tomcat. Have
you looked at those?

You might review:
http://wiki.apache.org/solr/UsingMailingLists
In particular, consider what it would be like for
_you_ to try to diagnose a problem in a system
you know a lot about with the level of information
you're providing to us.

Best
Erick

On Sun, Sep 15, 2013 at 8:15 AM, Nutan <nu...@gmail.com> wrote:

> That is my whole schema.xml file.
> I did not define default and copy field.I am new to solr,never read about
> its need for full-text search.Can you please send me any link for
> configurations to search within documents.I did follow Solr cookbook 4.
> Thanks a lot.
>
>
> On Sun, Sep 15, 2013 at 5:39 PM, Gora Mohanty-3 [via Lucene] <
> ml-node+s472066n4090175h98@n3.nabble.com> wrote:
>
> > On 15 September 2013 17:20, Nutan <[hidden email]<
> http://user/SendEmail.jtp?type=node&node=4090175&i=0>>
> > wrote:
> > >
> > > this is my schema.xml :
> >
> > You do not provide nearly enough information for people to
> > be able to help you.
> >
> > Is that the entirety of your schema.xml? If so, it is missing
> > various important bits such as a <defaultSearchField> and
> > <copyField> directives needed to make a full-text search
> > work, as you seem to want.
> >
> > > this is the document i want to search through contents:
> > > <doc>
> > > <str name="id">8</str>
> > > <arr name="author">
> > > <str>nutan shinde</str>
> > > </arr>
> > > <str name="comments">best book for solr</str>
> > > <str name="keywords">solr,lucene,apache tika</str>
> > > <str name="contents">
> > > solr,lucene is used for search based service.Google works uses web
> > > crawler.Lucene can implement web crawler
> > > </str>
> > > <str name="title">solr enterprise search server</str>
> > > <str name="revision_number">00123467889767</s…
> > > &lt;/doc>
> >
> > How are you indexing this document into Solr, and how do you
> > know that it actually got indexed?
> >
> > > But when i fire this query :
> > >
> > > http://localhost:8080/solr/select?q=co…
> >
> > Please provide the entire search URL, without which we are
> > forced to try and guess at what you are trying to do.
> >
> > Regards,
> > Gora
> >
> >
> > ------------------------------
> >  If you reply to this email, your message will be added to the discussion
> > below:
> >
> >
> http://lucene.472066.n3.nabble.com/searching-within-documents-tp4090173p4090175.html
> >  To unsubscribe from searching within documents, click here<
> http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=4090173&code=bnV0YW5zaGluZGUxOTkyQGdtYWlsLmNvbXw0MDkwMTczfC0xMzEzOTU5Mzcx
> >
> > .
> > NAML<
> http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml
> >
> >
>
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/searching-within-documents-tp4090173p4090176.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>

Re: searching within documents

Posted by Nutan <nu...@gmail.com>.

That is my whole schema.xml file.
I did not define default and copy field.I am new to solr,never read about
its need for full-text search.Can you please send me any link for
configurations to search within documents.I did follow Solr cookbook 4.
Thanks a lot.


On Sun, Sep 15, 2013 at 5:39 PM, Gora Mohanty-3 [via Lucene] <
ml-node+s472066n4090175h98@n3.nabble.com> wrote:

> On 15 September 2013 17:20, Nutan <[hidden email]<http://user/SendEmail.jtp?type=node&node=4090175&i=0>>
> wrote:
> >
> > this is my schema.xml :
>
> You do not provide nearly enough information for people to
> be able to help you.
>
> Is that the entirety of your schema.xml? If so, it is missing
> various important bits such as a <defaultSearchField> and
> <copyField> directives needed to make a full-text search
> work, as you seem to want.
>
> > this is the document i want to search through contents:
> > <doc>
> > <str name="id">8</str>
> > <arr name="author">
> > <str>nutan shinde</str>
> > </arr>
> > <str name="comments">best book for solr</str>
> > <str name="keywords">solr,lucene,apache tika</str>
> > <str name="contents">
> > solr,lucene is used for search based service.Google works uses web
> > crawler.Lucene can implement web crawler
> > </str>
> > <str name="title">solr enterprise search server</str>
> > <str name="revision_number">00123467889767</s…
> > &lt;/doc>
>
> How are you indexing this document into Solr, and how do you
> know that it actually got indexed?
>
> > But when i fire this query :
> >
> > http://localhost:8080/solr/select?q=co…
>
> Please provide the entire search URL, without which we are
> forced to try and guess at what you are trying to do.
>
> Regards,
> Gora
>
>
> ------------------------------
>  If you reply to this email, your message will be added to the discussion
> below:
>
> http://lucene.472066.n3.nabble.com/searching-within-documents-tp4090173p4090175.html
>  To unsubscribe from searching within documents, click here<http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=4090173&code=bnV0YW5zaGluZGUxOTkyQGdtYWlsLmNvbXw0MDkwMTczfC0xMzEzOTU5Mzcx>
> .
> NAML<http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>
>




--
View this message in context: http://lucene.472066.n3.nabble.com/searching-within-documents-tp4090173p4090176.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: searching within documents

Posted by Gora Mohanty <go...@mimirtech.com>.

On 15 September 2013 17:20, Nutan <nu...@gmail.com> wrote:
>
> this is my schema.xml :

You do not provide nearly enough information for people to
be able to help you.

Is that the entirety of your schema.xml? If so, it is missing
various important bits such as a <defaultSearchField> and
<copyField> directives needed to make a full-text search
work, as you seem to want.

> this is the document i want to search through contents:
> <doc>
> <str name="id">8</str>
> <arr name="author">
> <str>nutan shinde</str>
> </arr>
> <str name="comments">best book for solr</str>
> <str name="keywords">solr,lucene,apache tika</str>
> <str name="contents">
> solr,lucene is used for search based service.Google works uses web
> crawler.Lucene can implement web crawler
> </str>
> <str name="title">solr enterprise search server</str>
> <str name="revision_number">00123467889767</s…
> &lt;/doc>

How are you indexing this document into Solr, and how do you
know that it actually got indexed?

> But when i fire this query :
>
> http://localhost:8080/solr/select?q=co…

Please provide the entire search URL, without which we are
forced to try and guess at what you are trying to do.

Regards,
Gora