You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@archiva.apache.org by James William Dumay <ja...@atlassian.com> on 2008/11/14 01:40:34 UTC

Search improvements for 1.2

Hey guys,
As mentioned on IRC we all agreed that our search feature is a little  
suboptimal.

I would like to propose the following improvements:
* Search should be more like mvnrepository.com (showing codebase  
growth etc).
* We should figure out a way of using up stream repository indexes to  
improve search results.
* Advanced search needs a good rethink - we should probably use a  
filter approach so that you could do bytecode: search results that  
include free text search.
* UI improvements so that the user experience feels more intuitive.

Discuss!

James

Re: Search improvements for 1.2

Posted by Dan Tran <da...@gmail.com>.

oops sorry about that :-)

On Thu, Nov 13, 2008 at 8:10 PM, Wendy Smoak <ws...@gmail.com> wrote:
> On Thu, Nov 13, 2008 at 8:43 PM, Dan Tran <da...@gmail.com> wrote:
>
>> We need automatic database upgrade at startup.
>
> I agree, but how does that relate to improving search?
>
> We have this open if you have any ideas:
> http://jira.codehaus.org/browse/MRM-1001
>
> --
> Wendy
>

Re: Search improvements for 1.2

Posted by Wendy Smoak <ws...@gmail.com>.

On Thu, Nov 13, 2008 at 8:43 PM, Dan Tran <da...@gmail.com> wrote:

> We need automatic database upgrade at startup.

I agree, but how does that relate to improving search?

We have this open if you have any ideas:
http://jira.codehaus.org/browse/MRM-1001

-- 
Wendy

Re: Search improvements for 1.2

Posted by Dan Tran <da...@gmail.com>.

We need automatic database upgrade at startup.

-D

On Thu, Nov 13, 2008 at 5:07 PM, Maria Odea Ching <oc...@apache.org> wrote:
> +1 from me too :)
>
> It has been a bit painful having separate indices for any-text and bytecode
> searches. While we're on the verge of improving the indexing and search, we
> should also consider how we would structure the index for easy integration
> with IDEs (e.g. integration with m2eclipse, q4e/IAM, etc.).
>
> Thanks,
> Deng
>
> On Fri, Nov 14, 2008 at 8:59 AM, Brett Porter <br...@apache.org> wrote:
>
>> +1
>>
>> I'd like the basic search to be more basic (ie, search all the fields at
>> once), and the index to be consolidated. This is more inline with how it was
>> built in 0.9. Then on top of that, weighting results appropriately to make
>> it easier to find. Continuing to support Lucene search syntax is a good idea
>> for those that want power from the quick search.
>>
>> Then the advanced search should be the flexible, descriptive way to search
>> on specific fields via the UI without knowing the Lucene syntax. I think
>> find artifact can be folded into that page now.
>>
>> I agree with pulling upstream repos indexes. I'd rather we do that via
>> Archiva web services rather than taking whole index files so we can do an
>> actual "diff" to apply efficiently. I don't know if Lucene has some native
>> support for that already but we could do it via timestamped records. We
>> should mark the source in the record so that searches can clarify they don't
>> already reside locally.
>>
>> The above ties into the metadata proposal too - with plugins and metadata
>> we can pull remote info on artifacts to consolidate information without
>> having to sync all the artifacts themselves. So Lucene indexing is just one
>> place that would use that, but it would be used by other plugins, reporting,
>> etc.
>>
>> Cheers,
>> Brett
>>
>>
>> On 14/11/2008, at 8:40 AM, James William Dumay wrote:
>>
>>  Hey guys,
>>> As mentioned on IRC we all agreed that our search feature is a little
>>> suboptimal.
>>>
>>> I would like to propose the following improvements:
>>> * Search should be more like mvnrepository.com (showing codebase growth
>>> etc).
>>> * We should figure out a way of using up stream repository indexes to
>>> improve search results.
>>> * Advanced search needs a good rethink - we should probably use a filter
>>> approach so that you could do bytecode: search results that include free
>>> text search.
>>> * UI improvements so that the user experience feels more intuitive.
>>>
>>> Discuss!
>>>
>>> James
>>>
>>
>> --
>> Brett Porter
>> brett@apache.org
>> http://blogs.exist.com/bporter/
>>
>>
>

Re: Search improvements for 1.2

Posted by Maria Odea Ching <oc...@apache.org>.

+1 from me too :)

It has been a bit painful having separate indices for any-text and bytecode
searches. While we're on the verge of improving the indexing and search, we
should also consider how we would structure the index for easy integration
with IDEs (e.g. integration with m2eclipse, q4e/IAM, etc.).

Thanks,
Deng

On Fri, Nov 14, 2008 at 8:59 AM, Brett Porter <br...@apache.org> wrote:

> +1
>
> I'd like the basic search to be more basic (ie, search all the fields at
> once), and the index to be consolidated. This is more inline with how it was
> built in 0.9. Then on top of that, weighting results appropriately to make
> it easier to find. Continuing to support Lucene search syntax is a good idea
> for those that want power from the quick search.
>
> Then the advanced search should be the flexible, descriptive way to search
> on specific fields via the UI without knowing the Lucene syntax. I think
> find artifact can be folded into that page now.
>
> I agree with pulling upstream repos indexes. I'd rather we do that via
> Archiva web services rather than taking whole index files so we can do an
> actual "diff" to apply efficiently. I don't know if Lucene has some native
> support for that already but we could do it via timestamped records. We
> should mark the source in the record so that searches can clarify they don't
> already reside locally.
>
> The above ties into the metadata proposal too - with plugins and metadata
> we can pull remote info on artifacts to consolidate information without
> having to sync all the artifacts themselves. So Lucene indexing is just one
> place that would use that, but it would be used by other plugins, reporting,
> etc.
>
> Cheers,
> Brett
>
>
> On 14/11/2008, at 8:40 AM, James William Dumay wrote:
>
>  Hey guys,
>> As mentioned on IRC we all agreed that our search feature is a little
>> suboptimal.
>>
>> I would like to propose the following improvements:
>> * Search should be more like mvnrepository.com (showing codebase growth
>> etc).
>> * We should figure out a way of using up stream repository indexes to
>> improve search results.
>> * Advanced search needs a good rethink - we should probably use a filter
>> approach so that you could do bytecode: search results that include free
>> text search.
>> * UI improvements so that the user experience feels more intuitive.
>>
>> Discuss!
>>
>> James
>>
>
> --
> Brett Porter
> brett@apache.org
> http://blogs.exist.com/bporter/
>
>

Re: Search improvements for 1.2

Posted by Brett Porter <br...@apache.org>.

+1

I'd like the basic search to be more basic (ie, search all the fields  
at once), and the index to be consolidated. This is more inline with  
how it was built in 0.9. Then on top of that, weighting results  
appropriately to make it easier to find. Continuing to support Lucene  
search syntax is a good idea for those that want power from the quick  
search.

Then the advanced search should be the flexible, descriptive way to  
search on specific fields via the UI without knowing the Lucene  
syntax. I think find artifact can be folded into that page now.

I agree with pulling upstream repos indexes. I'd rather we do that via  
Archiva web services rather than taking whole index files so we can do  
an actual "diff" to apply efficiently. I don't know if Lucene has some  
native support for that already but we could do it via timestamped  
records. We should mark the source in the record so that searches can  
clarify they don't already reside locally.

The above ties into the metadata proposal too - with plugins and  
metadata we can pull remote info on artifacts to consolidate  
information without having to sync all the artifacts themselves. So  
Lucene indexing is just one place that would use that, but it would be  
used by other plugins, reporting, etc.

Cheers,
Brett

On 14/11/2008, at 8:40 AM, James William Dumay wrote:

> Hey guys,
> As mentioned on IRC we all agreed that our search feature is a  
> little suboptimal.
>
> I would like to propose the following improvements:
> * Search should be more like mvnrepository.com (showing codebase  
> growth etc).
> * We should figure out a way of using up stream repository indexes  
> to improve search results.
> * Advanced search needs a good rethink - we should probably use a  
> filter approach so that you could do bytecode: search results that  
> include free text search.
> * UI improvements so that the user experience feels more intuitive.
>
> Discuss!
>
> James

--
Brett Porter
brett@apache.org
http://blogs.exist.com/bporter/

Re: Search improvements for 1.2

Posted by James William Dumay <ja...@atlassian.com>.

On Mon, 2008-12-08 at 19:54 -0500, Wendy Smoak wrote:
> I notice the label says "Version(s)" but each one is on a separate
> line.  Should that just be "Version" or is the intent to group all
> versions for the same groupId/artifactId as one search result?  -Wendy

Yeah, the intent is to collapse it all into a single result - Ill be
making that part of the current set of changes.

James

Re: Search improvements for 1.2

Posted by Wendy Smoak <ws...@gmail.com>.

I notice the label says "Version(s)" but each one is on a separate
line.  Should that just be "Version" or is the intent to group all
versions for the same groupId/artifactId as one search result?  -Wendy

On Sun, Dec 7, 2008 at 8:06 PM, James William Dumay <ja...@atlassian.com> wrote:
> Also, you can see the fruits of our Fridays labour here:
>
> Without search improvements:
> http://archiva.exist.com/quickSearch.action?q=commons+lang
>
> With search improvements:
> https://maven.atlassian.com/quickSearch.action?q=commons+lang
>
> Cheers
> James

Re: Search improvements for 1.2

Posted by James William Dumay <ja...@atlassian.com>.

Also, you can see the fruits of our Fridays labour here:

Without search improvements:
http://archiva.exist.com/quickSearch.action?q=commons+lang

With search improvements:
https://maven.atlassian.com/quickSearch.action?q=commons+lang

Cheers
James



On Fri, 2008-12-05 at 17:39 +1100, Brett Porter wrote:
> I visited James today and we came up with this list of things to fix a  
> bit more urgently (I'll be putting them in JIRA later).
> 
> Already changed:
> * remove dependencies/filecontent from the quick search - must not  
> search for dependencies on commons-lang
> * change the default search to AND instead of OR
> 
> Still to look at:
>      - [ ] improve the search results page
>          - [ ] remove metadata files
>          - [ ] merge versions in search results
>          - [ ] for snapshots, just show SNAPSHOT, not timestamps
>          - [ ] show hits in the results (this may not be possible or  
> needed with better results, however)
> 
>      - [ ] existing JIRA complaints
>          - [ ] MRM-732 (tokenizing)
>          - [ ] MRM-495 (weighting)
>          - [ ] MRM-609 (windows bug - may be fixed)
>          - [ ] MRM-933 (hit count, pagination completely busted)
> 
>      - [ ] advanced search
>          - [ ] improve appearance and flexibility, maybe change to  
> "add term" buttons on the default search
>          - [ ] class/package search is still flaky
>              - [ ] might be the analyzer rules, etc. for splitting on  
> '.'
> 
> - [ ] browse improvements
>      - [ ] artifact version list should show basic shared project  
> information rather than having to drill into one version
>      - [ ] snapshot should go to a page that shows a list of versions
>            (go to latest, but list previous snapshots)
> 
> Thoughts?
> 
> Cheers,
> Brett
> 
> On 14/11/2008, at 11:40 AM, James William Dumay wrote:
> 
> > Hey guys,
> > As mentioned on IRC we all agreed that our search feature is a  
> > little suboptimal.
> >
> > I would like to propose the following improvements:
> > * Search should be more like mvnrepository.com (showing codebase  
> > growth etc).
> > * We should figure out a way of using up stream repository indexes  
> > to improve search results.
> > * Advanced search needs a good rethink - we should probably use a  
> > filter approach so that you could do bytecode: search results that  
> > include free text search.
> > * UI improvements so that the user experience feels more intuitive.
> >
> > Discuss!
> >
> > James
> 
> --
> Brett Porter
> brett@apache.org
> http://blogs.exist.com/bporter/
>

Re: Search improvements for 1.2

Posted by Maria Odea Ching <oc...@apache.org>.

Hi All,

I've picked up what James has started in the archiva-nexus-indexer branch
[1]

So far, we're now able to generate an index in the Nexus format. A search
implementation which uses
the Nexus indexer for searching is still in the works. Aside from this,
there are still a number of search-related
issues/tasks that need to be done as James mentioned in his previous mail
(see below) before we're good to go for a 1.2-M2 release.
I'll try to stage a running instance by next week so you could see the
changes being done in the branch :)

Thanks,
Deng

[1] http://svn.apache.org/repos/asf/archiva/branches/archiva-nexus-indexer/

On Tue, Dec 16, 2008 at 10:23 AM, James William Dumay
<ja...@atlassian.com>wrote:

>
> On 15/12/2008, at 4:54 PM, James William Dumay wrote:
>
>  Once it is in trunk (as Deng suggested) I would like to remove the need
>> for separate indices to make sure we truly show no duplicate hits.
>>
>
> Changes have been committed to trunk and we still have quite a bit todo.
>
> See MRM-1037 for the tasks necessary to complete.
>
> M2 could be released once the subtask MRM-1045 is completed.
>
> Cheers
> James
>

-- 
Maria Odea Ching
Software Engineer | Exist Global | 687-4091 | Skype: maria.odea.ching |
www.exist.com | Innovation Delivered

Re: Search improvements for 1.2

Posted by James William Dumay <ja...@atlassian.com>.

On 15/12/2008, at 4:54 PM, James William Dumay wrote:

> Once it is in trunk (as Deng suggested) I would like to remove the  
> need
> for separate indices to make sure we truly show no duplicate hits.

Changes have been committed to trunk and we still have quite a bit todo.

See MRM-1037 for the tasks necessary to complete.

M2 could be released once the subtask MRM-1045 is completed.

Cheers
James

Re: Search improvements for 1.2

Posted by James William Dumay <ja...@atlassian.com>.

On Mon, 2008-12-15 at 16:19 +1100, Brett Porter wrote:
> Looks good to me too. Is it trunk ready already?

Almost - I need to cleanup a few unit tests before it goes into trunk.
Ill probably merge this in tonight.

Once it is in trunk (as Deng suggested) I would like to remove the need
for separate indices to make sure we truly show no duplicate hits. 

Cheers
James

Re: Search improvements for 1.2

Posted by Brett Porter <br...@apache.org>.

Looks good to me too. Is it trunk ready already?

On 15/12/2008, at 9:08 AM, Maria Odea Ching wrote:

> Looking good James :) I especially like the new version(s) link..
>
> Thanks for all the work you're putting on this! :)
>
> -Deng
>
> On Sat, Dec 13, 2008 at 2:17 PM, James William Dumay <james@atlassian.com 
> >wrote:
>
>> Hey guys,
>> I thought the versions could do with a come back:
>>
>>
>> http://skitch.com/jdumay/6cqs/archiva-search-w-no-duplicates-and-full-version-list
>>
>> What do you think? :)
>>
>> James
>>
>>
>>
>> On 13/12/2008, at 4:46 PM, James William Dumay wrote:
>>
>> Hey guys,
>>> I've done some more work this weekend on the search results.
>>>
>>>
>>> http://skitch.com/jdumay/6cc2/archiva-search-results-w-no-duplicates-repository-id-or-versions-displayed
>>>
>>> So far:
>>> * There are no duplicate entries.
>>> * I've removed the link to the repository the result was from  
>>> (otherwise
>>> we might have two copies of "commons-lang" appearing in the  
>>> results but from
>>> different repositories. IMO, this information can be found on the  
>>> artifact
>>> info page).
>>> * Removed the versions available.
>>> * Clicking on the artifact link now takes you to browse the  
>>> versions.
>>>
>>> Thoughts?
>>>
>>> James
>>>
>>>
>>> On 05/12/2008, at 5:39 PM, Brett Porter wrote:
>>>
>>> I visited James today and we came up with this list of things to  
>>> fix a
>>>> bit more urgently (I'll be putting them in JIRA later).
>>>>
>>>> Already changed:
>>>> * remove dependencies/filecontent from the quick search - must  
>>>> not search
>>>> for dependencies on commons-lang
>>>> * change the default search to AND instead of OR
>>>>
>>>> Still to look at:
>>>> - [ ] improve the search results page
>>>>     - [ ] remove metadata files
>>>>     - [ ] merge versions in search results
>>>>     - [ ] for snapshots, just show SNAPSHOT, not timestamps
>>>>     - [ ] show hits in the results (this may not be possible or  
>>>> needed
>>>> with better results, however)
>>>>
>>>> - [ ] existing JIRA complaints
>>>>     - [ ] MRM-732 (tokenizing)
>>>>     - [ ] MRM-495 (weighting)
>>>>     - [ ] MRM-609 (windows bug - may be fixed)
>>>>     - [ ] MRM-933 (hit count, pagination completely busted)
>>>>
>>>> - [ ] advanced search
>>>>     - [ ] improve appearance and flexibility, maybe change to  
>>>> "add term"
>>>> buttons on the default search
>>>>     - [ ] class/package search is still flaky
>>>>         - [ ] might be the analyzer rules, etc. for splitting on  
>>>> '.'
>>>>
>>>> - [ ] browse improvements
>>>> - [ ] artifact version list should show basic shared project  
>>>> information
>>>> rather than having to drill into one version
>>>> - [ ] snapshot should go to a page that shows a list of versions
>>>>       (go to latest, but list previous snapshots)
>>>>
>>>> Thoughts?
>>>>
>>>> Cheers,
>>>> Brett
>>>>
>>>> On 14/11/2008, at 11:40 AM, James William Dumay wrote:
>>>>
>>>> Hey guys,
>>>>> As mentioned on IRC we all agreed that our search feature is a  
>>>>> little
>>>>> suboptimal.
>>>>>
>>>>> I would like to propose the following improvements:
>>>>> * Search should be more like mvnrepository.com (showing codebase  
>>>>> growth
>>>>> etc).
>>>>> * We should figure out a way of using up stream repository  
>>>>> indexes to
>>>>> improve search results.
>>>>> * Advanced search needs a good rethink - we should probably use  
>>>>> a filter
>>>>> approach so that you could do bytecode: search results that  
>>>>> include free
>>>>> text search.
>>>>> * UI improvements so that the user experience feels more  
>>>>> intuitive.
>>>>>
>>>>> Discuss!
>>>>>
>>>>> James
>>>>>
>>>>
>>>> --
>>>> Brett Porter
>>>> brett@apache.org
>>>> http://blogs.exist.com/bporter/
>>>>
>>>>
>>>
>>
>
>
> -- 
> Maria Odea Ching
> Software Engineer | Exist Global | 687-4091 | Skype:  
> maria.odea.ching |
> www.exist.com | Innovation Delivered

--
Brett Porter
brett@apache.org
http://blogs.exist.com/bporter/

Re: Search improvements for 1.2

Posted by Maria Odea Ching <oc...@apache.org>.

Looking good James :) I especially like the new version(s) link..

Thanks for all the work you're putting on this! :)

-Deng

On Sat, Dec 13, 2008 at 2:17 PM, James William Dumay <ja...@atlassian.com>wrote:

> Hey guys,
> I thought the versions could do with a come back:
>
>
> http://skitch.com/jdumay/6cqs/archiva-search-w-no-duplicates-and-full-version-list
>
> What do you think? :)
>
> James
>
>
>
> On 13/12/2008, at 4:46 PM, James William Dumay wrote:
>
>  Hey guys,
>> I've done some more work this weekend on the search results.
>>
>>
>> http://skitch.com/jdumay/6cc2/archiva-search-results-w-no-duplicates-repository-id-or-versions-displayed
>>
>> So far:
>> * There are no duplicate entries.
>> * I've removed the link to the repository the result was from (otherwise
>> we might have two copies of "commons-lang" appearing in the results but from
>> different repositories. IMO, this information can be found on the artifact
>> info page).
>> * Removed the versions available.
>> * Clicking on the artifact link now takes you to browse the versions.
>>
>> Thoughts?
>>
>> James
>>
>>
>> On 05/12/2008, at 5:39 PM, Brett Porter wrote:
>>
>>  I visited James today and we came up with this list of things to fix a
>>> bit more urgently (I'll be putting them in JIRA later).
>>>
>>> Already changed:
>>> * remove dependencies/filecontent from the quick search - must not search
>>> for dependencies on commons-lang
>>> * change the default search to AND instead of OR
>>>
>>> Still to look at:
>>>  - [ ] improve the search results page
>>>      - [ ] remove metadata files
>>>      - [ ] merge versions in search results
>>>      - [ ] for snapshots, just show SNAPSHOT, not timestamps
>>>      - [ ] show hits in the results (this may not be possible or needed
>>> with better results, however)
>>>
>>>  - [ ] existing JIRA complaints
>>>      - [ ] MRM-732 (tokenizing)
>>>      - [ ] MRM-495 (weighting)
>>>      - [ ] MRM-609 (windows bug - may be fixed)
>>>      - [ ] MRM-933 (hit count, pagination completely busted)
>>>
>>>  - [ ] advanced search
>>>      - [ ] improve appearance and flexibility, maybe change to "add term"
>>> buttons on the default search
>>>      - [ ] class/package search is still flaky
>>>          - [ ] might be the analyzer rules, etc. for splitting on '.'
>>>
>>> - [ ] browse improvements
>>>  - [ ] artifact version list should show basic shared project information
>>> rather than having to drill into one version
>>>  - [ ] snapshot should go to a page that shows a list of versions
>>>        (go to latest, but list previous snapshots)
>>>
>>> Thoughts?
>>>
>>> Cheers,
>>> Brett
>>>
>>> On 14/11/2008, at 11:40 AM, James William Dumay wrote:
>>>
>>>  Hey guys,
>>>> As mentioned on IRC we all agreed that our search feature is a little
>>>> suboptimal.
>>>>
>>>> I would like to propose the following improvements:
>>>> * Search should be more like mvnrepository.com (showing codebase growth
>>>> etc).
>>>> * We should figure out a way of using up stream repository indexes to
>>>> improve search results.
>>>> * Advanced search needs a good rethink - we should probably use a filter
>>>> approach so that you could do bytecode: search results that include free
>>>> text search.
>>>> * UI improvements so that the user experience feels more intuitive.
>>>>
>>>> Discuss!
>>>>
>>>> James
>>>>
>>>
>>> --
>>> Brett Porter
>>> brett@apache.org
>>> http://blogs.exist.com/bporter/
>>>
>>>
>>
>


-- 
Maria Odea Ching
Software Engineer | Exist Global | 687-4091 | Skype: maria.odea.ching |
www.exist.com | Innovation Delivered

Re: Search improvements for 1.2

Posted by James William Dumay <ja...@atlassian.com>.

Hey guys,
I thought the versions could do with a come back:

http://skitch.com/jdumay/6cqs/archiva-search-w-no-duplicates-and-full-version-list

What do you think? :)

James


On 13/12/2008, at 4:46 PM, James William Dumay wrote:

> Hey guys,
> I've done some more work this weekend on the search results.
>
> http://skitch.com/jdumay/6cc2/archiva-search-results-w-no-duplicates-repository-id-or-versions-displayed
>
> So far:
> * There are no duplicate entries.
> * I've removed the link to the repository the result was from  
> (otherwise we might have two copies of "commons-lang" appearing in  
> the results but from different repositories. IMO, this information  
> can be found on the artifact info page).
> * Removed the versions available.
> * Clicking on the artifact link now takes you to browse the versions.
>
> Thoughts?
>
> James
>
>
> On 05/12/2008, at 5:39 PM, Brett Porter wrote:
>
>> I visited James today and we came up with this list of things to  
>> fix a bit more urgently (I'll be putting them in JIRA later).
>>
>> Already changed:
>> * remove dependencies/filecontent from the quick search - must not  
>> search for dependencies on commons-lang
>> * change the default search to AND instead of OR
>>
>> Still to look at:
>>   - [ ] improve the search results page
>>       - [ ] remove metadata files
>>       - [ ] merge versions in search results
>>       - [ ] for snapshots, just show SNAPSHOT, not timestamps
>>       - [ ] show hits in the results (this may not be possible or  
>> needed with better results, however)
>>
>>   - [ ] existing JIRA complaints
>>       - [ ] MRM-732 (tokenizing)
>>       - [ ] MRM-495 (weighting)
>>       - [ ] MRM-609 (windows bug - may be fixed)
>>       - [ ] MRM-933 (hit count, pagination completely busted)
>>
>>   - [ ] advanced search
>>       - [ ] improve appearance and flexibility, maybe change to  
>> "add term" buttons on the default search
>>       - [ ] class/package search is still flaky
>>           - [ ] might be the analyzer rules, etc. for splitting on  
>> '.'
>>
>> - [ ] browse improvements
>>   - [ ] artifact version list should show basic shared project  
>> information rather than having to drill into one version
>>   - [ ] snapshot should go to a page that shows a list of versions
>>         (go to latest, but list previous snapshots)
>>
>> Thoughts?
>>
>> Cheers,
>> Brett
>>
>> On 14/11/2008, at 11:40 AM, James William Dumay wrote:
>>
>>> Hey guys,
>>> As mentioned on IRC we all agreed that our search feature is a  
>>> little suboptimal.
>>>
>>> I would like to propose the following improvements:
>>> * Search should be more like mvnrepository.com (showing codebase  
>>> growth etc).
>>> * We should figure out a way of using up stream repository indexes  
>>> to improve search results.
>>> * Advanced search needs a good rethink - we should probably use a  
>>> filter approach so that you could do bytecode: search results that  
>>> include free text search.
>>> * UI improvements so that the user experience feels more intuitive.
>>>
>>> Discuss!
>>>
>>> James
>>
>> --
>> Brett Porter
>> brett@apache.org
>> http://blogs.exist.com/bporter/
>>
>

Re: Search improvements for 1.2

Posted by James William Dumay <ja...@atlassian.com>.

Hey guys,
I've done some more work this weekend on the search results.

http://skitch.com/jdumay/6cc2/archiva-search-results-w-no-duplicates-repository-id-or-versions-displayed

So far:
* There are no duplicate entries.
* I've removed the link to the repository the result was from  
(otherwise we might have two copies of "commons-lang" appearing in the  
results but from different repositories. IMO, this information can be  
found on the artifact info page).
* Removed the versions available.
* Clicking on the artifact link now takes you to browse the versions.

Thoughts?

James


On 05/12/2008, at 5:39 PM, Brett Porter wrote:

> I visited James today and we came up with this list of things to fix  
> a bit more urgently (I'll be putting them in JIRA later).
>
> Already changed:
> * remove dependencies/filecontent from the quick search - must not  
> search for dependencies on commons-lang
> * change the default search to AND instead of OR
>
> Still to look at:
>    - [ ] improve the search results page
>        - [ ] remove metadata files
>        - [ ] merge versions in search results
>        - [ ] for snapshots, just show SNAPSHOT, not timestamps
>        - [ ] show hits in the results (this may not be possible or  
> needed with better results, however)
>
>    - [ ] existing JIRA complaints
>        - [ ] MRM-732 (tokenizing)
>        - [ ] MRM-495 (weighting)
>        - [ ] MRM-609 (windows bug - may be fixed)
>        - [ ] MRM-933 (hit count, pagination completely busted)
>
>    - [ ] advanced search
>        - [ ] improve appearance and flexibility, maybe change to  
> "add term" buttons on the default search
>        - [ ] class/package search is still flaky
>            - [ ] might be the analyzer rules, etc. for splitting on  
> '.'
>
> - [ ] browse improvements
>    - [ ] artifact version list should show basic shared project  
> information rather than having to drill into one version
>    - [ ] snapshot should go to a page that shows a list of versions
>          (go to latest, but list previous snapshots)
>
> Thoughts?
>
> Cheers,
> Brett
>
> On 14/11/2008, at 11:40 AM, James William Dumay wrote:
>
>> Hey guys,
>> As mentioned on IRC we all agreed that our search feature is a  
>> little suboptimal.
>>
>> I would like to propose the following improvements:
>> * Search should be more like mvnrepository.com (showing codebase  
>> growth etc).
>> * We should figure out a way of using up stream repository indexes  
>> to improve search results.
>> * Advanced search needs a good rethink - we should probably use a  
>> filter approach so that you could do bytecode: search results that  
>> include free text search.
>> * UI improvements so that the user experience feels more intuitive.
>>
>> Discuss!
>>
>> James
>
> --
> Brett Porter
> brett@apache.org
> http://blogs.exist.com/bporter/
>

Re: Search improvements for 1.2

Posted by Brett Porter <br...@apache.org>.

I visited James today and we came up with this list of things to fix a  
bit more urgently (I'll be putting them in JIRA later).

Already changed:
* remove dependencies/filecontent from the quick search - must not  
search for dependencies on commons-lang
* change the default search to AND instead of OR

Still to look at:
     - [ ] improve the search results page
         - [ ] remove metadata files
         - [ ] merge versions in search results
         - [ ] for snapshots, just show SNAPSHOT, not timestamps
         - [ ] show hits in the results (this may not be possible or  
needed with better results, however)

     - [ ] existing JIRA complaints
         - [ ] MRM-732 (tokenizing)
         - [ ] MRM-495 (weighting)
         - [ ] MRM-609 (windows bug - may be fixed)
         - [ ] MRM-933 (hit count, pagination completely busted)

     - [ ] advanced search
         - [ ] improve appearance and flexibility, maybe change to  
"add term" buttons on the default search
         - [ ] class/package search is still flaky
             - [ ] might be the analyzer rules, etc. for splitting on  
'.'

- [ ] browse improvements
     - [ ] artifact version list should show basic shared project  
information rather than having to drill into one version
     - [ ] snapshot should go to a page that shows a list of versions
           (go to latest, but list previous snapshots)

Thoughts?

Cheers,
Brett

On 14/11/2008, at 11:40 AM, James William Dumay wrote:

> Hey guys,
> As mentioned on IRC we all agreed that our search feature is a  
> little suboptimal.
>
> I would like to propose the following improvements:
> * Search should be more like mvnrepository.com (showing codebase  
> growth etc).
> * We should figure out a way of using up stream repository indexes  
> to improve search results.
> * Advanced search needs a good rethink - we should probably use a  
> filter approach so that you could do bytecode: search results that  
> include free text search.
> * UI improvements so that the user experience feels more intuitive.
>
> Discuss!
>
> James

--
Brett Porter
brett@apache.org
http://blogs.exist.com/bporter/