You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@archiva.apache.org by Brett Porter <br...@apache.org> on 2008/12/05 07:39:25 UTC

Re: Search improvements for 1.2

I visited James today and we came up with this list of things to fix a  
bit more urgently (I'll be putting them in JIRA later).

Already changed:
* remove dependencies/filecontent from the quick search - must not  
search for dependencies on commons-lang
* change the default search to AND instead of OR

Still to look at:
     - [ ] improve the search results page
         - [ ] remove metadata files
         - [ ] merge versions in search results
         - [ ] for snapshots, just show SNAPSHOT, not timestamps
         - [ ] show hits in the results (this may not be possible or  
needed with better results, however)

     - [ ] existing JIRA complaints
         - [ ] MRM-732 (tokenizing)
         - [ ] MRM-495 (weighting)
         - [ ] MRM-609 (windows bug - may be fixed)
         - [ ] MRM-933 (hit count, pagination completely busted)

     - [ ] advanced search
         - [ ] improve appearance and flexibility, maybe change to  
"add term" buttons on the default search
         - [ ] class/package search is still flaky
             - [ ] might be the analyzer rules, etc. for splitting on  
'.'

- [ ] browse improvements
     - [ ] artifact version list should show basic shared project  
information rather than having to drill into one version
     - [ ] snapshot should go to a page that shows a list of versions
           (go to latest, but list previous snapshots)

Thoughts?

Cheers,
Brett

On 14/11/2008, at 11:40 AM, James William Dumay wrote:

> Hey guys,
> As mentioned on IRC we all agreed that our search feature is a  
> little suboptimal.
>
> I would like to propose the following improvements:
> * Search should be more like mvnrepository.com (showing codebase  
> growth etc).
> * We should figure out a way of using up stream repository indexes  
> to improve search results.
> * Advanced search needs a good rethink - we should probably use a  
> filter approach so that you could do bytecode: search results that  
> include free text search.
> * UI improvements so that the user experience feels more intuitive.
>
> Discuss!
>
> James

--
Brett Porter
brett@apache.org
http://blogs.exist.com/bporter/


Re: Search improvements for 1.2

Posted by James William Dumay <ja...@atlassian.com>.
On Mon, 2008-12-08 at 19:54 -0500, Wendy Smoak wrote:
> I notice the label says "Version(s)" but each one is on a separate
> line.  Should that just be "Version" or is the intent to group all
> versions for the same groupId/artifactId as one search result?  -Wendy

Yeah, the intent is to collapse it all into a single result - Ill be
making that part of the current set of changes.

James


Re: Search improvements for 1.2

Posted by Wendy Smoak <ws...@gmail.com>.
I notice the label says "Version(s)" but each one is on a separate
line.  Should that just be "Version" or is the intent to group all
versions for the same groupId/artifactId as one search result?  -Wendy

On Sun, Dec 7, 2008 at 8:06 PM, James William Dumay <ja...@atlassian.com> wrote:
> Also, you can see the fruits of our Fridays labour here:
>
> Without search improvements:
> http://archiva.exist.com/quickSearch.action?q=commons+lang
>
> With search improvements:
> https://maven.atlassian.com/quickSearch.action?q=commons+lang
>
> Cheers
> James

Re: Search improvements for 1.2

Posted by James William Dumay <ja...@atlassian.com>.
Also, you can see the fruits of our Fridays labour here:

Without search improvements:
http://archiva.exist.com/quickSearch.action?q=commons+lang

With search improvements:
https://maven.atlassian.com/quickSearch.action?q=commons+lang

Cheers
James



On Fri, 2008-12-05 at 17:39 +1100, Brett Porter wrote:
> I visited James today and we came up with this list of things to fix a  
> bit more urgently (I'll be putting them in JIRA later).
> 
> Already changed:
> * remove dependencies/filecontent from the quick search - must not  
> search for dependencies on commons-lang
> * change the default search to AND instead of OR
> 
> Still to look at:
>      - [ ] improve the search results page
>          - [ ] remove metadata files
>          - [ ] merge versions in search results
>          - [ ] for snapshots, just show SNAPSHOT, not timestamps
>          - [ ] show hits in the results (this may not be possible or  
> needed with better results, however)
> 
>      - [ ] existing JIRA complaints
>          - [ ] MRM-732 (tokenizing)
>          - [ ] MRM-495 (weighting)
>          - [ ] MRM-609 (windows bug - may be fixed)
>          - [ ] MRM-933 (hit count, pagination completely busted)
> 
>      - [ ] advanced search
>          - [ ] improve appearance and flexibility, maybe change to  
> "add term" buttons on the default search
>          - [ ] class/package search is still flaky
>              - [ ] might be the analyzer rules, etc. for splitting on  
> '.'
> 
> - [ ] browse improvements
>      - [ ] artifact version list should show basic shared project  
> information rather than having to drill into one version
>      - [ ] snapshot should go to a page that shows a list of versions
>            (go to latest, but list previous snapshots)
> 
> Thoughts?
> 
> Cheers,
> Brett
> 
> On 14/11/2008, at 11:40 AM, James William Dumay wrote:
> 
> > Hey guys,
> > As mentioned on IRC we all agreed that our search feature is a  
> > little suboptimal.
> >
> > I would like to propose the following improvements:
> > * Search should be more like mvnrepository.com (showing codebase  
> > growth etc).
> > * We should figure out a way of using up stream repository indexes  
> > to improve search results.
> > * Advanced search needs a good rethink - we should probably use a  
> > filter approach so that you could do bytecode: search results that  
> > include free text search.
> > * UI improvements so that the user experience feels more intuitive.
> >
> > Discuss!
> >
> > James
> 
> --
> Brett Porter
> brett@apache.org
> http://blogs.exist.com/bporter/
> 


Re: Search improvements for 1.2

Posted by Maria Odea Ching <oc...@apache.org>.
Hi All,

I've picked up what James has started in the archiva-nexus-indexer branch
[1]

So far, we're now able to generate an index in the Nexus format. A search
implementation which uses
the Nexus indexer for searching is still in the works. Aside from this,
there are still a number of search-related
issues/tasks that need to be done as James mentioned in his previous mail
(see below) before we're good to go for a 1.2-M2 release.
I'll try to stage a running instance by next week so you could see the
changes being done in the branch :)

Thanks,
Deng

[1] http://svn.apache.org/repos/asf/archiva/branches/archiva-nexus-indexer/

On Tue, Dec 16, 2008 at 10:23 AM, James William Dumay
<ja...@atlassian.com>wrote:

>
> On 15/12/2008, at 4:54 PM, James William Dumay wrote:
>
>  Once it is in trunk (as Deng suggested) I would like to remove the need
>> for separate indices to make sure we truly show no duplicate hits.
>>
>
> Changes have been committed to trunk and we still have quite a bit todo.
>
> See MRM-1037 for the tasks necessary to complete.
>
> M2 could be released once the subtask MRM-1045 is completed.
>
> Cheers
> James
>



-- 
Maria Odea Ching
Software Engineer | Exist Global | 687-4091 | Skype: maria.odea.ching |
www.exist.com | Innovation Delivered

Re: Search improvements for 1.2

Posted by James William Dumay <ja...@atlassian.com>.
On 15/12/2008, at 4:54 PM, James William Dumay wrote:

> Once it is in trunk (as Deng suggested) I would like to remove the  
> need
> for separate indices to make sure we truly show no duplicate hits.

Changes have been committed to trunk and we still have quite a bit todo.

See MRM-1037 for the tasks necessary to complete.

M2 could be released once the subtask MRM-1045 is completed.

Cheers
James

Re: Search improvements for 1.2

Posted by James William Dumay <ja...@atlassian.com>.
On Mon, 2008-12-15 at 16:19 +1100, Brett Porter wrote:
> Looks good to me too. Is it trunk ready already?

Almost - I need to cleanup a few unit tests before it goes into trunk.
Ill probably merge this in tonight.

Once it is in trunk (as Deng suggested) I would like to remove the need
for separate indices to make sure we truly show no duplicate hits. 

Cheers
James


Re: Search improvements for 1.2

Posted by Brett Porter <br...@apache.org>.
Looks good to me too. Is it trunk ready already?

On 15/12/2008, at 9:08 AM, Maria Odea Ching wrote:

> Looking good James :) I especially like the new version(s) link..
>
> Thanks for all the work you're putting on this! :)
>
> -Deng
>
> On Sat, Dec 13, 2008 at 2:17 PM, James William Dumay <james@atlassian.com 
> >wrote:
>
>> Hey guys,
>> I thought the versions could do with a come back:
>>
>>
>> http://skitch.com/jdumay/6cqs/archiva-search-w-no-duplicates-and-full-version-list
>>
>> What do you think? :)
>>
>> James
>>
>>
>>
>> On 13/12/2008, at 4:46 PM, James William Dumay wrote:
>>
>> Hey guys,
>>> I've done some more work this weekend on the search results.
>>>
>>>
>>> http://skitch.com/jdumay/6cc2/archiva-search-results-w-no-duplicates-repository-id-or-versions-displayed
>>>
>>> So far:
>>> * There are no duplicate entries.
>>> * I've removed the link to the repository the result was from  
>>> (otherwise
>>> we might have two copies of "commons-lang" appearing in the  
>>> results but from
>>> different repositories. IMO, this information can be found on the  
>>> artifact
>>> info page).
>>> * Removed the versions available.
>>> * Clicking on the artifact link now takes you to browse the  
>>> versions.
>>>
>>> Thoughts?
>>>
>>> James
>>>
>>>
>>> On 05/12/2008, at 5:39 PM, Brett Porter wrote:
>>>
>>> I visited James today and we came up with this list of things to  
>>> fix a
>>>> bit more urgently (I'll be putting them in JIRA later).
>>>>
>>>> Already changed:
>>>> * remove dependencies/filecontent from the quick search - must  
>>>> not search
>>>> for dependencies on commons-lang
>>>> * change the default search to AND instead of OR
>>>>
>>>> Still to look at:
>>>> - [ ] improve the search results page
>>>>     - [ ] remove metadata files
>>>>     - [ ] merge versions in search results
>>>>     - [ ] for snapshots, just show SNAPSHOT, not timestamps
>>>>     - [ ] show hits in the results (this may not be possible or  
>>>> needed
>>>> with better results, however)
>>>>
>>>> - [ ] existing JIRA complaints
>>>>     - [ ] MRM-732 (tokenizing)
>>>>     - [ ] MRM-495 (weighting)
>>>>     - [ ] MRM-609 (windows bug - may be fixed)
>>>>     - [ ] MRM-933 (hit count, pagination completely busted)
>>>>
>>>> - [ ] advanced search
>>>>     - [ ] improve appearance and flexibility, maybe change to  
>>>> "add term"
>>>> buttons on the default search
>>>>     - [ ] class/package search is still flaky
>>>>         - [ ] might be the analyzer rules, etc. for splitting on  
>>>> '.'
>>>>
>>>> - [ ] browse improvements
>>>> - [ ] artifact version list should show basic shared project  
>>>> information
>>>> rather than having to drill into one version
>>>> - [ ] snapshot should go to a page that shows a list of versions
>>>>       (go to latest, but list previous snapshots)
>>>>
>>>> Thoughts?
>>>>
>>>> Cheers,
>>>> Brett
>>>>
>>>> On 14/11/2008, at 11:40 AM, James William Dumay wrote:
>>>>
>>>> Hey guys,
>>>>> As mentioned on IRC we all agreed that our search feature is a  
>>>>> little
>>>>> suboptimal.
>>>>>
>>>>> I would like to propose the following improvements:
>>>>> * Search should be more like mvnrepository.com (showing codebase  
>>>>> growth
>>>>> etc).
>>>>> * We should figure out a way of using up stream repository  
>>>>> indexes to
>>>>> improve search results.
>>>>> * Advanced search needs a good rethink - we should probably use  
>>>>> a filter
>>>>> approach so that you could do bytecode: search results that  
>>>>> include free
>>>>> text search.
>>>>> * UI improvements so that the user experience feels more  
>>>>> intuitive.
>>>>>
>>>>> Discuss!
>>>>>
>>>>> James
>>>>>
>>>>
>>>> --
>>>> Brett Porter
>>>> brett@apache.org
>>>> http://blogs.exist.com/bporter/
>>>>
>>>>
>>>
>>
>
>
> -- 
> Maria Odea Ching
> Software Engineer | Exist Global | 687-4091 | Skype:  
> maria.odea.ching |
> www.exist.com | Innovation Delivered

--
Brett Porter
brett@apache.org
http://blogs.exist.com/bporter/


Re: Search improvements for 1.2

Posted by Maria Odea Ching <oc...@apache.org>.
Looking good James :) I especially like the new version(s) link..

Thanks for all the work you're putting on this! :)

-Deng

On Sat, Dec 13, 2008 at 2:17 PM, James William Dumay <ja...@atlassian.com>wrote:

> Hey guys,
> I thought the versions could do with a come back:
>
>
> http://skitch.com/jdumay/6cqs/archiva-search-w-no-duplicates-and-full-version-list
>
> What do you think? :)
>
> James
>
>
>
> On 13/12/2008, at 4:46 PM, James William Dumay wrote:
>
>  Hey guys,
>> I've done some more work this weekend on the search results.
>>
>>
>> http://skitch.com/jdumay/6cc2/archiva-search-results-w-no-duplicates-repository-id-or-versions-displayed
>>
>> So far:
>> * There are no duplicate entries.
>> * I've removed the link to the repository the result was from (otherwise
>> we might have two copies of "commons-lang" appearing in the results but from
>> different repositories. IMO, this information can be found on the artifact
>> info page).
>> * Removed the versions available.
>> * Clicking on the artifact link now takes you to browse the versions.
>>
>> Thoughts?
>>
>> James
>>
>>
>> On 05/12/2008, at 5:39 PM, Brett Porter wrote:
>>
>>  I visited James today and we came up with this list of things to fix a
>>> bit more urgently (I'll be putting them in JIRA later).
>>>
>>> Already changed:
>>> * remove dependencies/filecontent from the quick search - must not search
>>> for dependencies on commons-lang
>>> * change the default search to AND instead of OR
>>>
>>> Still to look at:
>>>  - [ ] improve the search results page
>>>      - [ ] remove metadata files
>>>      - [ ] merge versions in search results
>>>      - [ ] for snapshots, just show SNAPSHOT, not timestamps
>>>      - [ ] show hits in the results (this may not be possible or needed
>>> with better results, however)
>>>
>>>  - [ ] existing JIRA complaints
>>>      - [ ] MRM-732 (tokenizing)
>>>      - [ ] MRM-495 (weighting)
>>>      - [ ] MRM-609 (windows bug - may be fixed)
>>>      - [ ] MRM-933 (hit count, pagination completely busted)
>>>
>>>  - [ ] advanced search
>>>      - [ ] improve appearance and flexibility, maybe change to "add term"
>>> buttons on the default search
>>>      - [ ] class/package search is still flaky
>>>          - [ ] might be the analyzer rules, etc. for splitting on '.'
>>>
>>> - [ ] browse improvements
>>>  - [ ] artifact version list should show basic shared project information
>>> rather than having to drill into one version
>>>  - [ ] snapshot should go to a page that shows a list of versions
>>>        (go to latest, but list previous snapshots)
>>>
>>> Thoughts?
>>>
>>> Cheers,
>>> Brett
>>>
>>> On 14/11/2008, at 11:40 AM, James William Dumay wrote:
>>>
>>>  Hey guys,
>>>> As mentioned on IRC we all agreed that our search feature is a little
>>>> suboptimal.
>>>>
>>>> I would like to propose the following improvements:
>>>> * Search should be more like mvnrepository.com (showing codebase growth
>>>> etc).
>>>> * We should figure out a way of using up stream repository indexes to
>>>> improve search results.
>>>> * Advanced search needs a good rethink - we should probably use a filter
>>>> approach so that you could do bytecode: search results that include free
>>>> text search.
>>>> * UI improvements so that the user experience feels more intuitive.
>>>>
>>>> Discuss!
>>>>
>>>> James
>>>>
>>>
>>> --
>>> Brett Porter
>>> brett@apache.org
>>> http://blogs.exist.com/bporter/
>>>
>>>
>>
>


-- 
Maria Odea Ching
Software Engineer | Exist Global | 687-4091 | Skype: maria.odea.ching |
www.exist.com | Innovation Delivered

Re: Search improvements for 1.2

Posted by James William Dumay <ja...@atlassian.com>.
Hey guys,
I thought the versions could do with a come back:

http://skitch.com/jdumay/6cqs/archiva-search-w-no-duplicates-and-full-version-list

What do you think? :)

James


On 13/12/2008, at 4:46 PM, James William Dumay wrote:

> Hey guys,
> I've done some more work this weekend on the search results.
>
> http://skitch.com/jdumay/6cc2/archiva-search-results-w-no-duplicates-repository-id-or-versions-displayed
>
> So far:
> * There are no duplicate entries.
> * I've removed the link to the repository the result was from  
> (otherwise we might have two copies of "commons-lang" appearing in  
> the results but from different repositories. IMO, this information  
> can be found on the artifact info page).
> * Removed the versions available.
> * Clicking on the artifact link now takes you to browse the versions.
>
> Thoughts?
>
> James
>
>
> On 05/12/2008, at 5:39 PM, Brett Porter wrote:
>
>> I visited James today and we came up with this list of things to  
>> fix a bit more urgently (I'll be putting them in JIRA later).
>>
>> Already changed:
>> * remove dependencies/filecontent from the quick search - must not  
>> search for dependencies on commons-lang
>> * change the default search to AND instead of OR
>>
>> Still to look at:
>>   - [ ] improve the search results page
>>       - [ ] remove metadata files
>>       - [ ] merge versions in search results
>>       - [ ] for snapshots, just show SNAPSHOT, not timestamps
>>       - [ ] show hits in the results (this may not be possible or  
>> needed with better results, however)
>>
>>   - [ ] existing JIRA complaints
>>       - [ ] MRM-732 (tokenizing)
>>       - [ ] MRM-495 (weighting)
>>       - [ ] MRM-609 (windows bug - may be fixed)
>>       - [ ] MRM-933 (hit count, pagination completely busted)
>>
>>   - [ ] advanced search
>>       - [ ] improve appearance and flexibility, maybe change to  
>> "add term" buttons on the default search
>>       - [ ] class/package search is still flaky
>>           - [ ] might be the analyzer rules, etc. for splitting on  
>> '.'
>>
>> - [ ] browse improvements
>>   - [ ] artifact version list should show basic shared project  
>> information rather than having to drill into one version
>>   - [ ] snapshot should go to a page that shows a list of versions
>>         (go to latest, but list previous snapshots)
>>
>> Thoughts?
>>
>> Cheers,
>> Brett
>>
>> On 14/11/2008, at 11:40 AM, James William Dumay wrote:
>>
>>> Hey guys,
>>> As mentioned on IRC we all agreed that our search feature is a  
>>> little suboptimal.
>>>
>>> I would like to propose the following improvements:
>>> * Search should be more like mvnrepository.com (showing codebase  
>>> growth etc).
>>> * We should figure out a way of using up stream repository indexes  
>>> to improve search results.
>>> * Advanced search needs a good rethink - we should probably use a  
>>> filter approach so that you could do bytecode: search results that  
>>> include free text search.
>>> * UI improvements so that the user experience feels more intuitive.
>>>
>>> Discuss!
>>>
>>> James
>>
>> --
>> Brett Porter
>> brett@apache.org
>> http://blogs.exist.com/bporter/
>>
>


Re: Search improvements for 1.2

Posted by James William Dumay <ja...@atlassian.com>.
Hey guys,
I've done some more work this weekend on the search results.

http://skitch.com/jdumay/6cc2/archiva-search-results-w-no-duplicates-repository-id-or-versions-displayed

So far:
* There are no duplicate entries.
* I've removed the link to the repository the result was from  
(otherwise we might have two copies of "commons-lang" appearing in the  
results but from different repositories. IMO, this information can be  
found on the artifact info page).
* Removed the versions available.
* Clicking on the artifact link now takes you to browse the versions.

Thoughts?

James


On 05/12/2008, at 5:39 PM, Brett Porter wrote:

> I visited James today and we came up with this list of things to fix  
> a bit more urgently (I'll be putting them in JIRA later).
>
> Already changed:
> * remove dependencies/filecontent from the quick search - must not  
> search for dependencies on commons-lang
> * change the default search to AND instead of OR
>
> Still to look at:
>    - [ ] improve the search results page
>        - [ ] remove metadata files
>        - [ ] merge versions in search results
>        - [ ] for snapshots, just show SNAPSHOT, not timestamps
>        - [ ] show hits in the results (this may not be possible or  
> needed with better results, however)
>
>    - [ ] existing JIRA complaints
>        - [ ] MRM-732 (tokenizing)
>        - [ ] MRM-495 (weighting)
>        - [ ] MRM-609 (windows bug - may be fixed)
>        - [ ] MRM-933 (hit count, pagination completely busted)
>
>    - [ ] advanced search
>        - [ ] improve appearance and flexibility, maybe change to  
> "add term" buttons on the default search
>        - [ ] class/package search is still flaky
>            - [ ] might be the analyzer rules, etc. for splitting on  
> '.'
>
> - [ ] browse improvements
>    - [ ] artifact version list should show basic shared project  
> information rather than having to drill into one version
>    - [ ] snapshot should go to a page that shows a list of versions
>          (go to latest, but list previous snapshots)
>
> Thoughts?
>
> Cheers,
> Brett
>
> On 14/11/2008, at 11:40 AM, James William Dumay wrote:
>
>> Hey guys,
>> As mentioned on IRC we all agreed that our search feature is a  
>> little suboptimal.
>>
>> I would like to propose the following improvements:
>> * Search should be more like mvnrepository.com (showing codebase  
>> growth etc).
>> * We should figure out a way of using up stream repository indexes  
>> to improve search results.
>> * Advanced search needs a good rethink - we should probably use a  
>> filter approach so that you could do bytecode: search results that  
>> include free text search.
>> * UI improvements so that the user experience feels more intuitive.
>>
>> Discuss!
>>
>> James
>
> --
> Brett Porter
> brett@apache.org
> http://blogs.exist.com/bporter/
>