You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@archiva.apache.org by Joakim Erdfelt <jo...@erdfelt.com> on 2007/04/10 22:25:07 UTC
State of the Archiva (April 2007)
State of the Archiva (April 2007)
:: PRESENT ::
Work is continuing on the archiva-jpox-database branch.
Many many improvements exist currently in that branch.
The core / base / database changes have settled down, now the work
in webapp continues to take advantage of the changes made in archiva-base.
:: FUTURE ::
BRANCH to TRUNK merge.
In roughly 2 weeks time, the branch will come up for a vote to be merged
with trunk. When this gets approval, the new trunk will undergo a
complete
review with regards to existing jiras to determine if they still exist, or
can be closed as fixed.
This is a good time to update the documentation to reflect the current
archiva UI and configuration process.
RELEASES.
Once the critical jiras have been closed, the initial release of Archiva
1.0-alpha-1 should be cut.
After this has occured, progress will continue on 1.0 following the
outline in http://docs.codehaus.org/display/MAVENUSER/Archiva+Roadmap
When we have a feature complete 1.0 (as per the roadmap) we'll start
the 1.0-M1 release.
When we have 2 solid weeks on a Milestone release without major bug
reports we'll start the vote for 1.0 final.
:: CHANGES IN BRANCH ::
First: lets show you a brief tour of the directories.
archiva/branches/archiva-jpox-database-refactor/
|-- archiva-base/
| |-- archiva-common/
| |-- archiva-configuration/
| |-- archiva-consumers/ (NEW)
| | |-- archiva-consumer-api/ (NEW)
| | |-- archiva-core-consumers/ (NEW)
| | |-- archiva-database-consumers/ (NEW)
| | |-- archiva-lucene-consumers/ (NEW)
| | `-- archiva-signature-consumers/ (NEW)
| |-- archiva-converter/
| |-- archiva-indexer/
| |-- archiva-model/ (NEW)
| |-- archiva-proxy/
| |-- archiva-repository-layer/
| |-- archiva-scheduled/ (NEW)
| `-- archiva-xml-tools/ (NEW)
|-- archiva-cli/
|-- archiva-database/ (NEW)
|-- archiva-reporting/
| `-- archiva-report-manager/ (Was
archiva-reports-standard)
|-- archiva-site/
|-- archiva-web/
| |-- archiva-applet/
| |-- archiva-security/
| |-- archiva-standalone/
| | |-- archiva-plexus-application/
| | `-- archiva-plexus-runtime/
| |-- archiva-webapp/
| `-- archiva-webapp-test/
|-- design/
| |-- logos/
| `-- white-site/
`-- maven-meeper/
MAJOR CHANGES:
modules refactored out of existance:
* archiva-discoverer
The classes here have been simplified and merged with
archiva-repository-layer.
* archiva-core
The classes in here have been moved to archiva-repository-layer,
archiva-common, archiva-model, and archiva-consumer-api
The archiva-repository-layer module is now the nexus for all things
that work against the repository.
The role of archiva.xml configuration file has been changed from being
the canonical source for 'configured' repositories, to being a bootstrap
for configured repositories stored and maintained in the database, and
the list of active consumers to use in the various stages of content
consumption. (more on that later)
The use of maven-artifact and maven-project has been removed as the
assumptions present in each (everything is for the purposes of a build)
are inappropriate for archiva and jpox. The new inbuilt replacements
are more resilient to missing referenced data.
Terminology:
I had to establish a new set of terminology to describe bits in
the database.
Name | Group ID | Artifact ID | Version | Classifier | Type |
------------+----------+-------------+---------+------------+------+
Project | yes | yes | | | |
Versioned | yes | yes | yes | | |
Artifact | yes | yes | yes | yes | yes |
These terms (Project, Versioned, Artifact) describes the heirarchy that
is present in the repository.
1 Project can contain multiple versions, each version can contain
multiple artifacts.
CONTENT SCANNING:
The scanning of content from the repository occurs in 2 major stages.
Major Stage 1: Scan of repository filesystem.
Artifacts Stage:
a) Find the new artifacts and put them into the database as
unprocessed.
b) Find the maven-metadata.xml and put them into the database.
c) Validate checksums (and report issues).
d) Create missing checksums.
Content Stage:
a) Index content (lucene)
Bad Content Stage:
a) Auto remove known bad content.
b) Auto rename known common filename issues.
c) Flag remaining unknown content as bad (in report).
Major Stage 2: Scan of artifacts from database.
Unprocessed Artifacts Stage:
a) Find pom artifacts and load project model into database.
b) Index artifact details (lucene)
c) Validate repository metadata.
d) Index archiva table of contents (lucene)
e) Update bytecode information in artifact-java-details.
f) Index public methods (lucene)
Processed Artifacts Stage:
a) Artifact not present in filesystem, remove artifact from db.
b) Artifact of type 'pom' not present in filesystem, remove project
model from db
c) Artifact not present in filesystem, remove from lucene index.
The benefit of these stages is that it allows the content to be found on
the filesystem and be made available to the users via the browse interface
relatively quickly. (Takes about 6 minutes to scan all of ibiblio this
way)
If a user happens to request an versioned project browse that has
yet to undergo the Major Stage 2, a 'Just in Time' scan of that specific
project is done.
The repository scan has been changed to include all content "**/*" and
specifically exclude known ignorable content. For each discovered file
a determination is made to see if it falls into the Artifact list or
the Content list, if it doesnt' fall into those two lists.
The archiva.xml contains the lists of patterns for ...
a) Artifacts
b) Indexable Content
c) Auto-Remove
d) Ignored
For latest, in code, lists see: http://tinyurl.com/2hbzoc
CONSUMER API:
This is a fundamental part of how archiva knows what to do with the
content
it is tracking.
We have 2 major consumer api interfaces.
RepositoryContentConsumer - http://tinyurl.com/28roxn
This consumer interface is used for those consumers that want to operate
on the raw files in the repository filesystem.
ArchivaArtifactConsumer - http://tinyurl.com/2s2blk
This consumer interface is used for those consumers that want to operate
on artifacts. Those consumers operating on the second major phase (as
outlined above as the Database Scan) should use this interface.
This allows for a very simplified content scan and manipulation in
archiva.
- Joakim Erdfelt
Re: State of the Archiva (April 2007)
Posted by Brett Porter <br...@apache.org>.
Mostly sounds good.
Firstly, all this stuff needs to become some sort of code
documentation. I'm regularly hearing feedback that it's hard to find
a way in to this stuff.
Comments inline...
On 11/04/2007, at 6:25 AM, Joakim Erdfelt wrote:
> The role of archiva.xml configuration file has been changed from
> being
> the canonical source for 'configured' repositories, to being a
> bootstrap
> for configured repositories stored and maintained in the
> database, and
> the list of active consumers to use in the various stages of content
> consumption. (more on that later)
I'd like to hear more on that, and the reason why. It sounds very
confusing.
>
> The use of maven-artifact and maven-project has been removed as the
> assumptions present in each (everything is for the purposes of a
> build)
> are inappropriate for archiva and jpox. The new inbuilt
> replacements
> are more resilient to missing referenced data.
That seems more of a flaw in those libraries than anything. maven-
artifact should not be build specific.
Maybe we need to look at using the reasons as impetus for change in
maven 2.1? Duplicating that code seems like a long term maintenance
risk.
Was it also removed from the proxying? I haven't had a chance to look
yet, but that sounds like a lot of duplicated functionality.
>
> The benefit of these stages is that it allows the content to be
> found on
> the filesystem and be made available to the users via the browse
> interface
> relatively quickly. (Takes about 6 minutes to scan all of ibiblio
> this
> way)
From scratch? Presumably when it's unchanged it's much faster?
> c) Auto-Remove
what are these?
Thanks for putting this together Joakim. Looking forward to kicking
the tires. Also, I'll have a closer look at the code - I'll hold off
on that until its close to being ready for trunk though.
Cheers,
Brett