| ------ |
| Understanding Repository Configuration of Apache Archiva |
| ------ |
| Maria Odea Ching |
| Olivier Lamy |
| ------ |
| 2013-02-07 |
| ------ |
| |
| ~~ Licensed to the Apache Software Foundation (ASF) under one |
| ~~ or more contributor license agreements. See the NOTICE file |
| ~~ distributed with this work for additional information |
| ~~ regarding copyright ownership. The ASF licenses this file |
| ~~ to you under the Apache License, Version 2.0 (the |
| ~~ "License"); you may not use this file except in compliance |
| ~~ with the License. You may obtain a copy of the License at |
| ~~ |
| ~~ http://www.apache.org/licenses/LICENSE-2.0 |
| ~~ |
| ~~ Unless required by applicable law or agreed to in writing, |
| ~~ software distributed under the License is distributed on an |
| ~~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY |
| ~~ KIND, either express or implied. See the License for the |
| ~~ specific language governing permissions and limitations |
| ~~ under the License. |
| |
| ~~ NOTE: For help with the syntax of this file, see: |
| ~~ http://maven.apache.org/guides/mini/guide-apt-format.html |
| |
| Understanding Repository Configuration of Apache Archiva |
| |
| ~~TODO: revise more as suggested by Jeff in the dev list |
| |
| Archiva has two types of repository configuration: managed repository and |
| remote repository. |
| |
| * Managed Repository |
| |
| A managed repository is a repository which resides locally to the server where |
| Archiva is running. It could serve as a proxy repository, an internal deployment |
| repository or a local mirror repository. |
| |
| Managed repository fields: |
| |
| * <<identifier>> - the id of the repository. This must be unique. |
| |
| * <<name>> - the name of the repository. |
| |
| * <<directory>> - the location of the repository. If the path specified does not |
| exist, Archiva will create the missing directories. |
| |
| * <<index directory>> - the location of the index files generated by Archiva. If |
| no location is specified, then the index directory (named <<<.indexer>>>) |
| will be created at the root of the repository directory. |
| This directory contains the packaged/bundled index which is consumed by different consumers of the index such as M2Eclipse. |
| |
| * <<type>> - the repository layout (maven 2 or maven 1) |
| |
| * <<cron>> - the |
| {{{http://quartz-scheduler.org/api/2.1.5/org/quartz/CronTrigger.html}cron schedule}} when |
| repository scanning will be executed. |
| |
| * <<repository purge by days older>> - the first option for repository purge. |
| Archiva will check how old the artifact is and if it is older than the set |
| number of days in this field, then the artifact will be deleted respecting |
| the retention count of course. In order to disable the purge by |
| number of days old and set Archiva to purge by retention count, just set the |
| repository purge field to 0. The maximum number of days which can be set |
| here is 1000. See the Repository Purge section below for more details. |
| ~~ above was:the retention count (see #7) of course no idea what is was linkeed to |
| |
| * <<repository purge by retention count>> - the second option for repository |
| purge. When running the repository purge, Archiva will retain only the |
| number of artifacts set for this field for a specific snapshot version. See |
| the Repository Purge section below for more details. |
| |
| * <<releases included>> - specifies whether there are released artifacts in the |
| repository. |
| |
| * <<block re-deployment of released artifacts>> - specifies whether released |
| artifacts that are already existing in the repository can be overwritten. |
| Note that this only take effects for non-snapshot deployments. |
| |
| * <<snapshots included>> - specifies whether there are snapshot artifacts in the |
| repository. |
| |
| * <<scannable>> - specifies whether the repository can be scanned, meaning it is |
| a local repository which can be indexed, browsed, purged, etc. |
| |
| * <<delete released snapshots>> - specifies whether to remove those snapshot |
| artifacts which already has release versions of it in the repository during |
| repository purge. |
| |
| * << Skip Packed Index creation >> - avoid creation of compressed index for IDE usage. |
| |
| [] |
| |
| [../images/managed-repositories.png] Managed Repositories |
| |
| Each repository has its own http(s)/webdav url. This allows the user to browse and |
| access the repository via http(s)/webdav. The url has the following format: |
| |
| +----+ |
| http://[URL TO ARCHIVA]/repository/[REPOSITORY ID] (e.g. http://localhost:8080/repository/releases). |
| +----+ |
| |
| A pom snippet is also available for each repository. The |
| \<distributionManagement\> section can be copied and pasted into a project's |
| pom to specify that the project will be deployed in that managed repository. |
| The \<repositories\> section on the other hand, can be copied and pasted to a |
| project's pom.xml or to Maven's settings.xml to tell Maven to get artifacts |
| from the managed repository when building the project. |
| |
| * Remote Repository |
| |
| A remote repository is a repository which resides remotely. These repositories |
| are usually the proxied repositories. See Proxy Connectors on how to proxy a |
| repository. |
| |
| Remote repository fields: |
| |
| * <<identifier>> - the id of the remote repository. |
| |
| * <<name>> - the name of the remote repository. |
| |
| * <<url>> - the url of the remote repository. It is also possible to use a |
| 'file://' url to proxy a local repository. Be careful that if this local |
| repository is a managed repository of archiva which has some proxies |
| connectors, those ones won't be triggered. |
| |
| * <<username>> - the username (if authentication is needed) to be used to access |
| the repository. |
| |
| * <<password>> - the password (if authentication is needed) to be used to access |
| the repository. |
| |
| * <<type>> - the layout (maven 2 or maven 1) of the remote repository. |
| |
| * <<Activate download remote index>> - to activate downloading remote index to |
| add available remote artifacts in search queries. |
| |
| * <<Remote index url, can be relative to url>> - path of the remote index |
| directory. |
| |
| * <<Cron expression>> - cron expression for downloading remote index (default |
| weekly on sunday) |
| |
| * <<Directory index storage>> - path to store index directory, default will be |
| $\{appserver.base\}/data/remotes/$\{repositoryId\}/.indexer |
| |
| * <<Download Remote Index Timeout in seconds>> - read time out for downloading |
| remote index files (default 300) |
| |
| * <<Network Proxy to Use for download Remote Index>> - proxy to use for |
| downloading remote index files. |
| |
| * <<Download Remote Index on Startup>> - will download remote index on Archiva startup. |
| |
| * <<Additionnal url parameters>> - key/value pairs to add to url when querying remote repository. |
| |
| * <<Additionnal Http Headers>> - key/value pairs to add as http headers when querying remote repository. |
| |
| [] |
| |
| [../images/remote-repositories.png] Remote Repositories |
| |
| You can also trigger an immediate download of remote index files. |
| |
| ** Maven Index from Remote repositories |
| |
| <<Since 1.4-M4>>: |
| If you have configured download remote index, those files (Maven Indexer project format) will be available in the path |
| http://[URL TO ARCHIVA]/repository/id/.index (you can consume those files for IDE) |
| |
| * Scanning a Repository |
| |
| Repository scan can be executed on schedule or it can be explicitly executed |
| by clicking the 'Scan Repository Now' button in the repositories page. By |
| default, Archiva only processes new artifacts in the repository with respect |
| to the last run of the repository scanner. Meaning that if the artifact's last |
| modified date is newer than the last repository scan, then the artifact will |
| be processed. Otherwise, it will be skipped. You can override this behavior |
| and force Archiva to process all artifacts regardless of its age by ticking |
| the 'Process All Artifacts' checkbox in the repositories page and clicking the |
| 'Scan Repository Now' button. |
| |
| [../images/repositories.png] Repositories |
| |
| For every artifact found by the repository scanner, processing is done on this |
| artifact by different consumers. Examples of the processing done are: indexing, |
| repository purge and database update. Details about consumers are available in |
| the {{{./consumers.html} Consumers}} page. |
| |
| * Repository Purge |
| |
| Repository purge is the process of cleaning up the repository of old |
| snapshots. When deploying a snapshot to a repository, Maven deploys the |
| project/artifact with a timestamped version. Doing daily/nightly builds of the |
| project then tends to bloat the repository. What if the artifact is large? |
| Then disk space will definitely be a problem. That's where Archiva's |
| repository purge feature comes in. Given a criteria to use -- by the number of |
| days old and by retention count, it would clean up the repository by removing |
| old snapshots. |
| |
| Please take note that the by number of days old criteria is activated by |
| default (set to 100 days). In order to de-activate it and use the by retention |
| count criteria, you must set the Repository Purge By Days Older field to 0. |
| Another thing to note here is that if the by number of days old criteria is |
| activated, the retention count would still be respected (See the Repository |
| Purge By Days Older section below for more details) but not the other way |
| around. |
| |
| Let's take a look at different behaviours for repository purge using the |
| following scenario: |
| |
| +----+ |
| Artifacts in the repository: |
| |
| ../artifact-x/2.0-SNAPSHOT/artifact-x-20061118.060401-2.jar |
| ../artifact-x/2.0-SNAPSHOT/artifact-x-20061118.060401-2.pom |
| ../artifact-x/2.0-SNAPSHOT/artifact-x-20070113.034619-3.jar |
| ../artifact-x/2.0-SNAPSHOT/artifact-x-20070113.034619-3.pom |
| ../artifact-x/2.0-SNAPSHOT/artifact-x-20070203.028902-4.jar |
| ../artifact-x/2.0-SNAPSHOT/artifact-x-20070203.028902-4.pom |
| +----+ |
| |
| [[1]] Repository Purge By Number of Days Older |
| |
| Using this criteria for the purge, Archiva will check how old an artifact is |
| and if it is older than the set value in the repository purge by days older |
| field, then the artifact will be deleted respecting the retention count of |
| course. |
| |
| If repository purge by days older is set to 100 days (with repository purge by |
| retention count field set to 1), and the current date is let's say 03-01-2007, |
| given the scenario above.. the following artifacts will be retained: |
| artifact-x-20070113.034619-3.jar, artifact-x-20070113.034619-3.pom, |
| artifact-x-20070203.028902-4.jar and artifact-x-20070203.028902-4.pom. It is |
| clear in the version timestamps that these 4 artifacts are not more than 100 |
| days old from the current date (which is 03-01-2007 in our example) so they are |
| all retained. In this case the retention count doesn't have any effect since the |
| priority is the age of the artifact. |
| |
| Now, if the repository purge by days older is set to 30 days (with repository |
| purge by retention count field still set to 1) and the current date is still |
| 03-01-2007, then given the same scenario above.. only the following artifacts |
| will be retained: artifact-x-20070203.028902-4.jar and |
| artifact-x-20070203.028902-4.pom. In this case, we can see that the retained |
| artifacts are still not older by the number of days set in the repository purge |
| by days older field and the retention count is still met. |
| |
| Now, let's set the repository purge by days older to 10 days (with repository |
| purge by retention count field still set to 1) and the current date is still |
| 03-01-2007, then still given the same repository contents above.. the following |
| artifacts will still be retained: artifact-x-20070203.028902-4.jar and |
| artifact-x-20070203.028902-4.pom. It is clear from the version timestamps that |
| the artifacts ARE MORE THAN the repository purge by days older value, which is |
| 10 days. Why is it still retained? Recall the value of the repository purge by |
| retention count -- 1 :) This ensures that there is ALWAYS 1 artifact timestamped |
| version retained for every unique version snapshot directory of an artifact. |
| |
| [[2]] Repository Purge By Retention Count |
| |
| If the repository purge by retention count field is set to 2, then only the |
| artifacts artifact-x-20070113.034619-3.jar, artifact-x-20070113.034619-3.pom, |
| artifact-x-20070203.028902-4.jar and artifact-x-20070203.028902-4.pom will be |
| retained in the repository. The oldest snapshots will be deleted maintaining |
| only a number of snapshots equivalent to the set retention count (regardless of |
| how old or new the artifact is). |
| |
| ** Deleting Released Snapshots |
| |
| You can also configure Archiva to clean up snapshot artifacts that have |
| already been released. This can be done by ticking the Delete Released Snapshots |
| checkbox in the Repository Configuration form. |
| |
| Once this feature is enabled, if Archiva encounters a snapshot artifact during |
| repository scanning, it would check <<all>> the repositories configured for a |
| released version of that snapshot. If it finds one, then it would delete the |
| entire snapshot version directory. |
| |
| It should be noted that this feature is entirely separate from the repository |
| purge by number of days older and by retention count. |