println it

Software blog about tools, builds and making it all work

Artifactory Power Pack – is it really powerful ?

We have been using Artifactory in Thomson Reuters (ClearForest) for more than a year now.

My first Maven repository manager was Nexus – we were using it in my previous workplace. When I came to Thomson Reuters and started working on new CM infrastructures – I decided to switch to Artifactory, though. Mostly due to its richness of features and for supporting an efficient checksum-based storage model for binaries, which was the biggest difference for me between the two products.

We started with version 1.3 and then went through all major upgrades: 2.0, 2.1 (it was a big update: new searches, artifacts metadata, add-ons, and move/copy operations on artifacts). We’re now running version 2.2.1 with Add-ons Power Pack.

To tell you the truth – we really love it as it worked perfectly through the whole period (except in some cases where support and solutions were provided on the same day!).

Within time, I’ve also learned to appreciate how Artifactory goes away from being Maven-only repository manager and becomes a general-purpose storage manager for any kind of binaries and build system. Ivy and Gradle support (with new collaborative relationship just announced) is already available and it’s a really good start. Integration with Hudson and TeamCity (will be available soon) also comes very handy. In fact, one can store any kind of binaries in Artifactory, not related to either Maven or Java in any way!

In short, Artifactory knows (and willing) to cooperate with all major players on today’s arena and that’s a really impressive achievement for something that started as a free-time Maven repo manager project. Well done, guys!

But my particular interest in the last months was it’s Power Pack offering.
We run Hudson pretty intensively as our CI server so when JFrog-ers announced they have a special “Hudson support” – they certainly had my attention!

Now as we have it installed I’d like to see .. Does it matter? I mean, is it really that powerful? The short answer is yes, it does and yes, it is. There’s no doubt about it – see below.

To start with – Power Packs offers different things and even with all our appreciation to Artifactory we’re not using all of them. There’s simply no need for us to right now.
But what we do use is really saving us time (and, therefore, money) on a daily basis:

Now, let’s take each of them apart.

 

Hudson integration

This one is definitely the best and draws the most attention, for obvious reasons. The idea is both simple and ingenius – let’s ask CI server (Hudson/TeamCity/Bamboo) to push all build environment data to Artifactory! After all, when a build job is running – it has all environmental information one can think of : OS type and version, JVM version, modules built, their dependencies and versions … Until today all this information was buried somewhere
in Hudson logs and deleted, eventually (I mean, we do need to clean up our build logs sometimes, don’t we?)

Not any more – Hudson integration establishes a bi-directional link between Artifactory and Hudson for each job run. Finally, those two start talking to each other!

How it works:

  • Hudson Artifactory plugin is installed
  • Hudson is configured and Artifactory server is added
  • Hudson job is configured to run "mvn clean deploy install" 
       That’s right, we’re not using maven-deploy-plugin any more
  • Hudson job is configured to deploy to Artifactory server (specified previously) 
       and one of repos available – a nice drop-down list allows to choose it:

        Hudson Job Config

When (and if!) job finishes successfully – all artifacts archived during the build (<archivingDisabled> should be set to “false” in job’s “config.xml” but that’s a default value) will be deployed by Hudson to Artifactory in one go:

Hudson Deploy 

It’s not truly atomic (if the process fails in the middle for some reasons – my guess is nothing would be un-deployed) but it’s still much better than what Maven does by default: deploying each artifact the moment it is ready (so if build process fails in the middle – some newer artifacts would be deployed already while some would stay in the previous version).

As you see, in this sense – Hudson’s way of deploying to Artifactory is much better as it only starts when the build has finished successfully. On top of that, for some weird reason Maven’s traditional deploy has let me down recently with:

Error installing metadata: Error updating group repository metadata
The requested operation cannot be performed on a file with a user-mapped section open

Seems to be some corruption issue that I couldn’t solve.
But "Ok" – I thought to myself – "One more reason not to use ‘mvn deploy’"

Now, does it scale?

After all, Hudson needs to deploy all created artifacts – what if there are too many of them?
In our case, it scales pretty well and there’s no problem whatsoever – our biggest job is publishing 170+ artifacts this way and it works just fine.

Ok, so what else does it actually do?

A lot. First of all, you now have a link to Artifactory in Hudson’s job: 

Hudson Artifactory link

Once we follow it to Artifactory – we get to a page where all build environmental data is stored: 

 General Build Info

So we have a "Properties" section here with JVM and OS versions recorded (though I wish there were some more), a link back to the Hudson job (I told ya those two started talking to each other!) and, most importantly, "Published Modules"

 Published Modules

For each published module – all its dependencies are recorded as well if we ever need to go back in time and figure out what dependencies do we need to re-create the module: 

Published Modules Dependencies

Following "Show In Tree" link we come to the usual artifact’s location in one of our repos where there’s now a new "Builds" tab: 
 
Builds

As you see – we now have an exact, bi-directional and traceable information about all jobs that ever deployed our artifacts.

And, like I said, I think it’s a lot. Since now we know for each artifact how it ended up being in Artifactory, by whom and when. We know which job has created it and we know what else was published by this job. For me it’s like a difference between my dady’s old garage (where you can find everything but nobody has any idea how things came along) and my mom’s kitchen (where every little thing has an origin and owner).

Don’t you love it already?

 

Properties

Historically, Maven doesn’t add much information to an artifact when it’s deployed. Nor does it offer any way to do so. Of course, a certain amount of metadata is added to each *.jar created (like it’s original POM) and each artifact has a traditional
<groupId>:<artifactId>:<version>:<classifier> coordinates (which is a huge improvement since Ant, if we really want to look back for a moment).

But, unfortunately, it only goes so far.

What if we want to mark or tag or label (pick up your favorite name) an artifact ?
A group of artifacts?

How about setting a "product=true" property to those artifacts that are final products (and not intermediate jars) ? We may talk a lot about artifacts and things but after all – people need working products, right? Those having "qa.status=passed" label on them. Or at least "qa.status=ok", may be.

As we can "label" e-mails in Gmail (surprisingly, some people don’t – I think they don’t know what they’re missing) or "tag" Delicious links – I would love to do the same with artifacts!

Some of them I would like to label manually, like QA steps in product lifecycle:
"qa.status = New => Accepted => Rejected => Passed => Graduated (?)"

Other properties I would like to be set automatically, when artifact is deployed: "build.number=35", "product=true" (if artifact is a ZIP file), "jvm=1.6" and the like.

Not surprisingly, there are two ways to set properties in Artifactory:

  • Manual
  • Automatic

The manual process is demonstrated here and, basically, it goes like this:

  • Define a property set: qa.status, qa.version, qa.anything
  • Choose a possible value for each property: any value, single-select, multi-select
  • Update your repo definition to make this property set available for it
       Watch out! If you miss this step – it will not work (happened twice to me)
  • For any artifact or folder in the tree – go to the new "Properties" tab and add a property

I agree, it’s more similar to Outlook “categories” (than to Gmail labels) and is a little bit involved but .. ok, that’s how it works for now. May be it’ll improve.

Anyway, being a software developer for life – I’m naturally more interested in things happening automatically. So how do I set a property on artifact during the build process?

I want to specify a POM <property> that will become an Artifactory property!

The answer is matrix-params:

<distributionManagement>
    <repository>
        <id>qa-releases</id>
        <url>
http://srv/artifactory/qa-rel;buildNumber=${number};rev=${rev}</url>
    </repository>
</distributionManagement>

As you see, it is simply a pair of arguments added to the deployment repo definition. Their values are usually taken from regular Maven properties and can be updated by any POM.
For example, to implement our “Products vs rest of artifacts” vision – all we need to do is to add a "product=${product}" matrix param:

<distributionManagement>
    <repository>
        <id>qa-releases</id>
        <url>
http://srv/artifactory/qa-rel;product=${product}</url>
    </repository>
</distributionManagement>

Some top-level <parent> POM will have it set to "false":

<properties>
    <product>false</product>
</properties>

.. but
those POMs packaging a final product will have it set to “true”:

<properties>
    <product>true</product>
</properties>

Can it be any simpler than that?!

Here’s a blog post demonstrating the same technique for the purpose of artifacts staging and promotion through tagging.

Today, there’s one problem here, though – if we use Hudson integration and switch to "Hudson deploy" (see above) – this <distributionManagement> tag isn’t worth a lot, is it?

And there’s no way to set up any matrix params from Hudson job configuration, where deployment repo is specified. The workaround is simple, though – one just needs to edit job’s “config.xml” (.hudson/jobs/JobName/config.xml) file manually and restart Hudson or “Manage Hudson” => "Reload Configuration from Disk":

<publishers>
    <org.jfrog.hudson.ArtifactoryRedeployPublisher>
        <details>
            <artifactoryName>
http://srv/artifactory</artifactoryName>
            <repositoryKey>
qa-rel;product=${product}</repositoryKey>
        </details>
        <deployArtifacts>true</deployArtifacts>
        <username>..</username>
        <scrambledPassword>..</scrambledPassword>
    </org.jfrog.hudson.ArtifactoryRedeployPublisher>
</publishers>

The bug is opened so I’m sure it’ll be fixed soon.

Ok, so we have our properties (tags, labels) set – now what? How do we use them?
That’s exactly what the next slide is about …

 

Smart Searches

In Gmail, searching for labeled mails is a matter of typing "g+l+label" (btw, I much preferred the “Labs” version over the “graduated” one). In Artifactory it’s a little bit involved (again) but that’s due to the fact that Artifactory searches are much more capable.

I believe options provided today would satisfy the most demanding (and esoteric) "querist":

  • Quick Search
  • Class Search
  • GAVC Search
  • Property Search
  • POM/XML Search

The first three are pretty obvious and very helpful indeed. I use GAVC Search most of the time, and a Class Search occasionally. But it still amazes me how fast Artifactory scans through its indices to locate all instances of, say, Scanner class:
 Scanner Search

It is a new Property Search we’re after today – it allows combining a query composed of a number of properties.

Like searching for all artifacts where "qa.version=1.0.1" and "qa.status=In QA".
Or, simply put, “What’s being checked today for the upcoming ‘1.0.1’ release?”

QA Search

It doesn’t matter how properties were set (either manually or automatically) – we can search for all of them! For example, "build.name" and "build.number" are sent by Hudson automatically so we can search by "build.number" as well: 
 Build Number Search

Search results can also be added or subtracted from each other – this is useful when they need to be either expanded or filtered with additional queries. They can also be saved for later use to perform a single operation on all results, where options are “Move”, “Copy” and “Delete”.

The blog post I’ve mentioned already shows exactly that – how artifacts can be

  1. Searched for
  2. Promoted to another repository with "Move" / "Copy" operation

As you see, using Property Search anyone can find what he’s looking for (assuming properties were set in the first place, of course): be it a QA person, looking for the last binaries to download and test or a Dev manager, looking for the binaries being QA-ed now.

The last advanced search is POM/XML Search allowing to search through all POMs (in all or specific repos) with XPath queries. I’ve used it yesterday trying to find out which POMs were using some specific plugin. Normally, I just run a textual search on “pom.xml” files through the whole “trunk” and it takes .. well, quite a while, of course. With Artifactory – it can be done smarter and faster:

XML Search

As you see, "/project/build/plugins/plugin/artifactId" search does the job.
And, of course, it takes less time than an old-school Total Commander textual search
(I don’t even need to measure it – it’s seconds vs minutes!)

So far I didn’t encounter a case where Artifactory searches were not sufficient.
They’re always smart (though I would call it "capable") enough.

 

Watches

I suppose this one is the easiest to describe and, in fact, there’s probably no need to describe it at all. One can set up “watches” to be notified by e-mail when certain repository, folder or artifact has a “create” or “delete” operation performed on it:

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The following events have recently occurred in Artifactory on items you are watching:

Sun Jan 10 06:30:21 IST 2010 [user-name/XXX.XXX.XXX.XXX] [CREATED] libs-snapshots-local:com/clearforest/ProductsPage/8.0-SNAPSHOT/ProductsPage-8.0-SNAPSHOT.pom
Sun Jan 10 06:30:18 IST 2010 [user-name/XXX.XXX.XXX.XXX] [CREATED] libs-snapshots-local:com
/clearforest/ProductsPage/8.0-SNAPSHOT/ProductsPage-8.0-SNAPSHOT.war
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Ironically, about a year ago – I was practically dying to get this notification: somehow, trunk POMs were overridden by an older versions and I suspected someone from the dev team running “mvn deploy” on an outdated sources (but this was not the case, actually).

So although I get quite a lot of e-mails from my “watches” – I just keep them in case anything like that will ever happen again. And of course, it becomes even more necessary when certain repos are used for special purposes, like in staging and promoting scenario.

 

Conclusions

I think “Power Pack” is a critical add-on to what Artifactory offers.

Being able to integrate it with Hudson, set custom properties and search for them is what makes it a nice, organized and watched storage rather than a kitchen sink of everything that happened to be downloaded from somewhere on the Internet.

Whether you have it or don’t – Artifactory surely delivers! But the questions are:

  • How aware are you of what’s happening?
  • How easy it is for you to dig through the mess and find what you need?

May be it’s just me, but I just love knowing what’s going on and when. It keeps me in control of things and not the other way around which I believe is a good thing, in general.

Happy building!

, ,

2 Responses to “Artifactory Power Pack – is it really powerful ?”

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>