Adobe XMP with an Hierarchical Subject Array in PHP

This morning I had a bit of a challenge parsing Adobe XMP information for images on Underwater Focus. The Adobe XMP is too complex for SimpleXML, and anyway, I only needed a few values — one of them, the LightRoom hierarchicalSubject keywords, is the reason I’m sharing some of the code I wrote.

Using regular expressions to get at single values is quick and easy, but I wanted to create arrays for `rdf:li` values, and split each `lr:hierarchicalSubject` keyword into an additional second-dimension array.

<?php
function get_xmp_info( $xml ) {
	$xmp = array();
	foreach ( array(
		'email'         => '<Iptc4xmpCore:CreatorContactInfo[^>]+?CiEmailWork="([^"]*)"',
		'created'       => '<rdf:Description[^>]+?xmp:CreateDate="([^"]*)"',
		'modified'      => '<rdf:Description[^>]+?xmp:ModifyDate="([^"]*)"',
		'state'         => '<rdf:Description[^>]+?photoshop:State="([^"]*)"',
		'country'       => '<rdf:Description[^>]+?photoshop:Country="([^"]*)"',
		'owner'         => '<rdf:Description[^>]+?aux:OwnerName="([^"]*)"',
		'creators'      => '<dc:creator>\s*<rdf:Seq>\s*(.*?)\s*<\/rdf:Seq>\s*<\/dc:creator>',
		'keywords'      => '<dc:subject>\s*<rdf:Bag>\s*(.*?)\s*<\/rdf:Bag>\s*<\/dc:subject>',
		'hierarchs'     => '<lr:hierarchicalSubject>\s*<rdf:Bag>\s*(.*?)\s*<\/rdf:Bag>\s*<\/lr:hierarchicalSubject>',
	) as $key => $regex ) { 

		// get a single text string
		$xmp[$key] = preg_match( "/$regex/is", $xml, $match ) ? $match[1] : ''; 
		
		// if string contains a list, then re-assign the variable as an array with the list elements
		$xmp[$key] = preg_match_all( "/<rdf:li>([^>]*)<\/rdf:li>/is", $xmp[$key], $match ) ? $match[1] : $xmp[$key]; 
		
		// hierarchical keywords need to be split into a second dimension
		if ( $key == 'hierarchs' ) {
			foreach ( $xmp[$key] as $li => $val ) $xmp[$key][$li] = explode( '|', $val );
			unset ( $li, $val );
		}
	}
	return $xmp;
}
?>

A print_r() of the returned $xmp variable looks like this:

Array
(
    [email] => jsm@underwaterfocus.com
    [created] => 2007-05-06T10:04:33
    [modified] => 2012-12-28T22:11:08-05:00
    [state] => Bonaire, Caribbean Netherlands
    [country] => Netherlands
    [owner] => unknown
    [creators] => Array
        (
            [0] => Jean-Sebastien Morisset
        )

    [keywords] => Array
        (
            [0] => Andrea I
            [1] => Animal
            [2] => Atlantic
            [3] => Bonaire
            [4] => Bony
            [5] => Caribbean
            [6] => Caribbean Netherlands
            [7] => Dive Sites
            [8] => Drum
            [9] => Equetus punctatus
            [10] => Fish
            [11] => Macro
            [12] => Nature
            [13] => North of Town Pier
            [14] => Oceans and Islands
            [15] => Odd-Shaped Swimmer
            [16] => Salt Water
            [17] => Sciaenidae
            [18] => Scuba Diving
            [19] => Spotted Drum
            [20] => Underwater
            [21] => Vertebrate
        )

    [hierarchs] => Array
        (
            [0] => Array
                (
                    [0] => Macro
                )

            [1] => Array
                (
                    [0] => Nature
                    [1] => Animal
                    [2] => Vertebrate
                    [3] => Fish
                    [4] => Bony
                    [5] => Odd-Shaped Swimmer
                    [6] => Drum
                    [7] => Spotted Drum
                )

            [2] => Array
                (
                    [0] => Oceans and Islands
                    [1] => Atlantic
                    [2] => Caribbean
                    [3] => Caribbean Netherlands
                    [4] => Bonaire
                )

            [3] => Array
                (
                    [0] => Oceans and Islands
                    [1] => Atlantic
                    [2] => Caribbean
                    [3] => Caribbean Netherlands
                    [4] => Bonaire
                    [5] => Dive Sites
                    [6] => North of Town Pier
                    [7] => Andrea I
                )

            [4] => Array
                (
                    [0] => Scuba Diving
                )

            [5] => Array
                (
                    [0] => Underwater
                )

            [6] => Array
                (
                    [0] => Underwater
                    [1] => Salt Water
                )

        )

)

And here’s how I used the $xmp[‘hierarchs’] two-dimensional array to print a list of keywords, one hierarchical keyword set per line:

foreach ( $xmp['hierarchs'] as $kwset ) {

        echo '<div class="keyword">';

        foreach ( $kwset as $num => $kw ) {

                if ( $num > 0 ) echo ' . ';

                echo '<a href="/s/', urlencode( strtolower( $kw ) ), '" rel="tag">', $kw, '</a>';
        }

        unset ( $num, $kw );

        echo '</div>', "\n";
}

You can see a finished hierarchical keyword example on photograph pages from Underwater Focus.

Update : This PHP code has been improved over time and is the basis for the Adobe XMP for WP WordPress plugin. The Adobe XMP for WP plugin reads image files progressively (small chunks at a time) to extract the embedded XMP meta data, instead of reading the whole file into memory (as many other image management plugins do). The extracted XMP data is also cached on disk to improve performance and is refreshed only if / when the original image is modified. You can use the plugin in one of two ways; calling a method from the $adobeXMP global class object in your template(s), or using an [xmp] shortcode in your Posts or Pages.

Find this content useful? Share it with your friends!