Perfect, thanks. So using your URLs as a model I am now getting a .cpd file that shows the image locations in XML format if opened in a text editor. The problem before was that I was pulling them down with `~/utils/getfile/collection/alias/id/pointer/filename/name` as specified at, which worked for downloading the .jpg files, but didn't work for the .cpd. 

Thanks again, 

-----Original Message-----
From: Code for Libraries [mailto:[log in to unmask]] On Behalf Of Kyle Banerjee
Sent: Tuesday, March 11, 2014 1:00 PM
To: [log in to unmask]
Subject: Re: [CODE4LIB] .cpd file format head scratcher

I haven't messed with this stuff for awhile so my knowledge is a bit rusty.
But the basic idea is that if you're messing with cpd files, what you need to do to extract the actual files is to find the object identifiers and then download them the same way you would a non compound object.

If the pointer is to another cpd file, actually getting to the images will require retrieving that cpd file, finding the object identifiers is critical. If you're getting an "invalid file" message, that's an indication from CONTENTdm that you didn't actually get the file you needed (i.e. the cpd file) and it just sent a short message that really means it didn't like the format of your request.

For example, our institution happens to have CDM. You can see one of the digitized books at

To pull the cpd file, the syntax
what I need.

Inspection reveals the identifiers for the components so if I want the front cover,
just fine. If you're wondering what the extraneous parameters are for, it's to keep the system from delivering a partial image. If your cpd files point to other cpd files, you'll have to traverse them to find the actual images.

I've experimented with several CDM systems in the past and have found that the syntax that works on one doesn't necessarily work on another so good luck


On Tue, Mar 11, 2014 at 9:18 AM, Andrew Gordon <[log in to unmask]> wrote:

> Thanks for the quick and helpful responses everyone, though Kyle and 
> Rachel, I think it being CONTENTdm is what it comes down to. I should 
> have mentioned that I am trying to extract these images from CONTENTdm.
> So I am still a little confused, though, about how the .cpd file in 
> CONTENTdm works. In the export I am noticing that the `cdmfile` 
> variously points to .cpd files in some records and .jpg files in other 
> records. It does not seem to correspond to whether there is a front 
> and back to the image or not (I think they all have front and back in 
> same object though I may be wrong). I am relatively green with 
> CONTENTdm, so this is a learning moment for me.
> If this is the case, how do I go about extracting the image file 
> information from the .cpd files? If they are XML, I am not sure how to 
> read their contents as opening up with a text file or browser shows 
> 'invalid file'.
> Thanks,
> d
> -----Original Message-----
> From: Code for Libraries [mailto:[log in to unmask]] On Behalf 
> Of Kyle Banerjee
> Sent: Tuesday, March 11, 2014 11:42 AM
> To: [log in to unmask]
> Subject: Re: [CODE4LIB] .cpd file format head scratcher
> Just out of curiosity, are these files in a DAM or did you get them 
> elsewhere? The reason I ask is that you appear to have a bunch of 
> pharmaceutical cards in a CONTENTdm system at
> If you're trying to extract the image files from CDM, the cpd file is 
> an XML file defining a compound object with object odentifiers, 
> filenames, and descriptions.You can iterate through all the images 
> separately
> kyle
> On Tue, Mar 11, 2014 at 8:15 AM, Andrew Gordon <[log in to unmask]> wrote:
> > Hey All,
> >
> > For a set of digitized pharmaceutical cards, I am coming up against 
> > an image file format that seems to be locked in time. It's 
> > supposedly a Compressed PhotoDefiner (?) lossless (.cpd) file ( 
> > Though when I try to load up the 
> > software, I can't get it to take on any of our windows machines 
> > (running 8 and 7). Don't have a mac on hand so don't know if that 
> > works or not, currently.
> >
> > In my experience, though, I've always been able to find some rogue 
> > third party file converter (or imagemagick) to be helpful in these 
> > scenarios but this format  is just not something that appears to 
> > have
> been accounted for.
> > Additionally, it's one of those file formats that seem to only pop 
> > randomly generated answer sites with questionable downloads in a 
> > google search, such as 
> >
> >
> > Just wanted to see if anyone has come across this format and whether 
> > there might be any tools to convert it.
> >
> > Thanks,
> > Drew
> >
> >
> >
> > ________________________________________
> > Andrew Gordon, MSI
> > Systems Librarian
> > Center for the History of Medicine and Public Health New York 
> > Academy of Medicine
> > 1216 Fifth Avenue
> > New York, NY, 10029
> > 212.822.7324
> >
> >