Author Topic: Some IPTC/XMP related questions in developing CSV workflow  (Read 9801 times)

3design

  • Jr. Member
  • **
  • Posts: 29
Some IPTC/XMP related questions in developing CSV workflow
« on: November 10, 2012, 02:55:05 PM »
Hi,
I've been trying to develop a workflow to pull image data (descriptions, keywords, etc) from text files, into a csv, which I could then apply to all the images by inserting the tags directly from the csv. I have a few questions on structure and procedure. Hopefully someone here can help.

I'm using the following command to pull pre-existing data from images in subfolders into a temp csv file:
exiftool -csv -f -iptc:all . -r *.tif > out.csv
but it's not restricting its output to only the TIF files and I'm not sure what I'm doing wrong. Something is wrong in my syntax.

Next, I've output separate IPTC and XMP CSVs so I could poke around at the available tags. I found that *all* of the XMP fields
Code: [Select]
SourceFile,XMPToolkit,CreatorWorkEmail,CreatorWorkURL,CaptionWriter,Instructions,TransmissionReference,Marked,WebStatement,UsageTerms,Creator,Title,Rights,Rating
also exist in the IPTC fields, but that the list of IPTC fields also contains another ~50-75 tags which don't exist in XMP.

Code: [Select]
SourceFile,ApplicationRecordVersion,Artist,BitDepth,BitsPerSample,By-line,CaptionWriter,CodedCharacterSet,ColorComponents,ColorType,Compression,Contact,CopyrightNotice,Creator,CreatorWorkEmail,CreatorWorkURL,Credit,CurrentIPTCDigest,DateCreated,DateTimeCreated,DateTimeOriginal,Directory,EditStatus,EncodingProcess,ExifByteOrder,ExifToolVersion,FileAccessDate,FileModifyDate,FileName,FilePermissions,FileSize,FileType,Filter,Headline,ImageHeight,ImageSize,ImageWidth,Instructions,Interlace,JFIFVersion,Marked,MIMEType,ObjectName,OriginalTransmissionReference,OriginatingProgram,PhotometricInterpretation,PixelsPerUnitX,PixelsPerUnitY,PixelUnits,PlanarConfiguration,Prefs,ProgramVersion,Rating,ReleaseDate,ReleaseTime,ResolutionUnit,Rights,RowsPerStrip,SamplesPerPixel,Software,Source,SpecialInstructions,StripByteCounts,StripOffsets,TimeCreated,Title,TransmissionReference,UsageTerms,WebStatement,Writer-Editor,XMPToolkit,XResolution,YCbCrSubSampling,YResolution
Does this mean that if I fill my CSV with all the IPTC fields that also exist as XMP fields, I can write both IPTC and XMP at the same time and they would be synchronized? Or do I need to maintain a separate CSV for IPTC and for XMP?

It seems to me, since I'm just beginning this process, that it's safe to output just the IPTC fields, then edit the CSV to populate all the fields I need, and then use the one IPTC CSV to populate both IPTC and XMP on the way back in... but I'm not sure how I would go about writing IPTC and XMP in one shot...

I'm using OpenOffice Calc to manipulate my CSVs. When OO exports the CSV, the fields are comma-delim but also wrapped in double quotes. Are those quotes going to be inserted into the metadata fields as quotes or will they be stripped and only the content between the quotes will be written to the fields?

Sorry for the long first post... I found exiftool last night and lost half a night of sleep not being able to stop reading about it. :) It seems like it will be immensely helpful with this workflow. I have approx 20k images to tag, and the text file > CSV > exiftool method might just save me a few weeks of work!

Phil Harvey

  • ExifTool Author
  • Administrator
  • ExifTool Freak
  • *****
  • Posts: 14937
    • ExifTool Home Page
Re: Some IPTC/XMP related questions in developing CSV workflow
« Reply #1 on: November 11, 2012, 07:32:45 AM »
exiftool -csv -f -iptc:all . -r *.tif > out.csv
but it's not restricting its output to only the TIF files

The *.tif gives you all ".tif" files in the current directory only, and -r . gives you all files in the current directory and sub-directories.  Drop the *.tif and add -ext tif to do what you want.

Quote
Next, I've output separate IPTC and XMP CSVs so I could poke around at the available tags. I found that *all* of the XMP fields
Code: [Select]
SourceFile,XMPToolkit,CreatorWorkEmail,CreatorWorkURL,CaptionWriter,Instructions,TransmissionReference,Marked,WebStatement,UsageTerms,Creator,Title,Rights,Rating
also exist in the IPTC fields, but that the list of IPTC fields also contains another ~50-75 tags which don't exist in XMP.

I think you are confused about the difference between IPTC and XMP here.  The new IPTCCore and IPTCExt tags actually use XMP already.  ExifTool calls this XMP.  What ExifTool calls IPTC is the old IPTC-IIM format information.  Use -G with ExifTool to see where the tags are really stored.  If you really want to maintain synchronization with the old IPTC (IIM), then the "xmp2iptc.args" and "iptc2xmp.args" files in the full distribution may be useful to you.

Quote
I'm using OpenOffice Calc to manipulate my CSVs. When OO exports the CSV, the fields are comma-delim but also wrapped in double quotes. Are those quotes going to be inserted into the metadata fields as quotes or will they be stripped and only the content between the quotes will be written to the fields?

If the fields are properly quoted, then the quotes will be stripped.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

3design

  • Jr. Member
  • **
  • Posts: 29
Re: Some IPTC/XMP related questions in developing CSV workflow
« Reply #2 on: November 11, 2012, 12:00:27 PM »
The *.tif gives you all ".tif" files in the current directory only, and -r . gives you all files in the current directory and sub-directories.  Drop the *.tif and add -ext tif to do what you want.

Hi Phil,
Thanks for clarifying the syntax. I'll give that a try this afternoon and then read up some more on the command line syntax so I can learn what I was doing wrong the first time.

I think you are confused about the difference between IPTC and XMP here.  The new IPTCCore and IPTCExt tags actually use XMP already.  ExifTool calls this XMP.  What ExifTool calls IPTC is the old IPTC-IIM format information.  Use -G with ExifTool to see where the tags are really stored.  If you really want to maintain synchronization with the old IPTC (IIM), then the "xmp2iptc.args" and "iptc2xmp.args" files in the full distribution may be useful to you.

No doubt I'm confused, but learning. :) A few questions:

When I output metadata using -iptc:all > out.csv and the resulting CSV shows me approx 75 tags, those are the old IPTC-IIM tags only?

When I output metadata using -xmp:all > out.csv the resulting CSV shows me ~15 tags. Am I correct that those 15 tags do not represent the total amount of possible XMP tags, but only the tags that are present in at least 1 image from the set of images I read? If I want to add XMP tags which are not already filled in the files, it's simply a matter of adding a column to the CSV file with the correct IPTC or XMP field name? Does that column need to be in a specific order, or can I simply add it to the end of the columns?

This is the result I got when running this command: exiftool -csv -G -f -r . -ext tif > output.csv

Code: [Select]
SourceFile,Composite:DateTimeCreated,Composite:DateTimeOriginal,Composite:ImageSize,EXIF:Artist,EXIF:BitsPerSample,EXIF:Compression,
EXIF:ImageHeight,EXIF:ImageWidth,EXIF:PhotometricInterpretation,EXIF:PlanarConfiguration,EXIF:ResolutionUnit,EXIF:RowsPerStrip,EXIF:SamplesPerPixel,
EXIF:Software,EXIF:StripByteCounts,EXIF:StripOffsets,EXIF:XResolution,EXIF:YResolution,ExifTool:ExifToolVersion,File:CurrentIPTCDigest,File:Directory,
File:ExifByteOrder,File:FileAccessDate,File:FileModifyDate,File:FileName,File:FilePermissions,File:FileSize,File:FileType,File:MIMEType,IPTC:ApplicationRecordVersion,
IPTC:By-line,IPTC:CodedCharacterSet,IPTC:Contact,IPTC:CopyrightNotice,IPTC:Credit,IPTC:DateCreated,IPTC:EditStatus,IPTC:Headline,IPTC:ObjectName,
IPTC:OriginalTransmissionReference,IPTC:OriginatingProgram,IPTC:Prefs,IPTC:ProgramVersion,IPTC:ReleaseDate,IPTC:ReleaseTime,IPTC:Source,
IPTC:SpecialInstructions,IPTC:TimeCreated,IPTC:Writer-Editor,XMP:CaptionWriter,XMP:Creator,XMP:CreatorWorkEmail,XMP:CreatorWorkURL,XMP:Instructions,
XMP:Marked,XMP:Rating,XMP:Rights,XMP:Title,XMP:TransmissionReference,XMP:UsageTerms,XMP:WebStatement,XMP:XMPToolkit

So my existing images contain a mixture of tag types. Up to now I was using a combination of xnView and Photo Mechanic to apply tags. Photo Mechanic applies "IPTC/XMP" but now I'm unclear as to whether its the old IPTC or the new.

Am I correct that if I simply maintain this same column structure / field headers in the CSV, I'll be safe in writing these tags to the images when using the CSV as the source?

Is there some list which shows a translation of XMP tags to IPTC tags? For example, the IPTC ObjectName tag doesn't exist in XMP.. Is there an easy way to find equivalent tag names?

Phil Harvey

  • ExifTool Author
  • Administrator
  • ExifTool Freak
  • *****
  • Posts: 14937
    • ExifTool Home Page
Re: Some IPTC/XMP related questions in developing CSV workflow
« Reply #3 on: November 11, 2012, 03:37:19 PM »
When I output metadata using -iptc:all > out.csv and the resulting CSV shows me approx 75 tags, those are the old IPTC-IIM tags only?

Yes.  Also add -a to be sure you get them all (see FAQ 3).

Quote
When I output metadata using -xmp:all > out.csv the resulting CSV shows me ~15 tags. Am I correct that those 15 tags do not represent the total amount of possible XMP tags, but only the tags that are present in at least 1 image from the set of images I read?

Yes.  If you want to see all XMP tags, do this:

exiftool -list -xmp:all

You will get a list of 884 tag names. (I know, more than you bargained for.)

Quote
If I want to add XMP tags which are not already filled in the files, it's simply a matter of adding a column to the CSV file with the correct IPTC or XMP field name? Does that column need to be in a specific order, or can I simply add it to the end of the columns?

Any order will do.

Quote
Am I correct that if I simply maintain this same column structure / field headers in the CSV, I'll be safe in writing these tags to the images when using the CSV as the source?

It depends on what you mean by "safe".  Existing information will be overwritten.  But then ExifTool creates a "_original" backup for you, so in that sense you're always safe.

Quote
Is there some list which shows a translation of XMP tags to IPTC tags? For example, the IPTC ObjectName tag doesn't exist in XMP.. Is there an easy way to find equivalent tag names?

The IPTCCore specification lists this in detail (I think).  Or you could look at what I am doing in iptc2xmp.args and xmp2iptc.args.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

3design

  • Jr. Member
  • **
  • Posts: 29
Re: Some IPTC/XMP related questions in developing CSV workflow
« Reply #4 on: November 11, 2012, 04:33:14 PM »
Hi Phil,

Thanks for the reply. I'm getting a somewhat better grasp on how this all works, but am having some difficulty with writing a test folder\subfolder from a csv I exported. I did a basic export using:

e:\img_output\INCOMING\TEST\exiftool -csv -G --filename --directory -charset type=UTF8 -r . -ext tif > testoutput.csv

(I used the --filename and --directory switches after having read on this page: http://www.sno.phy.queensu.ca/~phil/exiftool/exiftool_pod.html
Quote
-args (-argFormat)
    Output information in the form of exiftool arguments, suitable for use with the -@ option when writing. May be combined with the -G option to include group names. This feature may be used to effectively copy tags between images, but allows the metadata to be altered by editing the intermediate file (out.args in this example):
        exiftool -args -G1 --filename --directory src.jpg > out.args
        exiftool -@ out.args dst.jpg
    Note: Be careful when copying information with this technique since it is easy to write tags which are normally considered "unsafe". For instance, the FileName and Directory tags are excluded in the example above to avoid renaming and moving the destination file. Also note that the second command above will produce warning messages for any tags which are not writable.

So that gave me a CSV file. I opened it up in a text editor and made some random changes, adding alphabetical and numerical sequences in a bunch of fields, just as a test. I then saved the CSV  and ran the following command:

E:\img_output\INCOMING\TEST>exiftool -csv=testoutput.csv -r e:\img_output\INCOMING\TEST

and I get a long string of:

No SourceFile 'e:/img_output/INCOMING/TEST/test_filename.tif' in imported CSV database

I'm not sure what the error is. I'm just generating a CSV, making a couple edits, and trying to write it straight back to the files. I read a few other threads where people had similar problems, so I tried including the full path, not including, etc etc, and always the same error. Is it normal that the forward slashes and backslashes seem to be inverted? Do I even need to be removing the --filename and --directory as I'm doing?

Phil Harvey

  • ExifTool Author
  • Administrator
  • ExifTool Freak
  • *****
  • Posts: 14937
    • ExifTool Home Page
Re: Some IPTC/XMP related questions in developing CSV workflow
« Reply #5 on: November 11, 2012, 07:06:56 PM »
The -csv option is exactly reversible:

exiftool -csv -r . > out.csv

is the inverse of

exiftool -csv=out.csv -r .

If you are in the same directory and specify your file names in the same way then it should work.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

3design

  • Jr. Member
  • **
  • Posts: 29
Re: Some IPTC/XMP related questions in developing CSV workflow
« Reply #6 on: November 11, 2012, 10:20:41 PM »
Thanks Phil, I modified the command line as you suggested and it properly wrote my 7 test files in 3 subdirectories.

Is it possible to specify a filename string using wildcards when reading to csv and writing from csv? For example, if I want to restrict the read/write to a subset of the tif files. Something like this: "constant_string* 01*.tif"
Notice the double wildcard and the space within the name.

Also, the cataloging that I'm working on requires a lot of custom csv fields for my own internal use (i.e. fields which are not necessarily image related in the obvious sense, but pertain more so to categories, merchant ID#'s, prices, related images, etc etc). I'd like to be able to keep these custom fields in the same csv file, using my own tags (i.e. internal_cat, internal_merchID, internal_price, etc etc) but I don't want to run the risk of having those fields written to my images. They're just for my own administrative use in the CSV file. I'll admit once more to my lack of understanding of IPTC/XMP tagging in general and ask, is it possible for completely custom fields to be written to images? How do I avoid doing so, if I'm writing the entire CSV to my directories of images?

Phil Harvey

  • ExifTool Author
  • Administrator
  • ExifTool Freak
  • *****
  • Posts: 14937
    • ExifTool Home Page
Re: Some IPTC/XMP related questions in developing CSV workflow
« Reply #7 on: November 12, 2012, 07:10:19 AM »
Is it possible to specify a filename string using wildcards when reading to csv and writing from csv? For example, if I want to restrict the read/write to a subset of the tif files. Something like this: "constant_string* 01*.tif"
Notice the double wildcard and the space within the name.

The shell globbing isn't powerful enough to handle this sort of thing in a directory hierarchy.  You can use ExifTool's -if option to do this, but there is a performance penalty because it will still read the metadata from all files:

exiftool -if '$filename =~ /^constant_string.*01.*\.tif$/' ...

Quote
Also, the cataloging that I'm working on requires a lot of custom csv fields for my own internal use (i.e. fields which are not necessarily image related in the obvious sense, but pertain more so to categories, merchant ID#'s, prices, related images, etc etc). I'd like to be able to keep these custom fields in the same csv file, using my own tags (i.e. internal_cat, internal_merchID, internal_price, etc etc) but I don't want to run the risk of having those fields written to my images. They're just for my own administrative use in the CSV file. I'll admit once more to my lack of understanding of IPTC/XMP tagging in general and ask, is it possible for completely custom fields to be written to images? How do I avoid doing so, if I'm writing the entire CSV to my directories of images?

To write custom tags you need to first define them as a user-defined tag.  But if you get unlucky your tags could have the same name as existing writable tags.  To avoid this, just use a bogus group name for each of these tags (ie. "MyGroup:MyTag").  Then you can guarantee that they won't be written.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

3design

  • Jr. Member
  • **
  • Posts: 29
Re: Some IPTC/XMP related questions in developing CSV workflow
« Reply #8 on: November 12, 2012, 09:36:37 AM »
The shell globbing isn't powerful enough to handle this sort of thing in a directory hierarchy.  You can use ExifTool's -if option to do this, but there is a performance penalty because it will still read the metadata from all files:
exiftool -if '$filename =~ /^constant_string.*01.*\.tif$/' ...

Is this only able to be run from the perl library version? I'm currently using the windows standalone and I must have tried 2 dozen variants of the command. I get "File not found: ~"

This is a more exact example of a filename structure I want to isolate:
constant1 constant2 - category sub category ABC123.tif

In this case, the terms "constant1 constant2" will be constant across hundreds of directories.. everything from the hyphen onwards will be variable except for the TIF extension.

This is the command I'm executing:
exiftool -if '$filename =~ /^constant1.constant2.*\.tif$/' -csv -G --filename --directory -r . -ext tif > test.csv

I've tried single quotes, double quotes, including only the first constant, removing the extension... everything I could think of.

Phil Harvey

  • ExifTool Author
  • Administrator
  • ExifTool Freak
  • *****
  • Posts: 14937
    • ExifTool Home Page
Re: Some IPTC/XMP related questions in developing CSV workflow
« Reply #9 on: November 12, 2012, 10:38:21 AM »
Is this only able to be run from the perl library version? I'm currently using the windows standalone and I must have tried 2 dozen variants of the command. I get "File not found: ~"

Sorry.  In Windows you must use double quotes, not single.  You say you've tried them, but it really should work like this in Windows:

exiftool  -if "$filename =~ /^constant_string.*01.*\.tif$/" ...

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

3design

  • Jr. Member
  • **
  • Posts: 29
Re: Some IPTC/XMP related questions in developing CSV workflow
« Reply #10 on: November 12, 2012, 01:47:54 PM »
Hmm.. this definitely isn't working. I've lost count of how many variations I've tried now.

Also, as a test, I shortened the filename to just AAA.tif and it still doesn't find it, so I think it has to be something with that syntax in general.


Phil Harvey

  • ExifTool Author
  • Administrator
  • ExifTool Freak
  • *****
  • Posts: 14937
    • ExifTool Home Page
Re: Some IPTC/XMP related questions in developing CSV workflow
« Reply #11 on: November 12, 2012, 03:03:25 PM »
I'll see if I can try this in Windows if I get a chance.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

3design

  • Jr. Member
  • **
  • Posts: 29
Re: Some IPTC/XMP related questions in developing CSV workflow
« Reply #12 on: November 12, 2012, 03:53:41 PM »
Thanks! I also tried it via the perl version under strawberry perl and it's the same result.
(Also encountered an error in the second step of the 'make' procedure when attempting to install the Image::ExifTool package)
This is the error:
D:\utility\EXIFtool>make
to undefined at D:/utility/StrawberryPerl/strawberry-perl-5.14.2.1-64bit-portable/perl/lib/ExtUtils/Install.pm line 1208
make: *** [pm_to_blib] Error 2

A quick question re custom tags and config file... Theoretically, if my tags in the CSV were some completely crazy string (i.e. 4398ew650e43e6re0:gibberish_image_customtag) then I wouldn't necessarily even need a config file, right? In other words, if the group name is something completely unique, like a trademarked name or a phonetic spelling from another language (just as examples), and the field name likewise, then the chances of that tag actually existing in any group is for all intents and purposes zero, and would not get written anyway... So in that situation is a config file still unnecessary..?

Phil Harvey

  • ExifTool Author
  • Administrator
  • ExifTool Freak
  • *****
  • Posts: 14937
    • ExifTool Home Page
Re: Some IPTC/XMP related questions in developing CSV workflow
« Reply #13 on: November 12, 2012, 06:08:27 PM »
Re the "make" problem:  You can run exiftool directly.  It doesn't need to be built.

You only need the config file if you want to write custom tags.  Since you don't want to write them, you don't need a config file.  I just mentioned the config file earlier to try to make this point.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

Phil Harvey

  • ExifTool Author
  • Administrator
  • ExifTool Freak
  • *****
  • Posts: 14937
    • ExifTool Home Page
Re: Some IPTC/XMP related questions in developing CSV workflow
« Reply #14 on: November 15, 2012, 09:40:57 AM »
I'll see if I can try this in Windows if I get a chance.

Yes, it doesn't work in Windows.  When I add -v I see that problem is "Search pattern not terminated".

By trial and error, it seems that doubling the "$" fixes this:

exiftool  -if "$filename =~ /^constant_string.*01.*\.tif$$/" ...

I tried this and it is also a problem on the Mac.  Ah, right.  ExifTool translates the "$/" to a newline, which messes up the search expression.  But "$$" is translated to a single "$".  Sorry for not realizing this sooner.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

3design

  • Jr. Member
  • **
  • Posts: 29
Re: Some IPTC/XMP related questions in developing CSV workflow
« Reply #15 on: November 15, 2012, 01:44:03 PM »
Well that's something I didn't try. Thanks for following up on this; hopefully others will find it useful as well.

3design

  • Jr. Member
  • **
  • Posts: 29
Re: Some IPTC/XMP related questions in developing CSV workflow
« Reply #16 on: November 20, 2012, 02:05:55 PM »
Finally had a chance to sit down and almost finalize my workflow.. I have one additional question which I haven't been able to solve so far.. I'm using the following command to extract a specific list of tags, from a specific list of files, into a csv:

exiftool -if "$filename =~ /^CONSTANT.*STRING.*\.tif$$/" -csv -f -a -s -G -charset UTF8 -Composite:DateTimeCreated -Composite:DateTimeOriginal -Composite:ImageSize -EXIF:Artist -EXIF:BitsPerSample -r . -ext tif > MASTER_TagList.csv

Even with the -f and -G switches, the group name is still not included for empty tags. So for, example, the resulting list of field headers for this list, is:

SourceFile,DateTimeCreated,DateTimeOriginal,Composite:ImageSize,Artist,EXIF:BitsPerSample

Is there a way to also force group names for tags without content? IOW, in the above list, to force the field header EXIF:Artist instead of just Artist...

EDIT:
Another problematic situation which I think might be related.. When running the same command above, on brand new images which haven't yet been tagged:

exiftool -if "$filename =~ /^CONSTANT.*STRING.*\.tif$$/" -csv -f -a -s -G -charset UTF8 -IPTC:Headline -XMP:Headline -r . -ext tif > MASTER_TagList.csv

The resulting csv contains fields:
SourceFile,Headline

So in this instance there are two problems.. The IPTC: and XMP: group names aren't being added to the field headers, because those fields are empty in the file. AND, no matter what I've tried, the field is only being created once since the description is the same, even though I'm including the -a switch. Ideally, I'd like to see the following output in the CSV:
SourceFile,IPTC:Headline,XMP:Headline

Even if both headline tags are empty in the images themselves.
« Last Edit: November 20, 2012, 02:55:58 PM by 3design »

Phil Harvey

  • ExifTool Author
  • Administrator
  • ExifTool Freak
  • *****
  • Posts: 14937
    • ExifTool Home Page
Re: Some IPTC/XMP related questions in developing CSV workflow
« Reply #17 on: November 20, 2012, 07:58:56 PM »
Sorry, but what you are asking is not possible in general.  ExifTool doesn't know the group names until it reads the file.  The groups are dynamic, and even though you specify a group when you request the tag, this is no guarantee that the group you print will be the same.  For example:

exiftool -exif:artist -G1 -f FILE

will likely return "IFD0:Artist", or maybe "ExifIFD:Artist" since some software erroneously stores Artist in the EXIF IFD.  If the Artist tag doesn't exist, ExifTool doesn't know the group, so it can't print it.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

3design

  • Jr. Member
  • **
  • Posts: 29
Re: Some IPTC/XMP related questions in developing CSV workflow
« Reply #18 on: November 20, 2012, 08:38:52 PM »
I see.. and using a command such as

exiftool -if "$filename =~ /^CONSTANT.*STRING.*\.tif$$/" -csv -f -a -s -G -charset UTF8 -1IPTC:Headline -2XMP:Headline -r . -ext tif > MASTER_TagList.csv

wouldn't work? I recall reading someplace that tag groups could be specified in that way. But I see how both tags being empty (as yet untagged) in the first place is causing the problem.

I had convinced myself that I'd be able to pull a consistent set of CSV field headers from all the files I'm working with, and it's looking like that isn't going to work, so it's a setback in creating a master csv to store everything in. Ideally I'd like to be able to just append new csv files into an existing csv that contains the master set of fields, as I read additional directories, without having to hunt down which columns exist and which don't and which need to be shifted and which need to be removed and which need to be added or renamed. I'm thinking there might not be a way to do that.

Phil Harvey

  • ExifTool Author
  • Administrator
  • ExifTool Freak
  • *****
  • Posts: 14937
    • ExifTool Home Page
Re: Some IPTC/XMP related questions in developing CSV workflow
« Reply #19 on: November 21, 2012, 07:07:58 AM »
You mean -0IPTC:Headline -0XMP:Headline.  Technically, yes, in this case ExifTool does have enough information to know the -G0 group names (aside from the case, which would have to be taken from the command line).  However, this is a very special case and would require dedicated code (and try to explain this one in the documentation!).  Also, I'm not sure if it makes sense to allow ExifTool to output arbitrary group names (ie. -0SomeNon-ExistentGroup:Headline).

Instead, why not create a (small) dummy file with all of the information you want, then parse this along with the files you are interested in.  Then the output is consistent and all you have to do is delete one row.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

3design

  • Jr. Member
  • **
  • Posts: 29
Re: Some IPTC/XMP related questions in developing CSV workflow
« Reply #20 on: November 21, 2012, 04:11:50 PM »
Actually, that's essentially what I've done.. I created a template image in the top folder of any directory structure I read in. However, when reading-in different folder structures with images created over many years, additional tags are constantly popping up that don't exist in the template. It's a headache when a master csv file contains columns from A - EQ and you think you grabbed all the tags, then you read in a new folder and the resulting columns go from A - ES.. now you have an extra 2 columns in the new data.. what are they? Where are they? Now you have to hunt all 150 or so columns and find which ones are missing, then create them in the master csv so columns don't get skewed if you copy/paste the new data into the existing file.

I've also tried running that template image file against an explicit command line containing all the tags I want to grab.. the command line is basically a huge paragraph of tags

exiftool -csv -f -G -charset UTF8 [MEGA LIST OF TAGS HERE] -r . -ext tif > MASTER_OUTPUT.csv

But if it attempts to read tags (i.e. IPTC:Headline and XMP:Headline) from the template *and* from brand-new untagged files, then instead of just:

IPTC:Headline,XMP:Headline

the resulting field headers in the csv become:

Headline,IPTC:Headline,XMP:Headline

because while it pulls the IPTC: and XMP: field headers from the template image, it also combines those same two nonexistent Headlines into one 'Headline' field for all the untagged files. That kind of situation just throws a wrench into streamlining the import of new data because additional columns in the new data keep popping up to skew the total # of columns.

I suppose I can just read all ~20,000 files in one run, so all the possible tags are aggregated at once, but the problem is new images are created on an almost daily basis, so the potential for "rogue" additional columns is always there. What I was hoping for was a way in which to read, for example, IPTC:Headline and XMP:Headline from an image and create field headers entitled IPTC:Headline and XMP:Headline in the csv (whether those tags are in the image or not) but if they're not in the image, *do not* create the plain old "Headline" field in the csv. I think that would be a step closer to creating a consistent output each time.

(sorry for the long post, trying to brainstorm a solution to this as I run through it)

Phil Harvey

  • ExifTool Author
  • Administrator
  • ExifTool Freak
  • *****
  • Posts: 14937
    • ExifTool Home Page
Re: Some IPTC/XMP related questions in developing CSV workflow
« Reply #21 on: November 21, 2012, 07:09:30 PM »
But if it attempts to read tags (i.e. IPTC:Headline and XMP:Headline) from the template *and* from brand-new untagged files, then instead of just:

IPTC:Headline,XMP:Headline

the resulting field headers in the csv become:

Headline,IPTC:Headline,XMP:Headline

This won't happen without the -f option.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).