Author Topic: Bulk processing  (Read 229 times)

sgbotsford

  • Newbie
  • *
  • Posts: 6
Bulk processing
« on: March 04, 2019, 09:57:30 PM »
A search so far hasn't revealed the best way to attack this:

Periodically I run  "update image files from stored keyword database.

This write out my keywords to the individual images.

However, I've changed the format of my keywords, after finding out that Apple Aperture doesn't store parent keywords.

Ideally I want this to be file driven to run with a single or small number of invocations  of Exiftool

I want to do a global replace of keywords:

Existing Keyword || List of keywords to replace it with.
White Spruce || Spruce, White | Picea glauca | Conifer | Native Tree
Inferno Sugar Maple || Maple, Sugar "Inferno" | Acer sacharnum | Shade Tree | Edible (sap)

At present what I think I would have to do is run Exiftool once and pull all the existing keywords out of it, sort them, and probably stuff them in a spreadsheet temporarily, then sort in various ways while making the additions.

Then write a perl script that runs exiftool twice -- once to pull the existing keywords out of a file, then create the new list of words from the old list, then call exiftool a second time to write out the new words.  This uses two calls per image file.  At 30,000 images, this will take a while.

Is there a better way to approach this?

StarGeek

  • Global Moderator
  • ExifTool Freak
  • *****
  • Posts: 2569
Re: Bulk processing
« Reply #1 on: March 04, 2019, 11:24:58 PM »
I'm guessing that you have a lot of keywords?

Running twice per file would be really inefficient.  I can think of a couple ways to do a bulk replacement using a specialized config file, but it would require a list of the keywords and replacements ahead of time.

Unless someone comes along with a better idea, I'll see what I can do to make a replacement keyword config which you could just copy/paste a big list of keywords and their replacements.  Can I assume the list you would make would be in the format above?  Though I would suggest not having spaces around the separators.  For example
White Spruce||Spruce, White|Picea glauca|Conifer|Native Tree
Inferno Sugar Maple||Maple, Sugar "Inferno"|Acer sacharnum|Shade Tree|Edible (sap)
Troubleshooting hints:
* When posting, include your OS, Exiftool version, and type of file you're processing (MP4, JPG, etc).
* Double all percent signs (%) in a Windows batch file.
* If your GPS coords are negative, make sure and set the GpsLatitudeRef and GpsLongitudeRef tags correctly.

StarGeek

  • Global Moderator
  • ExifTool Freak
  • *****
  • Posts: 2569
Re: Bulk processing
« Reply #2 on: March 05, 2019, 02:37:03 PM »
Here's a config file that will replace keywords with the given list of new keywords.  It is currently set to read the Subject tag for the keywords, though this can be changed if needed.

You would take your list of keywords and replacements in the format
ExistingKeyword||ReplacementKeyword1|ReplacementKeyword2|ReplacementKeyword3
and place it at the end of the file after the __DATA__ line.  If this format isn't convenient, let me know what is and I can fix the config.

The replacement of  ExistingKeyword is case insensitive, so White Spruce, white spruce, WHITE SPRUCE, and WhItE SpRuCe would all be replaced.

If no replacements are made, then ReplacementKeywords is undefined.  It will pass through any existing keywords, so you can simply do something like
exiftool -config ReplacementKeywords.config -if "$ReplacementKeywords" "-Subject<ReplacementKeywords" DIR

Example output:
Code: [Select]
C:\>exiftool -config ReplacementKeywords.config -Subject -ReplacementKeywords y:\!temp\Test4.jpg
Subject                         : White Spruce, Other Stuff, Inferno Sugar Maple, Conifer
Replacement Keywords            : Spruce, White, Picea glauca, Conifer, Native Tree, Other Stuff, Maple, Sugar "Inferno", Acer sacharnum, Shade Tree, Edible (sap)

In this example, there are two keywords for replacement, one keyword that will not be replaced, and one keyword that will end up being a duplicate.  White Spruce and Inferno Sugar Maple will be replaced by the appropriate lists.  Other Stuff will be passed through.  Conifer from the original list will be duplicated from the Inferno Sugar Maple replacements, but only appear once in the final list as duplicates are removed.  Also note that "Spruce, White" is a single keyword.  You can double check using the -sep option if desired.

I have not intensively tested this, so make sure you do some testing first to make sure it does what you want.
Troubleshooting hints:
* When posting, include your OS, Exiftool version, and type of file you're processing (MP4, JPG, etc).
* Double all percent signs (%) in a Windows batch file.
* If your GPS coords are negative, make sure and set the GpsLatitudeRef and GpsLongitudeRef tags correctly.