Author Topic: Arg file whose path and contents contain non-ANSI characters?  (Read 3803 times)

Phil Harvey

  • ExifTool Author
  • Administrator
  • ExifTool Freak
  • *****
  • Posts: 13735
    • ExifTool Home Page
Re: Arg file whose path and contents contain non-ANSI characters?
« Reply #15 on: July 02, 2017, 08:16:22 PM »
Hi John,

Thanks for your work on this.  I'll read this in detail when I have a some time to absorb/implement your suggestions and post back here if I have any questions.  It may take me a few days.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

johnrellis

  • Full Member
  • ***
  • Posts: 34
Re: Arg file whose path and contents contain non-ANSI characters?
« Reply #16 on: July 03, 2017, 03:03:14 PM »
I've got a working solution now and personally don't have need for anything better. But as I was  learning about the issue, the following approach (which you may already be aware of) might simplify use of ExifTool on Windows:

1. Use the Perl Win32 module to import and call the Windows API function GetCommandLineW() to get the UTF-16-encoded command line.

2. Call GetCommandLineToArgvW() to parse the command line according to Windows conventions, yielding "argc" and "argv".

3. Use Perl encoding functions or the Windows API WideCharToMultiByte() to convert the UTF-16 strings to UTF-8 for internal use inside ExifTool.

With this approach, no matter what the current console page, ExifTool would receive the correct Unicode characters from the command line.  Doing "chcp 65001" to set the code page to UTF-8 would provide the most flexibility, of course.

The existing Win32::CommandLine module does something similar, but its documentation doesn't make clear how it handles Unicode, and it substitutes its own conventions for quoting, globbing, and parsing rather than use the Windows conventions.  (Though I prefer Unix traditions when I'm on Unix, I think I'd prefer Windows conventions while on Windows, especially if I'm using other command-line tools from a batch file.)


Phil Harvey

  • ExifTool Author
  • Administrator
  • ExifTool Freak
  • *****
  • Posts: 13735
    • ExifTool Home Page
Re: Arg file whose path and contents contain non-ANSI characters?
« Reply #17 on: July 03, 2017, 09:38:10 PM »
Hi John,

Thanks for this info.  I was really hoping not to have to add another system-specific dependency to ExifTool, but it would be the most complete solution.  I have already patched enough routines with Windows-specific stuff due to this Unicode problem.  I begrudge Microsoft for turning my nice code into a mess of patches, and would like to avoid adding more if possible.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

johnrellis

  • Full Member
  • ***
  • Posts: 34
Re: Arg file whose path and contents contain non-ANSI characters?
« Reply #18 on: July 03, 2017, 10:07:12 PM »
Yup, totally sympathize.

Phil Harvey

  • ExifTool Author
  • Administrator
  • ExifTool Freak
  • *****
  • Posts: 13735
    • ExifTool Home Page
Re: Arg file whose path and contents contain non-ANSI characters?
« Reply #19 on: July 06, 2017, 12:01:00 PM »
I still plan to update the documentation with your recommendations but things have been busy for me this week and I'm away next week, so it may be a few weeks before I get to this, but I'll post back here when I have done so.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

johnrellis

  • Full Member
  • ***
  • Posts: 34
Re: Arg file whose path and contents contain non-ANSI characters?
« Reply #20 on: July 06, 2017, 12:17:50 PM »
No worries, and no worries if in the end you don't find the suggested edits helpful.