tinyapps.org / blog


Fixing Japanese characters in vCards with bash #

After importing an Outlook Express address book (WAB) into Windows 7 Contacts and exporting as vCards (VCF), the name fields which contained kanji or kana characters showed only question marks:
  BEGIN:VCARD
  VERSION:2.1
  N:;????
  FN:????
  EMAIL;PREF;INTERNET:yamada@example.com
  REV:20140221T212743Z
  END:VCARD

However, since the filenames were correct (e.g., 山田太郎.vcf), the name fields were restored with a little bash magic:

#!/bin/bash

# change the line endings from DOS to Unix:
gsed -i $'s/\r$//' *.vcf

# add "FN;CHARSET=UTF-8:" followed by the filename to the last line of each file
# then move the last line ($) up to the third (2):
for x in *.vcf do echo "FN;CHARSET=UTF-8:$x" >> "$x" ed -s "$x" <<< $'$m2\nw' done # remove lines beginning with "N" or "FN:" as well as the characters ".vcf" gsed -i '/^N/d; /^FN:/d; s/\.vcf//g' *.vcf

Now the vCards were ready for import into OS X's Contacts:

  BEGIN:VCARD
  VERSION:2.1
  FN;CHARSET=UTF-8:山田太郎
  EMAIL;PREF;INTERNET:yamada@example.com
  REV:20140221T212743Z
  END:VCARD

UPDATE: Just discovered that exporting from Windows 7 Contacts to CSV preserves the Japanese names correctly. The CSV file can be converted to vCard with the free (but closed source and anonymous) CSV to vCard. For what it's worth:

You might think, "Why not just export to CSV, delete the Windows Contacts, reimport the CSV file, and then export to vCard?" That doesn't work either; the resultant vCards still display question marks instead of Japanese characters. Apparently the Windows Contacts vCard export function does not handle Unicode properly. This is true even for contacts originally created in Windows Contacts, not only for those imported from WAB or other formats.

/nix | Feb 21, 2014


Subscribe or visit the archives