0. Internet 1. Text 2. Graphics 3. System 4. File 5. Misc 6. Palm 7. OS X |
Detect the character encoding of a file # The aforementioned Perl module Unicode::Japanese includes ujguess, which attempts to detect the character encoding of a given file. The Unix program file is often suggested on forums and the like for this purpose, but it only returns the file type, not the encoding. Here's an illustration of the difference, using a Shift JIS-encoded file: $ file foo foo: UTF-8 Unicode text, with no line terminators $ ujguess foo sjisand an EUC-encoded one: $ file bar bar: ISO-8859 text, with CRLF line terminators $ ujguess bar euc /nix | Jan 03, 2010 |
Categories
Blosxom Archive
2012: 2 1
2011: 12 11 10 9 8 7 6 5 4 3 2 1 2010: 12 11 10 9 8 7 6 5 4 3 2 1 2009: 12 11 10 9 8 7 6 5 4 3 2 1 2008: 12 11 10 9 8 7 6 5 4 3 2 1 2007: 12 11 10 9 8 7 6 5 4 3 2 1 2006: 12 11 10 9 8 7 6 5 4 3 2 1 2005: 12 11 10 Blogger Archive
Ezine Archive
|