0. Internet 1. Text 2. Graphics 3. System 4. File 5. Misc 6. Palm 7. OS X |
Batch crop images with GIMP and ImageMagick #
/nix | Feb 21, 2010 Delete multiple pages in a DjVu document with djvm #djvm (a command line tool bundled with DjVuLibre), does not accept multiple pages or page ranges for the delete argument. Here are a few workarounds for deleting all pages in a range, even pages, or odd pages:
/nix | Feb 15, 2010 Linux Mint 8 (Helena) - Add Firefox shortcut / launcher to Desktop #This should really be simpler...
/nix | Feb 06, 2010 HTML Tidy: Batch processing files #HTML Tidy does not natively support wildcards in filenames (e.g., *.html), but batch processing in bash is possible with a simple for loop: for f in *.html;do tidy -m -i $f;done-m = modify original input files -i = indent element content (For a complete list of arguments, see the man page.) /nix | Jan 24, 2010 Detect the character encoding of a file #The aforementioned Perl module Unicode::Japanese includes ujguess, which attempts to detect the character encoding of a given file. The Unix program file is often suggested on forums and the like for this purpose, but it only returns the file type, not the encoding. Here's an illustration of the difference, using a Shift JIS-encoded file: $ file foo foo: UTF-8 Unicode text, with no line terminators $ ujguess foo sjisand an EUC-encoded one: $ file bar bar: ISO-8859 text, with CRLF line terminators $ ujguess bar euc /nix | Jan 03, 2010 Convert numbers and spaces from full-width (double-byte) to half-width (single-byte) #Within filenames (using Bash, Perl, and Unicode-Japanese-0.47):
/nix | Jan 02, 2010 Batch replace text in PDF files #Simple text replacements in simple PDF documents can be made with changepagestring.pl, part of CAM-PDF-1.52, which, by the way, includes many other cool tools like:
$ perl -MCPAN -e shellIf this is the first time you've run CPAN, it will ask you a series of questions - the default answers worked fine for me. When the cpan> prompt appears, install the CAM::PDF module: cpan> install CAM::PDFNow let's see if our PDF allows modification: $ pdfinfo.pl pcasm-book.pdf File: pcasm-book.pdf File Size: 1071411 bytes Pages: 195 Author: Paul A. Carter CreationDate: D:20050320210800 Creator: LaTeX with hyperref package Keywords: 80x86 assembly programming Producer: pdfTeX-1.10b Subject: 80x86 Assembly Language Programming Title: PC Assembly Language Page Size: variable Optimized: no PDF version: 1.4 Security Passwd: none Print: yes Modify: yes Copy: yes Add: yesAs it does, let's batch replace the word "Borland" with the word "Inprise" and name the new file output.pdf: $ changepagestring.pl -o pcasm-book.pdf Borland Inprise output.pdfThat seems to have worked, but there are still instances of "Borland" in the file - why were they not changed? The following script by Adam314 will output the entire file, including the hidden PDF formatting codes:
#!/usr/bin/perl
use warnings;
use strict;
use CAM::PDF;
my $infile = '/path/pcasm-book.pdf';
#open file
my $doc = CAM::PDF->new($infile) || die "$CAM::PDF::errstr\n";
#look for string
for my $page (1..$doc->numPages) {
my $content = $doc->getPageContent($page);
print $content
}
Sure enough, the string "Borland" only shows up twice. Where are all the others? Why, surrounded by hideous formatting code like these examples:
Borl)1(and)1('s
Borlan)1(d's)-2
Borlan)1(d)-497
Borl)1(and)-241
In his link above, Adam314 offers advice for replacing instances like these with regex. At this point I grew rather weary, however, especially as text replacements were wont to cut off or run into other words. However, for simple text replacements in simple PDF documents, changepagestring.pl may come in handy.
/nix | Dec 13, 2009 Blosxom alternatives, microblogs, etc. #
/nix | Dec 03, 2009 Have ls return human readable formats (KB, MB, GB, etc) #Like du and df, ls supports the -h switch for using unit suffixes (Byte, Kilobyte, Megabyte, Gigabyte, etc), turning this: $ ls -l 505223 aida16en.zip 10273 atomicwebserver.zip 1359260 camstudio20.zipinto this: $ ls -lh 494K aida16en.zip 11K atomicwebserver.zip 1.3M camstudio20.zip /nix | Nov 16, 2009 Create multiple empty files #The following examples create three empty files of 1MB each: Unix: $ for i in {1..3}; do dd if=/dev/zero of=/path/$i bs=1m count=1; doneWindows: C:\>for /L %x in (1,1,3) do fsutil file createnew %x 1048576Notes:
/nix | Nov 13, 2009 |
Categories
Blosxom Archive
2010: 3 2 1
2009: 12 11 10 9 8 7 6 5 4 3 2 1 2008: 12 11 10 9 8 7 6 5 4 3 2 1 2007: 12 11 10 9 8 7 6 5 4 3 2 1 2006: 12 11 10 9 8 7 6 5 4 3 2 1 2005: 12 11 10 Blogger Archive
Ezine Archive
|