Download webpage to .webarchive in Terminal #

Webarchiver "allows you to create Safari .webarchive files from the command line":

webarchiver -url https://tinyapps.org -output tinyapps.webarchive

With a bash function, we can automate creating the filename from the page's title tag and include the URL in the "Where from" metadata:

function dl() {
  ADDRESS="$1"
  TITLE=`curl -s "$ADDRESS" | grep -o "<title>[^<]*" -m 1 | tail -c+8`
  /Applications/network/webarchiver -url "$ADDRESS" -output "$TITLE.webarchive"
  xattr -w "com.apple.metadata:kMDItemWhereFroms" "$ADDRESS" "$TITLE.webarchive"
}

Add the above to your .bash_profile, reload with source ~/.bash_profile, and use like so:

$ dl https://tinyapps.org/docs/nvme-sanitize.html

Title tags can be tricky to parse correctly, here are some other approaches:

as well as another version wherein you manually supply the title/filename:

function dl() {
  ADDRESS="$1"
  FILENAME="$2"
  /Applications/network/webarchiver -url "$ADDRESS" -output "$FILENAME.webarchive"
  xattr -w "com.apple.metadata:kMDItemWhereFroms" "$ADDRESS" "$FILENAME.webarchive"
}

calling like so:

$ dl https://tinyapps.org/docs/nvme-sanitize.html "NVMe Sanitize"

Acquire webarchiver 0.9 via homebrew (brew install webarchiver) or MacPorts (sudo port install webarchiver), or build easily from source with Xcode.

Thanks to kenorb for his simple title regex; I only had to add -m 1 after running across a page containing multiple title tags (which apparently isn't that rare, in spite of the spec).

/mac | Oct 21, 2019


Subscribe or visit the archives.