Wget with Firefox Cookies

I recently found myself needing to scrape information from a website that uses login credentials. The authentication and session information was available in several cookies, which Wget could use, if the cookies were stored in a plain text file. I used Firefox to login and set the cookies, but Firefox saves it’s cookies in an sqlite data file, which must be exported before Wget can use it. A quick Google search turned up a few possible methods using sqlite3, which I’ve adapted here to use with Wget. I’ve also added some additional (example) code to extract hrefs and print them out, along with the webpage url. The script is called with the target url as the only command line argument.

Download the wget-as-firefox.sh script.

Unless you use Mac OS X, you’ll probably want to update the $cookie_file path and the $user_agent value. ;-)

The sed expression does not print anything by default, adds a newline after a greater-than character (so each line cannot have more than one href), and if a line contains an href, it replaces the whole line with the href’s value and prints the resulting line.

Find this content useful? Share it with your friends!