2014-03-18

Wget, Cookies and Firefox

Did you ever want to automatically (mass)-download data from a website, where a login is required, eg. a wiki or a social network? If the website stores a session cookie on your computer, it might be possible to download content automatedly using Wget.

It is possible to pass Wget a cookie file as a parameter. This might look like the following:
wget --keep-session-cookies --load-cookies=cookies.txt -p -k https://someurl.org/protected/site_01.htm

An example of a cookie file might look as follows (use tabs instead of spaces!):
# HTTP cookie file.
someurl.org  TRUE  /  FALSE  1391671828  someurlUserID  42
someurl.org  TRUE  /  FALSE  1391671828  someurlUserName Peter
someurl.org  TRUE  /  FALSE  1391671828  someurlToken  d3d3fdsere
someurl.org  TRUE  /  FALSE  -1  someurl_session  g8furfv99dmp1

After logging in on the respective website, you can conveniently view the necessary cookies in Firefox.




date can be used to convert the expiration time of the cookies in Firefox to the format used in the wget cookie files, eg. by issuing:
date -d "Wed 12 Mar 2014 01:31:42 PM CET" +%s