How to access and download your Google Web History with wget
Google Web History has now been recording all of the searches I made in Google since about 2005. Obviously 6 years of search queries and results is a phenomenal amount of data, and it would be nice to get hold of it all to see what I could make of it. Fortunately Google make the data available as an RSS feed, although it’s not particularly well documented.
(caution - many ‘ifs’ coming up next)
If you’re logged into your Google account the rss feed can be accessed at:
https://www.google.com/history/?q=&output=rss&num=NUM&start=START
If you’re using a *nix based operating system (Linux, Mac OS X etc) you can then use wget on the command line to get the data. The below example works for retrieving the 1000 most recent searches in your history:
wget --user=GOOGLE_USERNAME \
--password=PASSWORD --no-check-certificate \
"https://www.google.com/history/?q=&output=rss&num=1000&start=0"
If you’ve enabled 2-factor authentication on your google account you’ll need to add an app-specific password for wget so it can access your account - the password in the example above should be this app-specific password, not your main account password. If you haven’t enabled 2 factor authentication then you might be able to use your normal account password, but I haven’t tested this.
A simple bash script will then allow you to download the entire search history:
for START in 0 1000 2000 3000 ... 50000
do
wget --user=GOOGLE_USERNAME \
--password=WGET_APP_SPECIFIC_PASSWORD --no-check-certificate \
"https://www.google.com/history/?output=rss&num=1000&start=$START"
done
You may need to adjust the numbers in the first line - I had to go up to 50000 to get my entire search history back to 2005, you may need to make fewer calls if your history is shorter, or more if its longer.