from os.path import basename
from urlparse import urlsplit
url = "http://www.yahoo.com"
urlContent = urllib2.urlopen(url).read()
# HTML image tag:
imgUrls = re.findall('img .*?src="(.*?)"', urlContent)
# download all images
for imgUrl in imgUrls:
imgData = urllib2.urlopen(imgUrl).read()
fileName = basename(urlsplit(imgUrl))
output = open(fileName,'wb')
As you can see, the code above is incredibly simple. All it does is connect to the URL specified in the 'url' variable, searches for the HTML '<img>' tag using standard regular expressions and downloads whatever file is specified in the 'src' parameter. Earthshaking code? No. Useful? Yes!
The code could be made better if you added the ability to also parse URL's that were on the page and follow those to continue parsing. That way, you could point the program to a URL and have it automatically explore associated URL's and grab images from their sites too. Still, even as it is, it's pretty useful.
** Thanks to the folks at ActiveState for this Python recipe!