Problem HTTP error 403 in Python 3 Web Scraping

This is probably because of mod_security or some similar server security feature which blocks known spider/bot user agents (urllib uses something like python urllib/3.3.0, it’s easily detected). Try setting a known browser user agent with:

from urllib.request import Request, urlopen

req = Request('http://www.cmegroup.com/trading/products/#sortField=oi&sortAsc=false&venues=3&page=1&cleared=1&group=1', headers={'User-Agent': 'Mozilla/5.0'})
webpage = urlopen(req).read()

This works for me.

By the way, in your code you are missing the () after .read in the urlopen line, but I think that it’s a typo.

TIP: since this is exercise, choose a different, non restrictive site. Maybe they are blocking urllib for some reason…

Leave a Comment