urllib - Read For Learn

‘module’ has no attribute ‘urlencode’

urllib has been split up in Python 3. The urllib.urlencode() function is now urllib.parse.urlencode(), the urllib.urlopen() function is now urllib.request.urlopen().

Making a POST call instead of GET using urllib2

This may have been answered before: Python URLLib / URLLib2 POST. Your server is likely performing a 302 redirect from http://myserver/post_service to http://myserver/post_service/. When the 302 redirect is performed, the request changes from POST to GET (see Issue 1401). Try changing url to http://myserver/post_service/.

Download Returned Zip file from URL

Most people recommend using requests if it is available, and the requests documentation recommends this for downloading and saving raw data from a url: Since the answer asks about downloading and saving the zip file, I haven’t gone into details regarding reading the zip file. See one of the many answers below for possibilities. If … Read more

can we use XPath with BeautifulSoup?

Nope, BeautifulSoup, by itself, does not support XPath expressions. An alternative library, lxml, does support XPath 1.0. It has a BeautifulSoup compatible mode where it’ll try and parse broken HTML the way Soup does. However, the default lxml HTML parser does just as good a job of parsing broken HTML, and I believe is faster. … Read more

Python: urllib.error.HTTPError: HTTP Error 404: Not Found

So apparently the default display number of questions per page is 50 so the range you defined in the loop goes out of the available number of pages with 50 questions per page. The range should be adapted to be within the number of total pages with 50 questions each. This code will catch the … Read more

Python 3.5.1 urllib has no attribute request

According to this, you have to use the following: The reason is: With packages, like this, you sometimes need to explicitly import the piece you want. That way, the urllib module doesn’t have to load everything up just because you wanted one small part.

urllib and “SSL: CERTIFICATE_VERIFY_FAILED” Error

If you just want to bypass verification, you can create a new SSLContext. By default newly created contexts use CERT_NONE. Be careful with this as stated in section 17.3.7.2.1 When calling the SSLContext constructor directly, CERT_NONE is the default. Since it does not authenticate the other peer, it can be insecure, especially in client mode where most of time you … Read more

UnicodeEncodeError: ‘charmap’ codec can’t encode characters

I fixed it by adding .encode(“utf-8”) to soup. That means that print(soup) becomes print(soup.encode(“utf-8”)).

urllib2.HTTPError: HTTP Error 403: Forbidden

By adding a few more headers I was able to get the data: Actually, it works with just this one additional header:

AttributeError: ‘module’ object has no attribute ‘urlretrieve’

As you’re using Python 3, there is no urllib module anymore. It has been split into several modules. This would be equivalent to urlretrieve: urlretrieve behaves exactly the same way as it did in Python 2.x, so it’ll work just fine. Basically: urlretrieve saves the file to a temporary file and returns a tuple (filename, … Read more

+ More