The simple answer, imo, is that if you trust your source to be well-formed, go with the lxml solution. Otherwise, BeautifulSoup all the way.
Edit:
This answer is three years old now; it’s worth noting, as Jonathan Vanasco does in the comments, that BeautifulSoup4
now supports using lxml
as the internal parser, so you can use the advanced features and interface of BeautifulSoup without most of the performance hit, if you wish (although I still reach straight for lxml
myself — perhaps it’s just force of habit :)).
Related Posts:
- Where is BeautifulSoup4 hiding?
- How to remove \xa0 from string in Python?
- What is the meaning of [:] in python [duplicate]
- Understand the Find() function in Beautiful Soup
- builtins.TypeError: must be str, not bytes
- ImportError: No Module Named bs4 (BeautifulSoup)
- Difference between BeautifulSoup and Scrapy crawler?
- Using BeautifulSoup to search HTML for string
- Flask example with POST
- UnicodeEncodeError: ‘ascii’ codec can’t encode character u’\xa0′ in position 20: ordinal not in range(128)
- BeautifulSoup getting href
- How to find elements by class
- How can I from bs4 import BeautifulSoup?
- Get an attribute value based on the name attribute with BeautifulSoup
- UnicodeEncodeError: ‘charmap’ codec can’t encode characters
- python BeautifulSoup parsing table
- can we use XPath with BeautifulSoup?
- What should I use to open a url instead of urlopen in urllib3
- Can I remove script tags with BeautifulSoup?
- ImportError: No module named BeautifulSoup
- Installing lxml module in python
- Parse the JavaScript returned from BeautifulSoup
- Converting dictionary to JSON
- Saving and loading objects and using pickle
- numpy: Invalid value encountered in true_divide
- Invalid character in identifier
- Is “from matplotlib import pyplot as plt” == “import matplotlib.pyplot as plt”?
- ‘str’ object does not support item assignment
- What does `ValueError: cannot reindex from a duplicate axis` mean?
- JSONDecodeError: Expecting value: line 1 column 1 (char 0)
- How to print an exception in Python?
- What are “named tuples” in Python?
- What do these operators mean (** , ^ , %, //)? [closed]
- Python open() gives FileNotFoundError/IOError: Errno 2 No such file or directory
- Jupyter Notebook not saving: ‘_xsrf’ argument missing from post
- FileNotFoundError: [Errno 2] No such file or directory
- ValueError: all the input arrays must have same number of dimensions
- Get the data received in a Flask request
- Python function execution
- Where do I find the bashrc file on Mac?
- ValueError: setting an array element with a sequence
- Why does num = 100?
- How do order of operations go on Python?
- Python group by
- What is the difference between pip and conda?
- How to clear variables in ipython?
- How does the “view” method work in PyTorch?
- TypeError: generatecode() takes 0 positional arguments but 1 was given
- python – if not in list
- syntaxError: ‘continue’ not properly in loop
- TypeError(“‘bool’ object is not iterable”,) when trying to return a Boolean
- Flask Template Not found
- Making a collatz program automate the boring stuff
- Reverse / invert a dictionary mapping
- using pip3: module “importlib._bootstrap” has no attribute “SourceFileLoader”
- TypeError: unsupported operand type(s) for -: ‘list’ and ‘list’
- What’s the difference between a Python module and a Python package?
- How to overcome “datetime.datetime not JSON serializable”?
- AttributeError: ‘datetime’ module has no attribute ‘strptime’
- Where does this come from: -*- coding: utf-8 -*-
- Open file in a relative location in Python
- Python add item to the tuple
- SyntaxError invalid token
- TypeError: ‘type’ object is not iterable – Iterating over object instances
- How to check the version of scipy
- How to check if type of a variable is string?
- How to print colored text to the terminal
- Viewing all defined variables
- Difference between two dates in Python
- ImportError: No module named win32com.client
- Error message “Linter pylint is not installed”
- AttributeError: ‘Tensor’ object has no attribute ‘_keras_history’
- pip install -r requirements.txt [Errno 2] No such file or directory: ‘requirements.txt’
- RuntimeWarning: overflow encountered in ubyte_scalars
- Pandas – Drop function error (label not contained in axis)
- Python Requests – No connection adapters
- WindowsError: [Error 126] The specified module could not be found
- No module named ‘openpyxl’ – Python 3.4 – Ubuntu
- How to easily print ascii-art text?
- Using Tkinter in python to edit the title bar
- LinAlgError: Last 2 dimensions of the array must be square
- Python requests SSL error – certificate verify failed
- How to know/change current directory in Python shell?
- Trying to run Flask app gives “Address already in use”
- What is a unicode string?
- Remove specific characters from a string in Python
- pandas replace multiple values one column
- AttributeError: ‘tuple’ object has no attribute
- Can’t install Scipy through pip
- How to run python code in Sublime Text 3?
- Random word generator- Python
- How can I install the Beautiful Soup module on the Mac?
- What is python’s site-packages directory?
- How do I get Flask to run on port 80?
- optional arguments in initializer of python class
- How can I check if a string contains ANY letters from the alphabet?
- How do I calculate the MD5 checksum of a file in Python?
- Indent Expected?
- How exactly does a generator comprehension work?