import re htmlString = '</dd><dt> Fine, thank you. </dt><dd> Molt bé, gràcies. (<i>mohl behh, GRAH-syuhs</i>)' SearchStr = '(\<\/dd\>\<dt\>)+ ([\w+\,\.\s]+)([\&\#\d\;]+)(\<\/dt\>\<dd\>)+ ([\w\,\s\w\s\w\?\!\.]+) (\(\<i\>)([\w\s\,\-]+)(\<\/i\>\))' Result = re.search(SearchStr.decode('utf-8'), htmlString.decode('utf-8'), re.I | re.U) print Result.groups()
Works that way. The expression contains non-latin characters, so it usually fails. You’ve got to decode into Unicode and use re.U (Unicode) flag.
I’m a beginner too and I faced that issue a couple of times myself.
Related Posts:
- whitespace in regular expression
- re.sub erroring with “Expected string or bytes-like object”
- How to find all occurrences of a substring?
- Does “\d” in regex mean a digit?
- Remove all special characters, punctuation and spaces from string
- This can be done without regex:
- Remove all special characters, punctuation and spaces from string
- Extract part of a regex match
- How to remove parentheses from string [duplicate]
- TypeError: expected string or buffer
- What does this Django regular expression mean? `?P`
- Using strip() to clean up a string
- Python regex match space only
- Case insensitive regular expression without re.compile?
- python’s re: return True if string contains regex pattern
- What is the purpose and use of **kwargs? [duplicate]
- What does random.sample() method in python do?
- (Python) TypeError: ‘float’ object is not subscriptable
- ValueError: Unknown label type: ‘continuous’
- Transpose/Unzip Function (inverse of zip)?
- TypeError: unhashable type: ‘numpy.ndarray’
- python base64 to hex
- Convert a tensor to numpy array in Tensorflow?
- pandas: merge (join) two data frames on multiple columns
- Why do people write #!/usr/bin/env python on the first line of a Python script?
- How do I parallelize a simple Python loop?
- What does the power operator (**) in python translate into?
- How can I remove a trailing newline?
- ImportError: cannot import name main when running pip –version command in windows7 32 bit
- pygame.error: video system not initialized
- How to draw a circle in PyGame?
- TypeError: ‘numpy.float64’ object is not callable?
- Why do I get: “Length of values does not match length of index” error?
- How to remove all the punctuation in a string? (Python)
- Cross Entropy in PyTorch
- How to import the class within the same directory or sub directory?
- why should I make a copy of a data frame in pandas
- How to create a new text file using Python
- Python Save to file
- Pytorch tensor to numpy array
- ‘numpy.float64’ object is not iterable
- Python: ‘break’ outside loop
- Python 3: ImportError “No Module named Setuptools”
- Return JSON response from Flask view
- Convert hex to binary
- TypeError: ‘encoding’ is an invalid keyword argument for this function
- How to calculate a logistic sigmoid function in Python?
- Why is python setup.py saying invalid command ‘bdist_wheel’ on Travis CI?
- How to write bytes to file?
- PyLint “Unable to import” error – how to set PYTHONPATH?
- ImportError: No module named scipy
- What is the Python 3 equivalent of “python -m SimpleHTTPServer”
- What does -> mean in Python function definitions?
- Difference between BeautifulSoup and Scrapy crawler?
- Displaying better error message than “No JSON object could be decoded”
- Running Bash commands in Python
- Effect of using sys.path.insert(0, path) and sys.path(append) when loading modules
- Count unique values using pandas groupby
- Running Bash commands in Python
- (unicode error) ‘unicodeescape’ codec can’t decode bytes in position 2-3: truncated \UXXXXXXXX escape
- How do you split a list into evenly sized chunks?
- Ignoring NaNs with str.containsv
- How to split a string into a list of characters in Python?
- Find where python is installed (if it isn’t default dir)
- log2 in python math module
- What is the best way to call a script from another script?
- Python was not found; run without arguments to install from the Microsoft Store, or disable this shortcut from Settings
- Python map object is not subscriptable
- Why does pycharm propose to change method to static
- append new row to old csv file python
- Distributed 1.21.8 requires msgpack, which is not installed
- Python Library Path
- .pyw files in python program
- Get the cartesian product of a series of lists?
- No module named googlesamples.assistant.auth_helpers
- Solution for SpecificationError: nested renamer is not supported while agg() along with groupby()
- Why is there no tuple comprehension in Python?
- In Flask, what is “request.args” and how is it used?
- How do I determine the size of an object in Python?
- Python function pointer
- How can I change the font size using seaborn FacetGrid?
- python math domain errors in math.log function
- Append a dictionary to a dictionary
- Error “‘type’ object has no attribute ‘__getitem__'” when iterating over list[“a”,”b”,”c”]
- Python: “TypeError: __str__ returned non-string” but still prints to output?
- How to throw error and exit with a custom message in python
- What is the difference between random.randint and randrange?
- How do I access my webcam in Python?
- Boolean Series key will be reindexed to match DataFrame index
- What is the easiest way to clear a database from the CLI with manage.py in Django?
- Printing one character at a time from a string, using the while loop
- Change the name of a key in dictionary
- Check if an object exists
- Python a &= b meaning?
- TypeError: can only concatenate tuple (not “int”) in Python
- Write a program using integers user_num and x as input, and output user_num divided by x three times
- Logical operators for Boolean indexing in Pandas
- Transposing a 1D NumPy array
- Explaining Python’s ‘__enter__’ and ‘__exit__’
- File “/usr/bin/pip”, line 9, in
from pip import main ImportError: cannot import name main