You should use &
/ |
operators and be careful about operator precedence (==
has lower precedence than bitwise AND
and OR
):
df1 = sqlContext.createDataFrame( [(1, "a", 2.0), (2, "b", 3.0), (3, "c", 3.0)], ("x1", "x2", "x3")) df2 = sqlContext.createDataFrame( [(1, "f", -1.0), (2, "b", 0.0)], ("x1", "x2", "x3")) df = df1.join(df2, (df1.x1 == df2.x1) & (df1.x2 == df2.x2)) df.show() ## +---+---+---+---+---+---+ ## | x1| x2| x3| x1| x2| x3| ## +---+---+---+---+---+---+ ## | 2| b|3.0| 2| b|0.0| ## +---+---+---+---+---+---+
Related Posts:
- How to join on multiple columns in Pyspark?
- How to change dataframe column names in pyspark?
- Spark RDD to DataFrame python
- How to change dataframe column names in pyspark?
- How to delete columns in pyspark dataframe
- Pyspark: Exception: Java gateway process exited before sending the driver its port number
- environment variables PYSPARK_PYTHON and PYSPARK_DRIVER_PYTHON
- Is there a way to create multiline comments in Python?
- ‘pip’ is not recognized as an internal or external command
- What is the purpose of the word ‘self’?
- Python- Robot Framework Rebot Using List
- how to reset index pandas dataframe after dropna() pandas dataframe
- How to update/upgrade a package using pip?
- How can I remove a specific item from an array?
- Behaviour of increment and decrement operators in Python
- Convert bytes to a string
- How do I update\upgrade pip itself from inside my virtual environment?
- Python integer incrementing with ++ [duplicate]
- Replacing instances of a character in a string
- What is the use of “assert” in Python?
- IndexError: too many indices for array
- IndexError: too many indices for array
- numpy array: IndexError: too many indices for array
- Python3 – ModuleNotFoundError: No module named ‘numpy’
- How do I specify new lines on Python, when writing on files?
- What is the purpose of the return statement?
- Relative imports – ModuleNotFoundError: No module named x
- bash: pip: command not found
- What is the difference between importing matplotlib and matplotlib.pyplot?
- Using global variables in a function
- How do I check what version of Python is running my script?
- How to read a large file – line by line?
- How to uninstall pip on OSX?
- How to delete a file or folder in Python?
- Converting integer to string in Python
- deleting file if it exists; python
- Reverse a string in Python
- Why am I seeing “TypeError: string indices must be integers”?
- Python for-in loop preceded by a variable
- Python Linked List
- What is the result of % in Python?
- ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
- Reading an Excel file in python using pandas
- What does if __name__ == “__main__”: do?
- How to print without a newline or space
- Python’s equivalent of && (logical-and) in an if-statement
- Python time.sleep() vs event.wait()
- Program to Unjumble Words on Python [closed]
- TypeError: ‘module’ object is not callable
- “inconsistent use of tabs and spaces in indentation”
- Error: ‘int’ object is not subscriptable – Python
- Referring to the null object in Python
- How to measure elapsed time in Python?
- How to uninstall Anaconda completely from macOS
- How to overcome TypeError: unhashable type: ‘list’
- Python: Start and stop timer [duplicate]
- Does Python have a ternary conditional operator?
- python .replace() regex [duplicate]
- not all arguments converted during string formatting.. NO % variables
- How do I install opencv using pip?
- TypeError: ‘builtin_function_or_method’ object is not subscriptable
- How do I use raw_input in Python 3
- Python Variable Declaration
- How do I compare two strings in python?
- SyntaxError: “can’t assign to function call”
- Curve curvature in numpy
- ImportError: No module named sklearn.cross_validation
- Get statistics for each group (such as count, mean, etc) using pandas GroupBy?
- How can I prevent the TypeError: list indices must be integers, not tuple when copying a python list to a numpy array?
- What are data classes and how are they different from common classes?
- How to write the Fibonacci Sequence?
- Tensorflow import error: No module named ‘tensorflow’
- Python random function
- Python Dictionary Comprehension
- Converting NumPy array into Python List structure?
- Converting string into datetime
- ImportError: No module named matplotlib.pyplot
- How do you get the logical xor of two variables in Python?
- How do I install pip on macOS or OS X?
- ValueError: could not convert string to float: id
- Converting string into datetime
- How do I list all files of a directory?
- Pip freeze vs. pip list
- What is the necessity of plt.figure() in matplotlib?
- The difference between comparison to np.nan and isnull()
- Best way to return multiple values from a function? [closed]
- How to kill a running Spark application?
- How does createOrReplaceTempView work in Spark?
- What is the difference between spark.sql.shuffle.partitions and spark.default.parallelism?
- Python pandas – filter rows after groupby
- Pandas join issue: columns overlap but no suffix specified
- Meaning of @classmethod and @staticmethod for beginner?
- What does sys.stdin read?
- The value of “spark.yarn.executor.memoryOverhead” setting?
- How to read the last line of a file in Python?
- urllib2.HTTPError: HTTP Error 403: Forbidden
- Python/Django: log to console under runserver, log to file under Apache
- TensorFlow not found using pip
- How can I find the dimensions of a matrix in Python?
- How to convert rdd object to dataframe in spark