Solving typeerror: unhashable type: ‘numpy.ndarray’ in Python

typeerror unhashable type numpy.ndarray

What does unhashable type mean in Python?

Unhashable type means that object does not have a fixed hash value. Object like that cannot be used  as a key in a dictionary or an element in a set.

Data structures like dictionaries and sets are hash based data structures. Hash based data structures means that the structure relies on hash functions to efficiently store and retrieve elements.

A hashable type in python is an object that has a fixed value throughout its lifetime. The hashable objects can be used as keys in dictionaries and elements in sets. Some of the examples include integers, strings and tuples.

Why Are numpy.ndarray Objects Unhashable?

In simple terms numpy.ndarray objects are mutable, which means its content can change after creation. This mutability prevents them from being hashable and making them unsuitable for using them as dictionary keys or set elements.

Examples of the 'Unhashable Type numpy.ndarray' Error in Pandas

Example 1: Trying to find unique values in dataframe column

In the given code we are trying to find unique values in a DataFrame column using the unique() function of the pandas library. We will encounter an error: Let me explain the code.
Python code

import pandas as pd
import numpy as np


#creating data for dataframe
data = {
    'Numbers': [1, 2, 3, 4, 5],
    'Strings': ['apple', 'banana', 'cherry', 'date', 'elderberry'],
    'Floats': [1.1, 2.2, 3.3, 4.4, 5.5],
    'Booleans': [True, False, True, False, True],
    'Arrays': [np.array([1, 2]), 'hello', 'test', 1, np.array([9, 10])]
}


#creating dataframe
df = pd.DataFrame(data)
df.head()

output

Numbers	Strings	Floats	Booleans	Arrays
0	1	apple	1.1	True	[1, 2]
1	2	banana	2.2	False	hello
2	3	cherry	3.3	True	test
3	4	date	4.4	False	1
4	5	elderberry	5.5	True	[9, 10]

Trying to get unique values for ‘Arrays’ column from dataframe (df) which gives error

df['Arrays'].unique()

We get a TypeError: unhashable type: ‘numpy.ndarray’ because Python requires a hashable object for unique() operation in Pandas. But NumPy arrays are mutable means they are unhashable.

Output

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[7], line 1
----> 1 df['Arrays'].unique()

Python\Python310\site-packages\pandas\core\algorithms.py:428, in unique_with_mask(values, mask)
    426 table = hashtable(len(values))
    427 if mask is None:
--> 428     uniques = table.unique(values)
    429     uniques = _reconstruct_data(uniques, original.dtype, original)
    430     return uniques

File pandas\_libs\hashtable_class_helper.pxi:7247, in pandas._libs.hashtable.PyObjectHashTable.unique()

File pandas\_libs\hashtable_class_helper.pxi:7194, in pandas._libs.hashtable.PyObjectHashTable._unique()

TypeError: unhashable type: 'numpy.ndarray'

Solution : To solve the error we will convert the ‘Arrays’ column values to string and apply the unique() function to get unique values for that column.

The error is solved as string is non mutable which makes it hashable

df['Arrays'].apply(lambda x: str(x)).unique()

Output

array(['[1 2]', 'hello', 'test', '1', '[ 9 10]'], dtype=object)

Example 2: Trying to create a dictionary with a NumPy array as a key.

In the below code we are using numpy array as dictionary key and assigning a value ‘This line will give error. This process will throw an error. As dictionary keys need to be hasable and numpy arrays are not hashable.

Python Code

import numpy as np
my_dict = {}
key = np.array(['line_1', 'line_2', 'inline_3'])
my_dict[key]='This line will give error'

output

--------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[23], line 4
      2 my_dict = {}
      3 key = np.array([1, 2, 3])
----> 4 my_dict[key]='hi'

TypeError: unhashable type: 'numpy.ndarray'

To solve the error we just need to convert numpy arrays to tuple. Tuple are immutable and hashable
Python Code

import numpy as np
my_dict = {}

# Convert the NumPy array to tuple
key = tuple(np.array(['line_1', 'line_2', 'inline_3']))

# Using the tuple as the key in the dictionary
my_dict[key] = 'This line will not give error'

print(my_dict)

output

{('line_1', 'line_2', 'inline_3'): 'This line will not give error'}

Example 3:Trying to add a NumPy array directly to a set.

The code is trying to add a numpy array as one of the elements in the set but encounters an error.
my_set = set()
my_array = np.array(['man', 'women','animal'])
my_set.add(my_array)  
output:
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[28], line 3
      1 my_set = set()
      2 my_array = np.array([1, 2])
----> 3 my_set.add(my_array)  # Raises TypeError: unhashable type: 'numpy.ndarray'

TypeError: unhashable type: 'numpy.ndarray'
Solution: Sets in Python require their elements to be hashable. Hashable means they should be immutable whereas NumPy arrays are mutable. We have to convert numpy arrays to immutable objects. We can use frozenset().
import numpy as np
my_set = set()
my_array = np.array(['man', 'woman', 'animal'])

# Convert the array to a frozenset
my_set.add(frozenset(my_array))
print(my_set)
Output
{frozenset({'animal', 'woman', 'man'})}

How to Fix the 'Unhashable Type numpy.ndarray' Error

Converting numpy.ndarray to Hashable Types (Tuples, Set and String)

Convert the numpy.ndarray to a tuple before using it in a context that requires hashability: Converting the numpy.ndarray to a tuple before using it in a context that requires hashability.The below code converts all the values in column ‘B’ of dataframe data to tuple.
Python code
data['B'] = data['B'].apply(tuple)

Using .tolist() or .apply() for Numpy Arrays in Columns for pandas dataframe and series.

You can convert numpy.ndarray to a list using .tolist() or by applying a function. It would convert the unhashable object to a hashable object.
Python code
import pandas as pd
import numpy as np


data = pd.DataFrame({
    'A': [1, 2, 3],
    'B': [np.array([1, 2]), np.array([3, 4]), np.array([5, 6])]
})
Below code shows a Method how to use .tolist() on each value in column
data['B'] = data['B'].apply(lambda x: x.tolist())
print(data)
Output
  A       B
0  1  [1, 2]
1  2  [3, 4]
2  3  [5, 6]
Making a new DataFrame for the second method
data = pd.DataFrame({
    'A': [1, 2, 3],
    'B': [np.array([1, 2]), np.array([3, 4]), np.array([5, 6])]
})
The below code convert numpy array to list using apply function
data['B'] = data['B'].apply(list)
print(data)
Output
  A       B
0  1  [1, 2]
1  2  [3, 4]
2  3  [5, 6]

Causes of the 'Unhashable Type numpy.ndarray' Error

  • Using the numpy.ndarray as keys for indexing or in methods
  • Using numpy.ndarray as an input value in a function which accepts only hashable types.
  • When we are performing operations like union, unique or intersection on data frames that include numpy.ndarray will cause an ‘unhashable type’ error.

Best Practices to Avoid the 'Unhashable Type' Error

  • When we need to perform operations which require hashable types we should always use lists or tuples instead of numpy arrays.
  • Use a list or tuple which pandas function like set_index(), groupby(), loc[] etc.
  • Always consider data structures lists or dictionaries that are inherently hashable.

Conclusion

when you get typeerror: unhashable type: ‘numpy.ndarray’ just Convert numpy arrays to hashable types like tuple, list, string, set etc as per need and most of the times it will solve the error.

Remember to use hashable types in any index operations.

You can also learn how to sort a dictionary. 

1 thought on “Solving typeerror: unhashable type: ‘numpy.ndarray’ in Python”

  1. Pingback: Solving error failed building wheel for Numpy in python -

Comments are closed.