A Stupid Mistake I Made about Sorting in Python DataFrame

One day when I try to sort a DataFrame by a column, an amazing mistake happens!

I will reproduce this stupid thing here. Firstly, make a dataframe example:

1
2
3
4
5
   a  b  c
0 9 4 6
1 2 7 5
2 5 -3 8
3 1 2 3
1
2
3
frame = pd.DataFrame({"a":[9,2,5,1],"b":[4,7,-3,2],"c":[6,5,8,3]})
frame.sort_values('a',inplace=True)
print(frame)

What do you think the result will be? What I expect it will get is like this:

1
2
3
4
5
   a  b  c
3 1 2 3
1 2 7 5
2 5 -3 8
0 9 4 6

However, what I actually get is

1
2
3
4
5
   a  b  c
0 9 4 6
1 2 7 5
2 5 -3 8
3 1 2 3

I get really confused, so I try all the arg in function

DataFrame.sort_values(by, axis=0, ascending=True, inplace=False, kind=’quicksort’, na_position=’last’)

I find out that only if the inplace= is set to True, the result is as expected. But the usage of this function I searched in google, did not meantion this parameter.

Therefore, I look for an instruction of inplace, and I find that the inplace parameter is a generic term w.r.t pandas and not specific to sort_values alone. You can see it in several functions like pd.fillna, pd.replace etc. Whenever the inplace is set to True, it modifies the existing data frame and you need not assign it to a new data frame.

Ohhhh… Then I find out where the mistake really lies in. In my previous code, the DataFrame frame I sorted has not been modified only if the parameter inplace is set to True, so I modify the code as follow:

1
2
3
frame = pd.DataFrame({"a":[9,2,5,1],"b":[4,7,-3,2],"c":[6,5,8,3]})
df = frame.sort_values('a',inplace=True)
print(df)

The problem is solved! How stupid I was!

Your browser is out-of-date!

Update your browser to view this website correctly. Update my browser now

×