Doubt in Conditional subsetting in Pandas

pandas
data_science
python

#1

Code:

df.loc[(df['SMS'] <19.44) | (df['CALL'] <16.21) | (df['INTERNET'] <194.14) , 'REGION']=1
df.loc[(df['SMS'] >=19.44) | (df['CALL'] >=16.21) | (df['INTERNET'] >=194.14) , 'REGION']=2
df.loc[(df['SMS'] <=185.39) | (df['CALL'] <=179.32) | (df['INTERNET'] <=1617.15) , 'REGION']=2
df.loc[(df['SMS'] >185.39) | (df['CALL'] >179.32) | (df['INTERNET'] >1617.15) , 'REGION']=3
df.head()

output:
ID SMS CALL INTERNET POI REGION

1.0       2.664773      0.733324      62.899977             0              2
1.0       1.925665      0.492015      54.051207             0              2
1.0       1.479944      0.220799      45.901909             0              2

Here the values of region should be 1 and not 2.

But when I write the code below, it gives me the values of 1

df.loc[(df['SMS'] <19.44) | (df['CALL'] <16.21) | (df['INTERNET'] <194.14) , 'REGION']=1

df.head()

#2

hi,

Region values are 2 because of 3rd row (criterion)

df.loc[(df[‘SMS’] <19.44) | (df[‘CALL’] <16.21) | (df[‘INTERNET’] <194.14) , ‘REGION’]=1 => TRUE
df.loc[(df[‘SMS’] >=19.44) | (df[‘CALL’] >=16.21) | (df[‘INTERNET’] >=194.14) , ‘REGION’]=2 => FALSE
df.loc[(df[‘SMS’] <=185.39) | (df[‘CALL’] <=179.32) | (df[‘INTERNET’] <=1617.15) , ‘REGION’]=2 => TRUE
df.loc[(df[‘SMS’] >185.39) | (df[‘CALL’] >179.32) | (df[‘INTERNET’] >1617.15) , ‘REGION’]=3 => FALSE

you can combine 2nd and 3rd row to match your objective like

((df[‘SMS’] >=19.44) & (df[‘SMS’] <=185.39))


#3

Any reason as to why the previous code was not working?


#4

As i mentioned earlier… both 1st row and 3rd row criterion is returning TRUE… hence, when 1st row is executed region is assigned as 1. And when the 3rd row is executed region is reassigned with 2.

Why both rows return TRUE?
let take this data …
1.0 2.664773 0.733324 62.899977 0 2

Here, sms value is 2.66 which is <19.44 (as per 1st row) and <=185.39 (as per 3rd row)…


#5