Use of pandas.crosstab() in Python to create stacked histogram?




I was learning histograms then I encountered a code for stacked histograms as follows:

temp3 = pd.crosstab([df.Pclass, df.Sex], df.Survived.astype(bool))
temp3.plot(kind='bar', stacked=True, color=['red','blue'], grid=False)

where df is a pandas dataframe and ‘Pclass’ ,‘Survived’ and ‘Sex’ are two categorical columns in the dataframe.

Variable description:
Survived - could be 0 or 1
PClass - Passenger travelling class- could be 1, 2 or 3
Sex - Male, Female

I could not understand the use of pd.crosstab(). Is there any other way to create stacked histograms?

Please help



pd.crosstab() is used for cross tabulation of two factors.

For example : I have following data in two arrays ‘a’ and ‘b’ such that

a = array([foo, foo, foo, foo, bar, bar,
       bar, bar, foo, foo, foo])
b = array([one, one, one, two, one, one,
       one, two, two, two, one])


temp1 = pd.crosstab( a,b,  rownames = ['a'] , colnames =['b'])

here firstly arrays are passed and then their names respectively.

where temp gives

b     one        two
bar   3            1   
foo   4            3  

Thus in your question

Pclass and Sex are passed together as a first 2-d array and second argument passed is Survived

its cross-table will look like

    Survived          True       False
Pclass   Sex

1        male
1       female
2       male
2       female
3       male 
3       female

and thus stacked histogram will be plotted with [Pclass,Sex] on horizontal axis and their frequencies partitioned in two colours with respect to Survived = = True or False
Hope this helps