Only 1 vertical line in histogram

visualization
python2
histogram

#1

When I plot histogram, only 1 vertical line is displayed. No distribution.
Is it related to few outliers that are way too large?

Any other solution apart from dropping those rows and I don’t want to change into log variable as well.

Thanks


#2

We can make neither head nor tail out of you question without looking at the data and the code (we don’t know what you’re using to make the plot). :slight_smile: Please post a sample of the data that you’re attempting to plot and the code if applicable.


#3

Data -
Distance has been calculated using longitude and latitude coordinated over the entire rows.
I can describe you the data :
Max distance (km): 1240
Average distance (km): 3

Code-
plt.figure(figsize=(12,3))
plt.title(‘Frequency dist of distance covered in km’)
plt.xlabel(‘Distance covered in km’)
plt.ylabel(‘Counts’)
plt.hist(train.distance_km,bins=80)
pass

Is it because of few outliers that are extremely large or any other reason ?


#4

hi, you can check the Q1 and Q3 if they are also somewhere around the mean then its expected to have a big single line as almost data lie in the same range.
If that is not the case, you can increase the bins and see if it works out,


#5

@ASHISH_17,

I suspect it’s a right skew distribution, doing a log transformation might give you the histogram you are looking for, try this code -
plt.figure(figsize=(12,3))
plt.title(‘Frequency dist of distance covered in km’)
plt.xlabel(‘Distance covered in km’)
plt.ylabel(‘Counts’)
plt.hist(np.log(train.distance_km+1),bins=10)

Hope this helps.

Regards,
Aayush Agrawal