python - How can I account for identical data points in a scatter plot? -
i'm working data has several identical data points. visualize data in scatter plot, scatter plotting doesn't job of showing duplicates.
if change alpha value, identical data points become darker, nice, not ideal.
is there way map color of dot how many times occurs in data set? size? how can assign size of dot how many times occurs in data set?
as pointed out, whether makes sense depends bit on dataset. if have reasonably discrete points , exact matches make sense, can this:
import numpy np import matplotlib.pyplot plt test_x=[2,3,4,1,2,4,2] test_y=[1,2,1,3,1,1,1] # generating test x , y values. use data here #generate list of unique points points=list(set(zip(test_x,test_y))) #generate list of point counts count=[len([x x,y in zip(test_x,test_y) if x==p[0] , y==p[1]]) p in points] #now plotting: plot_x=[i[0] in points] plot_y=[i[1] in points] count=np.array(count) plt.scatter(plot_x,plot_y,c=count,s=100*count**0.5,cmap='spectral_r') plt.colorbar() plt.show()
notice: need adjust radius (the value 100
in th s
argument) according point density. used square root of count scale point area proportional counts.
also note: if have dense points, might more appropriate use different kind of plot. histograms example (i hexbin
2d data) decent alternative in these cases.
Comments
Post a Comment