在地理位信息中点与点之间的位置关系是在处理数据过程中重点关注的内容,聚类算法可以将地理位置间关系直观在图形上表示出来:
1、构造地理位置间距函数:
def get_distance(array_1, array_2):
lon_a = array_1[0]
lat_a = array_1[1]
lon_b = array_2[0]
lat_b = array_2[1]
radlat1 = radians(lat_a)
radlat2 = radians(lat_b)
a = radlat1 - radlat2
b = radians(lon_a) - radians(lon_b)
s = 2 * asin(sqrt(pow(sin(a/2),2) + cos(radlat1) * cos(radlat2)*pow(sin(b/2),2)))
earth_radius = 6378137
s = s * earth_radius
print(s)
return sarray_1、array_2是总数组包含的子数组表示[精度、纬度]
后面计算过程为输出总数组中随机两点间距离
2、连接数据库,提取点位的经纬度:
conn = pymssql.connect(host='localhost', user='sa', password='123456', database='net',charset='utf8')
cursor=conn.cursor()
cursor.execute('SELECT "2G基站名称","3G基站名称","4G基站名称","平均经度","平均纬度" FROM cell234')
cursor.execute('SELECT "平均经度","平均纬度" FROM cell234 where "4G基站名称" is not null')
result = cursor.fetchall()
3、 将输出结果构造成新的矩阵结构,其中lonlatchang用于画点位输出图形使用,其中lonlat用于DBSCAN聚类算法中函数表达式参数输入,:
lonlat=[]
for x in result:
lonlattemp=[]
lonlattemp.append(x[0])
lonlattemp.append(x[1])
lonlat.append(lonlattemp)
lonlatchang=numpy.array(lonlat)
4、 将相关参数带入并输出图形:
dbscan = DBSCAN(eps=1000, min_samples=2,algorithm='ball_tree', metric=get_distance)
place=dbscan.fit(lonlat)
# print(place.labels_)
y_pred = dbscan.fit_predict(lonlat)
# print('yanse',y_pred)
# print(result[:,0])
plt.scatter(lonlatchang[:,0],lonlatchang[:,1], marker='*',c=y_pred)
plt.show()