User:Klaus/Nearest Geohash Calculator

From Geo Hashing
Jump to: navigation, search

Usage[edit]

This 2 scripts allow you to calculate all geohashes since 1990-01-01 and their distance to given coordinates. You can then sort them easily by distance.

$ ./download.sh 
$ ./near_geohashes.py 48.53 9.05 | grep normal | sort -n | head -n5
0.5322372070778396 km near normal hash, hash(1997-08-08-8182.20), fractions: 0.525606, 0.052020
0.8428963498045385 km near normal hash, hash(2000-07-01-10393.09), fractions: 0.530534, 0.057562
1.118616780287884 km near normal hash, hash(1993-01-10-3269.00), fractions: 0.520030, 0.047935
1.1472513155453212 km near normal hash, hash(2012-05-08-13035.85), fractions: 0.537845, 0.056814
1.1546716239289085 km near normal hash, hash(1991-10-16-3040.25), fractions: 0.538614, 0.055956

So actually the nearest hash to 48.53,9.06 ever since 1990-01-01 was on 2000-07-01 and only ~532 metres away!

$ ./download.sh 
$ ./near_geohashes.py 48.53 9.05 | grep global | sort -n | head -n5 
42.43702289083002 km near global hash, hash(2006-08-21-11333.76), coordinates: 48.506565, 9.430944
180.5498687319234 km near global hash, hash(1999-05-28-10701.28), coordinates: 49.296345, 7.614187
436.771324971548 km near global hash, hash(2000-08-24-11130.55), coordinates: 47.605377, 12.871693
530.3023722367725 km near global hash, hash(2004-05-13-10011.52), coordinates: 53.266972, 8.150993
557.1621456075098 km near global hash, hash(1999-06-19-10847.64), coordinates: 53.602194, 9.219057

And the nearest global hash to 48.53,9.06 since 1990-01-01 was on 2006-08-21 and ~42km away...

The date 1990-01-01 is arbitrarily chosen, because Yahoo gives me only DowJones opening values back until around 1985.

download.sh[edit]

This bash script downloads the list of DowJones values from Yahoo:

#!/bin/bash

year=$(date +%Y)
mon=$(date +%m)
dat=$(date +%d)

# download data from yahoo, then:
# remove first line ("Date,Open,High,Low,Close,Volume,Adj Close")
rm -f dowjones.csv
wget -o /dev/null -O - "https://ichart.yahoo.com/table.csv?s=%5EDJI&a=09&b=1&c=1928&d=$mon&e=$dat&f=$year&g=d&ignore=.csv"  | tail -n+2 | cut -d ',' -f 1-2 > dowjones.csv

near_geohashes.py[edit]

This bash script calculates all geohashes since 1990-01-01 from the downloaded data:

#!/usr/bin/python
# vim: tabstop=8 expandtab shiftwidth=4 softtabstop=4


from math import radians, cos, sin, asin, sqrt
from sys import argv, exit
from datetime import datetime, date, timedelta
from time import strptime
from bisect import bisect_left
from hashlib import md5


dj = {} # this will contain the date -> dowjones values!
lat,lon = 0,0
latPrefix,lonPrefix = 0,0

# https://stackoverflow.com/questions/4913349/haversine-formula-in-python-bearing-and-distance-between-two-gps-points/4913653#4913653
def haversine(lon1, lat1, lon2, lat2):
    """
    Calculate the great circle distance between two points 
    on the earth (specified in decimal degrees)
    """
    # convert decimal degrees to radians 
    lon1, lat1, lon2, lat2 = map(radians, [lon1, lat1, lon2, lat2])

    # haversine formula 
    dlon = lon2 - lon1 
    dlat = lat2 - lat1 
    a = sin(dlat/2)**2 + cos(lat1) * cos(lat2) * sin(dlon/2)**2
    c = 2 * asin(sqrt(a)) 
    r = 6371 # Radius of earth in kilometers. Use 3956 for miles
    return c * r

def loadDowJones(inputfile):
    global dj
    with open(inputfile) as f:
        lines = [l.strip('\n') for l in f.readlines()]
        for l in lines:
            dat, dowjones = l.split(',')
            # convert the date string to a "date" object
            dat = datetime.strptime(dat, '%Y-%m-%d').date()
            # round the DowJones
            dowjones = round(float(dowjones), 2)
            dj[dat] = dowjones

def daterange(start_date, end_date):
    for n in range(int ((end_date - start_date).days)):
        yield start_date + timedelta(n)

def nearest(fractionLat, fractionLon):
    minimalDistance = float("inf") # infinity is larger than everything else!
    # calculate the minimum distance to all of the 9 adjacent graticules geohashes
    for i in range(-1,2):     # i is in -1 0 1
        for j in range(-1,2): # j is in -1 0 1
            minimalDistance = min(minimalDistance, haversine(lat, lon, latPrefix + fractionLat + i, lonPrefix + fractionLon + j))
    return minimalDistance

def calculateFractions():
    for d in (daterange(date(1990,1,1), date.today())):
        significantDJdate = d
        # if we are east of 30W and the date is after or equal to
        # 2008-05-27 when the 30W rule was introduced
        # -> use the previous day!
        if lon > -30 and d >= date(2008,5,27):
            significantDJdate -= timedelta(days=1)
        # go back, until we have a DJ value!
        while significantDJdate not in dj:
            significantDJdate -= timedelta(days=1)
        s = "%s-%.2f" % (d, dj[significantDJdate])
        md5Output = md5(s.encode('ascii')).digest()
        fractionA = int.from_bytes(md5Output[0:8],  byteorder='big', signed=False) / (256**8)
        fractionB = int.from_bytes(md5Output[8:16], byteorder='big', signed=False) / (256**8)
        distance = nearest(fractionA, fractionB)
        print("%s km near normal hash, hash(%s), fractions: %f, %f" % (distance, s, fractionA, fractionB))

        # now calculate the global hash
        # for the globalhash, we always use the previous day!
        significantDJdate = d - timedelta(days=1)
        # go back, until we have a DJ value!
        while significantDJdate not in dj:
            significantDJdate -= timedelta(days=1)
        s = "%s-%.2f" % (d, dj[significantDJdate])
        md5Output = md5(s.encode('ascii')).digest()
        globalLat = int.from_bytes(md5Output[0:8],  byteorder='big', signed=False) / (256**8) * 180 - 90
        globalLon = int.from_bytes(md5Output[8:16], byteorder='big', signed=False) / (256**8) * 360 - 180
        distance = haversine(lat, lon, globalLat, globalLon)
        print("%s km near global hash, hash(%s), coordinates: %f, %f" % (distance, s, globalLat, globalLon))
 
if __name__ == "__main__":
    if len(argv) != 3:
        print("usage: %s <lat> <lon>" % argv[0])
        exit(1)
    # print warning...
    #print("WARNING: this script will work only for east-of-W30 locations like Europe!")
    lat, lon = float(argv[1]), float(argv[2])
    latPrefix = int(lat)
    lonPrefix = int(lon)
    loadDowJones("dowjones.csv")
    calculateFractions() 


30W compliance[edit]

WARNING: Be careful with this script! Although I tried to make it 30W-compliant, I cannot guarantee it will output the correct coordinates, because I didn't test it very thoroughly...

At least, it does get the same values as 30W Rule page on the wiki:

$ ./near_geohashes.py 68 -30 | grep 2008-05-2.*
55.255531860961014 km near, hash(2008-05-20-13026.04), fractions: 0.630991, 0.618946
23.136853502336027 km near, hash(2008-05-21-12824.94), fractions: 0.179468, 0.861536
26.670377452155773 km near, hash(2008-05-22-12597.69), fractions: 0.972874, 0.238697
49.31177452443187 km near, hash(2008-05-23-12620.90), fractions: 0.400247, 0.722772
51.7625179127548 km near, hash(2008-05-24-12620.90), fractions: 0.126648, 0.547533
21.094839705051594 km near, hash(2008-05-25-12620.90), fractions: 0.941775, 0.182874
53.79151860822181 km near, hash(2008-05-26-12620.90), fractions: 0.673128, 0.607308
23.137520322442228 km near, hash(2008-05-27-12479.63), fractions: 0.209678, 0.101442
38.27005026419556 km near, hash(2008-05-28-12542.90), fractions: 0.687451, 0.212208
44.91787517770784 km near, hash(2008-05-29-12593.87), fractions: 0.464702, 0.034124
$ ./near_geohashes.py 68 -29 | grep 2008-05-2.*
55.4838760545182 km near, hash(2008-05-20-13026.04), fractions: 0.630991, 0.618946
23.265428063696074 km near, hash(2008-05-21-12824.94), fractions: 0.179468, 0.861536
26.67292242217709 km near, hash(2008-05-22-12597.69), fractions: 0.972874, 0.238697
49.61217710794762 km near, hash(2008-05-23-12620.90), fractions: 0.400247, 0.722772
51.791303059416734 km near, hash(2008-05-24-12620.90), fractions: 0.126648, 0.547533
21.109668916397265 km near, hash(2008-05-25-12620.90), fractions: 0.941775, 0.182874
53.97565438600757 km near, hash(2008-05-26-12620.90), fractions: 0.673128, 0.607308
48.57177027251728 km near, hash(2008-05-27-12620.90), fractions: 0.125367, 0.577111
30.83845687039247 km near, hash(2008-05-28-12479.63), fractions: 0.710441, 0.112732
39.48856495153396 km near, hash(2008-05-29-12542.90), fractions: 0.278327, 0.741142