clustering weather stations by historical temperature data
I have very limited knowledge of machine learning. I'm looking for a certain clustering algorithm that can help me to group data points together by some historical data of those points. Think of this example: There are n weather stations (for example 200), I have hourly temperature data for 5 years for those n weather stations. So the data looks like
timestamp, station_1, station_2, ... 1900-01-01 00:00:00, 80, 60, 81, ... 1900-01-01 01:00:00, 82, 59, 83
I'm looking for a clustering algorithm that group weather stations together so in a cluster the station temperatures are very close. For example, 80 and 81 are close, while 80 and 60 are not.
Plus, if the algorithm can also tell/calculate how 'close' is the data point to the cluster center, that will be great...
There is no free lunch
Don't expect to find an algorithm that exactly does what you need.
Customize algorithms as adequate for your problem. That is the very story of the Data Science buzz, the need to experiment and customize instead of hoping for a turnkey solution.
You have avery specific idea of what you need. You will have to put this idea into code and plug it into some algorithm. For example, consider complete linkage clustering with maximum norm. It probably is what you explained above, but I don't think it will be useful.