Saturday, 30 January 2010

GHCN World Map

Today I wrote a C# program to preprocess and load GHCN data into a number of classes.

I have written a Worldmap class which can be used to plot a list of selected stations. For example the above image is generated with this line of code

map.Plot(ghcn.Stations, Brushes.Red);

Using C# and LINQ it's easy to produce all number of queries to select subsets of GHCN stations. For example the following query selects all stations in Australia:

map.Plot( (from s in ghcn.Stations where
s.Country == "AUSTRALIA" select s).ToList(), Brushes.Red );

Or to select all stations above 1000m elevation:

map.Plot( (from s in ghcn.Stations where
s.Elevation > 1000 select s).ToList(), Brushes.Blue );

Here's a particularly crazy query. All stations below 100 meter elevation above 60N or below 60S whose station name begins with 'A'. I won't bother graphing it, there are only 16 stations that match this very specific query.

(from s in ghcn.Stations where Math.Abs(s.Latitude) > 60 &&
s.Elevation < 100 && s.Name.StartsWith("A") select s)

The great thing about LINQ is by adding more properties to the Station class they will be automatically available to query. Once I integrate the station temperature data into the Station class, it will be possible to query based on the data, such as querying for all stations with a temperature record that has a trend greater than a specified amount or querying for all stations with records that cover the period 1920-1960 or combinations of the lot. Using LINQ it's also possible to order the result set, eg ordering by elevation or ordering by length of temperature record, or trend.

For now I am working on loading the raw and adjusted temperature data. As a preprocessing step I have split the v2.mean and v2.mean_adj files out so that each station has it's own file containing it's temperature records. This way individual station temperature data can be loaded from disk as needed, rather than loading the whole lot up front, which takes time and a whole lot of memory.

Once the temperature data can be loaded and queried I need to write a load of utility methods, such as one to calculate the distance between two stations and a load of trend calculating ones. I haven't figured out yet how to handle calculating the trends. I am not familiar with statistical analysis or how to combine multiple temperature records. Will have to just cross those bridges when I come to them.

I also need the ability to graph temperature records.

No comments:

Post a Comment