Friday, February 8, 2013

Shrinking Well Known Text (WKT) Geometries By Rounding Decimals


It's been a good month, full of fun stuff, but nothing that REALLY seemed blogworthy. But today, someone suggested an idea.

Background: When communicating a geometry from the server to the browser, we often use Well Known Text (WKT). They're not overly verbose and they're simple and fast to generate.

Issue: WKT geometries can still be quite large. Simplifying is a great step if you can afford the introduced inaccuuracy, but still, can we make them smaller?

Solution: Round the decimals.

If you are using a SRS with meters or feet as units, then you can probably afford for your geometries to use whole numbers instead of decimals. After all, what's the rounding error? A difference of 0.999 meters isn't much, considering the accuracy of handheld or automotive GPSs.

Check out this sample WKT, the first 6 vertices of the polygon of Salinas County, California.

-- ST_ASTEXT()
 POLYGON((-13578594.747402 4380792.43128354,-13578592.2992804 4380792.29270892,-13578569.3701029 4380825.00366193,-13578580.720485 4380880.31712633,-13578605.4359713 4380926.761986
-- ST_ASTEXT_ROUNDED()
 POLYGON((-13578594 4380792,-13578592 4380792,-13578569 4380825,-13578580 4380880,-13578605 4380926,-13578607 4380932

Overall, the reduction is just under 50%, from 598 KB to 309 KB. This is cumulative with ST_SIMPLIFY() too. If you can afford to cut corners (a little geospatial pun there, ha ha), you can simplify and then round it to get 50% of even the simplified payload.


Neat. How do I do it?


I created a handy function ST_ASTEXT_ROUNDED() which simply complements ST_ASTEXT() It works for PostGIS 1 and PostGIS 2

Go git it!
https://github.com/gregallensworth/PostGIS

This is really a function wrapper around a super simple regular expression which trims off decimals. It doesn't round the numbers, but crops them. But then again, with a maximum error of 0.999 meters does it really matter?


When Not To Do This


A key point of this rounding trick, is the assumption that rounding the numbers makes no real-world difference in the accuracy of your data. For feet or meters, 4380792.43128354 and 4380792 introduces an inaccuracy less than an arm's length.

If you're using geographic coordinates (WGS84, lat & lon) then it's different. A degree is 60 miles at the equator, so rounding from -122.3 to -122 is a difference of 10-18 miles depending on your latitude. Don't do that.