Thursday 8 November 2018

What is the ideal data type to use when storing latitude / longitudes in a MySQL database?

Bearing in mind that I'll be performing calculations on lat / long pairs, what datatype is best suited for use with a MySQL database?

 Answers





Basically it depends on the precision you need for your locations. Using DOUBLE you'll have a 3.5nm precision. DECIMAL(8,6)/(9,6) goes down to 16cm. FLOAT is 1.7m...
This very interesting table has a more complete list: http://mysql.rjweb.org/doc.php/latlng :
Datatype               Bytes            Resolution

Deg*100 (SMALLINT)     4      1570 m    1.0 mi  Cities
DECIMAL(4,2)/(5,2)     5      1570 m    1.0 mi  Cities
SMALLINT scaled        4       682 m    0.4 mi  Cities
Deg*10000 (MEDIUMINT)  6        16 m     52 ft  Houses/Businesses
DECIMAL(6,4)/(7,4)     7        16 m     52 ft  Houses/Businesses
MEDIUMINT scaled       6       2.7 m    8.8 ft
FLOAT                  8       1.7 m    5.6 ft
DECIMAL(8,6)/(9,6)     9        16cm    1/2 ft  Friends in a mall
Deg*10000000 (INT)     8        16mm    5/8 in  Marbles
DOUBLE                16       3.5nm     ...    Fleas on a dog
Hope this helps.



When I did this for a navigation database built from ARINC424 I did a fair amount of testing and looking back at the code, I used a DECIMAL(18,12) (Actually a NUMERIC(18,12) because it was firebird).
Floats and doubles aren't as precise and may result in rounding errors which may be a very bad thing. I can't remember if I found any real data that had problems - but I'm fairly certain that the inability to store accurately in a float or a double could cause problems
The point is that when using degrees or radians we know the range of the values - and the fractional part needs the most digits.
The MySQL Spatial Extensions are a good alternative because they follow The OpenGIS Geometry Model. I didn't use them because I needed to keep my database portable.



Depends on the precision that you require.
Datatype           Bytes       resolution
------------------ -----  --------------------------------
Deg*100 (SMALLINT)     4  1570 m    1.0 mi  Cities
DECIMAL(4,2)/(5,2)     5  1570 m    1.0 mi  Cities
SMALLINT scaled        4   682 m    0.4 mi  Cities
Deg*10000 (MEDIUMINT)  6    16 m     52 ft  Houses/Businesses
DECIMAL(6,4)/(7,4)     7    16 m     52 ft  Houses/Businesses
MEDIUMINT scaled       6   2.7 m    8.8 ft
FLOAT                  8   1.7 m    5.6 ft
DECIMAL(8,6)/(9,6)     9    16cm    1/2 ft  Friends in a mall
Deg*10000000 (INT)     8    16mm    5/8 in  Marbles
DOUBLE                16   3.5nm     ...    Fleas on a dog
To summarise:
  • The most precise available option is DOUBLE.
  • The most common seen type used is DECIMAL(8,6)/(9,6).
As of MySQL 5.7, consider using Spatial Data Types (SDT), specifically POINT for storing a single coordinate. Prior to 5.7, SDT does not support indexes (with exception of 5.6 when table type is MyISAM).
Note:
  • When using POINT class, the order of the arguments for storing coordinates must be POINT(latitude, longitude).
  • There is a special syntax for creating a spatial index.
  • The biggest benefit of using SDT is that you have access to Spatial Analyses Functions, e.g. calculating distance between two points (ST_Distance) and determining whether one point is contained within another area (ST_Contains).



No need to go far, according to Google Maps, the best is FLOAT(10,6) for lat and lng.



In a completely different and simpler perspective:
  • if you are relying on Google for showing your maps, markers, polygons, whatever, then let the calculations be done by Google!
  • you save resources on your server and you simply store the latitude and longitude together as a single string (VARCHAR), E.g.: "-0000.0000001,-0000.000000000000001" (35 length and if a number has more than 7 decimal digits then it gets rounded);
  • if Google returns more than 7 decimal digits per number, you can get that data stored in your string anyway, just in case you want to detect some flees or microbes in the future;
  • you can use their distance matrix or their geometry library for calculating distances or detecting points in certain areas with calls as simple as this: google.maps.geometry.poly.containsLocation(latLng, bermudaTrianglePolygon))
  • there are plenty of "server-side" APIs you can use  that use Google Maps API.
This way you don't need to worry about indexing numbers and all the other problems associated with data types that may screw up your coordinates.



While it isn't optimal for all operations, if you are making map tiles or working with large numbers of markers (dots) with only one projection (e.g. Mercator, like Google Maps and many other slippy maps frameworks expect), I have found what I call "Vast Coordinate System" to be really, really handy. Basically, you store x and y pixel coordinates at some way-zoomed-in -- I use zoom level 23. This has several benefits:
  • You do the expensive lat/lng to mercator pixel transformation once instead of every time you handle the point
  • Getting the tile coordinate from a record given a zoom level takes one right shift.
  • Getting the pixel coordinate from a record takes one right shift and one bitwise AND.
  • The shifts are so lightweight that it is practical to do them in SQL, which means you can do a DISTINCT to return only one record per pixel location, which will cut down on the number records returned by the backend, which means less processing on the front end.

MySQL uses double for all floats ... So use type double. Using float will lead to unpredictable rounded values in most situations



FLOAT should give you all of the precision you need, and be better for comparison functions than storing each co-ordinate as a string or the like.
If your MySQL version is earlier than 5.0.3, you may need to take heed of certain floating point comparison errors however.
Prior to MySQL 5.0.3, DECIMAL columns store values with exact precision because they are represented as strings, but calculations on DECIMAL values are done using floating-point operations. As of 5.0.3, MySQL performs DECIMAL operations with a precision of 64 decimal digits, which should solve most common inaccuracy problems when it comes to DECIMAL columns

0 comments:

Post a Comment