{"id":2169,"date":"2017-10-10T16:30:00","date_gmt":"2017-10-10T14:30:00","guid":{"rendered":"https:\/\/www.sqlinthewild.co.za\/?p=2169"},"modified":"2017-10-09T23:39:30","modified_gmt":"2017-10-09T21:39:30","slug":"hunting-for-the-true-location-with-machine-learning","status":"publish","type":"post","link":"https:\/\/www.sqlinthewild.co.za\/index.php\/2017\/10\/10\/hunting-for-the-true-location-with-machine-learning\/","title":{"rendered":"Hunting for the True Location, with Machine Learning"},"content":{"rendered":"<p>Some context first.<\/p>\n<p>My company puts on a year end function every year. It\u2019s at some resort or other, and the important thing for this post is that we\u2019re not told the location in advance. We find out when we get there (by bus).<\/p>\n<p>What we are told, about a month ahead of the event, is approximate distances from 3-4 locations. These are where the bus pickup sites are. The locations are:<\/p>\n<ul>\n<li>Head Office<\/li>\n<li>Near Clearwater Mall<\/li>\n<li>Fourways<\/li>\n<li>Centurion<\/li>\n<\/ul>\n<p>The distances given aren\u2019t correct. And, as a result, there\u2019s usually several attempts by various people to figure out where the year end function will be in advance.<\/p>\n<p>I thought I\u2019d join in this year, using some machine learning on those distances.<\/p>\n<p>Now, I should mention that this is a very poor use for ML. Mainly because of a lack of data. I should have hundreds of data points for a decent prediction. I have 2 or 3 data points, for 4 different locations. Still, it\u2019s what I have to work with.<\/p>\n<p>First, the starting data. The distances for this year are:<\/p>\n<ul>\n<li>Clearwater mall: 63 KM<\/li>\n<li>Centurion: 56 KM<\/li>\n<li>Fourways: 43 KM<\/li>\n<li>HQ: 20 KM<\/li>\n<li>Cape Town: 1447 KM<\/li>\n<\/ul>\n<p>I\u2019m going to ignore Cape Town for training, as it only had a distance previously specified in 2015, and so I only have one piece of data.<\/p>\n<p>Plotting this on a map makes it clear that the distances have been \u2018massaged\u2019 (I\u2019m plotting \u2018as the bird flies\u2019, not driving distance for ease of plotting, I\u2019ll use driving distances for the training)<\/p>\n<p><a href=\"https:\/\/www.sqlinthewild.co.za\/wp-content\/uploads\/2017\/10\/yef_720.png\"><img loading=\"lazy\" decoding=\"async\" style=\"background-image: none; padding-top: 0px; padding-left: 0px; display: inline; padding-right: 0px; border-width: 0px;\" title=\"yef_720\" src=\"https:\/\/www.sqlinthewild.co.za\/wp-content\/uploads\/2017\/10\/yef_720_thumb.png\" alt=\"yef_720\" width=\"451\" height=\"484\" border=\"0\" \/><\/a><\/p>\n<p>Let\u2019s look at previous years.<\/p>\n<p><strong>2016<\/strong><\/p>\n<p>Actual location: Seasons Sport and Spa (pin on the map below)<\/p>\n<p>Actual distances calculated with Google Maps, driving distance, shortest route.<\/p>\n<ul>\n<li>Clearwater: Given distance &#8211; 80KM. Actual distance &#8211; 67KM<\/li>\n<li>Centurion : Given distance &#8211; 88KM. Actual distance &#8211; 47KM<\/li>\n<li>Fourways : Given distance &#8211; 68KM. Actual distance &#8211; 52KM<\/li>\n<li>HQ: Given distance &#8211; 115KM. Actual distance &#8211; 75KM<\/li>\n<\/ul>\n<p><a href=\"https:\/\/www.sqlinthewild.co.za\/wp-content\/uploads\/2017\/10\/YEF2016.png\"><img loading=\"lazy\" decoding=\"async\" style=\"background-image: none; padding-top: 0px; padding-left: 0px; display: inline; padding-right: 0px; border: 0px;\" title=\"YEF2016\" src=\"https:\/\/www.sqlinthewild.co.za\/wp-content\/uploads\/2017\/10\/YEF2016_thumb.png\" alt=\"YEF2016\" width=\"485\" height=\"484\" border=\"0\" \/><\/a><\/p>\n<p><strong>2015<\/strong><\/p>\n<p>Actual location: Vaal River Country Lodge.<\/p>\n<p>Actual distances calculated with Google Maps, driving distance, shortest route.<\/p>\n<ul>\n<li>Clearwater Mall: Given distance &#8211;\u00a0 51KM. Actual distance \u2013 79KM<\/li>\n<li>Centurion: Given distance &#8211; 110KM. Actual distance &#8211; 118KM<\/li>\n<li>Fourways: Given distance &#8211; 89KM. Actual distance &#8211; 97KM<\/li>\n<\/ul>\n<p><a href=\"https:\/\/www.sqlinthewild.co.za\/wp-content\/uploads\/2017\/10\/YEF2015.png\"><img loading=\"lazy\" decoding=\"async\" style=\"background-image: none; padding-top: 0px; padding-left: 0px; display: inline; padding-right: 0px; border-width: 0px;\" title=\"YEF2015\" src=\"https:\/\/www.sqlinthewild.co.za\/wp-content\/uploads\/2017\/10\/YEF2015_thumb.png\" alt=\"YEF2015\" width=\"484\" height=\"484\" border=\"0\" \/><\/a><\/p>\n<p><strong>2014<\/strong><\/p>\n<p>Actual location: Askari Game Lodge.<\/p>\n<p>Actual distances calculated with Google Maps, driving distance, shortest route.<\/p>\n<ul>\n<li>Clearwater Mall: Given distance &#8211;\u00a0 52KM. Actual distance \u2013 79KM<\/li>\n<li>HQ: Given distance &#8211; 90KM. Actual distance \u2013 118KM<\/li>\n<\/ul>\n<p><a href=\"https:\/\/www.sqlinthewild.co.za\/wp-content\/uploads\/2017\/10\/YEF2014.png\"><img loading=\"lazy\" decoding=\"async\" style=\"background-image: none; padding-top: 0px; padding-left: 0px; display: inline; padding-right: 0px; border: 0px;\" title=\"YEF2014\" src=\"https:\/\/www.sqlinthewild.co.za\/wp-content\/uploads\/2017\/10\/YEF2014_thumb.png\" alt=\"YEF2014\" width=\"488\" height=\"484\" border=\"0\" \/><\/a><\/p>\n<p>With that, I have the following training data:<\/p>\n<table border=\"1\" width=\"400\" cellspacing=\"0\" cellpadding=\"2\">\n<tbody>\n<tr>\n<td valign=\"top\" width=\"133\">Location<\/td>\n<td valign=\"top\" width=\"133\">Given Distance<\/td>\n<td valign=\"top\" width=\"133\">Error in Distance (Given &#8211; Actual)<\/td>\n<\/tr>\n<tr>\n<td valign=\"top\" width=\"133\">Clearwater Mall<\/td>\n<td valign=\"top\" width=\"133\">80<\/td>\n<td valign=\"top\" width=\"133\">13<\/td>\n<\/tr>\n<tr>\n<td valign=\"top\" width=\"133\">Clearwater Mall<\/td>\n<td valign=\"top\" width=\"133\">51<\/td>\n<td valign=\"top\" width=\"133\">-28<\/td>\n<\/tr>\n<tr>\n<td valign=\"top\" width=\"133\">Clearwater Mall<\/td>\n<td valign=\"top\" width=\"133\">52<\/td>\n<td valign=\"top\" width=\"133\">-27<\/td>\n<\/tr>\n<tr>\n<td valign=\"top\" width=\"133\">Centurion<\/td>\n<td valign=\"top\" width=\"133\">88<\/td>\n<td valign=\"top\" width=\"133\">41<\/td>\n<\/tr>\n<tr>\n<td valign=\"top\" width=\"133\">Centurion<\/td>\n<td valign=\"top\" width=\"133\">110<\/td>\n<td valign=\"top\" width=\"133\">-8<\/td>\n<\/tr>\n<tr>\n<td valign=\"top\" width=\"133\">Fourways<\/td>\n<td valign=\"top\" width=\"133\">68<\/td>\n<td valign=\"top\" width=\"133\">-6<\/td>\n<\/tr>\n<tr>\n<td valign=\"top\" width=\"133\">Fourways<\/td>\n<td valign=\"top\" width=\"133\">89<\/td>\n<td valign=\"top\" width=\"133\">-8<\/td>\n<\/tr>\n<tr>\n<td valign=\"top\" width=\"133\">HQ<\/td>\n<td valign=\"top\" width=\"133\">115<\/td>\n<td valign=\"top\" width=\"133\">40<\/td>\n<\/tr>\n<tr>\n<td valign=\"top\" width=\"133\">HQ<\/td>\n<td valign=\"top\" width=\"133\">90<\/td>\n<td valign=\"top\" width=\"133\">-28<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>Now to stick those into a linear regression and see if I can predict the error on this year\u2019s measurements.<\/p>\n<p>I need to mention that with so little data, the accuracy of the linear regression is going to be very low. I\u2019m as likely to get the correct results from linear regression as I am to get correct results from rolling a couple of d20s.<\/p>\n<p>That said, onwards to untrustworthy results.<\/p>\n<p>Once the starting values are loaded into R, creating a simple model is as easy as<\/p>\n<pre class=\"brush: plain; title: ; notranslate\" title=\"\">m &lt;- lm(Error ~ Location + Distance, data=YEF)<\/pre>\n<p>Then load up this year\u2019s values into another data frame, and predict.<\/p>\n<pre class=\"brush: plain; title: ; notranslate\" title=\"\">predict(m, YEFPredict)<\/pre>\n<p>The errors come out as:<\/p>\n<ul>\n<li>Clearwater Mall: -12<\/li>\n<li>Centurion: -18<\/li>\n<li>Fourways: -35<\/li>\n<li>HQ: -60<\/li>\n<\/ul>\n<p>Giving final estimated distances (Given &#8211; Error) as<\/p>\n<ul>\n<li>Clearwater Mall: 75KM<\/li>\n<li>Centurion: 74KM<\/li>\n<li>Fourways: 78KM<\/li>\n<li>HQ: 80KM<\/li>\n<\/ul>\n<p><a href=\"https:\/\/www.sqlinthewild.co.za\/wp-content\/uploads\/2017\/10\/YEF2017.png\"><img loading=\"lazy\" decoding=\"async\" style=\"background-image: none; padding-top: 0px; padding-left: 0px; display: inline; padding-right: 0px; border: 0px;\" title=\"YEF2017\" src=\"https:\/\/www.sqlinthewild.co.za\/wp-content\/uploads\/2017\/10\/YEF2017_thumb.png\" alt=\"YEF2017\" width=\"448\" height=\"484\" border=\"0\" \/><\/a><\/p>\n<p>Maybe I should have stuck to using dice.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Some context first. My company puts on a year end function every year. It\u2019s at some resort or other, and the important thing for this post is that we\u2019re not told the location in advance. We find out when we&#8230; <a class=\"read-more-button\" href=\"https:\/\/www.sqlinthewild.co.za\/index.php\/2017\/10\/10\/hunting-for-the-true-location-with-machine-learning\/\">(Read more)<\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_publicize_message":"New blog post: Hunting for the True Location, with Machine Learning","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":true,"jetpack_social_options":{"image_generator_settings":{"template":"highway","default_image_id":0,"font":"","enabled":false},"version":2}},"categories":[31,32],"tags":[],"class_list":["post-2169","post","type-post","status-publish","format-standard","hentry","category-machine-learning","category-r"],"jetpack_publicize_connections":[],"jetpack_featured_media_url":"","jetpack_shortlink":"https:\/\/wp.me\/p7h6n-yZ","jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/www.sqlinthewild.co.za\/index.php\/wp-json\/wp\/v2\/posts\/2169","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.sqlinthewild.co.za\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.sqlinthewild.co.za\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.sqlinthewild.co.za\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.sqlinthewild.co.za\/index.php\/wp-json\/wp\/v2\/comments?post=2169"}],"version-history":[{"count":2,"href":"https:\/\/www.sqlinthewild.co.za\/index.php\/wp-json\/wp\/v2\/posts\/2169\/revisions"}],"predecessor-version":[{"id":2171,"href":"https:\/\/www.sqlinthewild.co.za\/index.php\/wp-json\/wp\/v2\/posts\/2169\/revisions\/2171"}],"wp:attachment":[{"href":"https:\/\/www.sqlinthewild.co.za\/index.php\/wp-json\/wp\/v2\/media?parent=2169"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.sqlinthewild.co.za\/index.php\/wp-json\/wp\/v2\/categories?post=2169"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.sqlinthewild.co.za\/index.php\/wp-json\/wp\/v2\/tags?post=2169"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}