Group Abstract

Message Boards

WOLFRAM COMMUNITY

32.9K Views

18 Replies

27 Total Likes

View groups...

Follow this post

Share this post:

GROUPS:

[WSC19] Use Machine Learning to Predict if Topography is Land or Ocean

Khang Minh Nguyen

Khang Minh Nguyen, VNU-HCMC High School for the Gifted

Posted 6 years ago

Goal of the Project Analyze and study algorithms for predicting if topography is land or ocean by using simple image processing to find the differences between on-land relief plots and underwater relief plots because there is a difference between simple image processing and machine learning on images What is Topography? Topography is a broad term used to describe the detailed study of the earth's surface. This includes changes in the surface such as mountains and valleys as well as features such as rivers and roads. It can also include the surface of other planets, the moon, asteroids, and meteors. Topography is closely linked to the practice of surveying, which is the practice of determining and recording the position of points in relation to one another. For example, this is a topography of a random place. Collecting Data Mathematica provides lots of data of various places over the world, so I take advantages of this to scrap data of on-land and underwater features for my topographies training set. The training data set includes: 200 On-land features such as capital cities, cities, buildings, mountains. buildings = EntityList["Building"] buildings = Take[buildings, -50] mountains = MountainData[All] mountains = Take[mountains, 50] cities = EntityList["City"] // Flatten cities = Take[cities, -50] 200 Under-water features such as trenches, basins, seamounts, etc. underseafeatures = DeleteMissing[ EntityValue[EntityClass["Ocean", "Seas"], "UnderseaFeatures"]] // Flatten underseafeatureslist = Take[underseafeatures, -200] Image Processing I've done various approaches onto this Image Processing step but only one is suitable. First Proposal First, I was trying to calculate the slope of each topography by the following formula: However, it will not work effectively if there are on-land topography and underwater one having the same slope. Consequently, the machine will not be able to differentiate which is land and which is ocean. Second Proposal Therefore, I moved on to my second proposal from my mentor Harshal, which is using Discrete Fourier Transformation to convert an image into a matrix. As I can see, the transformedimages using DFT do not help much because I can not distinguish the differences between images although the right one is land and the left one is ocean As I can see, the transformed images using DFT do not help much because I can not distinguish the differences between images although the right one is land and the left one is ocean. Third Proposal Finally, my final solution is applying a high-frequency filter on the images. High frequency filtered output was easy to distinguish since surface above water is generally rough when compared to the surface underwater. Initially, start with 2 relief plots, one is the plot of Mount Everest and one is the plot of Mariana Trench. From this point, I can clearly see that the left one is Everest and the right one is Mariana Trench. The reason I can differentiate those two images because the image on the right side is rough, which means it has a high probability to be land and the image on the left side is smooth, which implies that it has a high probability to be ocean. So, let's try to find out a way to programmatically figure out these differences! capitalcitieselevationPlots = ReliefPlot[ Reverse[GeoElevationData[#, GeoRange -> Quantity[4, "Kilometers"], GeoProjection -> Automatic]], ImageSize -> Medium, ColorFunction -> GrayLevel, Frame -> False, PlotRangePadding -> None] & /@ capitalcities buildingselevationPlots = ReliefPlot[ Reverse[GeoElevationData[#, GeoRange -> Quantity[4, "Kilometers"], GeoProjection -> Automatic]], ImageSize -> Medium, ColorFunction -> GrayLevel, Frame -> False, PlotRangePadding -> None] & /@ buildings mountainselevationPlots = ReliefPlot[ Reverse[GeoElevationData[#, GeoRange -> Quantity[4, "Kilometers"], GeoProjection -> Automatic]], ImageSize -> Medium, ColorFunction -> GrayLevel, Frame -> False, PlotRangePadding -> None] & /@ mountains citieselevationPlots = ReliefPlot[ Reverse[GeoElevationData[#, GeoRange -> Quantity[4, "Kilometers"], GeoProjection -> Automatic]], ImageSize -> Medium, ColorFunction -> GrayLevel, Frame -> False, PlotRangePadding -> None] & /@ cities underseafeatureselevationPlots = ReliefPlot[ Reverse[GeoElevationData[#, GeoRange -> Quantity[4, "Kilometers"], GeoProjection -> Automatic]], ImageSize -> Medium, ColorFunction -> GrayLevel, Frame -> False, PlotRangePadding -> None] & /@ underseafeatureslist Those lines of code above transfer all the locations we got into the relief plots. In the next step, I am going to apply a high-frequency filter on the images. To illustrate, for each relief plot, I turned in into black and white and also made the black and white version blurred then found the differences between those two and thus dilate it. This is how I did it in code. finalcapitalcitieslist = Dilation[ImageDifference[Blur[#, 20], #], 10] & /@ capitalcitieselevationPlots finalbuildingslist = Dilation[ImageDifference[Blur[#, 20], #], 10] & /@ buildingselevationPlots finalmountainslist = Dilation[ImageDifference[Blur[#, 20], #], 10] & /@ mountainselevationPlots finalcitieslist = Dilation[ImageDifference[Blur[#, 20], #], 10] & /@ citieselevationPlots finalunderseafeatureslist = Dilation[ImageDifference[Blur[#, 20], #], 10] & /@ underseafeatureselevationPlots This returns: These are two images are the images of Mount Everest and Mariana Trench after applying the high-frequency filter. As mentioned above, the left one is Mount Everest and the right one is Mariana Trench. Henceforth, the differences are lots of "bubbles" in the left image compare to the right one where the "bubbles" represent the abrupt changes in the initial topography. Creating Training Set I gathered all the separated training sets (capital cities, cities, buildings, mountains, undersea features) into a larger training set. finalonlandlist = Join[finalcapitalcitieslist, finalcitieslist, finalbuildingslist, finalmountainslist] finalunderseafeatureslist = finalunderseafeatureslist finaltraininglist = Join[finalonlandlist, finalunderseafeatureslist] Generating Classify Function Creating the classify function base on the training set and also the result of each element in the set and the machine got the best fit line for my model by applying Logistic Regression method. landExamplesAndClasses = (# -> "Land" &) /@ finalonlandlist; seaExamplesAndClasses = (# -> "Sea" &) /@ finalunderseafeatureslist; predict = Classify[Union[landExamplesAndClasses, seaExamplesAndClasses]] Creating the Final Function to Return Probabilities I created the final function which returns the probabilities of both land and sea as the input is a topography of an arbitrary area. predictor[img_] := predict[Dilation[ImageDifference[Blur[img, 20], img], 10], "Probabilities"] // Normal test = ReliefPlot[topography, Frame -> False, PlotRangePadding -> None] predictor[test] An Application: "Generating Temperature Map and Smooth Plot" In this extra challenge, I randomly chose a place and generate the heat map of it and the smooth map as well. Generating Temperature Map What I have done was followed by this list: Get the coordinates of the current location. From current location, move to its right 15 times for 8000 meters in each step to get a list of new coordinates. From each coordinate in the list above, move downwards 15 times for 8000 meters in each step to get a list of new coordinates. Rearrange the list of coordinates vertical instead of horizontal. Join the lists of coordinates above all together into a big list. Make a list of plots for each coordinate in the list above. For each plot, apply the Final Function the get the Probability of Land and Water. Collect all the Land Probability of each plot and Arrange them into a 16x16-sized matrix. From the 16x16-sized matrix, convert each number into Thermometer Colors ranging from - 0,5 to 0.5. vitri = GeoPosition[Here] vitringang = Table[GeoDestination[GeoPosition[vitri], {dist, 90}], {dist, 0, 120000, 8000}] vitridoc = Table[GeoDestination[GeoPosition[#], {dist, 180}], {dist, 8000, 120000, 8000}] & /@ vitringang vitridocdachinhsua = Table[vitridoc[[All, n]], {n, 1, 15}] tatcavitri = Join[vitridocdachinhsua, vitringang] // Flatten bando = ReliefPlot[ Reverse[GeoElevationData[{#, {(# /. GeoPosition[x_] -> x)[[1]] - 0.05, (# /. GeoPosition[x_] -> x)[[2]] + 0.05}}, GeoRange -> Quantity[4, "Kilometers"], GeoProjection -> Automatic, GeoCenter -> Automatic]], ImageSize -> Medium, ColorFunction -> GrayLevel, Frame -> False, PlotRangePadding -> None] & /@ (GeoPosition /@ tatcavitri) topo1 = predictor[#] & /@ bando // Flatten bandomoi = Partition[ Values[topo1][[#]] & /@ Flatten[Position[Keys[topo1], "Land"]], 16] MatrixPlot[bandomoi - .5, ColorFunction -> ColorData["ThermometerColors"]] And it returns: Let's compare it to the actual map: The temperature map approximately matches the actual map. However, at some points, for instance, in the sea, the color of the pixel is more red than blue. I tried to figure out the reason and found out that it might be rough because it became a water body only recently approximately couple of tens of thousands of years ago. Generating Smooth Plot To do this, I just applied quite same process as generating the Temperature Map above. point1 = GeoPosition[{11.111457, 107.636774}] point2 = GeoPosition[{11.111457, 109.803591}] point3 = GeoPosition[{8.991341, 107.636774}] point4 = GeoPosition[{8.991341, 109.803591}] rong = UnitConvert[GeoDistance[point1, point2], \!\(\* NamespaceBox["LinguisticAssistant", DynamicModuleBox[{Typeset`query$$ = "meters", Typeset`boxes$$ = TemplateBox[{ InterpretationBox[" ", 1], "\"m\"", "meters", "\"Meters\""}, "Quantity", SyntaxForm -> Mod], Typeset`allassumptions$$ = {{ "type" -> "Clash", "word" -> "meters", "template" -> "Assuming \"${word}\" is ${desc1}. Use as \ ${desc2} instead", "count" -> "2", "Values" -> {{ "name" -> "Unit", "desc" -> "a unit", "input" -> "C.meters-_Unit-"}, { "name" -> "Word", "desc" -> "a word", "input" -> "C.meters-_Word-"}}}}, Typeset`assumptions$$ = {}, Typeset`open$$ = {1, 2}, Typeset`querystate$$ = { "Online" -> True, "Allowed" -> True, "mparse.jsp" -> 0.3739998`7.0244163634533505, "Messages" -> {}}}, DynamicBox[ToBoxes[ AlphaIntegration`LinguisticAssistantBoxes["", 4, Automatic, Dynamic[Typeset`query$$], Dynamic[Typeset`boxes$$], Dynamic[Typeset`allassumptions$$], Dynamic[Typeset`assumptions$$], Dynamic[Typeset`open$$], Dynamic[Typeset`querystate$$]], StandardForm], ImageSizeCache->{80., {8., 18.}}, TrackedSymbols:>{ Typeset`query$$, Typeset`boxes$$, Typeset`allassumptions$$, Typeset`assumptions$$, Typeset`open$$, Typeset`querystate$$}], DynamicModuleValues:>{}, UndoTrackedVariables:>{Typeset`open$$}], BaseStyle->{"Deploy"}, DeleteWithContents->True, Editable->False, SelectWithContents->True]\)] // QuantityMagnitude dai = UnitConvert[GeoDistance[point1, point3], \!\(\* NamespaceBox["LinguisticAssistant", DynamicModuleBox[{Typeset`query$$ = "meters", Typeset`boxes$$ = TemplateBox[{ InterpretationBox[" ", 1], "\"m\"", "meters", "\"Meters\""}, "Quantity", SyntaxForm -> Mod], Typeset`allassumptions$$ = {{ "type" -> "Clash", "word" -> "meters", "template" -> "Assuming \"${word}\" is ${desc1}. Use as \ ${desc2} instead", "count" -> "2", "Values" -> {{ "name" -> "Unit", "desc" -> "a unit", "input" -> "C.meters-_Unit-"}, { "name" -> "Word", "desc" -> "a word", "input" -> "C.meters-_Word-"}}}}, Typeset`assumptions$$ = {}, Typeset`open$$ = {1, 2}, Typeset`querystate$$ = { "Online" -> True, "Allowed" -> True, "mparse.jsp" -> 0.3739998`7.0244163634533505, "Messages" -> {}}}, DynamicBox[ToBoxes[ AlphaIntegration`LinguisticAssistantBoxes["", 4, Automatic, Dynamic[Typeset`query$$], Dynamic[Typeset`boxes$$], Dynamic[Typeset`allassumptions$$], Dynamic[Typeset`assumptions$$], Dynamic[Typeset`open$$], Dynamic[Typeset`querystate$$]], StandardForm], ImageSizeCache->{80., {8., 18.}}, TrackedSymbols:>{ Typeset`query$$, Typeset`boxes$$, Typeset`allassumptions$$, Typeset`assumptions$$, Typeset`open$$, Typeset`querystate$$}], DynamicModuleValues:>{}, UndoTrackedVariables:>{Typeset`open$$}], BaseStyle->{"Deploy"}, DeleteWithContents->True, Editable->False, SelectWithContents->True]\)] // QuantityMagnitude soluongbuocnhayngang = rong/8000 + 1 // Floor soluongbuocnhaydoc = (dai - 8000)/8000 + 1 // Floor vitringang1 = Table[GeoDestination[GeoPosition[point1], {dist, 90}], {dist, 0, rong, 8000}] vitridoc1 = Table[GeoDestination[GeoPosition[#], {dist, 180}], {dist, 8000, dai, 8000}] & /@ vitringang1 vitridoc1dachinhsua = Table[vitridoc1[[All, n]], {n, 1, soluongbuocnhaydoc}] vitridoc1dachinhsua = Table[vitridoc1[[All, n]], {n, 1, soluongbuocnhaydoc}] tatcavitri1 = Join[vitridoc1dachinhsua, vitringang1] // Flatten bando1 = ReliefPlot[ Reverse[GeoElevationData[{#, {(# /. GeoPosition[x_] -> x)[[1]] - 0.05, (# /. GeoPosition[x_] -> x)[[2]] + 0.05}}, GeoRange -> Quantity[4, "Kilometers"], GeoProjection -> Automatic, GeoCenter -> Automatic]], ImageSize -> Medium, ColorFunction -> GrayLevel, Frame -> False, PlotRangePadding -> None] & /@ (GeoPosition /@ tatcavitri1) topo11 = predictor[#] & /@ bando1 // Flatten bandomoi1 = Partition[ Values[topo11][[#]] & /@ Flatten[Position[Keys[topo11], "Land"]], soluongbuocnhayngang] // Flatten kethopvitrichiso = MapThread[Rule, {tatcavitri1, bandomoi1}] GeoSmoothHistogram[kethopvitrichiso, ColorFunction -> ColorData["ThermometerColors"]] And here it is, this is the Smooth Map of a part of Vietnam which contains sea and land. Red regions in the heat map show land and blue show ocean and as visible Wrapping-up To summarize, in the first step, I collected the topographies from both on-land and underwater. Then in the second one, the most complicated step, I had tried to use Discrete Fourier Transform (DFT) for a topography image into the matrix, but the problem was that all Fourier images were very similar. Then I tried applying a high-frequency filter on the images. High frequency filtered output was easy to distinguish since surface above water is generally rough when compared to the surface underwater. After the Image Processing Step, I created the two set of training sets, one is on-land training set and another one is underwater features training set. Hence, I went straight to build the classify function and the Final Function to Return Probabilities of "Land" and "Sea" as well. Finally, for an extra challenge also a nice application in other words for my project, I generated a Temperature Map of a part of Boston and tried to compare it to the actual map and the generated heat map is similar to the real map. Future Works As we can see, from an arbitrary relief plot of a location, it is possible to turn the relief one into the thermometer surface where blue represents "Sea" and red represents "Land" by using the probabilities that the classifier returned. The first thing that needs to be done is to think of other crucial parameters that can be used, like water bodies around buildings, etc. to increase the accuracy. After increasing the accuracy and interesting thing to do would be to analyze the topography of other celestial bodies (like Mars) and then predict regions where liquids flowed on the surface. Are these the images of wet Mars in the past and current Mars? Who knows. Special Thanks I would like to give my special thanks to my mentor Mr. Harshal Gajjar for guiding me and also Mr. Mads Bahrami for giving me more challenges! Final Project GitHub link

POSTED BY: Khang Minh Nguyen

18 Replies

Sort By:

Mohammad Bahrami

Mohammad Bahrami, Wolfram Research Inc

Posted 6 years ago

Great job Khang! Your predictor is a suitable function for Wolfram Function Repository; and other users can use it in any Wolfram Language computation. What do you think about submitting your function into WFR?

POSTED BY: Mohammad Bahrami

Khang Minh Nguyen

Khang Minh Nguyen, VNU-HCMC High School for the Gifted

Posted 6 years ago

Thank you so much, Mr. Mads! I totally agree with publishing my function for WFR, I'll submit it soon!

POSTED BY: Khang Minh Nguyen

Nam Tran

Nam Tran, University of Miami

Posted 6 years ago

Hello Minh Khang, I was not aware that there was a fellow PTNK at Wolfram Summer 2019. I just want to reach out and say hello. Have a safe flight home, in case you are traveling back to Asia. Best, Nam