Message Boards Message Boards


[WSG20] Daily Study Group: Biodiversity Explorations with Machine Learning

Posted 4 months ago
30 Replies
12 Total Likes

enter image description here

In this study group you'll practice applying machine learning techniques using the Wolfram Language and data from the natural world. Topics include functions to access biodiversity data, examples of classification, text analysis in social media, audio processing of bird sounds and deploying a trained neural network image classifier to your mobile phone. Sessions run daily, Monday through Friday. Join any session 15 minutes early for help getting started. I will guide each session by sharing lessons, polling the group to review key concepts, introducing practice problems and answering questions. A certificate of program completion will be awarded to participants who attend online sessions and pass a quiz. Sign up:

30 Replies

Looking forward to it, @Jofre Espigule-Pons!

Hi Jamie, I wasn't able to attend today. Will you be posting the recording soon? Sincerely, Jay Morreale Member p-brane LLC

Hi @Jay Morreale, the recording link was included in today's email reminder. The link is:

Hello , Can you please post the Notebook, you showed in regards to the "Tick" example. I want to try to correlate its population growth with weather such as rainfall... I am interested in the central WI area..

Thanks Andrew

The following sections are the ones related to tick species. If you are interested in getting tick observations from central Wisconsin only. I recommend you using ResourceFunction["INaturalistSearch"] (see the next last example)

You showed the histogram data, but during the class you increased the resolution for the month of May. I don' recall how you did that? Did you change "Month" to "Day" in the Histogram plot?

Thanks Andrew

Yes, if you change "Month" to "Day" you will get daily observations on DateHistogram.

Probably you are already aware of WeatherData function. You can use it to get rainfall info:

  Entity["AdministrativeDivision", {"Wisconsin", "UnitedStates"}][
  "TotalPrecipitation", {{2018, 1, 1}, {2022, 12, 1}, "Month"}], 
 Joined -> True]

Thank you very much, this is very helpful!! Andrew Skipor

Thanks very much, Andrew

Posted 4 months ago

At the end of the Day 2 challenge example Jofre provided, the syntax trained["TrainedNet"][image] would not execute, but if I used c[image] it works as expected. Is there some code missing (i.e., I didn't see trained[] defined)?

Also, is there a shortcut for ResourceFunction["iNaturalistSearch"] that will create the iconized form on the fly like Ctrl + Enter for entities?

Thanks for pointing to this syntax error. I edited the code. Concerning the shortcut for ResourceFunction, I'm not aware of any trick. It would be nice to have one though.

Note: At the "Predict" section from today's session there was an issue for those running 12.1 version or earlier. You will need to change a bit the code. Use to the following:

pm = PredictorMeasurements[p, testing -> "Rings"]

Instead of using the 12.2 version code:

pm = PredictorMeasurements[p, testing]

Thank you, Neil for catching this issue.

Thanks, Andrew

I'm trying to expand on the anole example, but apparently iNaturalistSearch is limited to a maximum of 200 observations, even though 3046 observations for the brown anole were counted for Houston. I'm unable to use the Page and MaxItems options to automate the extraction (although I can brute force it by manually extracting 200 observations at a time into separate variables and then using Join. I also tried to extract 3 different species using the Or ("|") operator, but that did not work. My goal is to visually show the displacement of the green anole by the brown anole over the last 20 years in Houston.

In Jofre's stink bug example, he used a SemanticImport of a *.csv file, but I'm not sure of its source.

In my meandering around the web, I became aware of another invasive lizard that's beginning to get noticed in Texas, the Argentine tegus - they're the size of a dog and grow to about 4 feet long. Oh, and BTW, they're voracious omnivores!

@James Kralik , I recommend you to use GBIFImport instead, specially when the number of observations is very large. You can download the occurrences data directly on the GBIF site:

Once you have downloaded the dataset for this species, you can then import the local CSV/Text file containing the occurrences using ResourceFunction["GBIFImport"]

I was finally able to create animations similar to that in Jofre's stink bug example for the brown and green anole. Both animations are based on GBIF data. When the distributions are compared for a given year, it appears both populations are increasing and that green anoles are apparently not being displaced by the brown anoles. What's being depicted, however, is probably the increasing frequency of observations being reported, rather than any change in the true populations of these two species year-by-year. I can report, however, that I have not seen any green anoles around my house or on my walks this year. Perhaps because I'm not looking in the right places; e.g., the brown anoles have taken over the ground and the green anoles have moved into other niches, such as tree canopies or less urbanized areas like parks.



Nice animations! You are right, for a comparative analysis one would need to take into account the rise in number of observers over the last years. Or maybe you could plot their relative abundance over time for these two species. In this article they also point out the change in habitat for the green anole as you mentioned:

Today (Day-5), we are going to train the "Wolfram ImageIdentify Net V1"- Neural Network with additional images from “INaturalistSearch”.

I would like to do the same with "YOLO V2 Trained on MS-COCO Data" and or "YOLO V3 Trained on Open Images Data". The input shall be an image and the outputs are Label, Probability, and Bounding Box.

How to train such a network? I have not found any documentation. The final objective is to identify all mushrooms on the following image. enter image description here

Hi @Juergen Kanz , there was presentation on this particular topic at the Wolfram Tech Conference 2020, that you might want to watch.

It isn't an easy task yet. Probably, the most laborious task will be to manually tag the labels and add bounding boxes on your training dataset. Hopefully in the near future there will be more tools for automating this task.

Hi Jofre,

Yes, this presentation is going into the right direction. Is it possible to get access to the notebook?

Thanks in advance.

Posted 4 months ago

Jofre - Great presentation yesterday on audio processing. Couldn't help but reflect upon how the results were somewhat inconclusive (ie end of slide 3) despite both the overall process and coding appearing quite sound. With that said, is it possible the bird you recorded wasn't actually an owl?

Mourning doves can sound like owls ( ) and would be much more common habitants in the forest of lamp poles populating the UICI stadium parking lots. The initial WL entity search did return "pigeon, dove" as its AudioIdentify result.

A search of inaturalist showed the Rock Dove could serve as another likely suspect for A/B test use.

Just a thought. Again, excellent presentation. Thank you.

Oh that would be embarrassing! I'm not an owl expert and you are right that mourning dove can sound like owls. Thanks for pointing out the audubon article. If you want to check my observation and audio recording on iNaturalist, you can do it here:

Since there were other iNaturalists identifying the species as great horned owl, I did not perform an exhaustive search. I don't remember if I really saw the animal or just heard the sound. :)

I noticed that the anole downloads from GBIF are labeled as *.csv files, but the don't open properly in Excel (which is easier for me to use for filtering the data). They appear to be either space or tab delimited files. I tried to import them in Excel several different ways, but nothing worked, The files do import fine in Mathematica, however, using SemanticImport.

I assume the process Jofre used to load the iOS mushroom app was for his iPhone only, and did not place the app in the Apple app store - Correct?

In the quiz, none of the possible answers for Question 5 will work because of a syntax error. I'm pretty sure I've seen this same question on previous quizzes with the same error.

UPDATE: I just discovered that my earlier post about the syntax error in Question 5 of the quiz is wrong. Some operators (those that contain multiple characters) need to be entered using key combinations between Escape ... Escape. Apologies for any confusion.

Posted 4 months ago

Hi James,

The problem is caused by GBIF giving the file a .csv extension. It is not comma separated, it is tab separated. Changed the file extension from .csv to .tsv and Excel will open it correctly.

In response to the query about using neural networks for medical image analysis, here is a list of past posts from community on medical related machine learning projects that might be useful:

Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
or Discard

Group Abstract Group Abstract