18C Computer Vision Artifical Knowledge Links (Beeldherkenningsmodel )

Waarneming.nl: Max 3 keer per jaar herziening van het model.
Beeldherkenningsmodel

Model 1.Naturalis: 2-20 photos per species
Model 2.Naturalis: 40 photos per species
https://www.inaturalist.org/journal/optilete/43578-inaturalist-en-het-computer-vision-een-korte-geschiedenis
Model 4 4.1Nat:40 photographers per species
Sep 2019 Model 5 <1000 photos per species
Mar-Jul2020 Model 6, Tensor Flow2 25,000 taxa
Mar 2020 Model 6 March 2020 model did and it had ~21000 species and ~2500 genera.
https://www.inaturalist.org/blog/31806-a-new-vision-model
Mar-Jul 2021 Model 7, Tensor Flow2, 25,000 taxa +25.000 leaves , 21milj training photos , 1 NVIDIA RTX 8000 GPU, with hybrids (up from 38,000)
https://www.inaturalist.org/posts/54236-new-computer-vision-model
Oct-Dec2021 Model 8, Tensor Flow2, 47.00 taxa +25.000 leaves, 25milj training photos , 3NVIDIA RTX 8000 GPU, 4 times Memory, Label smoothing, without hybrids
April 2022 Model 9
https://www.inaturalist.org/blog/63931-the-latest-computer-vision-model-updates
Aug 2022 Model 10.0.1 includes 60,000)
https://www.inaturalist.org/blog/69193-new-computer-vision-model
Sept 2022 Model 10.0.2 (+5000 species, includes 65,000 taxa (up from 60,000)
https://docs.google.com/spreadsheets/d/12bfPG8idJbWpCO0Semw26fnH8hCOI_TDbTRSdPNXe1U/edit#gid=959153103
Okt 2022 Model 10 Okt 2022
https://www.inaturalist.org/blog/71290-a-new-computer-vision-model-including-1-368-new-taxa-in-37-days

Beeldherkenning Model v1.3 (Nr 11), Oktober 2022 has 66,214 taxa, up from 64,884. https://www.inaturalist.org/blog/71290-a-new-computer-vision-model-including-1-368-new-taxa-in-37-days

eferentiewaarnemingen er gebruikt worden (5000 of 40)
[b]Model 7.2022 38,000 to 47,000 taxa, and from 21 million to 25 million training photos[/b]
https://www.inaturalist.org/blog/59122-new-vision-model-training-started
https://groups.google.com/forum/#!topic/inaturalist/K9nJOC0Cjss

https://forum.inaturalist.org/t/use-computer-vision-to-annotate-observations/3331
https://forum.inaturalist.org/t/what-image-s-are-used-for-training-computer-vision/3307/6

Interessant artikel over de Artificial Intelligence achter Inaturalist: https://news.mongabay.com/wildtech/2017/08/smartphone-app-combines-computer-vision-and-crowdsourcing-to-explore-earths-biodiversity-one-photo-at-a-time/
[b]Model 7.2022 38,000 to 47,000 taxa, and from 21 million to 25 million training photos[/b]
= = = Nov2020
https://forum.inaturalist.org/t/better-use-of-location-in-computer-vision-suggestions/915/32
https://github.com/inaturalist/iNaturalistAPI/blob/main/lib/controllers/v1/computervision_controller.js#L368
= = = Jan 2022
https://www.inaturalist.org/blog/59122-new-vision-model-training-started

Common ancestor for the top 3 raw results

https://github.com/inaturalist/iNaturalistAPI/blob/main/lib/controllers/v1/computervision_controller.js#L368

https://www.inaturalist.org/journal/ahospers/28858-funding-infrastructure-costs-images-on-amazon

m looking for a way of finding observations without coordinates. Many of these have Location Notes, so it is basically lacking Longitude or Latitude that I am looking for.
I am not interested in those with Latitude = 0 or Longitude = 0 (see https://www.inaturalist.org/projects/null (which is very inappropriately named, as I am looking for NULLS but this project identifies zeros instead - nulls have no data (value unassigned, or empty, or missing), but 0 is a specific datum - zero - like any other value - and not a “null”)).
At present for this user, filtering on verifiable=false gives me more or less what I want, but conflates these with any Data Quality criteria, not just missing coordinates.
https://www.inaturalist.org/observations?place_id=any&subview=grid&user_id=ahospers&verifiable=false 1

I added a very basic search to atlases in response to Jane’s feature request https://www.inaturalist.org/atlases 6. So now if you wanted to see all ‘marked’, ‘active’ atlases of taxa in the LIliaceae you’d do https://www.inaturalist.org/atlases?utf8=✓&filters[taxon_name]=Lilies&filters[taxon_id]=47328&filters[is_active]=True&filters[is_marked]=True

The out-of-range is vestigal, we don’t display it anywhere anymore (except the old filter menu thats still on https://www.inaturalist.org/observations/loarie 1). It worked directly on the taxon-range, rather than using atlases

k kan de vraag beantwoorden voor het europese model welke nu getraind wordt:

om als soort opgenomen te kunnen worden moet

de taxonomische boom kloppen
minimaal 10 waarnemingen bestaan met 1 of meer fotos welke handmatig gevalideerd zijn met een groen vinkje binnen Europa
de foto`s moeten minimaal 400*400 pixels zijn

het maximaal aantal fotos voor 1 soort is 25.000

http://nl.wikisage.org/wiki/Eelderbaan_Vinkhuizen
http://nl.wikisage.org/wiki/Westpark_(Groningen)
http://nl.wikisage.org/wiki/Onneresch
https://nl.everybodywiki.com/Eelderbaan_Vinkhuizen
https://nl.everybodywiki.com/Eelderbaan
https://nl.everybodywiki.com/Suikerunieterrein
https://nl.everybodywiki.com/Roegebos
https://nl.everybodywiki.com/Westpark_(DeHeld.Vinkhuizen)
https://nl.everybodywiki.com/Onneres
https://nl.everybodywiki.com/Holtesch_(Hooghalen)
https://nl.everybodywiki.com/Zwagermieden

Het genereren van het europees model is klaar en staat nu op test.
Deze versie bevat MINDER soorten dan het nederlandse/belgische model omdat het minimaal aantal goedgekeurde waarnemingen met fotos is opgeschroefd van 2 naar 10
Het gevolg is wel dat de herkenning beter gaat in veruit de meeste ordes
Om deze versie te kunnen testen is op de oude site in het waarnemingen detailscherm een knopje toegevoegd onder het kopje 'contact' .Zie bijlage
b.v. https://old.waarneming.nl/waarneming/view/226572102
Druk daarop en je ziet hoe de herkenning zou uitpakken met de nieuwe versie.

Wat de planten betreft ben ik qua soorten die voorheen niet in het NL/BE model zaten nog niet onder de indruk. Ook bij waarnemingen met duidelijke foto's van taxa waarbij een paar honderd goedgekeurde foto's zijn, zit de herkenning er in het merendeel van de gevallen naast. Als test heb ik de prestatie even vergeleken met PlantNet (waar de datakwantiteit van de foto's iets hoger is, maar de datakwaliteit een stuk lager dan 'bij ons' (meer foutieve determinaties)). Daar worden dezelfde foto's consequent wél juist herkend.

Ik twijfel zelfs of het voor de planten op Observation.org wel een vooruitgang is. Er wordt wel iets meer herkend (vooral plantentaxa met >1000 goedgekeurde foto's worden herkend, maar dat is een zeer gering aantal), maar soorten als Edelweiss, Purpersla en Linaria alpina worden consequent niet herkend (en dat zijn toch echt taxa waarbij verwacht mag worden dat de beeldherkenning er iets mee kan, daarvan waren ten tijde van het ophalen van de foto's tenminste ca. 400 gevalideerde foto's van beschikbaar), en de foute suggesties zijn bovendien uiteenlopender dan bij het huidige model het geval is volgens mij. Ik had er iets meer van verwacht/gehoopt om eerlijk te zijn.

De komende tijd zal ik zo nu en dan nog eens wat testen. Ik ben benieuwd wat de ervaringen van andere soortgroepen op observation zijn, maar ook naar de ervaringen van de NL validatoren.

Zit er al iets in de pijplijn mbt het niet laten suggereren van soorten die niet in NL/BE voorkomen wanneer de coördinaten van de foto/waarneming wel in NL/BE zijn?
Ik heb het bij de paddenstoelen geprobeerd, maar bij de niet in NL/BE voorkomende soorten heb ik uitsluitend namen voorgesteld gekregen van soorten die wel in NL/BE voorkomen. Voor de rest is het model voor mijn gevoel het net zo goed/slecht als het huidige model: een aantal soorten worden beter herkend, andere slechter, goed herkenbare algemene soorten worden vrij goed benoemd, bij de slecht herkenbare of zeldzame soorten (die zijn er veel bij de paddenstoelen) krijg je nog steeds de meest uiteenlopende suggesties.
k hoop dat men per ongeluk het Europese model nog niet heeft geactiveerd. Ik koos een groep met enkele tientallen bijensoorten met zeer karakteristiek contrasterend geel-zwart of rood-zwart patroon waarvan ik weet dat NIA de Belgische/Nederlandse soorten zeer goed kan herkennen. Het resultaat is dat zelfs voor soorten met helemaal andere unieke kleurpatronen ik nog steeds de Belgische/Nederlandse soorten als suggestie krijg. Van die Europese soorten bestaan tientallen goedgekeurde foto's, van lage tot zeer hoge kwalite
https://www.inaturalist.org/journal/optilete/43578-inaturalist-en-het-computer-vision-een-korte-geschiedenis
https://www.inaturalist.org/blog/59122-new-vision-model-training-started

BIODIV Next
Een presentatie hoe een be-nl model samengesteld is https://observation.org/download/Biodiv%20Next%20-%20Dutch_Belgian%20species%20ID%20.pptx
Hierarchisch Model Ensemble is nauwkeuriger dan een singe model, mogelijk omdat bij 16.000 soorten te veel keuzes gemaakt moeten worden (Inception-v1, Inception-v3,Inception-v4, ResNet-18,, ResNet-34 , ResNet-101, GoogleLeNet, BN-NIN, GG-10)
Performance vs Voorkomen

https://forum.inaturalist.org/t/identification-quality-on-inaturalist/7507

AI Model 7 . July 2021

The number of taxa included in the model went from almost 25,000 to over 38,000. That’s an increase of 13,000 taxa compared to the last model, which, to put in perspective, is more than the total number of bird species worldwide. The number of training photos increased from 12 million to nearly 21 million.
Accuracy
Accuracy outside of North America has improved noticeably in this model. We suspect this is largely due to the nearly doubling of the data driving this model in addition to recent international growth in the iNaturalist community. We’re continuing to work on developing a better framework for evaluating changes in model accuracy, especially given tradeoffs among global and regional accuracy and accuracy for specific groups of taxa.

The recent changes removing non-nearby taxa from suggestions by default have helped reduce this global-regional accuracy tradeoff, but there’s still more work to do to improve how computer vision predictions are incorporating geographic information.
https://www.inaturalist.org/blog/54236-new-computer-vision-model
Participate in the annual iNaturalist challenges: Our collaborators Grant Van Horn and Oisin Mac Aodha continue to run machine learning challenges with iNaturalist data as part of the annual Computer Vision and Pattern Recognition conference. By participating you can help us all learn new techniques for improving these models.

https://www.inaturalist.org/blog/archives/2022/05
https://www.inaturalist.org/blog/66531-we-ve-passed-100-000-000-verifiable-observations-on-inaturalist
https://www.inaturalist.org/blog/63931-the-latest-computer-vision-model-updates

AI Model 8 . May 2022

In 2017 the amount of recognised species was 20.000 and now it is still.....20.000?
https://www.inaturalist.org/pages/help#cv-taxa
FWIW, there's also discussion and some additional charts at https://forum.inaturalist.org/t/psst-new-vision-model-released/10854/11
https://forum.inaturalist.org/t/identification-quality-on-inaturalist/7507
AI Model 5 . July 2019 included 16,000 taxa and 12 million training photos.
AI Model 6 . July 2020 included 25,000 taxa and xx million training photos.
AI Model 7 . July 2021 included 38,000 taxa and 21 million training photos. Training job in October 2021, we planned to train a AI Model 7 . May 2022 on 47,000 taxa and 25 million training images but finished with er 55,000 taxa and over 27 million training images.
March 2020 Model 6
https://www.inaturalist.org/blog/31806-a-new-vision-model
Juli 2021 Model 7
https://www.inaturalist.org/posts/54236-new-computer-vision-model
Sept 2022 Model 9 Sept 2022
https://www.inaturalist.org/blog/63931-the-latest-computer-vision-model-updates
https://stackoverflow.com/questions/44860563/can-vgg-19-fine-tuned-model-outperform-inception-v3-fine-tuned-model
https://stackoverflow.com/questions/44860563/can-vgg-19-fine-tuned-model-outperform-inception-v3-fine-tuned-model
https://www.inaturalist.org/posts/54236-new-computer-vision-model
https://www.inaturalist.org/blog/69958-a-new-computer-vision-model-including-4-717-new-taxa
Okt 2022 Model 10 Okt 2022
https://www.inaturalist.org/blog/71290-a-new-computer-vision-model-including-1-368-new-taxa-in-37-days

Model v1.3 (Nr 11), Oktober 2022 has 66,214 taxa, up from 64,884. https://www.inaturalist.org/blog/71290-a-new-computer-vision-model-including-1-368-new-taxa-in-37-days

This new model (v1.3) is the second we’ve trained in about a month using the new faster approach, but it’s the first with a narrow ~1 month interval between the export of the data it was trained on and the export of the data the model it is replacing (v1.2) was trained on. The previous model (v1.2) was replacing a model (v1.1) trained on data exported in April so there was a 4 month interval between these data exports (interval between A and B in the figure below). This 4 month interval is why model 1.2 added ~5,000 new taxa to the model. The new model (v1.3) was trained on data exported just 37 days after the data used to train model 1.2 (interval between B and C in the figure below) and added 1,368 new taxa.

Best een aardig verhaal
https://zenodo.org/record/7050651/files/Naturalis%20Biodiversity%20Center%20%282022%29.%20Eindrapportage%20project%20Automatische%20beeldherkenning%20voor%20museumcollecties.pdf?download=1

et model verboden toegang is, maar de fotos wer wel toegangkelijk. Scheelt in elk geval wel weer 2800 eiren rapen mocht je zelf een vogeleieren model willen maken :-)

https://zenodo.org/record/7050651/files/Naturalis%20Biodiversity%20Center%20%282022%29.%20Eindrapportage%20project%20Automatische%20beeldherkenning%20voor%20museumcollecties.pdf?download=1

Waaarom is er niet gewoon een BEWERK button ?
Mocht je te lui zijn om te lezen..bij dit verhaal hoef je alleen maar je oortjes open te doen en het is een nog veel innovatiever onderwerp
https://www.youtube.com/watch?v=FUfrRcTNOh0
Ik snap niet dat dit onderwerp nog maar 59 keer bekeken is..
TAISIG Talks: Dan Stowell on Birdsong representations with AI
https://www.youtube.com/watch?v=FUfrRcTNOh0
To facilitate collaboration among researchers from different AI related fields, in the TAISIG Talks series, Tilburg University brings together AI experts from various domains to discuss their most recent findings. Each TAISIG talk features three scientists with different backgrounds and at different stages in their careers.

https://zenodo.org/record/7050651#.Y2ZqenbMK72 Naturalis Biodiversity Center (2022). Eindrapportage project Automatische beeldherkenning voor museumcollecties

Publicado el marzo 20, 2020 03:31 TARDE por

ahospers

Comentarios

Coolia 2020(3)

In recent years there has been an explosion in the availability of apps for smartphones that can be
used to help with mushroom identification in the field. There are a number of approaches available, ranging from those apps that identify mushroom automatically based on the use of Artificial Intelligence (AI) and automated Image Recognition, through those that require the user to use traditional dichotomous keys or multi-access keys, to those that may only have a range of images without a clear system for identification of any species of interest

BACHELORARBEIT MAGIC MUSHROOM APP -Mit Deep Learning essbare Pilze erkennen met Python!!!
https://www.ntb.ch/fileadmin/NTB_Institute/ICE/projekte/MagicMushroom/JUNG_R._WAGNER_D._MagicMushroom_App-Pilzklassifikation_mit_CNNs.pdf

Deep Shrooms: classifying mushroom images
https://tuomonieminen.github.io/deep-shrooms/
https://github.com/TuomoNieminen/deep-shrooms (Python)
https://teekoivi.users.cs.helsinki.fi/
https://ujjwalkarn.me/2016/08/11/intuitive-explanation-convnets/
https://www.youtube.com/watch?v=f6Bf3gl4hWY

Shroomnet: Kunstliches neuronales Netz f ¨ ur die Bestimmung von Pilzarten
https://www.obermeier.ch/wp-content/uploads/2018/12/ShroomNET_small.pdf !!

Artificial Intelligence for plant identification on smartphones and tablets
https://bsbi.org/wp-content/uploads/dlm_uploads/BSBI-News-144-pp34-40-plant-id-apps-final.pdf

https://web.plant.id/

TUOMAS NIEMINEN DEEP LEARNING IN QUANTIFYING VASCULAR BURDEN FROM BRAIN IMAGES
https://www.semanticscholar.org/paper/TUOMAS-NIEMINEN-DEEP-LEARNING-IN-QUANTIFYING-BURDEN-Eskola/aea24dc5822ac9f5af4801f9aaf9ab864cf23aea

Obsidentify
https://play.google.com/store/apps/details?id=org.observation.obsidentify

Deens svampeatlas
https://play.google.com/store/apps/details?id=com.noque.svampeatlas

Duits
https://play.google.com/store/apps/details?id=com.nastylion.pilz

iNaturalist Seek
https://play.google.com/store/apps/details?id=org.inaturalist.seek

Google Lens
https://play.google.com/store/apps/details?id=com.google.ar.lens

= = = Nov2020
https://forum.inaturalist.org/t/better-use-of-location-in-computer-vision-suggestions/915/32?u=ahospers

Publicado por ahospers hace casi 4 años

ee natuurlijk niet. Aangezien er redelijk lange tijd tussen nieuwe versies van de herkenningssoftware zitten, zal het ongeveer als volgt in elkaar zitten, waar bij het proces telkens deels of geheel wordt doorgelopen. Men begint met de volledige gevalideerde fotodatabase. Daarop laat men een 'Convolutional Neural Network' 'leren' (hoe dit precies in zijn werk gaat wil ik je graag tot in de puntjes uitleggen, maar lijkt mij nogal buiten de strekking van dit forum).

Het belangrijkste is in ieder geval om een balans te vinden tussen 'overfitten' en geen herkenning hebben. Dat wil zeggen: voorkomen dat alleen zeer sterk gelijkende foto's op de foto's in de database herkend worden, maar er wel voor zorgen dat soorten 'gegeneraliseerd' herkend worden. Zodat dus ook nieuwe foto's van die soort worden herkend.

Wanneer de volledige database als trainingsmateriaal wordt gebruikt, en de volledige database ook als testmateriaal wordt gebruikt, kan het voorkomen dat een erg complex model superhoog scoort. Superfijn toch, 99% herkenning!! Niet dus. Vaak is dit een gevolg van 'overfitting'. Wanneer je het model een nieuwe foto aanrijkt die deze nog niet eerder heeft gezien, weet het model er vaak slecht raad mee, omdat het heeft geleerd de specifieke foto's uit de database te herkennen in plaats van patronen/'soorten'.

Om dit te voorkomen zijn allerlei trucjes bedacht, waarbij de meest gebruikte optie 'K-fold cross-validation' is, waarbij tijdens het trainen bij iedere iteratie van het proces steeds een ander stukje apart wordt gehouden (vaak ca. 20% van de data, afhankelijk van de grootte van de database), waarna vervolgens het model iedere iteratie van het proces op dat stukje getest/gescoord wordt (wederom, is hier tot in de puntjes uit te leggen, maar niet echt de juiste plaats lijkt mij).

Het uiteindelijke percentage correcte voorspellingen in die 'K-fold cross-validation' van de laatste iteratie geeft vaak al een vrij aardige indicatie van de kwaliteit van het model, maar voor de kwaliteit wordt vaak gekeken naar de 'confusion matrix', waaruit duidelijk wordt wat goed en fout gaat (de waarden en percentages correct voorspelde foto's die hieruit rollen zou je in principe 'testwaarden' kunnen noemen). In de confusion matrix is ook te zien als welke soort een foutief voorspelde soort dan wel voorspeld is etc. etc. Diegenen die statistiek in hun opleiding hebben gehad, zullen zo vast nog wel de termen 'vals positief' en 'vals negatief' herkennen. Ik kan mij zo voorstellen dat in dit geval wordt gepoogd om het aantal vals positieven zo laag mogelijk te houden. Liever geen voorspelling dan een onjuiste voorspelling.

Het bericht waarnaar je hier verwijst is dan ook slechts bedoeld om dingen eruit te vissen die tijdens de ontwikkeling over het hoofd worden gezien. Daar wordt vaak gekeken naar de grote lijnen, en het optimaliseren van het model in het algemeen (met eventueel preprocessing, en vooral neural network design (layers, aantal neurons, leersnelheid, en nog een heleboel parameters)). Wat gepoogd wordt in zo'n publieke testfase als via dit bericht is om erachter te komen of er toch niet wat 'overfitting' in het model is geslopen (er blijven altijd wat valkuilen, wat voor gekozen methode om dit te voorkomen dan ook), en om eventuele terugkerende fouten bij bepaalde soorten of soortgroepen eruit te vissen voordat het geheel in gebruik wordt genomen...

Publicado por ahospers hace más de 3 años

Specific to the scenario you raise, the CV suggestion rules already adjust the “raw” list of CV matches to “insert” other sister species seen nearby. From this post 1 by , it seems that the suggestion algorithm currently:
https://forum.inaturalist.org/t/better-use-of-location-in-computer-vision-suggestions/915/32
finds the common ancestor for the top 3 raw results,
searches for additional taxa descending from that ancestor that have been observed within 100 km of the observation’s location, and
inserts those taxa into the list of raw results based on the frequency of nearby observations.
My guess is that this “insertion” process may be failing for Trirhabda observations because the raw CV results do not contain 3 closely related species. There are currently 2,905 putative Trihabda observations. Of these, 1,195 are identified just as being genus Trihabda. iNat recognizes 26 total species in the genus. Of these, there are 7 species that have no observations at present.

There are 2-3 Trirhabda species I would expect to be covered by the CV model. The first is Trirhabda bacharidis (currently with 665 observations) which had about 335 verifiable observations when the most recent training dataset was collected on 29 September 2019. CV also should be aware of Trirhabda flaviolimbata which had about 410 verifiable observations by the cut-off date. The third possible species is Trirhabda canadensis, which had about 120 verifiable observations by 29 September 2019. However, it’s possible that fewer than 50 of these had a community ID, which would have excluded the species.

So, when someone uploads an observation, there’s a maximum of 2 or 3 Trirhabda species that could be returned in the result set. For the insertion process to search for other species under Trirhabda, the raw result set would need to ID all those Trirhabda species as the top 3 results. Failing that, the insertion process could kick in at the Family level, if the top 3 results are all in Chrysomelidae, but that spans a huge number of genera and species, so I doubt this would result in additional Trirhabda species being inserted.

So in summary it could be that suggestions for Trirhabda will improve quite a bit once there are 4 or 5 species covered by CV.

But your scenario does suggest that it’s worth looking for any a logic tweaks that would better handle Trirhabda observations without degrading suggestions for other scenarios.

Back on your broader proposal, I see benefits for the prioritization you suggest, but this order does cause me concern:

https://forum.inaturalist.org/t/better-use-of-location-in-computer-vision-suggestions/915/32

For a lot of taxa I work with, the species-level suggestions are comprehensive and accurate, even within genera of 5 - 20 plant species. I’m concerned that making the genus-level suggestion more prominent than a high-confidence species ID will result in lots of observations with genus-level initial IDs where in fact CV did a fine job of finding the right species. That creates a lot more work for identifiers.

I would support prioritizing the genus just for those observations where the algorithm can identify factors that call into question the reliability of a good visual match. These might be:

Many related species that are not in scope for CV.
High rate of previous misidentifications.
Low CV coverage rate for this geography.
Some variable factor that reflects how amenable each iconic taxon is to image-based ID (e.g. it’s realistic to identify many flowering plants to species or even subspecies level based on photographs, but for many arthropods a genus- or family-level ID is the best that is reasonable).

The figure you linked to is just a simplified cartoon to provide a general example of what’s happening. It’s actually the top 10 results. If you want to dig into the code, here’s the “common ancestor” calculation 2. That said, I suspect your explanation for what went wrong with that obs is probably correct. https://github.com/inaturalist/iNaturalistAPI/blob/main/lib/controllers/v1/computervision_controller.js#L368

https://www.tandfonline.com/doi/full/10.1080/00219266.2020.1739114?scroll=top&needAccess=true 10

From the abstract: “Overall, iNaturalist provided greater taxonomic resolution with proper scientific and common names and additional natural history information for many organisms, piquing student interest. Incorporating this smartphone technology may increase identification of local biodiversity and student engagement in the biological sciences.”

We're currently training a new model based on an export in September that had ~18 million images of 35k+ taxa. It's running with the same setup that we've used on previous models, but with a lot more data, so it will probably take ~210 days and be done some time next Spring. We're simultaneously experimenting with an updated system (TensorFlow 2, Xception vs Inception) that seems to be much faster, e.g. it seems like it might do the same job in 40-60 days, so if it seems like the new system performs about the same as the old one in terms of accuracy, inference speed, etc., we might just switch over to that and have a new model deployed in January or February 2021.

FWIW, COVID has kind of put a hitch in our goal of training 2 models a year. We actually ordered some new hardware right before our local shelter in place orders were issued, and we didn't feel the benefit of the new hardware outweighed the COVID risk of spending extended time inside at the office to assemble everything and get it running. Uncertainty about when it would be safe to do so was part of why we didn't start training a new model in the spring (that and the general insanity of the pandemic), but eventually we realized things weren't likely to get much better any time soon so we just started a new training job on the old system.

The Academy is actually open to the public again now

Er is december 2021 een publicatie over het herkennen van kevers met foto's uitgekomen:
https://brill.com/view/journals/tve/aop/article-10.1163-22119434-bja10018/article-10.1163-22119434-bja10018.xml?rskey=hKOc5C&result=2&ebody=pdf-49903

Zie: https://forum.waarneming.nl/index.php?topic=489151 (vanwege de focus op kevers heb ik in het deelforum ook een topic gemaakt
Een generalisatie van mij:
Chrysomelidae, Bladhaantjes, worden matig herkend, maar wel snel bekeken en naar een beter taxon geleid.
Curculionidae, Snuitkevers, worden matig herkend, en het aantal goedkeuringen ligt een stuk lager. Frambozensnuittor is een voorbeeld van een soort die vaak wordt gesuggereerd.
Coccinellidae, Lieveheersbeestjes, worden goed herkend, en snel bekeken.
Carabidae, Loopkevers, worden redelijk herkend; de waarnemingen hebben baat bij meerdere foto's. Veel expertise over deze familie aanwezig.
Staphylinidae, Kortschildkevers, worden erg slecht herkend; Veel geslachten zijn echter goed herkenbaar, ook voor de leek. Die kan de soortherkenner in principe dus snel leren.
Elateridae, Kniptorren; determinatiekenmerken lijken voor een groot deel moeilijk te fotograferen.
Cerambycidae; Boktorren; zijn meestal redelijk te doen voor de soortherkenner.
Cantharidae, Weekschildkevers, gaan denk ik ook redelijk.

"Given that about half of the records in this data set were not validated even after several months, the need for more active validators is evident, especially for megadiverse and common taxa like Coleoptera."
Voor keverfamilies apart lij

==
Verspreidingsatlas van de Pissebedden ?

https://www.eis-nederland.nl/Portals/4/pdfs/Hemiptera/Bodemfauna.pdf

https://www.gelderland.nl/bestanden/Gelderland/Nieuws-en-evenementen/DOC_Blokkenschema_online_Gelderse_biodiversiteitsdag_2021.pdf

Obsidentify is mooi maar ik was nog meer onder de indruk van Google Lens voor herkennen van flora en fauna buiten Nederland. Vorig jaar in de Alpen werden vrijwel alle soorten feilloos herkent.

eel keversoorten hebben een veel kleinere kans te worden waargenomen dan andere.
Zeldzame soorten, maar ook kleine en onopvallende soorten worden door maar weinig waarnemers ingevoerd.
De steekproefgrootte (N<100) is wel klein, maar dat ze allemaal van een verschillende soort moesten zijn was wel een stevige beperking.
Het onderzoek heeft met de procentueel hoge vertegenwoordiging van boktorren en bladsprietkevers een best aardig deel van de (op het oog) gemakkelijkst herkenbare soorten.

Dat er juist een stinkende kortschild is opgenomen en dat de soortherkenner daarmee goed scoorde op kortschildkevers vind ik totaal niet representatief.
Maar goed, ieder mens en ieder onderzoek heeft een eigen karakter (net als iedere kever, al maak ik het zelden mee dat ze stinkende goedjes merkbaar inzetten).
https://forum.waarneming.nl/index.php/topic,489172.msg2498211.html#msg2498211
Er zijn wel tendensen en mogelijkheden om te komen tot meer validabiliteit; en er zijn natuurlijk keverkenners die daarin verder kunnen en willen gaan.
Veel algemeenheden over fotograferen doen er toe (resolutie, gebied waarop het scherpgesteld is, ...)
Maar ook de manier waarop de waarnemer kijkt, en zijn achtergrondkennis.

Publicado por ahospers hace alrededor de 2 años

Een presentatie hoe een be-nl model samengesteld is https://observation.org/download/Biodiv%20Next%20-%20Dutch_Belgian%20species%20ID%20.pptx
Hierarchisch Model Ensemble is nauwkeuriger dan een singe model, mogelijk omdat bij 16.000 soorten te veel keuzes gemaakt moeten worden (Inception-v1, Inception-v3,Inception-v4, ResNet-18,, ResNet-34 , ResNet-101, GoogleLeNet, BN-NIN, GG-10)

Publicado por ahospers hace más de 1 año

Start building your own model with the iNaturalist data now: If you can’t wait for the next CVPR conference, thanks to the Amazon Open Data Program you can start downloading iNaturalist data to train your own models now. Please share with us what you’ve learned by contributing to iNaturalist on Github.
BIODIV Next
Een presentatie hoe een be-nl model samengesteld is https://observation.org/download/Biodiv%20Next%20-%20Dutch_Belgian%20species%20ID%20.pptx
Hierarchisch Model Ensemble is nauwkeuriger dan een singe model, mogelijk omdat bij 16.000 soorten te veel keuzes gemaakt moeten worden (Inception-v1, Inception-v3,Inception-v4, ResNet-18,, ResNet-34 , ResNet-101, GoogleLeNet, BN-NIN, GG-10)
Performance vs Voorkomen

New computer vision model

In deze blogpost wordt gesproken over 25.000 van de 300.000 soorten die waargenomen zijn in iNaturalist. Het nu gereleasde model is minimaal twee modellen later: 25.000->55.000->60.000

We’ve released a new computer vision model for iNaturalist. This is our first model update since April 2022. The iNaturalist website, mobile apps, and API are all now using this new model. Here’s what’s new and different with this change:

It includes 60,000 taxa (up from 55,000)
It was trained using a different approach than our previous models, which made it much faster to train
To see if a particular species is included in this model, you can look at the “About” section of its taxon page.

It’s bigger
Our previous model included 55,000 taxa and 27 million training photos. The new model was trained on over 60,000 taxa and almost 30 million training photos.

https://www.inaturalist.org/blog/69193-new-computer-vision-model

Publicado por optilete hace más de 1 año

https://forum.inaturalist.org/t/what-i-learned-after-training-my-own-computer-vision-model-on-inats-data/44052

Publicado por marcelhospers hace 9 meses

https://nofreehunch.org/2023/08/09/image-classification-in-the-real-wild/
https://forum.inaturalist.org/t/what-i-learned-after-training-my-own-computer-vision-model-on-inats-data/44052
ooks like the web app, shut down some time back. I have restarted it and updated the links.
Just in case, this is address http://35.224.94.168:8080/ 45 (the ip address should not change)
This app visualizes model predictions for a Computer Vision model trained on iNaturalist data

You can read more about how this model was trained here

Here is a rough guide to use this app :

Look at the predictions on a Random image from the validation set
Look at the Accuracy Summary for different taxonomic groups
For example to look at the summary by Kingdom
I personally find the summary by Order most useful
Look at the errors at different levels in the taxonomic heirarchy.
For example to look at errors where the model got the Kingdom wrong !
For example to look at errors where the the model got the Species wrong
This is a personal project by Satyajit Gupte
http://35.224.94.168:8080/about
I would be happy to hear anything you have to say. You can reach me at gupte.satyajit@gmail.com or on iNat
https://nofreehunch.org/2023/08/09/image-classification-in-the-real-wild/
http://35.224.94.168:8080/
https://nofreehunch.org/2023/07/24/make-the-most-of-your-gpu/
https://nofreehunch.org/2023/03/22/ads-auction/
https://nofreehunch.org/about-me/

https://forum.inaturalist.org/t/better-use-of-location-in-computer-vision-suggestions/915/41

Google provides three models that have been trained with iNaturalist data - classification models for plants, birds, and insects. These Google models can be downloaded and used with Google's TensorFlow and TensorFlow Lite tools.
https://techbrij.com/setup-tensorflow-jupyter-notebook-vscode-deep-learning

Publicado por optilete hace 9 meses

https://techbrij.com/setup-tensorflow-jupyter-notebook-vscode-deep-learninghttps://techbrij.com/setup-tensorflow-jupyter-notebook-vscode-deep-learning

Agregar un comentario

Acceder o Crear una cuenta para agregar comentarios.