Determining biodiversity

A trained machine learning model generates embeddings from images to determine biodiversity scores, addressing limitations in existing methods by assessing arthropod diversity comprehensively, enhancing ecosystem health assessments.

WO2026125225A1PCT designated stage Publication Date: 2026-06-18BAYER AG

Patent Information

Authority / Receiving Office
WO · WO
Patent Type
Applications
Current Assignee / Owner
BAYER AG
Filing Date
2025-12-08
Publication Date
2026-06-18

AI Technical Summary

Technical Problem

Existing methods for determining biodiversity, particularly arthropod diversity, are limited to species for which a machine learning model has been trained, restricting the ability to assess biodiversity comprehensively.

Method used

A computer-implemented method using a trained machine learning model to generate embeddings from images of an area, determining a biodiversity score based on the variability of these embeddings, and updating the score over time, allowing for a comprehensive assessment of arthropod diversity.

🎯Benefits of technology

Enables the determination of biodiversity scores that reflect the variability of arthropod species, providing a more comprehensive understanding of ecosystem health and resilience, supporting sustainable agriculture and ecosystem services.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure EP2025085830_18062026_PF_FP_ABST
    Figure EP2025085830_18062026_PF_FP_ABST
Patent Text Reader

Abstract

Systems, methods, and computer programs disclosed herein relate to the determination of biodiversity in an area.
Need to check novelty before this filing date? Find Prior Art

Description

[0001] BYC240213 Foreign Countries

[0002] Determining Biodiversity

[0003] FIELD OF THE DISCLOSURE

[0004] Systems, methods, and computer programs disclosed herein relate to the determination of biodiversity in an area.

[0005] BACKGROUND

[0006] Biodiversity, short for biological diversity, refers to the variety of life in the world or in a particular habitat or ecosystem. It encompasses the diversity of living organisms, including plants, animals, bacteria, fungi, and the genetic information they contain. Biodiversity is usually considered at three levels, genetic diversity, species diversity, and ecosystem diversity. Genetic diversity refers to the variety of genes within a particular species. It includes the differences in genes among the individuals of any one species. Species diversity refers to the variety of species within a habitat or a region. It considers the number of different species and the balance or evenness of the population sizes of all species present. Ecosystem diversity looks at the variety of ecosystems in a given area. An ecosystem is a community of living organisms in conjunction with the nonliving components of their environment, interacting as a system.

[0007] Biodiversity is critical because it contributes to ecosystem services that support life on Earth, including air and water purification, climate regulation, pollination of plants, soil fertility, and the cycling of nutrients. It is also essential for human well-being, providing resources for food, fiber, medicine, and cultural and recreational activities.

[0008] Arthropods are a diverse group of invertebrate animals that belong to the phylum Arthropoda. Arthropods play essential roles in ecosystems as pollinators, decomposers, and as apart of the food web.

[0009] A high variety of arthropod species in an area plays a crucial role in supporting and enhancing agricultural productivity and sustainability. Arthropods, including insects, spiders, and crustaceans, are integral components of most terrestrial ecosystems, including agricultural landscapes. Their importance in agriculture can be attributed to several factors.

[0010] Many arthropods, especially bees and butterflies, are vital pollinators for a wide range of crops. Pollination is essential for the production of fruits, vegetables, and seeds. Without these pollinators, many crops would have reduced yields or fail to produce.

[0011] A diverse arthropod community includes numerous predatory and parasitic species that can naturally control populations of pest insects. For example, lady beetles, lacewings, and parasitic wasps prey on aphids, caterpillars, and other pests that can damage crops. This biological control reduces the need for chemical pesticides, which can be harmful to the environment and human health.

[0012] Some arthropods, such as certain beetles and ants, contribute to soil health by breaking down organic matter, aerating the soil, and facilitating nutrient cycling. Healthy soils are fundamental for productive agriculture, supporting plant growth and water regulation.

[0013] A diverse arthropod community contributes to overall biodiversity and ecosystem resilience. High biodiversity can make agricultural systems more resilient to disturbances such as diseases, pests, and climate change by providing a range of responses to these challenges.

[0014] The presence of a diverse arthropod population supports sustainable agriculture practices by reducing the reliance on chemical inputs, enhancing natural ecosystem services, and contributing to the balance between pest and predator populations. By providing essential ecosystem services like pollination and pest control, arthropods can significantly increase crop yields and quality, leading to economic benefits for farmers and the agricultural sector.

[0015] It would therefore be desirable to be able to determine the biodiversity in an area. Among other things, this can help to evaluate the success of measures to increase biodiversity in the area, and / or the possible negative impact of certain phenomena, such as deforestation, and / or monocultural agriculture.

[0016] S. Schneider et al. propose determining a diversity index using a trained machine learning model (S. Schneider et al. : Bulk arthropod abundance, biomass and diversity estimation using deep learning for computer vision, Methods Ecol Evol, 2022,13: 346-357). The machine learning model is trained to classify images of arthropods into one of a number of classes. Diversity D is defined as wherein n is the number of individuals detected per class and N is the total number of individuals detected. With this approach, it is only possible to determine arthropod diversity based on arthropods for which the machine learning model has been trained, and which the trained machine learning model is able to detect.

[0017] SUMMARY

[0018] The present disclosure addresses this and further / other aspects.

[0019] In a first aspect, the present disclosure provides a computer-implemented method, the method comprising: providing at least part of a trained machine learning model, wherein the trained machine learning model has been trained using training data to capture features of an image of an arthropod, generate an embedding based on the features, and perform a task based on the embedding, over a period of time, receiving a plurality of images of an area, for each image of the plurality of images: o inputting the image into the trained machine learning model, o determining or receiving one or more embeddings from the trained machine learning model, each of the one or more embeddings representing an arthropod depicted in the inputted image, determining a biodiversity score based on the embeddings, wherein the biodiversity score is a measure of variability of the embeddings, storing and / or outputting the biodiversity score and / or transmitting the biodiversity score to a separate computer system.

[0020] In another aspect, the present disclosure provides a computer system comprising: a processing unit; and a memory storing a computer program configured to perform, when executed by the processing unit, an operation, the operation comprising: providing at least part of a trained machine learning model, wherein the trained machine learning model has been trained using training data to capture features of an image of an arthropod, generate an embedding based on the features, and perform a task based on the embedding, over a period of time, receiving a plurality of images of an area, for each image of the plurality of images: o inputting the image into the trained machine learning model, o determining or receiving one or more embeddings from the trained machine learning model, each of the one or more embeddings representing an arthropod depicted in the inputted image, determining a biodiversity score based on the embeddings, wherein the biodiversity score is a measure of variability of the embeddings, storing and / or outputting the biodiversity score and / or transmitting the biodiversity score to a separate computer system.

[0021] In another aspect, the present disclosure provides a non-transitory computer readable storage medium having stored thereon a computer program that, when executed by a processing unit of a computer system, cause the computer system to execute the following steps: providing at least part of a trained machine learning model, wherein the trained machine learning model has been trained using training data to capture features of an image of an arthropod, generate an embedding based on the features, and perform a task based on the embedding, over a period of time, receiving a plurality of images of an area, for each image of the plurality of images: o inputting the image into the trained machine learning model, o determining or receiving one or more embeddings from the trained machine learning model, each of the one or more embeddings representing an arthropod depicted in the inputted image, determining a biodiversity score based on the embeddings, wherein the biodiversity score is a measure of variability of the embeddings, storing and / or outputting the biodiversity score and / or transmitting the biodiversity score to a separate computer system.

[0022] BRIEF DESCRIPTION OF THE DRAWINGS

[0023] Fig. 1 shows schematically an example of the training of the machine learning model of the present disclosure.

[0024] Fig. 2 schematically shows an example of how an embedding is generated based on an image using the trained machine learning model of the present disclosure.

[0025] Fig. 3 shows an example of how a biodiversity score is determined.

[0026] Fig. 4 shows an embodiment of the computer-implemented method of the present disclosure in the form of a flow chart.

[0027] Fig. 5 shows another embodiment of the computer-implemented method of the present disclosure in the form of a flow chart.

[0028] Fig. 6 shows another embodiment of the computer-implemented method of the present disclosure in the form of a flow chart.

[0029] Fig. 7 illustrates a computer system according to some example implementations of the present disclosure in more detail. DETAILED DESCRIPTION

[0030] Various example embodiments will be more particularly elucidated below without distinguishing between the aspects of the disclosure (method, computer system, computer-readable storage medium). On the contrary, the following elucidations are intended to apply analogously to all the aspects of the disclosure, irrespective of in which context (method, computer system, computer-readable storage medium) they occur.

[0031] If steps are stated in an order in the present description or in the claims, this does not necessarily mean that the disclosure is restricted to the stated order. On the contrary, it is conceivable that the steps can also be executed in a different order or else in parallel to one another, unless, for example one step builds upon another step, this requiring that the building step be executed subsequently (this being, however, clear in the individual case). The stated orders may thus be exemplary embodiments of the present disclosure.

[0032] As used herein, the articles “a” and “an” are intended to include one or more items and may be used interchangeably with “one or more” and “at least one”. As used in the specification and the claims, the singular form of “a”, “an”, and “the” include plural referents, unless the context clearly dictates otherwise. Where only one item is intended, the term “one” or similar language is used. Also, as used herein, the terms “has”, “have”, “having”, or the like are intended to be open-ended terms. Further, the phrase “based on” is intended to mean “based at least partially on” unless explicitly stated otherwise.

[0033] Some implementations of the present disclosure will be described more fully hereinafter with reference to the accompanying drawings, in which some, but not all implementations of the disclosure are shown. Indeed, various implementations of the disclosure may be embodied in many different forms and should not be construed as limited to the implementations set forth herein; rather, these example implementations are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.

[0034] The terms used in this disclosure have the meaning that these terms have in the prior art, in particular in the prior art cited in this disclosure, unless otherwise indicated.

[0035] The present disclosure provides means for determining a biodiversity score for an area.

[0036] In an embodiment of the present disclosure, the biodiversity score is a measure of the variability of species occurring in the area (species diversity).

[0037] In another embodiment of the present disclosure, the biodiversity score is a measure of the variability of arthropod species occurring in the area.

[0038] Arthropods are a diverse group of invertebrate animals that belong to the phylum Arthropoda. Arthropods are classified into several major groups (subphyla and classes), including insects, and arachnids.

[0039] In an embodiment of the present disclosure, the term arthropods refers exclusively to insects and arachnids.

[0040] In another embodiment of the present disclosure, the term arthropods refers exclusively to insects.

[0041] In another embodiment of the present disclosure, the term arthropods refers exclusively to adult insects.

[0042] In another embodiment of the present disclosure, the term arthropods refers exclusively to insects in the form of caterpillars.

[0043] In another embodiment of the present disclosure, the term arthropods refers exclusively to flying insects. Flying insects are those that can fly (insects capable of flight) and not necessarily insects that are flying at the time the (reference) image is taken.

[0044] In another embodiment of the present disclosure, the term arthropods refers exclusively to arachnids.

[0045] In another embodiment of the present disclosure, the term arthropods refers exclusively to mites. Biodiversity is determined based on images.

[0046] An “image” is a representation or reproduction of the physical characteristics of an object, entity, or event, which can be perceived visually. This representation is created by capturing, converting, or interpreting physical signals, which may include but are not limited to electromagnetic waves (such as light), sound waves, electromagnetic waves, or other forms of energy. The term image includes photographs, video recordings, infrared images, ultrasound images, and similar representations.

[0047] An image can be represented in various forms, including two-dimensional (2D) arrays of pixels in grayscale or colour, three-dimensional (3D) models or renderings, and multi-dimensional data capturing different spectral, temporal, or depth dimensions. The content of an image may be perceived by humans or processed by machines for the purpose of analysis, interpretation, and / or decision-making.

[0048] An image in the sense of the present disclosure is usually a digital image or can be converted into a digital image. “Digital” means that the image can be processed by a computer system.

[0049] To determine a biodiversity score, a plurality of images is taken (or generated or captured) of an area over a period of time.

[0050] The term “plurality of images” usually refers to more than 10, e.g. more than 50 or even more than 100. It is also possible that a biodiversity score is determined on the basis of more than 1000 images.

[0051] The period of time can be less than an hour or an hour or several hours or a day or several days (e.g., 2, 3, 4, 5 or 6 or more than 6) or a week or several weeks (e.g., 2, 3, 4, 5 or 6 or more than 6) or more.

[0052] The area is an area in which the species of which images are to be generated may be present.

[0053] The size of the area usually depends on the size of the species from which images are to be generated. The size of the area is usually greater than the species, for example more than 10 times as large or more than 20 times as large or more than 100 times as large.

[0054] The size of the area also usually depends on the imaging device used to create the images. If the imaging device is a camera, for example, the size of the area usually depends on the size of the image sensor, the distance between the image sensor and the area, and the lenses used.

[0055] In an embodiment of the present disclosure, the area is flat and / or comprises a flat surface.

[0056] The area and / or the surface may be triangular, quadrangular (e.g., rectangular or square), pentagonal, hexagonal, or generally w-angular. where n is an integer greater than two. The area and / or the surface may also be round or elliptical or have some other shape.

[0057] In an embodiment of the present disclosure, the area and / or the surface is flat, rectangular, although the comers may be rounded, and extends perpendicular to the direction of gravity.

[0058] In an embodiment of the present disclosure, the area and / or surface has an extension in the range of 100 mm x 200 mm to 250 mm x 250 mm.

[0059] In an embodiment of the present disclosure, the area and / or surface has an extension in the range of 100 mm x 200 mmm bis 200 mm x 250 mm.

[0060] In an embodiment of the present disclosure, the area and / or surface has an extension in the range of 100 mm x 160 mmm bis 130 mm x 190 mm.

[0061] In an embodiment of the present disclosure, the area and / or surface has an extension in the range of 160 mm x 210 mmm bis 180 mm x 230 mm.

[0062] The area is usually outdoors, but it can also be in a foil tunnel (polytunnel) or a greenhouse, for example.

[0063] In an embodiment of the present disclosure, the area is located in or near a field for growing crops. “Near” can mean that the area is located at a distance from a field boundary that is no greater than, for example, 1 meter or 5 meters or 10 meters or 20 meters or 50 meters or 100 meters or 500 meters. A “field” is understood to mean a spatially delimitable region of the Earth's surface on which plants grow. For example, a field may be at least partly utilized agriculturally in that crop plants are planted, supplied with nutrients and harvested. A field may be or comprise a silviculturally utilized region of the Earth's surface (for example a forest). Gardens, parks or the like in which vegetation exists solely for human pleasure are covered by the term “field”. The term “field” shall also comprise the terms “orchard” and / or “plantation”.

[0064] In another embodiment of the present disclosure, the area is in a greenhouse.

[0065] In another embodiment of the present disclosure, the area is in a foil tunnel.

[0066] In another embodiment of the present disclosure, the area is in an urban region.

[0067] In another embodiment of the present disclosure, the area is in a non-urban, such as a natural reserve, park, forests and / or other biome.

[0068] The area is usually chosen to represent a larger area (region). In other words, to determine biodiversity in a larger area (region), an area may be selected in the larger area (region) to represent the larger area (region).

[0069] It is also possible to generate images of several areas. For example, it is possible that several imaging devices are distributed in a larger area (region) and each imaging device is used to generate images of an area within the larger area (region).

[0070] It is possible to generate images of the area at defined times or at defined time intervals or when defined events occur. If a species, e.g. an arthropod, is present in the area at the time the image is taken, the arthropod will be depicted in the image. It is therefore possible that images taken of the area do not show species (e.g. arthropods).

[0071] In an embodiment of the present disclosure, the images are intended to depict the variability of species (e.g. arthropods) or the variability of a certain species in the area. It is therefore possible that a large number of images may need to be generated over a long period of time in order to determine a meaningful biodiversity score. In other words, the number of images and / or the length of the period of time over which images are generated is determined based on the frequency with which species enter the area.

[0072] In an embodiment of the present disclosure, a biodiversity score is progressively determined and / or updated on the basis of the images generated so far. If this biodiversity score does not change even when further images are added, or does not change within a predefined range, it can be assumed that the number of images and / or the length of the period of time were sufficient.

[0073] For example, a new biodiversity score can be determined whenever a new image is generated and / or received. Likewise, it is possible to determine an updated biodiversity score when a predefined number of images have been generated and / or received, e.g. 5 or 10 or 20 or any other number. The number may be fixed or may depend on how much the biodiversity score changes when new images are added: if it changes a lot, the number of images can be higher than if it changes less.

[0074] It is also possible to determine the biodiversity score at defined points in time (e.g. once an hour or once a day) at defined time intervals (e.g. every 6 hours) and / or when defined events occur.

[0075] It is possible that attractants are used to lure species into the area. Such attractants may increase the frequency with which species enter the area. However, it should be kept in mind that such an attractant usually only attracts certain species, so that these may be depicted in images of the area more frequently than species that are not attracted by the attractant. However, it is possible that it is desirable to attract certain species, e.g. to determine the variability within these species.

[0076] The area may be designed in a color (e.g. yellow or red) that attracts specific arthropods. In addition to or instead of a color, other means of attracting arthropods may be present. For example, the use of a pheromone or food or an odorant that simulates a food source is conceivable. The use of a source of electromagnetic radiation in the infrared, visible and / or ultraviolet range to attract (specific) arthropods is also conceivable. The use of sounds that imitate, for example, males and / or females ready to mate is also conceivable. The use of special patterns that imitate a plant, for example, is also conceivable.

[0077] The area may include means for immobilizing species (e.g. arthropods). This may be, for example, a surface coated with an adhesive (e.g. a glue). This may be a container filled with a liquid (e.g. water). An agent (e.g. a surfactant) may be added to the liquid to reduce the surface tension.

[0078] The area can be part of a trap for certain species (e.g. arthropods). In an embodiment of the present disclosure, the trap comprises a container filled with a liquid as described for example in W02020 / 058175A1, W02020 / 058170A1, WO2021 / 213824A1 or WO2022 / 243150A1. In another embodiment of the present disclosure, the trap comprises a surface provided with an adhesive, as described, for example, in WO2023 / 043871A1, WO2018 / 131853A1 orW02004 / 095919A2. In another embodiment of the present disclosure, the trap comprises a tent-like frame defining an interior space into which arthropods can enter. Such traps are also known as delta traps (see, for example, WO2018 / 078638A1).

[0079] However, if possible, means for immobilizing species should be avoided, as they can harm the species.

[0080] If means of immobilization are used, it should be noted that the same immobilized species may be imaged in two consecutive images. To ensure that the immobilized species is only considered once, it is possible to only consider image areas in which changes have occurred for consecutive images. Such changes can be detected, for example, by subtracting the consecutive images from each other.

[0081] In general, images can be analyzed to see if they show a species (e.g. an arthropod) at all. Images that do not show a species (e.g. an arthropod) can be discarded.

[0082] The biodiversity score is determined using a trained machine learning model.

[0083] Machine learning is a subset of artificial intelligence that involves the development of algorithms and statistical models that enable computers to perform tasks without explicit programming. In other words, machine learning involves teaching computers to learn and make decisions or predictions based on data.

[0084] The term “machine learning model”, as used herein, may be understood as a computer implemented data processing architecture. The machine learning model can receive input data and provide output data based on that input data and on parameters of the machine learning model (model parameters). The machine learning model can learn a relation between input data and output data through training. In training, parameters of the machine learning model may be adjusted in order to provide a desired output for a given input.

[0085] The process of training a machine learning model involves providing a machine learning algorithm (that is the learning algorithm) with training data to learn from. The term “trained machine learning model” refers to the model artifact that is created by the training process. The training data usually contains the correct answer, which is referred to as the target. The learning algorithm finds patterns in the training data that map input data to the target, and it outputs a trained machine learning model that captures these patterns.

[0086] In the training process, input data are inputted into the machine learning model and the machine learning model generates an output. The output may be compared with the (known) target. Parameters of the machine learning model may be modified in order to reduce the deviations between the output and the (known) target to a (defined) minimum.

[0087] In general, a loss function can be used for training, where the loss function can quantify the deviations between the output and the target. The aim of the training process can be to modify (adjust) parameters of the machine learning model in order to reduce the loss to a (defined) minimum. The loss function is usually minimized using an optimization method, e.g. a gradient descent method. The machine learning model of the present disclosure is configured and was trained using training data to capture features of an image of an arthropod, generate an embedding based on the features, and perform a task based on the embedding.

[0088] Such an “embedding”, sometimes also referred to as “feature vector”, is a numerical representation of an object. The numerical representation is usually an array of numbers, such as a vector, a matrix, a tensor or any other arrangement of numbers. In other words, the term feature vector is not limited to vectors, but can also include other arrangements of numbers.

[0089] The embedding generated by the trained machine learning model is a numerical representation of an arthropod depicted in the image.

[0090] In an embodiment of the present disclosure, the embedding is a fixed-size unidimensional array of real numbers.

[0091] In an embodiment of the present disclosure, the embedding has a number of dimensions greater than 100, for example greater than 200 or greater than 500 or greater than 700.

[0092] The embedding might be composed of various measurements and / or characteristics extracted from an image and / or a part thereof that collectively represent a species (e.g. an arthropod) within the image.

[0093] The machine learning model of the present disclosure may be configured and trained to analyse the image and extract features such as edges, comers, and / or specific textures into an embedding.

[0094] This is explained using a CNN as an example; however, it should be noted that the machine learning model of the present disclosure can also be or comprise a different model than a CNN.

[0095] Convolutional neural networks (CNNs) are a class of deep neural networks that are particularly powerful for tasks related to image processing, such as object detection. They can automatically and adaptively leam spatial hierarchies of features from input images. CNNs comprise layers that perform operations on the input image. The primary layers include convolutional layers, pooling layers, and fully connected layers. The convolutional layers apply fdters (also known as kernels) to the input image to create feature maps. These fdters are designed to detect specific features such as edges, textures, or more complex patterns as the network goes deeper. Pooling layers reduce the dimensionality of the feature maps, making the detection of features invariant to scale and orientation and reducing the computational load. Toward the end of the network, fully connected layers aggregate the data from the feature maps. In the context of generating embeddings (feature vectors), the output from one of the last fully connected layers may serve as an embedding (feature vector).

[0096] The task to be performed by the machine learning model during training may be or comprise a classification. In other words, the machine learning model may have been trained to classify an arthropod into one of at least two classes based on an image of the arthropod. In other words, the machine learning model of the present disclosure may be or may comprises a classification model. A “classification model” is a type of machine learning model that is used to separate data into specific categories or classes. The goal of a classification model is to predict the category or class of unseen or new data based on the learning from past observations.

[0097] The machine learning model of the present disclosure may be or may comprise a multiclass classification model. A “multiclass classification model” is a type of machine learning model that is used when there are more than two classes to predict. In other words, the machine learning model of the present disclosure may be configured and may have been trained to assign an image to one of a number of classes. The number of classes is greater than 3. For example, the number of classes can be 10 or more than 10, or 100 or more than 100.

[0098] A class may indicate to which class, order, family, genus, and / or species in the sense of biological taxonomy a certain species belongs to. A class may be combination of different biological taxonomy levels. A class may be defined as a set of species, or a union of two or more families, for example. Such a multiclass classification model is usually trained on training data, wherein the training data comprise a multitude of reference images as input data. Each reference image of the multitude of reference images depicts an arthropod. It is possible that one or more images depict more than one arthropod.

[0099] The term “multitude of reference images” means more than 10, e.g. more than 100 or even more than 1000.

[0100] The term “reference image” is used in this disclosure to distinguish images used to train the machine learning model from images used to determine the biodiversity score. The term “reference image” is used to avoid lack of clarity objections in the examination proceedings of the present patent application. The term “reference” has no other restrictive meaning.

[0101] For training a multiclass classification model, the training data further comprise, for each reference image of the multitude of reference images, an information about a class to which the arthropod depicted in the reference image belongs as target data (class information).

[0102] The training of a multiclass classification model usually involves the following steps: for each reference image of the multitude of reference images: inputting the reference image depicting an arthropod into the machine learning model, receiving an output from the machine learning model, wherein the output indicates the class to which the machine learning model has assigned the arthropod depicted in the reference image, determining a deviation between the output and the target data, reducing the deviation by modifying parameters of the machine learning model.

[0103] The training steps can be carried out until a stop criterion is reached. Such a stop criterion can be for example: a predefined maximum number of training steps / cycles / epochs has been performed, deviations between output and target data can no longer be reduced by modifying the model parameters, a predefined minimum of the loss function is reached, and / or an extreme value (e.g., maximum or minimum) of another performance value is reached.

[0104] The multiclass classification model classifies an arthropod depicted in a reference image based on an embedding, the embedding representing the arthropod.

[0105] Usually, reference images of different species are used to train the machine learning model. The more different the species look / appear in the reference images, the more features the model learns to differentiate between the different species.

[0106] In an embodiment of the present disclosure, the machine learning model is trained using reference images of various arthropods.

[0107] It should be noted that the machine learning model of the present disclosure does not have to be trained based on reference images of a particular species in order to determine a biodiversity score on the basis of images showing this particular species. In other words, if the trained machine learning model is fed images of one or more certain species to determine a biodiversity value, the model does not necessarily have to be trained on the basis of reference images of these one or more certain species; it can have been trained on the basis of reference images of other species. In other words, even though the machine learning model was trained on images showing spiders, the trained machine learning model can be used to determine a biodiversity score based on images showing insects and vice versa.

[0108] The task to be performed by the machine learning model during training may be or comprise a segmentation. On other words, the machine learning model may be or comprises a segmentation model.

[0109] Segmentation refers to the process of partitioning an image into multiple segments or regions, making it easier to analyze and understand the image content. The aim is to identify and / or delineate objects (e.g. arthropods) or regions of interest within an image, which can be particularly useful for tasks such as object recognition, scene understanding, and image analysis.

[0110] Segmentation usually involves classifying each pixel in the image into predefined categories. For example, in an image containing various objects, each pixel might be labelled as belonging to a specific object class (e.g., “background”, “arthropod”, “plant”).

[0111] The segmentation model may have been trained on training data, wherein the training data usually comprises a multitude of reference images (input data), each reference image showing one or more arthropods, and, for each reference image, an information about which pixels of the reference image represent an arthropod (target data).

[0112] The training of a segmentation model usually involves the following steps: for each reference image of the multitude of reference images: inputting the reference image depicting one or more arthropods into the machine learning model, receiving an output from the machine learning model, wherein the output indicates which pixels of the reference image the machine learning model has assigned to an arthropod, determining a deviation between the output and the target data, reducing the deviation by modifying parameters of the machine learning model.

[0113] The training steps can be carried out until a stop criterion is reached. Such a stop criterion can be for example: a predefined maximum number of training steps / cycles / epochs has been performed, deviations between output and target data can no longer be reduced by modifying the model parameters, a predefined minimum of the loss function is reached, and / or an extreme value (e.g., maximum or minimum) of another performance value is reached.

[0114] The segmentation model may be or comprise a Convolutional Neural Network (CNNs). The segmentation model may be or comprise a U-Net, Mask R-CNN, or DeepLab, for example. These models are designed to capture spatial hierarchies and features in images effectively.

[0115] Segmentation by classifying individual pixels is also referred to as semantic segmentation. The machine learning model of the present disclosure may also be trained to perform instance segmentation. Instance segmentation involves identifying and delineating each individual object instance within an image. Unlike semantic segmentation, which classifies each pixel into a category, instance segmentation assigns a unique label to each object instance. This means that if there are multiple instances of the same object class in an image, each instance will have its own distinct segmentation mask.

[0116] To train a machine learning model for segmenting arthropods in images using instance segmentation, training data is required that comprise a multitude of reference images (input data), each reference image showing one or arthropods, and, for each reference image, annotations that mark individual instances of arthropods, e.g., by means of a bounding box or a mask, as target data.

[0117] Machine learning models suitable for instance segmentation include, for example, Mask R-CNN, YOLACT, and / or DeepLab.

[0118] The task to be performed by the machine learning model during training may be or comprise a localization. On other words, the machine learning model may be or comprises a localization model.

[0119] The term “localize” (also referred to as “locate”) refers to finding the position of an object (e.g., an arthropod) in an image. Object localization answers the question of “where” an object can be found in an image.

[0120] Localizing can be done, for example, by specifying a bounding box around an object (e.g., an arthropod) species depicted in an image. A “bounding box” may be a rectangular box defined by the coordinates of its comers that encapsulates an object of interest within an image. The bounding box is characterized by its position, typically specified by the coordinates of the top-left comer (xi, yi) and the bottom-right comer (x2, y2), or alternatively by the center coordinates (ex, cy) along with its width (w) and height (h). The bounding box serves as a spatial representation that delineates the extent of the object, enabling its identification, and / or analysis.

[0121] Bounding boxes are often used to mark objects in images. It should be noted that the bounding box does not necessarily have to be rectangular; other geometric shapes are also suitable for marking objects, e.g. circles, ellipses, hexagons or other shapes. In this respect, the term “bounding box” is to be interpreted broadly and is not limited to rectangular boxes.

[0122] To train such a localization model, the training data usually comprises a multitude of reference images (input data), each reference image showing an arthropod, and, for each reference image, coordinates of a bounding box that surrounds the arthropod in the reference image.

[0123] The machine learning model may be a bounding box regression model. This approach involves training a model to directly predict the coordinates of the bounding boxes based on the input images. The model may be a neural network that takes an image and outputs the coordinates of the bounding box. The loss function may be designed to minimize the difference between the predicted bounding box coordinates and the ground tmth coordinates.

[0124] The machine learning model may be a Region Proposal Network (RPN). While RPNs are often part of more complex object detection frameworks like Faster R-CNN, they can be adapted to focus solely on generating region proposals (bounding boxes). The RPN generates a set of bounding box proposals based on the features extracted from the input image, and it can be trained to optimize the location of these boxes.

[0125] Another approach may be to use keypoint detection models that identify specific points of interest on the arthropod (e.g., comers or edges). By determining the positions of these keypoints, a bounding box may be constructed around the detected arthropod.

[0126] The task to be performed by the machine learning model during training may be an object recognition (e.g., arthropod recognition). The task to be performed by the machine learning model during training may be a combined localization and classification. In other words, the machine learning model may have been trained to locate and classify species (e.g., arthropods) in images.

[0127] The term “classify” refers to the assignment of an object to a defined class and / or category. Object classification answers the question of “what” is in the image.

[0128] The term “object recognition” means the joint localization and classification of an object in an image.

[0129] In the present disclosure, the term “object” refers to a species, e.g. an arthropod. Training data for training an object recognition model for arthropods usually comprise a multitude of reference images (input data), each reference image showing one or more arthropods, and, for each reference image, an information about the position of the one or more arthropods in the reference image (e.g., in the form of bounding box coordinates) as target data.

[0130] The training of the object recognition model may comprise the following steps: for each reference image of the plurality of reference images: inputting the reference image into the machine learning model, receiving an output from the machine learning model, the output indicating the location and class of the one or more arthropods depicted in the reference image, comparing the output with the target data, i.e. comparing a predicted location of the arthropod in the reference image with the real location of the arthropod in the reference image, and comparing a predicted class of the located arthropod with the real class of the located arthropod, determining a deviation between the output and the target data, i.e. determining a first deviation between the predicted location and the real location and a second deviation between the predicted class and the real class, reducing the deviation(s) by modifying parameters of the machine learning model.

[0131] The deviation(s) can be determined using a loss function.

[0132] Training a model for object recognition in images may involve predicting the coordinates of a bounding box that accurately encases the object of interest. This task requires a loss function that can effectively quantify the difference between the predicted bounding box coordinates and the ground truth (i.e. the real bounding box coordinates). Suitable localization loss functions are: Mean Squared Error (MSE), Intersection over Union (loU), smooth LI loss, and / or Huber loss or combinations thereof.

[0133] Examples of classification losses are Binary Cross Entropy, and Negative Log-likelihood.

[0134] The total loss may be a sum of the localization loss and classification loss. The losses can be weighted. The weights can vary during training.

[0135] The training steps can be carried out until a stop criterion is reached. Such a stop criterion can be for example: a predefined maximum number of training steps / cycles / epochs has been performed, deviations between output and target data can no longer be reduced by modifying the model parameters, a predefined minimum of the loss function is reached, and / or an extreme value (e.g., maximum or minimum) of another performance value is reached.

[0136] As described, the machine learning model may perform localization and / or classification of one or more species (e.g. athropods) based on a (reference) image. It is possible that the machine learning model uses further data for localization and / or classification, for example information about the (reference) image (where the (reference) image was taken, which area the (reference) image shows, when the (reference) image was taken (e.g. time, season), in which magnification the (reference) image shows one or more species) and further / other data.

[0137] If such additional training data was used when training the machine learning model, such data can also be entered into the trained model when using the trained model to determine a biodiversity score.

[0138] Various models for object localization and / or classification based on images are disclosed in the state of the art.

[0139] For example, the machine learning model may be or comprise an R-CNN model (R-CNN: Regions with Convolutional Neural Network, see, e.g., R. Girshick etal. '. Rich feature hierarchies for accurate object detection and semantic segmentation, arXiv: 1311 ,2524v5).

[0140] The idea behind R-CNN is to take a two-step approach to object detection: first, a number of candidate regions or bounding boxes is identified in an image where there is a high likelihood of finding an object. This is about pinpointing where something of interest might be. This step is crucial because it significantly reduces the number of locations the model needs to evaluate, compared to looking at every possible location in the image. For each proposed region, a CNN is applied to extract features from the region and then classify the object within the proposed region. Additionally, it refines the bounding box coordinates to more accurately encompass the object.

[0141] The machine learning model may be or comprise a Fast R-CNN or a Faster R-CNN model (see, e.g. R. Girshick: Fast R-CNN, arXiv: 1504.08083v2; S. Ren et al. '. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks, arXiv: 1506.01497v3). The original R-CNN has inspired a series of improvements and iterations, leading to faster and more efficient models like Fast R-CNN and Faster R-CNN. These models have addressed many of the limitations of the original R-CNN, especially regarding speed and training complexity.

[0142] The machine learning model may be or comprise a YOLO model (YOLO: You Only Look Once, see, e.g. J. Redmon etal. '. You Only Look Once: Unified, Real-Time Object Detection, arXiv: 1506.02640v5). Unlike the R-CNN family of models, which first propose regions and then classify them, Y OLO frames object detection as a single regression problem, directly moving from image pixels to bounding box coordinates and class probabilities. This is achieved by dividing the image into a grid (e.g., a 13x13 grid), and for each grid cell, predicting bounding boxes and probabilities forthose boxes. Each bounding box prediction includes: the coordinates of the center of the box (relative to the grid cell location), the width and height of the box (relative to the whole image), a confidence score indicating the likelihood that the box contains an object, and a class probabilities indicating the likelihood of each class being present in the box.

[0143] The machine learning model may be or comprise an SSD (Single Shot MultiBox Detector, see, e.g. W. Liu et al. '. SSD: Single Shot MultiBox Detector, arXiv: 1512.02325v5). SSD addresses some of the limitations of earlier detection systems, like Y OLO, by providing a method that is both fast and capable of detecting objects at multiple scales. SSD simplifies the object detection workflow by eliminating the need for a separate proposal generation step, which is a characteristic of two-stage detectors like those in the R-CNN family. Instead, SSD performs detection in a single shot, hence the name, by predicting both the bounding box locations and class probabilities directly from the image in one go.

[0144] The machine learning model may be or comprises a DETR model (DETR: Detection Transformer, see, e.g. N. Carion et al. '. End-to-End Object Detection with Transformers, arXiv:2005.12872v3). DETR simplifies the object detection pipeline by eliminating the need for many hand-designed components that are common in traditional object detection systems, such as non-maximum suppression and anchor generation. Instead, DETR leverages the Transformer architecture, originally designed for natural language processing tasks, to directly predict object bounding boxes and their corresponding class labels from an image in a more end-to-end manner. DETR treats object detection as a direct set prediction problem. It uses a transformer, combined with a CNN, to process the input image. The transformer architecture is adept at handling sequences of data and capturing long-range dependencies, which, in the context of DETR, allows it to consider the entire image and the relationships between different objects within the image to make accurate predictions.

[0145] The DETR model can be trained end-to-end. The CNN of the DETR model can be pre-trained.

[0146] The term “pre-trained” refers to a machine learning model that has undergone an initial training phase on a large and diverse dataset before being fine-tuned and / or applied to specific tasks. Pre-training involves learning the underlying patterns, structures, and representations of data, which can then be leveraged for various downstream tasks with minimal additional training. For example, the CNN may be a ResNet-50 (see, e.g. DOI: 10.33395 / sinkron.v8i2.12378) pretrained on ImageNet (see, e.g. DOI: 10. 1109 / CVPR.2009.5206848).

[0147] The models mentioned here are merely examples. There are many other models that can be used for object localization and / or classification and thus as the machine learning model of the present disclosure.

[0148] The task to be performed by the machine learning model of the present disclosure during training may be or comprise a reconstruction. In other words, the machine learning model may be configured and trained to capture features in an image of an arthropod, generate an embedding based on the features, and reconstruct the image of the arthropod based on the features.

[0149] Such a reconstruction model may be or comprise an autoencoder. An autoencoder is a type of artificial neural network used for unsupervised learning, primarily for the purpose of dimensionality reduction and feature learning. It consists of two main components: the encoder and the decoder.

[0150] The encoder takes the input data (e.g., an image) and compresses it into a lower-dimensional representation (the embedding or feature vector). The encoder learns to capture the essential features of the input while discarding less important information. The encoder may contain a series of convolutional layers (for image data) followed by activation functions (like ReLU).

[0151] The decoder takes the compressed representation (the embedding or feature vector) from the encoder and attempts to reconstruct the original input data. The goal of the autoencoder is to minimize the difference between the original input and the reconstructed output. The decoder may contain transposed convolutional layers or upsampling layers to map the compressed representation back to the original image dimensions.

[0152] Mean Squared Error (MSE) may be used as a loss function to quantify deviations between the original and reconstructed image.

[0153] It is possible to introduce noise to the input images (e.g., Gaussian noise, salt-and-pepper noise) and train the autoencoder to reconstruct the original clean images . This helps the model learn robust features .

[0154] It is possible to randomly occlude or mask certain parts of the input images (e.g., using black squares or random patches) and train the autoencoder to reconstruct the full image from the incomplete input. This encourages the model to learn contextual information.

[0155] It is possible to apply transformations such as rotation, scaling, flipping, and / or cropping to generate variations of the training images. This helps the model become invariant to such transformations.

[0156] It is possible to randomly change the brightness, contrast, saturation, and / or hue of the input images to enhance the model’s robustness to lighting variations.

[0157] It is possible to apply small elastic deformations to the input images, which can help the model generalize better to variations in shapes and structures.

[0158] The reconstruction model may be or comprise or be based on the U-Net, for example.

[0159] The task to be performed by the machine learning during training may be or comprise image superresolution. Image super-resolution is the process of enhancing the resolution of an image, generating a high-resolution (HR) image from a low -resolution (LR) input image.

[0160] The training data for training an image super-resolution model usually comprise a multitude of low- resolution reference images (input data), each low-resolution reference image showing one or more arthropods, and a multitude of corresponding high-resolution reference images.

[0161] The term “low-resolution” refers to the term “high-resolution” and vice versa. “Low-resolution” means that the low-resolution image has a lower resolution than the high-resolution image, and “high- resolution” means that the high-resolution image has a higher resolution than the low-resolution image.

[0162] The term “corresponding” means that the high-resolution image corresponding to a low-resolution image shows the same arthropods as the low-resolution image, but in higher resolution.

[0163] Usually, the low-resolution image is generated based on the corresponding high-resolution image, e.g. by reducing the resolution.

[0164] Data augmentation techniques may be applied to the low-resolution images (e.g., rotations, flips, noise addition) to increase the diversity of the training data and improve model robustness.

[0165] The machine learning model for performing an image super-resolution task may be or may comprise a Super-Resolution Convolutional Neural Network (SRCNN), a Very Deep Super Resolution (VDSR), an Efficient Sub-Pixel Convolutional Neural Network (ESPCN), a Generative Adversarial Networks (GAN) such as SRGAN, for example.

[0166] The task to be performed by the machine learning during training may be or comprise image colorization. Image colorization is the process of adding color to grayscale images. This task may involve predicting the appropriate colors for different regions of an image based on the visual content and context.

[0167] The training data for training an image colorization model usually comprise a multitude of grayscale reference images (input data), each grayscale reference image showing one or more arthropods, and a multitude of corresponding color reference images. The term “corresponding” means that the color image corresponding to a grayscale image shows the same arthropods as the grayscale image, but in color.

[0168] Usually, the grayscale image is generated based on the corresponding color image, e.g. by grayscale conversion.

[0169] The machine learning model for performing an image colorization task may be or may comprise a Convolutional Neural Network (CNN), a U-Net, and / or a Generative Adversarial Network (GAN) such as Pix2Pix or CycleGAN, for example.

[0170] It is possible that the machine learning model of the present disclosure may be trained to perform a different task or an additional task or a combination of tasks. To select a suitable task, it is only necessary that the task has something to do with visible characteristics of arthropods or is based on them.

[0171] Once the machine learning model has been trained, it can be used to determine a biodiversity score.

[0172] It should be noted that the final parts (e.g. deep layers) of the model that are solely used to localize and / or classify species during training are not required for determining a biodiversity score. These parts, generally referred to as classification and / or localization heads, may be removed. The model can be reduced to the parts that are responsible and required for generating an embedding.

[0173] This also applies to models that perform reconstruction, image super-resolution, image colorization, and / or other tasks based on an autoencoder: to determine the biodiversity value, the decoder of the autoencoder usually is not required, only the encoder, which generates an embedding based on the input image.

[0174] To determine the biodiversity score, a plurality of images is received.

[0175] The term “receiving” is to be interpreted broadly and includes any means capable of providing an image (or any other data) for further processing by the computer system of the present disclosure.

[0176] The term “receiving” may mean transmitting data from another computer system to the computer system of the present disclosure. The term “receiving” may mean transmitting data from an imaging device such as a camera to the computer system of the present disclosure. The term “receiving” may mean uploading of data into the computer system of the present disclosure, e.g. by a user. The term “receiving” may mean retrieving data from a data storage; the data storage may be a component of and / or connected to the computer system of the present disclosure, for example via a network.

[0177] Each received image is fed to the trained machine learning model. The trained machine learning model generates an embedding for each arthropod depicted in the image.

[0178] The embeddings of all images can be collected, e.g. stored in a data storage.

[0179] A biodiversity score can be determined on the basis of the (collected) embeddings.

[0180] The biodiversity score can be a measure of the variability of the available embeddings.

[0181] If the embeddings are vectors in an ^-dimensional space (wherein n can be greater than 100, e.g. greater than 500 or greater than 700), then each vector defines a point in that space and the biodiversity score can be a measure of the distribution of the points in that space.

[0182] Quantifying the variability of vectors in a vector space can be accomplished through several statistical and mathematical techniques.

[0183] The biodiversity score can be or be based on the variance of the vectors. Variance measures how far a set of numbers (or vectors) are spread out from the mean number (or mean vector). In a vector space, the variance of vectors can be determined by finding the average of the squared differences from the mean vector. The greater the variance, the higher the biodiversity score. The variance can be calculated as follows: Calculate the mean vector; the mean vector is the element- wise sum of all vectors divided by the number of vectors. Calculate the deviation vectors; element-wise subtract the mean vector from each vector. Square each deviation; this gives you the squared deviations. Calculate the mean of these squared deviations across all vectors and elements; this gives you the variance as a single scalar.

[0184] The biodiversity score can be or be based on the standard deviation. The standard deviation is the square root of the variance. It’s a measure of the amount of variation or dispersion of a set of values. A low standard deviation indicates that the values tend to be close to the mean, while a high standard deviation indicates that the values are spread out over a wider range . The greater the standard deviation, the higher the biodiversity score.

[0185] The biodiversity score can be or be based on eigenvalues and / or eigenvectors. Such eigenvalues and eigenvectors are used in Principal Component Analysis (PCA), a technique often used to reduce the dimensionality of data. The eigenvalues represent the variances of the variables in a new coordinate system, and the eigenvectors define this new coordinate system. The larger the eigenvalue, the more variability there is in the data along the corresponding eigenvector.

[0186] The biodiversity score can be based on Singular Value Decomposition (SVD): This is a method used in linear algebra to factorize a matrix into singular values and singular vectors. The singular values obtained from SVD represent the magnitude or energy of the corresponding vectors. This can be used to quantify variability.

[0187] The biodiversity score can be or be based on the entropy of the embeddings. Entropy is a concept used in physics and information theory to quantify the amount of uncertainty or randomness in a set of data. The higher the entropy, the higher the uncertainty or randomness, and therefore, the higher the biodiversity score.

[0188] In the information-theoretic statistics, the expected uncertainty associated with a draw from the random variable X with a continuous distribution is measured by the differential entropy wherein / is the probability function of A. Usually, the probability function is not known. However, the entropy H(f ) may be estimated based on a set of observations Xi, x2, ... , xndrawn from / Two approaches for such an estimation are described in: N. Ebrahimi el al. : Two measures of sample entropy, Statistics & Probability Letters 20 (1994) 225-234. These approaches (or other approaches) can be used to determine the entropy of the available embeddings. The set of observables corresponds to the available embeddings.

[0189] The biodiversity score may be outputted (e.g. displayed on a monitor and / or printed using a printer), stored in a data storage and / or transmitted to a separate computer system.

[0190] The biodiversity score can be used to estimate and / or quantify the biodiversity in an area or region.

[0191] The biodiversity score can be used to estimate and / or quantify the biodiversity in a field for growing crops and / or in the vicinity of such a field, in a greenhouse, in a foil tunnel, in an orchard, in a plantation, and / or in another area or region.

[0192] The biodiversity score can be used to compare the biodiversity of two areas; that area that has a higher biodiversity score usually has a higher biodiversity.

[0193] The biodiversity score can be used to provide recommendations for action. For example, if the biodiversity score is below a predefined threshold or falls below such a threshold, measures can be taken to increase biodiversity. Such a threshold can be defined by an expert (such as a biologist) or a government authority (e.g. a regulatory authority). In agriculture, for example, flower strips can be created that attract insects and thus increase biodiversity. Certain species, i.e. beneficial insects, may be actively released.

[0194] The biodiversity score can be used to determine the impact of a measure on biodiversity. A biodiversity score can be determined before the measure and after the measure. If the biodiversity score after the measure is lower than before the measure, the measure has a negative impact on biodiversity; there might be a need to take countermeasures . If the biodiversity score after the measure is greater than before the measure, the measure has a positive impact on biodiversity; the measure may be considered a success.

[0195] Even if the biodiversity score was determined only on the basis of images showing a limited number of arthropods or insects or certain insects, it can still be a good estimate of biodiversity in general, since species attract other species. In other words, even if the biodiversity score was determined only on the basis of images showing a limited number of arthropods or insects or certain insects, it can still be representative of other species.

[0196] Further embodiments of the present disclosure are disclosed below. These embodiments are not necessarily objects that are covered by patent protection. As is known to those skilled in the art of patent protection, the scope of protection of a patent is defined by the patent claims. The description and drawings are to be used to interpret the patent claims. The embodiments described below are part of the description and not of the patent claims. The following embodiments are intended to provide the reader with information on how various features described in this disclosure can be combined with each other. They are therefore part of the present technical teaching and should not be confused with the objects of the patent claims.

[0197] Further embodiments of the present disclosure are:

[0198] Embodiment 1: A computer-implemented method, the method comprising: providing at least part of a trained machine learning model, wherein the trained machine learning model has been trained using training data to capture features of an image of an arthropod, generate an embedding based on the features, and perform a task based on the embedding, over a period of time, receiving a plurality of images of an area, for each image of the plurality of images: o inputting the image into the trained machine learning model, o determining or receiving one or more embeddings from the trained machine learning model, each of the one or more embeddings representing an arthropod depicted in the inputted image, determining a biodiversity score based on the embeddings, wherein the biodiversity score is a measure of variability of the embeddings, storing and / or outputting the biodiversity score and / or transmitting the biodiversity score to a separate computer system.

[0199] Embodiment 2: The method according to embodiment 1, wherein the task is or comprises a segmentation task.

[0200] Embodiment 3 : The method according to embodiment 1 or 2, wherein the task is or comprises a semantic segmentation task.

[0201] Embodiment 4: The method according to any one of the embodiments 1 to 3, wherein the training data comprised (i) a multitude of reference images as input data, each reference images of the multitude of reference images showing one or more arthropods, and (ii) for each reference image, an information about which pixels of the reference image represent an arthropod as target data. Embodiment 5 : The method according to embodiment 4, wherein training of the machine learning model comprised: for each reference image of the multitude of reference images: inputting the reference image into the machine learning model, receiving an output from the machine learning model, wherein the output indicates which pixels of the reference image the machine learning model has assigned to an arthropod, determining a deviation between the output and the target data, reducing the deviation by modifying parameters of the machine learning model.

[0202] Embodiment 6: The method according to any one of the embodiments 1 to 5, wherein the task is or comprises an instance segmentation task.

[0203] Embodiment 7: The method according to embodiment 6, wherein the training data comprised (i) a multitude of reference images as input data, each reference images of the multitude of reference images showing one or more arthropods, and (ii) for each reference image, annotations that mark individual instances of arthropods in the reference image as target data.

[0204] Embodiment 8: The method according to embodiment 7, wherein training of the machine learning model comprised: for each reference image of the multitude of reference images: inputting the reference image into the machine learning model, receiving an output from the machine learning model, wherein the output indicates predicted individual instances of arthropods, determining a deviation between the output and the target data, reducing the deviation by modifying parameters of the machine learning model.

[0205] Embodiment 9: The method according to any one of the embodiments 1 to 8, wherein the task is or comprises a classification task.

[0206] Embodiment 10: The method according to any one of the embodiments 1 to 9, wherein the task is or comprises a multiclass classification task.

[0207] Embodiment 11: The method according to any one of the embodiments 9 or 10, wherein the training data comprised (i) a multitude of reference images as input data, each reference images of the multitude of reference images showing an arthropod, and (ii) a class label, the class label indicating to which class the arthropod depicted in the reference image belongs as target data.

[0208] Embodiment 12: The method according to embodiment 11, wherein training of the machine learning model comprised: for each reference image of the multitude of reference images: inputting the reference image into the machine learning model, receiving an output from the machine learning model, wherein the output indicates the class to which the machine learning model has assigned the arthropod depicted in the reference image, determining a deviation between the output and the target data, reducing the deviation by modifying parameters of the machine learning model.

[0209] Embodiment 13: The method according to any one of the embodiments 1 to 12, wherein the task is or comprises a localization task. Embodiment 14: The method according to embodiment 13, wherein the training data comprised (i) a multitude of reference images as input data, each reference image showing one or more arthropod, and, for each reference image, coordinates of a bounding box that surrounds the arthropod in the reference image.

[0210] Embodiment 15: The method according to embodiment 14, wherein training of the machine learning model comprised: for each reference image of the multitude of reference images: inputting the reference image into the machine learning model, receiving an output from the machine learning model, wherein the output comprises predicted coordinates of the one or more arthropods depicted in the reference image, determining a deviation between the output and the target data, reducing the deviation by modifying parameters of the machine learning model.

[0211] Embodiment 16: The method according to any one of the embodiments 1 to 15, wherein the task is or comprises an object recognition task.

[0212] Embodiment 17: The method according to embodiment 16, wherein the object is an arthropod.

[0213] Embodiment 18: The method according to any one of the embodiments 16 or 17, wherein the object recognition task comprises a localization task and a classification task.

[0214] Embodiment 19: The method according to embodiment 18, wherein the training data comprised (i) a multitude of reference images as input data, each reference image showing one or more arthropod, and (ii), for each reference image, an information about the position of the one or more arthropods in the reference image and an information about a class the one or more arthropods belong to as target data.

[0215] Embodiment 20: The method according to embodiment 19, wherein training of the machine learning model comprised: for each reference image of the multitude of reference images: inputting the reference image into the machine learning model, receiving an output from the machine learning model, the output indicating a predicted location and a predicted class of the one or more arthropods depicted in the reference image, comparing the predicted location of the arthropod in the reference image with the real location of the arthropod in the reference image, and comparing the predicted class of the located arthropod with the real class of the located arthropod, determining a deviation between the output and the target data, reducing the deviation(s) by modifying parameters of the machine learning model.

[0216] Embodiment 21 : The method according to any one of the embodiments 1 to 20, wherein the task is or comprises a reconstruction task.

[0217] Embodiment 22: The method according to embodiment 21, wherein the training data comprised a multitude of reference images, each reference image showing one or more arthropods.

[0218] Embodiment 23 : The method according to embodiment 22, wherein training of the machine learning model comprised: for each reference image of the multitude of reference images: inputting the reference image into the machine learning model, receiving an output from the machine learning model, the output comprising a predicted reference image, determining a deviation between the reference image and the predicted reference image, reducing the deviation(s) by modifying parameters of the machine learning model. Embodiment 24: The method according to embodiment 21, wherein the training data comprised (i) a multitude of reference images as target data, each reference image showing one or more arthropods, and (ii) a multitude of modified reference images as input data.

[0219] Embodiment 25: The method according to embodiment 24, wherein each modified reference image was generated from a reference image by introducing noise to the reference image, and / or occluding or masking one or more parts of the reference image, and / or applying one or more transformations to the reference image, and / or changing brightness, contrast, saturation, and / or hue of the reference image, and / or applying one or more elastic deformations to the reference image.

[0220] Embodiment 26: The method according to embodiment 24 or 25, wherein training of the machine learning model comprised: for each modified reference image of the multitude of modified reference images: inputting the modified reference image into the machine learning model, receiving an output from the machine learning model, the output comprising a reconstructed reference image, determining a deviation between the reference image and the reconstructed reference image, reducing the deviation(s) by modifying parameters of the machine learning model.

[0221] Embodiment 27 : The method according to any one of the embodiments 1 to 26, wherein the task is or comprises an image super-resolution task.

[0222] Embodiment 28: The method according to embodiment 27, wherein the training data comprised (i) a multitude of low-resolution reference images as input data, each low-resolution reference image showing one or more arthropods, and, for each low-resolution reference image, a corresponding high- resolution reference image as target data.

[0223] Embodiment 28: The method according to embodiment 27, wherein each low-resolution reference image was generated from the corresponding high-resolution reference image by reducing the resolution of the high-resolution reference image.

[0224] Embodiment 29: The method according to embodiment 28 or 29, wherein training of the machine learning model comprised: for each low-resolution reference image of the multitude of low-resolution reference images: inputting the low-resolution reference image into the machine learning model, receiving an output from the machine learning model, the output comprising a predicted high- resolution reference image, determining a deviation between the high-resolution reference image and the predicted high- resolution reference image, reducing the deviation(s) by modifying parameters of the machine learning model.

[0225] Embodiment 30: The method according to any one of the embodiments 1 to 29, wherein the task is or comprises a colorization task.

[0226] Embodiment 31: The method according to embodiment 30, wherein the training data comprised (i) a multitude of grayscale reference images as input data, each grayscale reference image showing one or more arthropods, and, for each grayscale reference image, a corresponding color reference image as target data.

[0227] Embodiment 32: The method according to embodiment 31, wherein each grayscale reference image was generated from the corresponding color reference image by grayscale conversion of the color reference image.

[0228] Embodiment 33: The method according to embodiment 31 or 32, wherein training of the machine learning model comprised: for each grayscale reference image of the multitude of grayscale reference images: inputting the grayscale reference image into the machine learning model, receiving an output from the machine learning model, the output comprising a predicted color reference image, determining a deviation between the color reference image and the predicted color reference image, reducing the deviation(s) by modifying parameters of the machine learning model.

[0229] Embodiment 34: The method according to any one of embodiments 1 to 33, wherein the machine learning model comprises a convolutional neural network, wherein the convolutions neural network is configured and trained to extract features from the inputted image and generate an embedding based on the extracted features.

[0230] Embodiment 35: A computer-implemented method, the method comprising: providing a trained machine learning model, wherein the trained machine learning model was trained on training data to classify arthropods in images, wherein the training data comprised (i) a multitude of reference images as input data, each reference image depicting an arthropod, and (ii) for each reference image of the multitude of reference images an information about a class to which the arthropod depicted in the image belongs as target data, over a period of time, receiving a plurality of images of an area, for each image of the plurality of images: o inputting the image into the trained machine learning model, o determining or receiving one or more embeddings from the trained machine learning model, each of the one or more embeddings representing an arthropod depicted in the inputted image, determining a biodiversity score based on the embeddings, storing and / or outputting the biodiversity score and / or transmitting the biodiversity score to a separate computer system.

[0231] Embodiment 36: The method according to any one of embodiments 1 to 35, wherein the biodiversity score is a measure of the variability of arthropods occurring in the area.

[0232] Embodiment 37: The method according to any one of embodiments 1 to 36, wherein determining the biodiversity score based on the embeddings, comprises: collecting the embeddings of the plurality of images, determining the biodiversity score based on the collected embeddings.

[0233] Embodiment 38: The method according to any one of embodiments 1 to 37, wherein the biodiversity score is a measure of the variability of the embeddings.

[0234] Embodiment 39: The method according to any one of embodiments 1 to 38, wherein the biodiversity score is the variance of the embeddings or based thereon.

[0235] Embodiment 40: The method according to any one of embodiments 1 to 39, wherein the biodiversity score is the standard deviation of the embeddings or based thereon.

[0236] Embodiment 41 : The method according to any one of embodiments 1 to 40, wherein the biodiversity score is the entropy of the embeddings or based thereon

[0237] Embodiment 42: The method according to any one of embodiments 1 to 41, wherein each embedding is a numerical representation of an arthropod depicted in the image.

[0238] Embodiment 43 : The method according to any one of embodiments 1 to 42, wherein each embedding is a vector, a matrix, a tensor or another arrangement of numbers. Embodiment 44: The method according to any one of embodiments 1 to 43, wherein each embedding is generated by the trained machine learning model to classify an arthropod in an image.

[0239] Embodiment 45 : The method according to any one of embodiments 1 to 44, wherein each embedding is generated by the trained machine learning model to classify and localize an arthropod in an image.

[0240] Embodiment 46: The method according to any one of embodiments 1 to 45, wherein each embedding is a vector, and the biodiversity score is the variance of the vectors or based thereon.

[0241] Embodiment 47 : The method according to any one of embodiments 1 to 46, wherein each embedding is a vector, and the biodiversity score is the standard deviation of the vectors or based thereon.

[0242] Embodiment 48: The method according to any one of embodiments 1 to 47, wherein each embedding is a vector, and the biodiversity score is the differential entropy of the vectors or based thereon.

[0243] Embodiment 49: The method according to any one of embodiments 1 to 48, wherein each embedding is a compressed representation of an arthropod depicted in the image.

[0244] Embodiment 50: The method according to any one of embodiments 1 to 49, wherein each embedding is a fixed-size unidimensional array of real numbers.

[0245] Embodiment 51: The method according to any one of embodiments 1 to 50, wherein each embedding has a number of dimensions greater than 100.

[0246] Embodiment 52: The method according to any one of embodiments 1 to 51, wherein each embedding has a number of dimensions greater than 500.

[0247] Embodiment 53: The method according to any one of embodiments 1 to 52, wherein each embedding has a number of dimensions greater than 700.

[0248] Embodiment 54: The method according to any one of embodiments 1 to 53, wherein the period of time is several days or several week or more.

[0249] Embodiment 55: The method according to any one of embodiments 1 to 56, wherein the area is an area within a field, an area within an orchard, an area within a plantage or an area within a foil tunnel or the area is an area located near a field, an area located near an orchard, or an area located near a plantage.

[0250] Embodiment 56: The method according to any one of embodiments 1 to 55, wherein the area is a part of a trap for arthropods.

[0251] Embodiment 57: The method according to any one of embodiments 1 to56, wherein the area is flat or comprises a flat surface.

[0252] Embodiment 58: The method according to any one of embodiments 1 to 57, wherein the area and / or surface has an extension in the range of 100 mm x 200 mm to 250 mm x 250 mm.

[0253] Embodiment 59: The method according to any one of embodiments 1 to 58, wherein the biodiversity score is progressively determined and / or updated.

[0254] Embodiment 60: The method according to any one of embodiments 1 to 59, wherein a new biodiversity score is determined whenever a new image is received.

[0255] Embodiment 61 : The method according to any one of embodiments 1 to 60, wherein a new biodiversity score is determined when a predefined number of images has been received.

[0256] Embodiment 62: The method according to any one of embodiments 1 to 61, wherein the machine learning model is configured to generate an embedding for each arthropod depicted in an image and / or reference image.

[0257] Embodiment 63: The method according to any one of embodiments 1 to 62, wherein the machine learning model is or comprises a classification model during training. Embodiment 64: The method according to embodiment 63, wherein the classification model is configured to assign a class to each arthropod depicted in an image and / or reference image based on the embedding.

[0258] Embodiment 65 : The method according to any one of embodiments 11 to 64, wherein the class indicates a class, order, family, genus, and / or species in sense of biological taxonomy.

[0259] Embodiment 66: The method according to any one of embodiments 1 to 65, wherein each embedding is a numerical representation of an arthropod in form of a vector.

[0260] Embodiment 67 : The method according to any one of embodiments 1 to 66, wherein the trained machine learning model is configured and was trained to extract features of an image and / or reference image and generate an embedding based on the extracted features.

[0261] Embodiment 68: The method according to any one of embodiments 1 to 67, wherein the trained machine learning model is or comprises a convolutional neural network.

[0262] Embodiment 69: The method according to any one of embodiments 1 to 68, wherein the trained machine learning model is or comprises a transformer.

[0263] Embodiment 70: The method according to any one of embodiments 1 to 69, wherein the trained machine learning model is or comprises an encoder of an autoencoder.

[0264] Embodiment 71: The method according to any one of embodiments 1 to 70, wherein training the machine learning model comprised: for each reference image of the multitude of reference images: inputting the reference image into the machine learning model, receiving a predicted class as an output from the machine learning model, comparing the predicted class with the target data, determining a deviation between the predicted class and the target data, reducing the deviation by modifying model parameters of the machine learning model.

[0265] Embodiment 72: The method according to any one of embodiments 1 to 71, wherein the trained machine learning model is configured and was trained to locate and classify arthropods in images and / or reference images.

[0266] Embodiment 73: The method according to any one of embodiments 1 to 72, wherein the training data comprised, (i) a multitude of reference images as input data, each reference image depicting one or more arthropods, (ii) for each arthropod depicted in a reference image an information about a location of the arthropod in the reference image as first target data, and (iii) for each arthropod depicted in a reference image an information about a class to which the arthropod depicted in the reference image belongs as second target data.

[0267] Embodiment 74: The method according to embodiment 73, wherein training the machine learning model comprised: for each reference image of the multitude of reference images: inputting the reference image into the machine learning model, for each arthropod depicted in the reference image: receiving a predicted location of the arthropod in the reference image and a predicted class of the arthropod in the reference image as an output from the machine learning model, for each arthropod depicted in the reference image: determining a first deviation between the predicted location and the first target data and a second deviation between the predicted class and the second target data, reducing the deviations by modifying parameters of the machine learning model.

[0268] Embodiment 75: A computer system comprising: a processing unit; and a memory storing a computer program configured to cause, when executed by the processing unit, the computer system to perform the method according to any one of embodiments 1 to 74.

[0269] Embodiment 76: Anon-transitory computer readable storage medium having stored thereon a computer program that, when executed by a processing unit of a computer system, causes the computer system to perform the method according to any one of embodiments 1 to 74.

[0270] Fig. 1 shows schematically an example of the training of the machine learning model of the present disclosure.

[0271] The training is done with training data. The training data comprises (i) a multitude of reference images as input data, each reference image depicting an arthropod, and (ii) for each reference image of the multitude of reference images an information about a class to which the arthropod depicted in the image belongs as target data.

[0272] In the example shown in Fig. 1, only one training data set TD is shown for the sake of clarity.

[0273] The training data set TD includes a reference image RI showing an arthropod Al .

[0274] The training data set TD further includes information RC that indicates what type of arthropod it is. This information therefore indicates the real class.

[0275] The training data set TD further includes information RL that indicates where the arthropod is located in the reference image. This information therefore indicates the real location.

[0276] The reference image RI is used as input data; real class RC and real location RL are used as target data.

[0277] The reference image RI is inputted into the machine learning model MLM. The machine learning model MLM is configured locate arthropods in a (reference) image and classify each located arthropod based on the (reference) image and model parameters MP (and optionally further input data).

[0278] In the example shown in Fig. 1, the machine learning model MLM comprises different layers. The layers labelled FE are used to extract features from the reference image RI and to generate an embedding that represents the arthropod Al in the reference image RI. The layers labelled L are used to localize the arthropod Al in the reference image RI, and the layers labelled C are used to classify the arthropod AL

[0279] The machine learning model MLM is configured to predict and output a predicted location PL and to predict and output a predicted class PC.

[0280] The predicted location PL is compared with the real location RL. Deviations between the predicted location PL and the real location RL are quantified using a classification loss function CL.

[0281] The predicted class PC is compared with the real class RC. Deviations between the predicted class PC and the real class RC are quantified using a localization loss function LL.

[0282] Deviations (i) between the predicted location PL and the real location RL, and (ii) between the predicted class PC and the real class RC are reduced by adjusting model parameters MP of the machine learning model MLM.

[0283] The described steps are repeated for a multitude of reference images. The training steps can be carried out until a stop criterion is reached. Such a stop criterion can be for example: a predefined maximum number of training steps / cycles / epochs has been performed, deviations between output and target data can no longer be reduced by modifying the model parameters, a predefined minimum of a loss function is reached, and / or an extreme value (e.g., maximum or minimum) of another performance value is reached.

[0284] Fig. 2 schematically shows an example of how an embedding is generated based on an image using the trained machine learning model of the present disclosure. The trained machine learning model MLM‘ can be trained as described in relation to Fig. 1. It should be noted that the trained machine learning model MLM‘ only includes the layers FE for extracting features from an image and for generating an embedding; the layers L for localizing an arthropod in an image and the layers C for classifying the localized arthropod are not needed; they were only needed to train the machine learning model to generate embeddings that represent an arthropod in a way that allows it to be assigned to a class.

[0285] An image I is fed to the trained machine learning model MLM‘. The image I shows an arthropod A2. The trained machine learning model MLM‘ generates an embedding E that represents the arthropod A2 based on the image I, model parameters MP and optionally further input data. The embedding E can be used together with further embeddings of further images, which were generated together with the image I in a time period in an area, to determine a biodiversity score for the area.

[0286] Fig. 3 shows an example of how a biodiversity score is determined. The biodiversity score BDS is determined with the help of a trained machine learning model MEM1. This trained machine learning model MLM‘ can be the model already shown in Fig. 2.

[0287] A plurality of images II, 12, ... , LA, each showing one or more arthropods, are fed into the model one after the other.

[0288] For each arthropod depicted in an image, the trained machine learning model MLM‘ generates an embedding. In the example shown in Fig. 3, the trained machine learning model MLM‘ generates an embedding El for the arthropod Al depicted in image II, an embedding E2 for the arthropod A2 depicted in image 12, an embedding E3 for the arthropod A3 depicted in image 12, and an embedding EA for the arthropod AN depicted in image IAT

[0289] A biodiversity score BSD is then determined based on the embeddings El, E2, ...., EA. As described, this biodiversity score BSD can be the variance or the standard deviation of the embeddings E 1 , E2, ...., EA, or it can be based on it, or it can be some other value that quantifies the variability of the embeddings El, E2, ...., EA, such as the differential entropy.

[0290] The biodiversity score BSD can be outputted (e.g. displayed on a monitor and / or printed using a printer), stored in a data storage and / or transmitted to a separate computer system.

[0291] Fig. 4 shows an embodiment of the computer-implemented method of the present disclosure in the form of a flow chart.

[0292] The method (100) comprises:

[0293] (110) providing a trained machine learning model, wherein the trained machine learning model was trained on training data to classify arthropods in images, wherein the training data comprised (i) a multitude of reference images as input data, each reference image depicting an arthropod, and (ii) for each reference image of the multitude of reference images an information about a class to which the arthropod depicted in the image belongs as target data,

[0294] (120) over a period of time, receiving a plurality of images of an area,

[0295] (130) for each image of the plurality of images :

[0296] (131) inputting the image into the trained machine learning model,

[0297] (132) determining or receiving one or more embeddings from the trained machine learning model, each of the one or more embeddings representing an arthropod depicted in the inputted image,

[0298] (140) determining a biodiversity score based on the embeddings,

[0299] (150) storing and / or outputting the biodiversity score and / or transmitting the biodiversity score to a separate computer system.

[0300] Fig. 5 shows another embodiment of the computer-implemented method of the present disclosure in the form of a flow chart. The method (200) comprises:

[0301] (210) providing a machine learning model, wherein the machine learning model is configured to assign an arthropod depicted in an image to one of a number of classes based on the image and model parameters,

[0302] (220) providing training data, wherein the training data comprises (i) a multitude of reference images as input data, each reference image depicting an arthropod, and (ii) for each reference image of the multitude of reference images an information about a class to which the arthropod depicted in the image belongs as target data,

[0303] (230) training the machine learning model, wherein the training comprises: for each reference image of the multitude of reference images:

[0304] (231) inputting the reference image into the machine learning model,

[0305] (232) receiving a predicted class as an output from the machine learning model,

[0306] (233) comparing the predicted class with the target data,

[0307] (234) determining a deviation between the predicted class and the target data,

[0308] (235) reducing the deviation by modifying model parameters of the machine learning model,

[0309] (240) over a period of time, receiving a plurality of images of an area,

[0310] (250) for each image of the plurality of images:

[0311] (251) inputting the image into the trained machine learning model,

[0312] (252) determining or receiving an embedding from the trained machine learning model, the embedding representing an arthropod depicted in the inputted image,

[0313] (260) determining a biodiversity score based on the embeddings,

[0314] (270) storing and / or outputting the biodiversity score and / or transmitting the biodiversity score to a separate computer system.

[0315] Fig. 6 shows another embodiment of the computer-implemented method of the present disclosure in the form of a flow chart.

[0316] The method (300) comprises:

[0317] (310) providing a machine learning model, wherein the machine learning model is configured to localize and classify one or more arthropods depicted in an image based on the image and model parameters,

[0318] (320) providing training data, wherein the training data comprises: (i) a multitude of reference images as input data, each reference image depicting one or more arthropods, (ii) for each arthropod depicted in a reference image an information about a location of the arthropod in the reference image as first target data, and (iii) for each arthropod depicted in a reference image an information about a class to which the arthropod depicted in the reference image belongs as second target data;

[0319] (330) training the machine learning model, wherein the training comprises: for each reference image of the multitude of reference images:

[0320] (331) inputting the reference image into the machine learning model,

[0321] (332) for each arthropod depicted in the reference image: receiving a predicted location of the arthropod in the reference image and a predicted class of the arthropod in the reference image as an output from the machine learning model, (333) for each arthropod depicted in the reference image: determining a first deviation between the predicted location and the first target data and a second deviation between the predicted class and the second target data,

[0322] (334) reducing the deviations by modifying parameters of the machine learning model,

[0323] (340) over a period of time, receiving a plurality of images of an area,

[0324] (350) for each image of the plurality of images:

[0325] (351) inputting the image into the trained machine learning model,

[0326] (352) determining or receiving one or more embeddings from the trained machine learning model, each of the one or more embeddings representing an arthropod depicted in the inputted image,

[0327] (360) determining a biodiversity score based on the embeddings,

[0328] (370) storing and / or outputting the biodiversity score and / or transmitting the biodiversity score to a separate computer system.

[0329] The operations in accordance with the teachings herein may be performed by at least one computer system specially constructed for the desired purposes or general -purpose computer specially configured for the desired purpose by at least one computer program stored in a typically non-transitory computer readable storage medium.

[0330] A “computer system” is a system for electronic data processing that processes data by means of programmable calculation rules. Such a system usually comprises a “computer”, that unit which comprises a processor for carrying out logical operations, and also peripherals.

[0331] In computer technology, “peripherals” refer to all devices which are connected to the computer and serve for the control of the computer and / or as input and output devices. Examples thereof are monitor (screen), printer, scanner, mouse, keyboard, drives, camera, microphone, loudspeaker, etc. Internal ports and expansion cards are, too, considered to be peripherals in computer technology.

[0332] Computer systems of today are frequently divided into desktop PCs, portable PCs, laptops, notebooks, netbooks and tablet PCs and so-called handhelds (e.g. smartphone); all these systems can be utilized for carrying out the computer-implemented method of the present disclosure.

[0333] The term “non-transitory” is used herein to exclude transitory, propagating signals or waves, but to otherwise include any volatile or non-volatile computer memory technology suitable to the application.

[0334] The term “computer system” should be broadly construed to cover any kind of electronic device with data processing capabilities, including, by way of non-limiting example, personal computers, servers, embedded cores, computing system, communication devices, processors (e.g., digital signal processor (DSP)), microcontrollers, field programmable gate array (FPGA), application specific integrated circuit (ASIC), etc.) and other electronic computing devices.

[0335] The term “process” as used above is intended to include any type of computation or manipulation or transformation of data represented as physical, e.g., electronic, phenomena which may occur or reside e.g., within registers and / or memories of at least one computer system or processor. The term processing unit includes a single processor or a plurality of distributed or remote such units.

[0336] Fig. 7 illustrates a computer system (1) according to some example implementations of the present disclosure in more detail. The computer system may include one or more of each of a number of components such as, for example, a processing unit (20) connected to a memory (50) (e.g., storage device). The processing unit (20) may be composed of one or more processors alone or in combination with one or more memories. The processing unit (20) is generally any piece of computer hardware that is capable of processing information such as, for example, data, computer programs and / or other suitable electronic information. The processing unit (20) is composed of a collection of electronic circuits some of which may be packaged as an integrated circuit or multiple interconnected integrated circuits (an integrated circuit at times more commonly referred to as a “chip”). The processing unit (20) may be configured to execute computer programs (60), which may be stored onboard the processing unit or otherwise stored in the memory (50) of the same or another computer.

[0337] The processing unit (20) may be a number of processors, a multi-core processor or some other type of processor, depending on the particular implementation. Further, the processing unit (20) may be implemented using a number of heterogeneous processor systems in which a main processor is present with one or more secondary processors on a single chip. As another illustrative example, the processing unit (20) may be a symmetric multi-processor system containing multiple processors of the same type. In yet another example, the processing unit (20) may be embodied as or otherwise include one or more ASICs, FPGAs or the like. Thus, although the processing unit (20) may be capable of executing a computer program (60) to perform one or more functions, the processing unit (20) of various examples may be capable of performing one or more functions without the aid of a computer program (60). In either instance, the processing unit (20) may be appropriately programmed to perform functions or operations according to example implementations of the present disclosure.

[0338] The memory (50) is generally any piece of computer hardware that is capable of storing information such as, for example, data, images, computer programs (e.g., computer-readable program code (60)), machine learning models and / or other suitable information either on a temporary basis and / or a permanent basis. The memory may include volatile and / or non-volatile memory, and may be fixed or removable. Examples of suitable memory include random access memory (RAM), read-only memory (ROM), a hard drive, a flash memory, a thumb drive, a removable computer diskette, an optical disk, a magnetic tape or some combination of the above. Optical disks may include compact disk - read only memory (CD-ROM), compact disk - read / write (CD-R / W), DVD, Blu-ray disk or the like. In various instances, the memory may be referred to as a computer-readable storage medium. The computer- readable storage medium is a non-transitory device capable of storing information, and is distinguishable from computer-readable transmission media such as electronic transitory signals capable of carrying information from one location to another. Computer-readable medium as described herein may generally refer to a computer-readable storage medium or computer-readable transmission medium.

[0339] In addition to the memory (50), the processing unit (20) may also be connected to one or more interfaces for displaying, transmitting and / or receiving information. The interfaces may include one or more communications interfaces and / or one or more user interfaces. The communications interface(s) may be configured to transmit and / or receive information, such as to and / or from other computer(s), network(s), database(s), camera(s) or the like. The communications interface may be configured to transmit and / or receive information by physical (wired) and / or wireless communications links. The communications interface(s) may include interface(s) (41) to connect to a network, such as using technologies such as cellular telephone, Wi-Fi, satellite, cable, digital subscriber line (DSL), fiber optics and the like. In some examples, the communications interface(s) may include one or more short-range communications interfaces (42) configured to connect devices using short-range communications technologies such as NFC, RFID, Bluetooth, Bluetooth LE, ZigBee, infrared (e.g., IrDA) or the like.

[0340] The user interfaces may include a display (30). The display (30) may be configured to present or otherwise display information to a user, suitable examples of which include a liquid crystal display (LCD), light-emitting diode display (LED), plasma display panel (PDP) or the like. The user input interface(s) (11) may be wired or wireless, and may be configured to receive information from a user into the computer system (1), such as for processing, storage and / or display. Suitable examples of user input interfaces include a microphone, image or video capture device, keyboard or keypad, joystick, touch-sensitive surface (separate from or integrated into a touchscreen) or the like. In some examples, the user interfaces may include automatic identification and data capture (AIDC) technology (12) for machine-readable information. This may include barcode, radio frequency identification (RFID), magnetic stripes, optical character recognition (OCR), integrated circuit card (ICC), and the like. The user interfaces may further include one or more interfaces for communicating with peripherals such as printers, cameras and the like.

[0341] As indicated above, a computer program (60) may be stored in memory (50), and executed by processing unit (20) that is thereby programmed, to implement functions of the systems, subsystems, tools and their respective elements described herein. As will be appreciated, any suitable program code instructions may be loaded onto a computer or other programmable apparatus from a computer-readable storage medium to produce a particular machine, such that the particular machine becomes a means for implementing the functions specified herein. These program code instructions may also be stored in a computer-readable storage medium that can direct a computer, processing unit or other programmable apparatus to function in a particular manner to thereby generate a particular machine or particular article of manufacture. The instructions stored in the computer-readable storage medium may produce an article of manufacture, where the article of manufacture becomes a means for implementing functions described herein. The program code instructions may be retrieved from a computer-readable storage medium and loaded into a computer, processing unit or other programmable apparatus to configure the computer, processing unit or other programmable apparatus to execute operations to be performed on or by the computer, processing unit or other programmable apparatus.

[0342] Retrieval, loading and execution of the program code instructions may be performed sequentially such that one instruction is retrieved, loaded and executed at a time. In some example implementations, retrieval, loading and / or execution may be performed in parallel such that multiple instructions are retrieved, loaded, and / or executed together. Execution of the program code instructions may produce a computer-implemented process such that the instructions executed by the computer, processing circuitry or other programmable apparatus provide operations for implementing functions described herein.

[0343] Execution of instructions by processing unit, or storage of instructions in a computer-readable storage medium, supports combinations of operations for performing the specified functions. In this manner, a computer system (1) may include processing unit (20) and a computer-readable storage medium or memory (50) coupled to the processing circuitry, where the processing circuitry is configured to execute computer-readable program code (60) stored in the memory. It will also be understood that one or more functions, and combinations of functions, may be implemented by special purpose hardware-based computer systems and / or processing circuitry which perform the specified functions, or combinations of special purpose hardware and program code instructions.

Claims

CLAIMS1. A computer-implemented method comprising: providing at least part of a trained machine learning model, wherein the trained machine learning model has been trained using training data to capture features of an image of an arthropod, generate an embedding based on the features, and perform a task based on the embedding, over a period of time, receiving a plurality of images of an area, for each image of the plurality of images: o inputting the image into the trained machine learning model, o determining or receiving one or more embeddings from the trained machine learning model, each of the one or more embeddings representing an arthropod depicted in the inputted image, determining a biodiversity score based on the embeddings, wherein the biodiversity score is a measure of variability of the embeddings, storing and / or outputting the biodiversity score and / or transmitting the biodiversity score to a separate computer system.

2. The method according to claim 1, wherein the task is or comprises a localization task, a segmentation task, a classification task, an object recognition task, a reconstruction task, a colorization task, an image super-resolution task or a combination of one or more of said tasks.

3. The method according to claim 1 or 2, wherein each embedding is a numerical representation of an arthropod depicted in an inputted image.

4. The method according to any one of claims 1 to 3, wherein(i) the biodiversity score is the variance of the embeddings or based thereon, or(ii) the biodiversity score is the standard deviation of the embeddings or based thereon, or(iii) the biodiversity score is the differential entropy of the embeddings or based thereon.

5. The method according to any one of claims 1 to 4, wherein the period of time is several days or several week or more.

6. The method according to any one of claims 1 to 5, wherein the area is an area within a field, an area within an orchard, an area within a plantage or an area within a foil tunnel or the area is an area located near a field, an area located near an orchard, or an area located near a plantage.

7. The method according to any one of claims 1 to 6, wherein the area is flat or comprises a flat surface, wherein the area and / or the surface has an extension in the range of 100 mm x 200 mm to 250 mm x 250 mm.

8. The method according to any one of claims 1 to 7, wherein the biodiversity score is progressively determined and / or updated.

9. The method according to any one claim 8, wherein the machine learning model is or comprises a classification model during training, wherein the classification model is configured to assign a class to each arthropod depicted in an image and / or reference image based on the embedding.

10. The method according to claim 9, wherein the class indicates a class, order, family, genus, and / or species in sense of biological taxonomy.

11. The method according to any one of claims 1 to 10, wherein the trained machine learning model is or comprises a convolutional neural network.

12. The method according to any one of claims 1 to 11, wherein training the machine learning model comprised: for each reference image of the multitude of reference images: inputting the reference image into the machine learning model, receiving a predicted class as an output from the machine learning model, comparing the predicted class with the target data, determining a deviation between the predicted class and the target data, reducing the deviation by modifying model parameters of the machine learning model.

13. The method according to any one of claims 1 to 12, wherein the trained machine learning model is configured and was trained to locate and classify arthropods in images and / or reference images, wherein the training data comprise (i) a multitude of reference images as input data, each reference image depicting one or more arthropods, (ii) for each arthropod depicted in a reference image an information about a location of the arthropod in the reference image as first target data, and (iii) for each arthropod depicted in a reference image an information about a class to which the arthropod depicted in the reference image belongs as second target data, wherein training the machine learning model comprised: for each reference image of the multitude of reference images: inputting the reference image into the machine learning model, for each arthropod depicted in the reference image: receiving a predicted location of the arthropod in the reference image and a predicted class of the arthropod in the reference image as an output from the machine learning model, for each arthropod depicted in the reference image: determining a first deviation between the predicted location and the first target data and a second deviation between the predicted class and the second target data, reducing the deviations by modifying parameters of the machine learning model.

14. A computer system comprising: a processing unit; and a memory storing a computer program configured to cause, when executed by the processing unit, the computer system to perform the method according to any one of claims 1 to 13.

15. A non-transitory computer readable storage medium having stored thereon a computer program that, when executed by a processing unit of a computer system, causes the computer system to perform the method according to any one of claims 1 to 13.