Remote crop yield prediction system

A system using satellite sensors and machine learning models addresses the limitations of traditional crop yield estimation by accurately predicting crop yield and soil health, enhancing agricultural resource management and food security.

WO2026139854A1PCT designated stage Publication Date: 2026-07-02SABIC AGRI NUTRIENTS CO

Patent Information

Authority / Receiving Office
WO · WO
Patent Type
Applications
Current Assignee / Owner
SABIC AGRI NUTRIENTS CO
Filing Date
2025-12-22
Publication Date
2026-07-02

AI Technical Summary

Technical Problem

Traditional crop yield estimation methods are time-consuming, labor-intensive, prone to human error, and fail to provide timely and accurate predictions, especially in the face of climate changes and varying food demand, leading to challenges in agricultural resource management and food security.

Method used

Utilizing multispectral and hyperspectral satellite sensors with machine learning models to process weather and satellite data, enabling accurate and timely crop yield prediction and soil health estimation without physical sampling, through a system comprising convolutional neural networks (CNNs) and recurrent neural networks (RNNs) to analyze satellite and weather data.

Benefits of technology

Enables precise and timely crop yield estimation, reducing labor and time, improving agricultural resource management, and enhancing food security by providing reliable predictions adaptable to climate changes.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure IB2025063327_02072026_PF_FP_ABST
    Figure IB2025063327_02072026_PF_FP_ABST
Patent Text Reader

Abstract

Systems, methods, computer readable media, and devices that support predicting crop yield and estimating soil health. The method may include receiving weather data and satellite data corresponding to an area of land; generating an input dataset from the weather data and the satellite data; providing the input dataset to a trained machine learning model configured to determine crop health data of a crop in the area of land and / or soil content data of soil in the area of land from a portion of the input dataset corresponding to the satellite data; determining, via the trained machine learning model, a predicted crop yield of the area of land; and initiating, by the one or more processors, one or more recommended actions based on the predicted crop yield.
Need to check novelty before this filing date? Find Prior Art

Description

DESCRIPTIONREMOTE CROP YIELD PREDICTION SYSTEM CROSS-REFERENCE TO RELATED APPLICATIONS

[0001] This application claims priority to and the benefit of Indian Application No.202441103114, filed December 26, 2024, the contents of which is incorporated into the present application by reference in its entirety.TECHNICAL FIELD

[0002] The present disclosure generally relates to crop yield prediction and more particularly to systems and machine learning models for remotely predicting crop yield.BACKGROUND

[0003] The challenges in agricultural sustainability have become more intense in recent years with sharp rise in the cost of food and energy, climate change, water scarcity, degradation of natural ecosystems and biodiversity, financial crises, and increases in population. With increases in demand for food and agricultural products, crop yield estimation is a crucial component of modern agriculture, providing valuable insights for decision making, resource allocation and food security.

[0004] Traditional methods of crop yield estimation rely on direct observation, field surveys, and manual data collection. These traditional methods of data collection are often time consuming and labor-intensive, and may introduce human error, subjectivity, or inaccuracies into data collected and crop yields predicted. Moreover, traditional field surveys and ground collection may not always provide timely crop yield information to decision makers. Farmers and agricultural industries can face several challenges when the crop yield is unknown in advance, including difficulty with agricultural resource management, increased financial risk, market volatility, and supply chain disruptions. Moreover, as climate changes over time, traditional yield estimation methods may not accurately account for changes in climate and weather patterns, or may become outdated. As such, a need exists for crop yield estimation systems and tools that can provide accurate and reliable results in a faster time and that can readily adapt to changes in climate and food demand.300879086.1 - 1 -SUMMARY

[0005] With recent advancements in artificial intelligence (Al), machine learning (ML), and satellite imagery, crop yields can now be predicted more accurately. Applicant has found that multispectral and / or hyperspectral sensors in satellites, together with advanced data processing techniques, may be used for the extraction of information such as crop health, weather data, and environmental variables in order to more accurately anticipate crop yield production. Weather and agrology data can be provided to a trained machine learning model to predict the crop yield at a farm level with high accuracy and to provide better understanding of soil chemistry and plant health with less of a need for physical sampling of soil and plants in an area of land. Machine learning models may also be adaptable to changes in input data, including changes over an extended period of time.

[0006] Aspects of the present disclosure provide systems, methods, and computer-readable storage media providing functionality for predicting crop yield and estimating soil health using satellite data and machine learning models.

[0007] In an aspect, a method for predicting crop yield and estimating soil health includes receiving, by one or more processors, weather data and satellite data corresponding to an area of land; generating, by the one or more processors, an input dataset from the weather data and the satellite data; providing, by the one or more processors, the input dataset to a trained machine learning model configured to determine from a portion of the input dataset corresponding to the satellite data one or both of: crop health data of a crop in the area of land, and soil content data of soil in the area of land; determining, by the one or more processors via the trained machine learning model, a predicted crop yield of the area of land based on the input dataset, the crop health data, the soil content data, and / or historical yield data; and initiating, by the one or more processors, one or more recommended actions based on the predicted crop yield.

[0008] In an aspect, a system for predicting crop yield and estimating soil health includes a memory storing processor-readable code; and at least one processor coupled to the memory, the at least one processor configured to execute the processor-readable code to cause the at least one processor to perform operations including: receiving, by one or more processors, weather data and satellite data corresponding to an area of land; generating, by the one or more processors, an input dataset from the weather data and the satellite data; providing, by the one or more processors, the input dataset to a trained machine learning model configured to determine from a portion of the input dataset corresponding to the satellite data one or both of:300879086.1crop health data of a crop in the area of land, and soil content data of soil in the area of land; determining, by the one or more processors via the trained machine learning model, a predicted crop yield of the area of land based on the input dataset, the crop health data, the soil content data, and / or historical yield data; and initiating, by the one or more processors, one or more recommended actions based on the predicted crop yield.

[0009] In an aspect, a computer readable medium may include processor-readable code which, when executed by one or more processors cause the one or more processors to perform operations, including: receiving, by one or more processors, weather data and satellite data corresponding to an area of land; generating, by the one or more processors, an input dataset from the weather data and the satellite data; providing, by the one or more processors, the input dataset to a trained machine learning model configured to determine from a portion of the input dataset corresponding to the satellite data one or both of: crop health data of a crop in the area of land, and soil content data of soil in the area of land; determining, by the one or more processors via the trained machine learning model, a predicted crop yield of the area of land based on the input dataset, the crop health data, the soil content data, and / or historical yield data; and initiating, by the one or more processors, one or more recommended actions based on the predicted crop yield.

[0010] In an aspect, a trained machine learning model includes a first convolutional neural network (CNN) model configured to operate on a weather data portion of the input dataset; a second CNN model configured to operate on a satellite data portion of the input dataset; and a recurrent neural network (RNN) model configured to receive as inputs, the output of the first CNN model and the output of the second CNN model. According to aspects, the RNN model is configured to provide the predicted crop yield.

[0011] Using the aforementioned features, which are described in more detail below with reference to FIGs. 1-4, a system for predicting crop yield and estimating soil health using a trained machine learning model may be provided that enables several benefits and technical improvements. For example, more accurate and timely crop yield estimation may enable better planning of agricultural resources, and may allow for increasing crop yield at time of harvest. The disclosed systems and methods also provide functionality that supports operations to assess land capacity and create a rating in relation to the predicted yield of the current crop. Such operations may enable accurate estimates of crop yield without physically sampling the soil or the crops in an area of land. Such benefits may save time and labor from physically sampling soil and vegetation at the area of land. Moreover, the predicted crop yield may be determined by the trained machine learning model more quickly than by traditional physical sampling,300879086.1 - 3 -which may prevent unnecessary crop loss due to damage during physical sampling or from delayed action. As such, the systems and methods disclosed herein may improve the accuracy and reliability of food security initiatives by determining food output to meet the nutritional needs of an expanding global population.

[0012] Non-limiting aspects include the following aspects. Aspect 1: A method for predicting crop yield and estimating soil health, the method comprising: receiving, by one or more processors, weather data and satellite data corresponding to an area of land; generating, by the one or more processors, an input dataset from the weather data and the satellite data; providing, by the one or more processors, the input dataset to a trained machine learning model configured to determine crop health data of a crop in the area of land and / or soil content data of soil in the area of land from a portion of the input dataset corresponding to the satellite data; determining, by the one or more processors via the trained machine learning model, a predicted crop yield of the area of land based on the input dataset, the crop health data, the soil content data, and / or historical yield data; and initiating, by the one or more processors, one or more recommended actions based on the predicted crop yield.

[0013] Aspect 2: The method of aspect 1, wherein the trained machine learning model is configured to predict crop yield based on a type of the crop.

[0014] Aspect 3: The method of any one of aspects 1 to 2 further comprising determining the soil content data without physically sampling the soil in the area of land.

[0015] Aspect 4: The method of any one of aspects 1 to 3, wherein the trained machine learning model comprises: a first convolutional neural network (CNN) model configured to operate on a weather data portion of the input dataset; a second CNN model configured to operate on a satellite data portion of the input dataset; and a recurrent neural network (RNN) model configured to receive as inputs, the output of the first CNN model and the output of the second CNN model, wherein the RNN model is configured to provide the predicted crop yield.

[0016] Aspect 5: The method of aspect 4, wherein the RNN model is configured to capture temporal dependencies in one or both of the weather data portion and the satellite data portion.

[0017] Aspect 6: The method of any one of aspects 1 to 5, wherein generating the input dataset comprises filtering the satellite data to remove atmospheric interference from the satellite data.

[0018] Aspect 7: The method of any one of aspects 1 to 6, wherein generating the input dataset comprises applying cloud masking to filter out a portion of the satellite data corresponding to images of the area of land when obscured by clouds.

[0019] Aspect 8: The method of any one of aspects 1 to 7, wherein the soil content data comprises data for a moisture level of soil in the area of land, data for a nutrient content of the300879086.1 - 4 -soil in the area of land, data for the pH of the soil in the area of land, data for the roughness of the soil in the area of land, data for the density of the soil in the area of land, or a combination thereof.

[0020] Aspect 9: The method of any one of aspects 1 to 8, wherein the satellite data comprises spectral data obtained in the microwave band.

[0021] Aspect 10: The method of any one of aspects 1 to 9, wherein the satellite data comprises spectral data obtained in the centimeter wavelength.

[0022] Aspect 11: The method of aspect 10, wherein the satellite data comprises spectral data obtained in the 5.2 centimeter to 6 centimeter wavelength.

[0023] Aspect 12: The method of aspect 11, wherein the satellite data comprises spectral data obtained in the 5.4 centimeter to 5.8 centimeter wavelength, or alternatively wherein the satellite data comprises spectral data obtained in a wavelength of substantially 5.6 centimeters.

[0024] Aspect 13: The method of any one of aspects 1 to 12, wherein the area of land is larger than 1 acre.

[0025] Aspect 14: The method of any one of aspects 1 to 13, wherein the predicted crop yield provides two or more predicted yields for the crop at different dates.

[0026] Aspect 15: The method of any one of aspects 1 to 14, wherein the predicted crop yield comprises a predicted crop yield for a date at least 10 days before an expected harvest.

[0027] Aspect 16: A computer readable medium including processor-readable code which, when executed by one or more processors cause the one or more processors to perform operations including the method of any one of aspects 1 to 15.

[0028] Aspect 17: A system for predicting crop yield and estimating soil health, the system comprising: includes a memory storing processor-readable code; and at least one processor coupled to the memory, the at least one processor configured to execute the processor-readable code to cause the at least one processor to perform operations including the method of any one of aspects 1 to 15.

[0029] Aspect 18: The system of aspect 17, further comprising the computer readable medium of aspect 16.

[0030] As used herein, the term “coupled” means connected, although not necessarily directly, and not necessarily mechanically; two items that are “coupled” may be unitary with each other.

[0031] The terms “a” and “an” are defined as one or more unless this disclosure explicitly requires otherwise.

[0032] As used herein, including in the claims, the term “or,” when used in a list of two or more items, means that any one of the listed items may be employed by itself, or any300879086.1 - 5 -combination of two or more of the listed items may be employed. For example, if a composition is described as containing components A, B, or C, the composition may contain A alone; B alone; C alone; A and B in combination; A and C in combination; B and C in combination; or A, B, and C in combination. Also, as used herein, including in the claims, “or” as used in a list of items prefaced by “at least one of’ indicates a disjunctive list such that, for example, a list of “at least one of A, B, or C” means A or B or C or AB or AC or BC or ABC (that is A and B and C) or any of these in any combination thereof.

[0033] The term “substantially” is defined as largely, but not necessarily wholly, what is specified (and includes what is specified; for example, substantially 90 degrees includes 90 degrees and substantially parallel includes parallel), as understood by a person of ordinary skill in the art. In any disclosed implementations, the term “substantially” may be substituted with “within [a percentage] of’ what is specified, where the percentage includes .1, 1, 5, or 10 percent.

[0034] The terms “comprise” (and any form of comprise, such as “comprises” and “comprising”), “have” (and any form of have, such as “has” and “having”), and “include” (and any form of include, such as “includes” and “including”) are open-ended linking verbs. As a result, an apparatus or system that “comprises,” “has,” or “includes” one or more elements possesses those one or more elements, but is not limited to possessing only those elements. Likewise, a method that “comprises,” “has,” or “includes,” one or more steps possesses those one or more steps, but is not limited to possessing only those one or more steps.

[0035] The foregoing has outlined rather broadly the features and technical advantages of the present invention in order that the detailed description of the invention that follows may be better understood. Additional features and advantages of the invention will be described hereinafter which form the subject of the claims of the invention. It should be appreciated by those skilled in the art that the conception and specific embodiment disclosed may be readily utilized as a basis for modifying or designing other structures for carrying out the same purposes of the present invention. It should also be realized by those skilled in the art that such equivalent constructions do not depart from the spirit and scope of the invention as set forth in the appended claims. The novel features which are believed to be characteristic of the invention, both as to its organization and method of operation, together with further objects and advantages will be better understood from the following description when considered in connection with the accompanying figures. It is to be expressly understood, however, that each of the figures is provided for the purpose of illustration and description only and is not intended as a definition of the limits of the present invention.300879086.1 - 6 -BRIEF DESCRIPTION OF THE DRAWINGS

[0036] For a more complete understanding of the disclosed methods and apparatuses, reference should be made to the embodiments illustrated in greater detail in the accompanying drawings.

[0037] FIG. 1 is a block diagram illustrating an example system for predicting crop yield and estimating soil health in accordance with aspects of the present disclosure.

[0038] FIG. 2 is a block diagram illustrating an example machine learning model for predicting crop yield and estimating soil health in accordance with aspects of the present disclosure.

[0039] FIG. 3 is a flow chart illustrating an example process for predicting crop yield and estimating soil health in accordance with aspects of the present disclosure.

[0040] FIG. 4 shows an example user interface for a system for predicting crop yield and estimating soil health in accordance with aspects of the present disclosure.

[0041] FIGs. 5a-5c show an example graphical representation of rainfall data for areas of land over a period of time in accordance with aspects of the present disclosure.

[0042] FIGs. 6a-6c show an example graphical representation of temperature data for areas of land over a period of time in accordance with aspects of the present disclosure.

[0043] FIGs. 7a-7c show an example graphical representation of relative humidity data for areas of land over a period of time in accordance with aspects of the present disclosure.

[0044] FIGs. 8a-8c show an example graphical representation of wind speed data for areas of land over a period of time in accordance with aspects of the present disclosure.

[0045] FIGs. 9a-9c show an example graphical representation of crop health data for areas of land over a period of time in accordance with aspects of the present disclosure.

[0046] FIGs. 10a- 10c show an example graphical representation of soil moisture data for areas of land over a period of time in accordance with aspects of the present disclosure.

[0047] FIG. 11 shows an example graphical representation of soil moisture data for an area of land over a period of time in accordance with aspects of the present disclosure.

[0048] FIG. 12 shows an example graphical representation of soil content data for portions of an area of land in accordance with aspects of the present disclosure.

[0049] FIG. 13 shows example spatial maps of crop health at different growth crop growth stages for an area of land.

[0050] FIG. 14 shows example spatial maps of crop health at different growth crop growth stages for an area of land.

[0051] FIG. 15 shows example spatial maps of crop health at different growth crop growth stages for an area of land.300879086.1 - 7 -

[0052] FIGs. 16a to 16c show example spatial maps of soil nutrient distribution across an area of land in accordance with aspects of the present disclosure.

[0053] FIGs. 17a to 17b show example spatial maps of a backscatter coefficient for microwave data captured in the centimeter band across an area of land in accordance with aspects of the present disclosure.

[0054] FIGS. 18a to 18d show example spatial maps of reflectance data captured in different spectral bands across an area of land in accordance with aspects of the present disclosure.

[0055] It should be understood that the drawings are not necessarily to scale and that the disclosed embodiments are sometimes illustrated diagrammatically and in partial views. In certain instances, details which are not necessary for an understanding of the disclosed methods and apparatuses or which render other details difficult to perceive may have been omitted. It should be understood, of course, that this disclosure is not limited to the particular embodiments illustrated herein.DETAILED DESCRIPTION

[0056] With recent advancements in artificial intelligence (Al), machine learning, and satellite imagery, crop yields can now be predicted more accurately. Applicant has found that multispectral and / or hyperspectral sensors in satellites, together with advanced data processing techniques, may be used for the extraction of information such as crop health, weather data, and environmental variables in order to more accurately anticipate crop yield production. Weather and agrology data can be provided to a trained machine learning model to predict the crop yield at a farm level with high accuracy and to provide better understanding of soil chemistry and plant health with less of a need for physical sampling of soil and plants in an area of land. Machine learning models may also be adaptable to changes in input data, including changes over an extended period of time.

[0057] Aspects of the present disclosure provide systems, methods, and computer-readable storage media providing functionality for predicting crop yield and estimating soil health using satellite data and machine learning models. The figures are used herein to provide non-limiting examples of embodiments of the systems, methods, and computer-readable storage media disclosed herein.

[0058] Referring to FIG. 1, a block diagram illustrating an example system for predicting crop yield and estimating soil health in accordance with aspects of the present disclosure is shown as a system 100. As shown in FIG. 1, the system 100 includes a computing device 110. In an300879086.1 - 8 -aspect, the functionality described with respect to the computing device 110 may be implemented via a cloud, as shown by cloud-based logic 152, rather than or in addition to via a server or other type of computing device. Additionally or alternatively, functionality described with respect to the computing device 110 may be implemented via multiple computing devices (e.g., computing devices 130) or across a network 150. The computing device 110 includes one or more processors 112, a memory 114, one or more machine learning (ML) models 122, an input formatting engine 120, one or more communication interfaces 124, and one or more input / output (I / O) devices 126. Each of the one or more processors 112 may be a central processing unit (CPU), a graphics processing unit (GPU), or other computing circuitry (e.g., a microcontroller, one or more application specific integrated circuits (ASICs), and the like) and each processor 112 may have one or more processing cores.

[0059] The memory 114 may include read only memory (ROM) devices, random access memory (RAM) devices, one or more hard disk drives (HDDs), flash memory devices, solid state drives (SSDs), network attached storage (NAS) devices, other devices configured to store data in a persistent or non-persistent state, or a combination of different memory devices. The memory 114 may store instructions 116 that, when executed by the one or more processors 112, cause the one or more processors 112 to perform operations, such as the operations described in connection with the computing device 110 with reference to FIGs. 1-4 and / or the Examples herein. The memory 114 may also store one or more databases 118 configured to store data. For example, database 118 may store data to be processed or preprocessed by input formatting engine 120, to be provided to the machine learning model 122 as an input, or that is received as an output of the machine learning model 122. Database 118 may include a structured database management system configured to handle large amounts of field and satellite data generated and / or referenced by agricultural systems.

[0060] The one or more communication interfaces 124 may be configured to communicatively couple the computing device 110 to the one or more networks 150 via wired or wireless communication links according to one or more communication protocols or standards (e.g., an Ethernet protocol, a transmission control protocol / intemet protocol (TCP / IP), an institute of electrical and electronics engineers (IEEE) 802.11 protocol, and an IEEE 802.16 protocol, a 3rd Generation (3G) communication standard, a 4th Generation (4G) / long term evolution (LTE) communication standard, a 5th Generation (5G) communication standard, and the like). The EG devices 126 may include one or more display devices, a keyboard, a stylus, one or more touchscreens, a mouse, a trackpad, a camera, one or more speakers, haptic feedback300879086.1 - 9 -devices, or other types of devices that enable a user to receive information from or provide information to the computing device 110.

[0061] Input formatting engine 120 may be configured to receive data from one or more data sources 140 over the network 150. Data received from the data sources 140 may include data from one or more satellites 142 (e.g., satellite data 144) or weather data (e.g., weather data 146). Satellite data 144 may be received, purchased, downloaded, or otherwise acquired from the data sources 140. For example, Sentinel- 1 or Sentinel-2 satellite data may be downloaded and stored in memory 114 or database 118. Alternatively, satellite data 144 may be retrieved from or accessed over the network 150. Satellite data 144 may include data corresponding to an area of land of interest (e.g., an area of land on which crops are growing), or it may be general satellite data. Satellite data 144 may include multispectral data and / or hyperspectral data, including data obtained in multiple bands such as two or more of optical, infrared, thermal, microwave, radio, ultraviolet, x-ray, and the like. Satellite data 144 may further include data from one or more satellite-based indices, such as the Normalized Difference Vegetation Index (ND VI), the Enhanced Vegetation Index (EVI), the Soil- Adjusted Vegetation Index (SAVI), the Green Normalized Difference Vegetation Index (GNDVI), the chlorophyll vegetation index (CVI), the Radar Vegetation Index (RVI), and the like. Satellite data 144 may include soil content data, including soil moisture (SM) data, soil nutrient data (e.g., data regarding nitrogen, phosphorus, potassium, organic carbon, pH, and / or other data related to soil chemistry), land surface water index (LSWI) data, and land surface temperature (LST) data. With respect to soil chemistry data, accurate readings for the soil content from satellite data 144 may support precision application of fertilizers and / or water to particular areas of land (or particular sections of an area of land) to improve overall soil health. In an example, spectral data may be used to identify soil content data for an area of land, including soil moisture, soil nutrients, or soil roughness. Crop health may also be determined from spectral data. For example, data obtained in the microwave band may be useful for capturing soil content data. In an aspect, the microwave band may be a centimeter band (e.g., micro wave radiation having wavelengths of centimeters). In an aspect, the microwave band may be within the range of 5.2 to 6.0 cm. In an aspect, the microwave band may be within the range of 5.4 to 5.8 cm. In an aspect, the microwave band may have wavelengths of approximately or substantially 5.6 cm. Microwave data in these bands may be able to penetrate cloud cover while still capturing useful information about crop health and soil content, and so may be more reliable. An example of the resolution, wavelength, and application of various spectral data (e.g., in various Sentinel- 1 bands) is summarized below in Table 1. Example data captured across portions of an area of300879086.1 - 10 -land in the centimeter band is shown in FIG. 17a, which shows a spatial representation of a backscatter coefficient for microwave data captured in the centimeter band using verticalvertical (VV) polarization, and in FIG. 17b, which shows a spatial representation of a backscatter coefficient for microwave data captured in the centimeter band using verticalhorizontal (VH) polarization. VV refers to vertical polarization for both transmission and reception in radar-based remote sensing. This band is included in Sentinel- 1 radar data and may help in studying land cover, vegetation structure, and surface roughness. VH is a polarization mode used in radar imagery, and currently is available in datasets from missions like Sentinel- 1 (e.g., a Synthetic Aperture Radar (SAR) mission). VH may represent radar signals transmitted in a vertical polarization (V) and received in a horizontal polarization (H).Table 1. Wavelengths of Spectral Data and Applications in Crop Data Analysis.

[0062] An example of the resolution, wavelength, and application of other various spectral data (e.g., in various Sentinel-2 bands) is summarized below in Table 2. Example data captured in several spectral bands is shown in FIG. 18a-18d. FIG. 18a shows a spatial representation (e.g., a spatial map) of reflectance data captured in the near-infrared (NIR) band across portions of an area of land. The NIR band may refer to the Band 8 of Sentinel-2. This band may capture reflectance data in the near-infrared spectrum, such as specifically at 842 nm or at, between, any range thereof, or any number therebetween any of 800, 810, 820, 825, 830, 835, 840, 845, 850, 855, 860, 865, 870, 875, 880, 885, 890, 895, 900 nm. FIG. 18b shows a spatial representation of a reflectance data captured in the red edge 4 band across portions of an area of land. This red edge 4 band is part of the red-edge spectral region, which may be sensitive to vegetation changes, including chlorophyll content and canopy structure. FIG. 18c shows a spatial representation of a reflectance data captured in a shortwave infrared (SWIR) band across portions of an area of land. The SWIR 1 band illustrated in FIG. 18c is sensitive to vegetation properties, especially in terms of water content and plant health. The SWIR 1 band may be sensitive to the moisture content in soils, and is often used for mapping different geological and soil types. FIG. 18d shows a spatial representation of a reflectance data captured in a300879086.1 - 11 -second shortwave infrared (SWIR) band across portions of an area of land. The SWIR 2 band illustrated in FIG. 18d is part of the shortwave infrared (SWIR) spectrum, and may be useful for distinguishing between different land cover types, particularly in arid and semi-arid areas.Table 2. Wavelengths of Spectral Data and Applications in Crop Data Analysis.

[0063] In an aspect, the soil content data obtained by satellite may also reduce or eliminate the need for physical laboratory sampling of soil in the plot of land, which may both save time and cost in determining soil health, and may provide a more accurate and wholistic picture of the soil content of the area of land. In conjunction with providing vegetation indices and evaluating crop health, satellite data for soil content may be especially valuable for evaluating large areas of land or farms and predicting their crop yield, as their size may be prohibitively large to allow for representative physical sampling.

[0064] Returning to FIG. 1, Weather data 146 may include rainfall data, windspeed data, relative humidity data, temperature data, and the like. Weather data 146 may include weather data obtained through local instruments, weather data obtained through one or more weather300879086.1 - 12 -models, weather data obtained through the satellite 142, weather data from large weather databases, and the like. Weather data 146 may include data corresponding to an area of land of interest (e.g., an area of land on which crops are growing), local weather data, regional weather data, or it may be general weather data or information. Weather data 146 and satellite data 144 may include current data (e.g., live data, such as may be streamed directly from a data source), recent data (e.g., data from the last minute, hour, day, week, month, year, and so on), historical data, or a combination thereof.

[0065] Input formatting engine 120 may be configured to reduce large general satellite and weather datasets to more useful datasets by removing less relevant data from a dataset. For example, input formatting engine 120 may filter out data that does not correspond to an area of land of interest (e.g., based on coordinate data for the satellite data or weather data). As an additional or alternative example, input formatting engine 120 may filter out data that does not correspond to metrics relevant in predicting crop yield or evaluating soil content. Additionally or alternatively, input formatting engine 120 may be configured to retrieve only the portions of the general satellite and weather datasets that may be useful as inputs and / or training data for the machine learning model 122.

[0066] Input formatting engine 120 may be configured to prepare satellite data 144 and / or weather data 146 to be ingested or used by ML model 122. According to aspects, input formatting engine 120 may be configured to clean the data. For example, noise or aberrations from atmospheric distortions, microwave speckle, radiation backscatter, ground reflections, or other sources, may cause outliers in a dataset that should be removed so as not to skew the dataset. Input formatting engine 120 may be configured to remove outliers or inconsistencies from the dataset. One way this may be done is to remove data points from the dataset that exceed or fail to meet a threshold. Input formatting engine 120 may be configured to preprocess satellite data 144 to remove atmospheric distortions, speckle noise, and / or terrain distortions. For example, input formatting engine 120 may employ techniques such as the Savitzky-Golay filter for crop health data or an exponential filter for soil moisture data. Additionally or alternatively, input formatting engine 120 may be configured to normalize the dataset. Other methods for filtering and preparing datasets are also known in the art. An alternative or additional operation for preparing the dataset is to categorize the dataset based on crop growth stages of crops growing on an area of land, including germination, vegetative, reproductive and maturity phases. Crop growth stages may be determined based on the date that crops were sown and the type of the crop. According to aspects, the cleaned, filtered, and / or categorized data may be used to assess land capacity and create a rating in relation to300879086.1 - 13 -the anticipated yield of the current crop. Input formatting engine 120 may be configured to extract features from the dataset into feature vectors or another format useful for ingestion by the machine learning model 122.

[0067] Machine learning model 122 may be configured to receive inputs of data, including but not limited to satellite data 144 and / or weather data 146, or features extracted from such data and to generate a predicted yield for a crop growing on the area of land. While illustrated as one machine learning model 122, an ordinary artisan would understand that multiple machine learning models may be included. For example, a machine learning model 122 may be provided for each area of land of a plurality of areas of land and configured to predict yield for the crop(s) in each area of land. Additionally or alternatively, multiple machine learning models 122 may be provided and / or configured for each type of crop in each area of land. Machine learning model 122 may also be configured to provide one or more recommended actions to improve crop health and / or initiate such an action in or for the area of land. A more detailed description of how machine learning model 122 may generate a predicted yield is provided below with respect to FIGs. 2-4.

[0068] Generally speaking, machine learning techniques train models to accurately make predictions on data fed into the models (e.g., what was said by a user in a given utterance; whether a noun is a person, place, or thing; what the weather will be like tomorrow, and so on). During a learning phase, the models are developed against a training dataset of inputs to optimize the models to correctly predict the output for a given input. Generally, the learning phase may be supervised, semi- supervised, or unsupervised, indicating a decreasing level to which the “correct” outputs are provided in correspondence to the training inputs. In a supervised learning phase, all of the outputs are provided to the model and the model is directed to develop a general rule or algorithm that maps the input to the output. In contrast, in an unsupervised learning phase, the desired output is not provided for the inputs so that the model may develop its own rules to discover relationships within the training dataset. In a semisupervised learning phase, an incompletely labeled training set is provided, with some of the outputs known and some unknown for the training dataset.

[0069] Models may be run against a training dataset for several epochs (e.g., iterations), in which the training dataset is repeatedly fed into the model to refine its results. For example, in a supervised learning phase, a model is developed to predict the output for a given set of inputs and is evaluated over several epochs to more reliably provide the output that is specified as corresponding to the given input for the greatest number of inputs for the training dataset. In another example, for an unsupervised learning phase, a model is developed to cluster the300879086.1 - 14 -dataset into n groups and is evaluated over several epochs as to how consistently it places a given input into a given group and how reliably it produces the n desired clusters across each epoch.

[0070] Once an epoch is run, the models are evaluated and the values of their variables are adjusted to attempt to better refine the model in an iterative fashion. The values may be adjusted in several ways depending on the machine learning technique used. For example, in a genetic or evolutionary algorithm, the values for the models that are most successful in predicting the desired outputs are used to develop values for models to use during the subsequent epoch, which may include random variation and / or mutation to provide additional data points. One of ordinary skill in the art will be familiar with several other machine learning algorithms that may be applied with the present disclosure, including linear regression, random forests, k-nearest neighbor clustering, decision tree regression, neural networks, deep neural networks, convolutional neural networks, recurrent neural networks, support vector machines (SVM), and so forth.

[0071] Each model develops a rule or algorithm over several epochs by varying the values of one or more variables affecting the inputs to more closely map to a desired result, but as the training dataset may be varied, and is preferably very large, perfect accuracy and precision may not be achievable. A number of epochs that make up a learning phase, therefore, may be set as a given number of trials or a fixed time / computing budget, or may be terminated before that number / budget is reached when the accuracy of a given model is high enough or low enough or an accuracy plateau has been reached. For example, if the training phase is designed to run n epochs and produce a model with at least 95 % accuracy, and such a model is produced before the nth epoch, the learning phase may end early and use the produced model satisfying the end-goal accuracy threshold. Similarly, if a given model is inaccurate enough to satisfy a random chance threshold (e.g., the model is only 55% accurate in determining true / false outputs for given inputs), the learning phase for that model may be terminated early, although other models in the learning phase may continue training. Similarly, when a given model continues to provide similar accuracy or vacillate in its results across multiple epochs — having reached a performance plateau — the learning phase for the given model may terminate before the epoch number / computing budget is reached.

[0072] Once the learning phase is complete, the models are finalized and / or verified. In some example embodiments, models that are finalized are evaluated against testing criteria. In an example, a testing dataset that includes known outputs for its inputs is fed into the finalized models to determine an accuracy of the model in handling data that is has not been trained on.300879086.1 - 15 -

[0073] Reference is now made to FIG. 2, in which a block diagram illustrating an example machine learning model for predicting crop yield and estimating soil health in accordance with aspects of the present disclosure is presented as machine learning model 200. In FIG. 2, machine learning model 200 is illustrated as a series of iterations in time of a recurrent neural network (RNN) model with convolutional neural network (CNN) models configured to provide inputs to the RNN model. Machine learning model 200 (also referred to herein as ML model 200), includes iteration 210 at time (t-k), iteration 230 at time (t-k+1), iteration 250 at time (t-1) and iteration 270 at time (t). Here, k signifies the length of time dependencies within the time series data. The iterations 210, 230, 250, and 270 of the machine learning model 200 illustrated in FIG. 2 are provided as examples of how a recurrent neural network may receive or operate on time series datasets. RNN models like ML model 200 may be conceptualized as a recurring iteration of the same ML architecture (e.g., a recurrent unit), rather than a series of layers. Unlike feedforward neural networks, which process data in a single pass, RNNs may process data across multiple time steps, making them well-adapted for modelling and processing time series data. Time series datasets, including weather data and satellite data, may be of particular use when evaluating soil content or when predicting crop yield at least because the time series datasets allow for observing changes in the data over time, which may enable a machine learning model to identify relationships between the data points and predict future outcomes more accurately. For example, observing changes in the soil content (e.g., nutrient levels or soil moisture) for an area of land may enable machine learning model 200 to more accurately predict crop yield or recommend actions to be performed or changes to be made to the soil content (e.g., applying additional or different nutrients, changing an irrigation schedule or water application method, planting seeds in the area of land, and so on) that may increase the crop yield.

[0074] Certain farm activities and practices are inherently challenging to quantify, as they do not directly correspond to measurable units or outputs. For example, crop rotation, which involves systematically alternating crops in a field to maintain soil health and reduce pests, is a qualitative practice that influences long-term productivity rather than immediate metrics. Similarly, the choice and implementation of irrigation methods, such as drip irrigation, sprinkler systems, or flood irrigation, affect water efficiency and crop growth but may not be easily captured numerically. Other such activities include soil preparation techniques, like ploughing and mulching, which improve soil structure and fertility, integrated pest management, which employs biological and cultural methods to control pests and practices like intercropping or agroforestry, which enhance biodiversity and resilience. These activities300879086.1 - 16 -contribute significantly to sustainable farming outcomes but require a more wholistic approach to documentation and analysis to reflect their impact. Data reflecting or corresponding to farm activities such as these may be provided to ML model 200 as management data. Such data may add accuracy to the crop yield predictions.

[0075] The machine learning model 200 may include a first RNN 212. First RNN 212 may be configured to operate on time series data corresponding to time (t-k). First RNN 212 may receive as inputs management data 214, predicted yield data 216, and the output(s) of convolutional neural network (CNN) models configured to operate on weather data and satellite data. Management data 214 may include data (e.g. time series data) related to managing or operating in the area of land, including data related to the performance of crop rotation, soil preparation, irrigation, pest management, or other farming practices. Predicted yield data 216 may be obtained using a previous iteration of the model, may reflect an expected or an optimistic outcome for the crop, may reflect one or more threshold yield amounts (e.g., a minimum profitable yield, a lower bound on expected yield, an upper bound on expected yield), may include statistical estimations of predicted yield using a statistical model, may include data derived from industry knowledge, or a combination thereof.

[0076] The CNN models may be configured to identify or extract spatial features (e.g., corresponding to the area of land or individual portions of the area of land) from the weather data and satellite data. A first weather data CNN 222 may receive as input time series weather data 218 corresponding to time (t-k). The output of the first weather data CNN 222 may be provided to a fully connected (FC) layer 224, and the output of the FC layer 224 may be provided to the first RNN 212. A first satellite data CNN 226 may receive as input time series satellite data 220 corresponding to time (t-k). The output of the first satellite data CNN 226 may be provided to a fully connected (FC) layer 228. The output of the FC layer 228 may be provided to the first RNN 212. FC layers 224 and 228 may be configured to optimize classification scores generated by the respective CNNs 222 and 226. According to some aspects, the outputs of the FC layers 224 and 228 may be combined before being provided as input to the first RNN 212. Alternatively, the outputs of the FC layers 224 and 228 may be provided to the first RNN 212 without first being combined. The RNN model 212 may be configured to capture temporal dependencies in the data inputs and / or the outputs of the respective CNNs 222 and 226.

[0077] The output of the first RNN 212 may be provided to second RNN 232 in iteration 230. While illustrated here in a linear fashion for ease of visualization, second RNN 232 is not so much a different RNN from first RNN 212 as a second iteration of the RNN, receiving similar300879086.1 - 17 -inputs data as first RNN 212, but corresponding to time (t-k+1). As such, an alternative way to understand providing the output of first RNN 212 to second RNN 232 is that a recurrent unit of the RNN maintains a hidden state, essentially a form of memory, which is updated at each time step based on the current input and the previous hidden state. This feedback loop allows the network to learn from past inputs and incorporate that knowledge into its current processing.

[0078] At iteration 230, second RNN 232 receives as input management data 234, predicted yield data 236, and the outputs of weather data CNN 242 and satellite data CNN 246, each operating on or including data corresponding to time (t-k+1). The management data 234 and the predicted yield data may be similar to management data 214 and predicted yield data 216 , respectively, but corresponding to time (t-k+1). As would be recognized by one of ordinary skill in the art, the operations of weather data CNN 242 (e.g., on weather data 238), including FC layer 244, are similar to those of weather data CNN 218 described above. Likewise, the operations of satellite data CNN 246 (e.g., on satellite data 240), including FC layer 248, are similar to those of satellite data CNN 220 described above. The output of second RNN 232 may be provided to third RNN 252 in iteration 250 corresponding to time (t-1). Alternatively, the output of second RNN 232 may be provided to an RNN in an intervening iteration (e.g., operating on data corresponding to a time between time (t-k+1) and time (t-1). Any number of intervening iterations may be performed to the extent that there are distinct time steps in the several input datasets the ML model 200 is configured to operate on.

[0079] Iteration 250 and iteration 270 may operate similarly to iterations 210 and 230, but for data corresponding to times (t-1) and (t), respectively. Iteration 250 at time (t-1) may include RNN 252 with inputs including a previous RNN output or recurrent unit (e.g., the output of RNN 232), management data 254, predicted yield data 256, and the outputs of weather data CNN 262 and satellite data CNN 266. Weather data CNN 262 receives input of weather data 258 and satellite data CNN 266 receives input of satellite data 260. Weather data CNN 262 may include FC layer 264. Satellite data CNN 266 may include FC layer 268. Iteration 270 at time (t) may include RNN 272 with inputs including a previous RNN output or recurrent unit (e.g., the output of RNN 252), management data 274, predicted yield data 276, and the outputs of weather data CNN 282 and satellite data CNN 286. Weather data CNN 282 receives input of weather data 278 and satellite data CNN 286 receives input of satellite data 280. Weather data CNN 282 may include FC layer 284. Satellite data CNN 286 may include FC layer 288. Operations of iteration 250 and iteration 270 are not described in detail, as such300879086.1 - 18 -operations would be readily understood by one of skill in the art in view of the above discussion of iterations 210 and 230.

[0080] In some implementations, at time (t), which may correspond to a current time step or a most recent time step in the datasets, a desired yield range Y'(t) 290 (e.g., a value, a range, or a dataset) may be provided as an additional input to RNN 272. Providing a desired yield range Y'(t) 290 at a later time relative to the time series data (e.g., at time (t) ) may enable more accurate predictions of the actual crop yield. The desired yield range Y'(t) 290 may include or correspond to yield predictions obtained in earlier increments of time to the present time (t). For example, if the time (t) corresponds to a time of harvest, the desired yield range Y'(t) 290 may include a harvest yield predicted at a time of sowing, at 60 days after sowing, 90 days after sowing, some interval of time other than those already mentioned, or some combination thereof. When the ML model has completed its operation, a predicted crop yield may be generated as output 292. For example, predicted crop yields at each time step may be aggregated and / or averaged to obtain a final yield estimate. Output 292 may include an assessment of land capacity or a rating in relation to the predicted yield of the current crop. Output 292 may also include recommended action(s) to be performed based at least in part on the predicted crop yield. Recommended actions may include operations to the area of land that may cause the predicted crop yield to more closely resemble the optimal yield range 290. For example, output 292 may include a recommendation to modify an irrigation schedule to an area of land, apply nutrients to the soil of the area of land, apply a pesticide to the area of land, or to plant seeds in the area of land. Recommended actions may be general to the area of land, or may be specifically determined for portions of the area of land. Output 292 may cause one or more processors to initiate the recommended action(s), including an action to modify or retrain the ML model 200, generate a signal or alert to modify how nutrients are provided to the area of land, display a report of the predicted crop yield at a display device, and so on.

[0081] The machine learning model 200 may be trained on real or simulated datasets to perform soil content evaluation and crop yield prediction. As an example, simulated time series datasets may be generated to train the model. Such simulated datasets may represent environmental conditions relevant to the growth of a particular type of crop. Simulated datasets may be generated based on optimal growing conditions for the crop, to the extent such conditions are known. Training the ML model 200 may include converting simulated datasets into a percentage loss in crop yield using fuzzy logic. For example, the percentage loss in crop yield may be used as a target variable to train the ML model 200. Alternatively or additionally, the machine learning model may be trained using historical data (e.g., historical satellite data,300879086.1 - 19 -weather data, and / or crop yield data). Whether trained on actual or simulated datasets, ML model 200 may advantageously be trained for a specific type of crop (e.g., cereals such as maize, paddy, wheat, or oats; fruits; vegetables; and so on) so that growth patterns and crop yields specific to the type of crop may be more accurately predicted. Once trained, the ML model 200 may be applied to new datasets of weather and satellite data to predict one or more crop yields. Newly predicted crop yields may be provided as feedback to the model. Crop yield predictions may also be validated at intervals for crop yield analysis (e.g., at 60 days after sowing, 90 days after sowing, or at harvest). For example, the predicted yields may be validated by comparing them with actual yields obtained from field data, and calculating performance metrics such as root mean square error (RMSE), mean absolute error (MAE), and R2to evaluate model accuracy. These performance metrics may also be used to refine, modify and / or retrain the ML model 200. Validation may also account for the final yield of the area of land with or without losses occurring due to machinery or occurring during the threshing process.

[0082] An advantage of the machine learning architecture described with respect to FIG. 2 of an RNN with inputs coming from the outputs of two CNNs is that this architecture may perform better than other architectures of machine learning models at estimating crop yield. For example, the combination of RNN and CNN models described herein may outperform other machine learning models in terms of RMSE, MAE, and / or R2for crop yield prediction accuracy. Another advantage of the architecture described is that it may be configured to account for or identify time dependencies and similar relationships within time series data (e.g., weather and satellite data), enabling more accurate predictions and recommended actions.

[0083] FIG. 3 is a flow chart illustrating an example process for predicting crop yield and estimating soil health in accordance with aspects of the present disclosure. The method described with respect to FIG. 3 may be performed, for example, by the one or more processors 112 of FIG. 1. At step 302, the one or more processors may receive weather data and satellite data corresponding to an area of land. The weather data and satellite data may include or correspond to the satellite data 144 and the weather data 146 described above. For example, the satellite data may include spectral data obtained in the microwave band, spectral data obtained in the centimeter wavelength, spectral data obtained in the 5.2 centimeter to 6 centimeter wavelength, spectral data obtained in the 5.4 centimeter to 5.8 centimeter wavelength, and / or spectral data obtained at a wavelength of substantially 5.6 cm. In some implementations, the area of land may be larger than 1 acre.300879086.1 - 20 -

[0084] At step 304, the one or more processors may generate an input dataset from the weather data and the satellite data. Generating an input dataset may include operations corresponding to those described above with reference to input formatting engine 120. For example, generating the input dataset may include removing outliers and / or inconsistencies from the weather data and / or the satellite data. In some implementations, generating the input dataset may include filtering the satellite data to remove atmospheric interference from the satellite data. In some implementations, generating the input dataset may include applying cloud masking to filter out a portion of the satellite data corresponding to images of the area of land when obscured by clouds. Additionally or alternatively, generating the input dataset may include performing radiometric correction of the data. In some implementations, generating the input dataset may include categorizing the data (e.g., based on type of crop, crop sowing date, crop maturity, area of land location, climate data, or other factors). Generating the input dataset may also include formatting the data into a format capable of being provided to or received by a trained machine learning model (e.g., extracting features from the data into feature vectors).

[0085] At step 306, the input dataset may be provided to a trained machine learning model. The trained machine learning model may be configured to determine crop health data of a crop in the area of land and / or soil content data of soil in the area of land from a portion of the input dataset corresponding to the satellite data. In an aspect, the trained machine learning model may be configured to predict crop yield based on a type of the crop (e.g., based on whether the crop is maize, paddy, a vegetable crop, a fruit crop, or some other type of crop). The soil content data determined by the trained machine learning model may include data for a moisture level of soil in the area of land, data for a nutrient content of the soil in the area of land, data for the pH of the soil in the area of land, data for the roughness of the soil in the area of land, data for the density of the soil in the area of land, or a combination thereof. According to some aspects, the soil content data may be determined without physically sampling the soil in the area of land. For example, soil content data may be determined entirely from the satellite data. The crop health data may include data determined using one or more satellite indices such as Normalized Difference Vegetation Index (ND VI), vegetation growth data, vegetation size data, vegetation color data, vegetation growth pattern data, or other metrics used for determining the health of a crop. NDVI is a commonly used index to measure the health and vigor of vegetation, although other methods are also known in the art.

[0086] At step 308, a predicted crop yield may be determined by the trained machine learning model based on the input dataset, the crop health data, the soil content data, and / or historical300879086.1 - 21 -yield data. The predicted crop yield may provide two or more predicted yields for the crop at different dates. For example, the predicted crop yield may provide a different predicted yield at 60 days from sowing, at 90 days from sowing, and / or at harvest. In another example, the predicted crop yield may include a predicted crop yield for a date at least 10 days before an expected harvest.

[0087] According to aspects of the present disclosure, the trained machine learning model may include a first convolutional neural network (CNN) model configured to operate on a weather data portion of the input dataset, a second CNN model configured to operate on a satellite data portion of the input dataset, and a recurrent neural network (RNN) model configured to receive as inputs, the output of the first CNN model and the output of the second CNN model. In some such implementations, the RNN model may be configured to provide the predicted crop yield. In some implementations, the outputs of the first CNN model and the second CNN model may be combined before being input to the RNN model.

[0088] At step 310 the method may include initiating one or more recommended actions based on the predicted crop yield. For example, the one or more recommended actions may include modifying, updating, or retraining the trained machine learning model; validating the predicted crop yield against an actual crop yield; generating a signal or alert indicating an instruction to modify how nutrients are provided to the area of land (e.g., apply a fertilizer); determining a specific fertilizer composition to improve the crop health and / or soil health; displaying the predicted crop yield at a display device; generating a report of the predicted crop yield; adjusting an irrigation schedule or method of water delivery; planting seeds in the area of land; determining a harvest schedule; and / or initiating one or more other actions as would be recognized by those of ordinary skill in the art. Such actions may be determined and / or performed generally for the area of land, or may be provided for a portion of the area of land, or individually for multiple portions of the area of land. Individual portions of the area of land may have the same or different recommended actions. The size of portions of the area of land may include or correspond to the resolution of the satellite data.

[0089] FIG. 4 illustrates an example user interface for a system for predicting crop yield and estimating soil health in accordance with aspects of the present disclosure as a user interface (UI) 400. UI 400 may allow a user to provide inputs to or receive information from a system for predicting crop yield and estimating soil health. For example, UI 400 may be configured to be displayed to a user on a display device. UI 400 may include one or more selectable elements for providing information to the UI 400. Selectable elements may include buttons, radio buttons, check boxes, sliders, drop-down menus, elements for uploading or downloading300879086.1 - 22 -files, and other such elements as would be recognized by one of ordinary skill in the art. For example, UI 400 may include selectable elements 410 for providing data to the system; selectable elements 420 for providing crop data; selectable elements 430 for providing data regarding nutrients applied to the area of land; selectable elements 440 for providing data related to ranges of weather conditions; and an output portion 450, which may display a predicted yield or a recommended action, or some other output. UI 400 may enable a user to input data and observe how the crop yield changes with respect to changes in input data. For example, by changing one or more parameters, a user may be able to identify sensitivities in the crop yield to the parameters.

[0090] Selectable elements 410 may include capability to select, upload, download, or transfer data files to the system. For example, selectable elements 410 may include element 410a for rainfall data, element 410b for temperature data, element 410c for windspeed data, element 410d for relative humidity data, element 410e for Normalized Difference Vegetation Index (ND VI) data, element 41 Of for land surface water index (LSWI) data, element 410g for soil moisture (SM) data, and / or element 410h for Radar Vegetation Index (RVI) data. A user may provide data files to one or more of the selectable elements 410, or the system may be configured to search for relevant data files (e.g., on a database or over a network). Previously used data files may be retained or made available for subsequent use. The data files provided to selectable elements 410 may include raw data files (e.g., as received from data sources), or they may be provided in a more accessible file format (e.g., including PNG, CSV, and JSON). Data files provided to selectable elements 410 may include real data (e.g., obtained by sensors on one or more satellites), may include simulated data generated for a crop or an area of land, or both.

[0091] Selectable elements 420 may allow for input of crop data to the system. For example, at element 422 data of a sowing date for a crop or an area of land may be input. Element 422 may allow for a single date input, a range input, or a time from harvest input. As another example, element 424 may allow for inputting data of whether a crop is irrigated, rainfed, or otherwise provided with water. Element 426 may allow for inputting data of a type of crop. The type of crop may be generic, such as whether the crop is a cereal, fruit, or vegetable crop. Or the type of crop may be increasingly specific, such as a particular species of crop (e.g., maize), or even a particular strain of the crop species.

[0092] Selectable elements 430 may allow for input of nutrient data to the system. For example, elements 432, 434, and 436 may allow for inputs of amounts nitrogen, phosphorus, and potassium respectively applied to the area of land. The inputs to selectable elements 430300879086.1 - 23 -are shown in FIG. 4 as an amount in kilograms per hectare, but could also be represented as a relative percentage or proportion of nutrients included in a fertilizer composition. Other nutrients or soil chemistry besides those shown in selectable elements 430 could also be included (e.g., calcium, organic carbon, pH, and so on). Additionally, inputs for a dates and / or times that the nutrients were applied or are to be applied may be included.

[0093] Selectable elements 440 may allow for providing data related to ranges of weather conditions. For example, element 442 may allow for inputting a range of rainfall, element 444 may allow for inputting a range of temperatures, element 446 may allow for inputting a range of wind speeds, and element 448 may allow for inputting a range of relative humidity. In some implementations, typical ranges for weather data for an area of land may be used as limits for the range of the selectable elements 440. According to other implementations, optimal ranges of weather conditions may be known for a given crop in a given area of land. As such, the ranges allow for improved forecasting of crop yield due to changes in weather data.

[0094] The UI 400 may provide data input to the UI 400 by the selectable elements to the trained machine learning model and may display the predicted yield determined by the trained machine learning model. For example, output portion 450 may display a predicted yield after data has been input into one or more of the selectable elements 410, 420, 430, or 440. Output portion 450 is illustrated as representing a yield in units of tons per hectare, but one of ordinary skill would recognize that output portion 450 could be configured to output estimated yield as a percentage of loss, or as graphical data, or in any other suitable manner. Output portion 450 may also be configured to display an indication of soil content data, such as soil chemistry or soil moisture. Output portion 450 may also be configured to display an indication of crop health data. Output portion 450 may also be configured to display a recommended action as described herein, or some other output.

[0095] UI 400 may be configured to initiate modification, training, and / or updating of the trained machine learning model based on the data input to UI 400. In other embodiments, UI 400 may be configured not to change the underlying trained machine learning model but may only provide inputs to and / or receive outputs from the trained machine learning model.

[0096] Although the embodiments of the present disclosure and their advantages have been described in detail, it should be understood that various changes, substitutions and alterations can be made herein without departing from the spirit and scope of the disclosure as defined by the appended claims. Moreover, the scope of the present application is not intended to be limited to the particular embodiments of the process, machine, manufacture, composition of matter, means, methods and steps described in the specification. As one of ordinary skill in300879086.1 - 24 -the art will readily appreciate from the present disclosure, processes, machines, manufacture, compositions of matter, means, methods, or steps, presently existing or later to be developed that perform substantially the same function or achieve substantially the same result as the corresponding embodiments described herein may be utilized according to the present disclosure. Accordingly, the appended claims are intended to include within their scope such processes, machines, manufacture, compositions of matter, means, methods, or steps.

[0097] Those of skill in the art will understand that one or more blocks (or operations) described with reference to Figures 1-4 may be combined with one or more blocks (or operations) described with reference to another of the figures. For example, one or more blocks (or operations) of Figure 3 may be combined with one or more blocks (or operations) of Figures 1-2. As another example, one or more blocks or selectable elements (or operations) associated with Figure 4 may be combined with one or more blocks (or operations) associated with Figures 1-3.

[0098] Those of skill in the art would further appreciate that the various illustrative logical blocks, modules, circuits, and algorithm steps described in connection with the disclosure herein may be implemented as electronic hardware, computer software, or combinations of both. To clearly illustrate this interchangeability of hardware and software, various illustrative components, blocks, modules, circuits, and steps have been described above generally in terms of their functionality. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the overall system. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present disclosure. Skilled artisans will also readily recognize that the order or combination of components, methods, or interactions that are described herein are merely examples and that the components, methods, or interactions of the various aspects of the present disclosure may be combined or performed in ways other than those illustrated and described herein.

[0099] The various illustrative logics, logical blocks, modules, circuits and algorithm processes described in connection with the implementations disclosed herein may be implemented as electronic hardware, computer software, or combinations of both. The interchangeability of hardware and software has been described generally, in terms of functionality, and illustrated in the various illustrative components, blocks, modules, circuits, and processes described above. Whether such functionality is implemented in hardware or300879086.1 - 25 -software depends upon the particular application and design constraints imposed on the overall system.

[0100] The hardware and data processing apparatus used to implement the various illustrative logics, logical blocks, modules and circuits described in connection with the aspects disclosed herein may be implemented or performed with a general purpose single-or multi-chip processor, a digital signal processor (DSP), an application specific integrated circuit (ASIC), a field programmable gate array (FPGA) or other programmable logic device, discrete gate or transistor logic, discrete hardware components, or any combination thereof designed to perform the functions described herein. A general-purpose processor may be a microprocessor, or, any conventional processor, controller, microcontroller, or state machine. In some implementations, a processor may be implemented as a combination of computing devices, such as a combination of a DSP and a microprocessor, a plurality of microprocessors, one or more microprocessors in conjunction with a DSP core, or any other such configuration. In some implementations, particular processes and methods may be performed by circuitry that is specific to a given function.

[0101] In one or more aspects, the functions described may be implemented in hardware, digital electronic circuitry, computer software, firmware, including the structures disclosed in this specification and their structural equivalents thereof, or in any combination thereof. Implementations of the subject matter described in this specification also may be implemented as one or more computer programs, which is one or more modules of computer program instructions, encoded on a computer storage media for execution by, or to control the operation of, data processing apparatus.

[0102] If implemented in software, the functions may be stored on or transmitted over as one or more instructions or code on a computer-readable medium. The processes of a method or algorithm disclosed herein may be implemented in a processor-executable software module which may reside on a computer-readable medium. Computer-readable media includes both computer storage media and communication media including any medium that may be enabled to transfer a computer program from one place to another. A storage media may be any available media that may be accessed by a computer. By way of example, and not limitation, such computer-readable media may include random-access memory (RAM), read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), CD-ROM or other optical disk storage, magnetic disk storage or other magnetic storage devices, or any other medium that may be used to store desired program code in the form of instructions or data structures and that may be accessed by a computer. Also, any connection may be300879086.1 - 26 -properly termed a computer-readable medium. Disk and disc, as used herein, includes compact disc (CD), laser disc, optical disc, digital versatile disc (DVD), floppy disk, and Blu-ray disc where disks usually reproduce data magnetically, while discs reproduce data optically with lasers. Combinations of the above should also be included within the scope of computer-readable media. Additionally, the operations of a method or algorithm may reside as one or any combination or set of codes and instructions on a machine readable medium and computer-readable medium, which may be incorporated into a computer program product.

[0103] Various modifications to the implementations described in this disclosure may be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other implementations without departing from the spirit or scope of this disclosure. Thus, the claims are not intended to be limited to the implementations shown herein but are to be accorded the widest scope consistent with this disclosure, the principles and the novel features disclosed herein.EXAMPLES

[0104] The present invention will be described in greater detail by way of specific examples. The following examples are offered for illustrative purposes only, and are not intended to limit the invention in any manner. Those of skill in the art will readily recognize a variety of noncritical parameters which can be changed or modified to yield essentially the same results.Example 1Data Preparation and Modelling

[0105] Data was obtained, analyzed, and prepared for input into a machine learning model in accordance with embodiments of the present disclosure. For example, an experiment involved gathering data corresponding to 3 areas of land. Two of the areas of land were plots of maize, and the third was a plot of a paddy crop. The first maize plot was 0.98 acres, the second maize plot was 0.79 acres, and the paddy plot was 1.97 acres. The experimental process included leveraging satellite data in conjunction with weather data (e.g., rainfall, wind speed, relative humidity, temperature data) to estimate yields at the farm level. An initial phase included comprehensive pre-processing of satellite data, incorporating atmospheric correction, cloud removal, and radiometric correction. Concurrently, weather data was cleaned to ensure accuracy.

[0106] Weather data and satellite data (e.g., satellite indices such as ND VI and SM) were gathered and analyzed for the three areas of land over a period of time (e.g., a growing season).300879086.1 - 27 -This data was used to assess the land's capacity and to create a rating in relation to the anticipated yield of each crop.

[0107] The weather data included rainfall data, temperature data, windspeed data, and relative humidity data. To illustrate, FIGs. 5a, 5b, and 5c show rainfall data for a first maize plot, a second maize plot, and a paddy plot, respectively, over a growing season. It can be seen that all three farms had a similar rainfall trend, and heavier rainfall was observed in November. FIGs. 6a, 6b, and 6c show temperature data for the first maize plot, the second maize plot, and the paddy plot, respectively, over a growing season. These data show the minimum, maximum and mean temperature for every day of the period of time. As expected, there was very slight variation in temperature across the time. FIGs. 7a, 7b, and 7c show relative humidity data for the first maize plot, the second maize plot, and the paddy plot, respectively, over a growing season. Relative humidity is one of the major factors which determines whether conditions are favorable for pests and / or disease, which may impact the crop, which in turn may affect the yield. FIGs. 8a, 8b, and 8c show wind speed data for the first maize plot, the second maize plot, and the paddy plot, respectively, over a growing season. It was observed that the wind speed varied from 5 kmph to 25 kmph.

[0108] Satellite data was also observed and analyzed. Time series satellite data for monitoring health of the crop health and soil moisture was acquired from the date of sowing. The crop health was monitored using optical satellite data such as Sentinel 2 (e.g., using ND VI). FIGs 9a, 9b, and 9c show crop health data for the first maize plot, the second maize plot, and the paddy plot, respectively, over a growing season. The soil moisture was estimated from the active microwave satellite data such as Sentinel 1. FIGs. 10a, 10b, and 10c show soil moisture data for the first maize plot, the second maize plot, and the paddy plot, respectively, over a growing season. The satellite data was preprocessed to remove the issues due to atmosphere, speckle in case of micro wave data, terrain distortions. For example, the crop health data was filtered using the Savitzky-Golay filter and the microwave based soil moisture data was filtered using the exponential filter. Even though there were some noises / outliers present in the satellite data due to ground conditions, those noises / outliers were removed using the filters.

[0109] Satellite data was used to determine soil conditions in resolutions smaller than the boundaries of the respective areas of land. For example, FIG. 11 illustrates soil moisture data over time for portions of the first maize plot 1110 in graph 1100. Portions of the first maize plot 1110 which had soil moisture content in the yellow to red range corresponded to areas of land with high moisture stress. For example, portion 1120 had a soil moisture in the “very dry” range on May 15, 2023. Other portions of the first maize plot 1110 had high soil moisture at300879086.1 - 28 -other times. For example, portion 1122 had soil moisture in the “excessive” range on June 8, 2023. For another example of satellite data collected, FIG. 12 illustrates soil chemistry data for portions of the first maize plot 1110 in graph 1200, including data of nitrogen content, phosphorus content, potassium content, organic carbon content, and pH of the soil. These data points were gathered solely by satellite data, without physical sampling of the land. An advantage of this kind of granular data gathering is that it allows for precise application of water and nutrients to the portions of the area of land as needed. A further advantage is that time-intensive and error-prone physical sampling can be reduced or eliminated by remotely sampling the area(s) of land.

[0110] Satellite data including crop health data was also gathered and analyzed. Crop health was determined using the Normalized Difference Vegetation Index (NDVI) foreach area of land. Figure 13 shows spatial maps of the crop health (e.g., NDVI) at different growth crop growth stages (e.g., at different times) for maize plot 1. Figure 14 shows spatial maps of the crop health at different growth crop growth stages for maize plot 2. Figure 15 show spatial maps of the crop health at different growth crop growth stages for the paddy plot. For each of FIGs. 13 to 15, the data progresses in time as read from right to left, then top to bottom. For example, spatial map 1302 in FIG. 13 precedes spatial map 1304 in time, which in turn precedes spatial map 1306 in time. FIGs. 14 and 15 are read similarly. It was observed from the time series data for the three areas of land, that crop health increased gradually from the date of sowing and almost reached a saturation point. This indicates crops approaching maturity.

[0111] In FIGs. 13 to 15, the spatial representations show the variation of NDVI values within the farms over time. Overall, all farms had very low NDVI at the sowing stage and the NDVI increased gradually. A similar pattern was observed in time series data also. It can be seen in the maize plots of FIGs. 13 and 14, the NDVI value at the edge of the farm was relatively lower compared to the other parts. This may be due to the pixels at the edge of the farm do not fall completely inside the farm, so the data for those pixels may have been impacted by a nearby plot. In the paddy farm shown in FIG. 15, the crop health was not uniform throughout the farm, as can be seen in the north-western part of the spatial maps, where the health of the crop was very low compared to the other regions. In the middle and eastern part of the paddy farm, the crop health was generally better compared to other regions of the farm.300879086.1 - 29 -Example 2Yield Prediction

[0112] Yield estimation was performed by a machine learning model according to the systems and processes described above. Farm 1 and farm 2 have maize crop, whereas farm 3 is paddy crop. The crop yield was predicted by a trained machine learning model in accordance with aspects of the present disclosure. The machine learning model was trained to operate on time series satellite data, including crop health data and soil content data, and weather data, including rainfall, wind speed, relative humidity, and temperature. Crop yield was predicted for each of the three farms identified above at three different time periods based on the stage of the crop, including 60 days after sowing (DAS), 90 DAS and at the time of harvest. Predicting crop yield at various stages of crop growth can help understand the impact of various factors at the different stages of the crop.

[0113] The crop yield values were estimated accordingly using the satellite data and weather dataset, including both historical data and forecasted data (e.g. weather forecasts). Input data included rainfall, wind speed, relative humidity and temperature along with the satellite based crop health and soil moisture indices as discussed above with respect to Example 1. The estimated crop yield values were multiplied with the area of the plot to get production amounts. These estimated results are shown in table 3 below. The estimated / predicted yield values from the model were provided prior to harvest, and the actual yield output (e.g., the ground value or real data) from each field was compared to the predictions generated by the yield model. Two scenarios were analyzed: one considering the final yield available for sale and the other accounting for potential losses due to machinery or during the threshing process. These comparisons are shown in the last two columns of table 3.Table 3: Model-Generated Yield Predictions / Outcomes300879086.1 - 30 -

[0114] Accuracy assessments were conducted for both scenarios as shown in table 4. Thus it is shown that the machine learning model was able to achieve better than 75% accuracy, and in some cases up to 99.47% accuracy, in predicting crop yield for the areas of land observed.Table 4. Accuracy of Predicted Yield Compared with Ground ValueExample 3Soil Nutrient Analysis

[0115] The soil nutrient data obtained from the satellite data was compared to laboratory data (e.g., physical sample data) of the individual experiment farms (maize plot 1, maize plot 2, and the paddy plot). High accuracy was achieved for nitrogen, phosphorus, potassium, organic carbon, and pH. These high accuracies were observed before sowing and after harvest of the crops. Notably, the prediction accuracy was lower for some nutrients when the analysis occurred 60 days after sowing, coinciding with the presence of a standing crop in the field.

[0116] Table 5 shows an accuracy of soil content data estimated from satellite data when compared to a ground truth value obtained by physically sampling and testing the soil of the first maize plot. The analysis was performed at different crop growth stages of the maize crop. From the table 5, it is evident that recorded high accuracy of Nitrogen, Phosphorus, Potassium, Organic Carbon and pH (89.4%, 92.3%, 87.3%, 90.0% and 97.5%, respectively) before sowing. Similarly high accuracy of Nitrogen, Phosphorus, Potassium, Organic Carbon and pH (93.3%, 71.9%, 90.4%, 97.8% and 96.6%, respectively) were recorded after harvest of the crop. Prediction accuracy was lower when the analysis occurred 60 days after sowing for some nutrients, coinciding with the presence of a standing crop in the field. An example spatial representation of soil nutrient distribution across the first maize plot is shown in FIG. 16a.300879086.1 - 31 -Table 5. Accuracy of soil nutrients from satellite data vs. physical sampling for maize plot 1

[0117] Table 6 shows an accuracy of soil content data estimated from satellite data when compared to a ground truth value obtained by physically sampling and testing the soil of the second maize plot. The analysis was performed at different crop growth stages of the maize crop. Results shown in Table 6 indicate that high accuracy of Nitrogen, Phosphorus, Potassium, Organic Carbon and pH (98.6%, 82.3%, 90.9%, 85.0% and 89.0%, respectively) before sowing. Similarly high accuracy of Nitrogen, Phosphorus, Potassium, Organic Carbon and pH (93.7%, 91.3%, 95.4%, 89.2%, and 97.2%, respectively) recorded after harvest of the maize crop. Prediction accuracy was lower for some nutrients, such as nitrogen, phosphorus, and potassium, when the analysis occurred 60 days after sowing, coinciding with the presence of a standing crop in the field. An example spatial representation of soil nutrient distribution across the second maize plot is shown in FIG. 16b.Table 6. Accuracy of soil nutrients from satellite data vs. physical sampling for maize plot 2

[0118] Table 7 shows an accuracy of soil content data estimated from satellite data when compared to a ground truth value obtained by physically sampling and testing the soil of the paddy plot. The analysis was performed at different crop growth stages of the paddy crop. Results showed that high accuracy of Nitrogen, Phosphorus, Potassium, Organic Carbon and pH (95.2%, 94.8%, 92.1%, 92.7% and 86.7%, respectively) was achieved before sowing. Similarly high accuracy of Nitrogen, Phosphorus, Potassium, Organic Carbon and pH (80.9%, 85.7%, 89.3%, 90.9% and 99.1%, respectively) was recorded after harvest of the paddy crop. Prediction accuracy was lower for some nutrients, such as nitrogen, phosphorus, and300879086.1 - 32 -potassium, when the analysis occurred 60 days after sowing, coinciding with the presence of a standing crop. An example spatial representation of soil nutrient distribution across the paddy plot is shown in FIG. 16c.Table 7. Accuracy of soil nutrients from satellite data vs. physical sampling for paddy crop

[0119] All of the methods, systems, and computer-readable media disclosed and claimed herein can be performed and executed without undue experimentation in light of the present disclosure. While the systems and methods of this invention have been described in terms of preferred embodiments, it will be apparent to those of skill in the art that variations may be applied to the methods and in the steps or in the sequence of steps of the method described herein without departing from the concept, spirit and scope of the invention. All such modifications apparent to those skilled in the art are deemed to be within the spirit, scope and concept of the invention as defined by the appended claims.300879086.1 - 33 -

Claims

1. CLAIMS1. A method for predicting crop yield and estimating soil health, the method comprising:receiving, by one or more processors, weather data and satellite data corresponding to an area of land;generating, by the one or more processors, an input dataset from the weather data and the satellite data;providing, by the one or more processors, the input dataset to a trained machine learning model configured to determine from a portion of the input dataset corresponding to the satellite data one or both of: crop health data of a crop in the area of land, and soil content data of soil in the area of land; determining, by the one or more processors via the trained machine learning model, a predicted crop yield of the area of land based on the input dataset, the crop health data, the soil content data, and / or historical yield data; and initiating, by the one or more processors, one or more recommended actions based on the predicted crop yield.

2. The method of claim 1, wherein the trained machine learning model is configured to predict crop yield based on a type of the crop.

3. The method of any one of claims 1 to 2 further comprising determining the soil content data without physically sampling the soil in the area of land.

4. The method of any one of claims 1 to 3, wherein the trained machine learning model comprises:a first convolutional neural network (CNN) model configured to operate on a weather data portion of the input dataset;a second CNN model configured to operate on a satellite data portion of the input dataset; anda recurrent neural network (RNN) model configured to receive as inputs, the output of the first CNN model and the output of the second CNN model, wherein the RNN model is configured to provide the predicted crop yield.300879086.1 - 34 -5. The method of claim 4, wherein the RNN model is configured to capture temporal dependencies in one or both of the weather data portion and the satellite data portion.

6. The method of any one of claims 1 to 5, wherein generating the input dataset comprises filtering the satellite data to remove atmospheric interference from the satellite data.

7. The method of any one of claims 1 to 6, wherein generating the input dataset comprises applying cloud masking to filter out a portion of the satellite data corresponding to images of the area of land when obscured by clouds.

8. The method of any one of claims 1 to 7, wherein the soil content data comprises data for a moisture level of soil in the area of land, data for a nutrient content of the soil in the area of land, data for the pH of the soil in the area of land, data for the roughness of the soil in the area of land, data for the density of the soil in the area of land, or a combination thereof.

9. The method of any one of claims 1 to 8, wherein the satellite data comprises spectral data obtained in the microwave band.

10. The method of any one of claims 1 to 9, wherein the satellite data comprises spectral data obtained in the centimeter wavelength.

11. The method of claim 10, wherein the satellite data comprises spectral data obtained in the 5.2 centimeter to 6 centimeter wavelength.

12. The method of claim 11, wherein the satellite data comprises spectral data obtained in the 5.4 centimeter to 5.8 centimeter wavelength.

13. The method of any one of claim 1 to 12, wherein the area of land is larger than 1 acre.

14. The method of any one of claims 1 to 13, wherein the predicted crop yield provides two or more predicted yields for the crop at different dates.

15. The method of any one of claims 1 to 14, wherein the predicted crop yield comprises a predicted crop yield for a date at least 10 days before an expected harvest.300879086.1 - 35 -