CSANR > Resources > An Online Toolbox for Understanding and Communicating AI in Sustainable Agriculture > Machine Learning and Other Common Concepts

Machine Learning and Other Common Concepts

Text Transcript with Description of Visuals

Audio	Video
Dr. Alex Kirkpatrick: Hello, I’m Dr. Alex Kirkpatrick from the Center for Sustaining Agriculture and Natural Resources at Washington State University. In this video, we’ll look at some commonly used forms of artificial intelligence and try and decode some of the jargon that you might be hearing in relation to AI and agriculture. Let’s explore.	Text on screen: Alex Kirkpatrick, PhD. Alex speaks in a studio and addresses the camera.
[Music]	A tractor moves across a field. A combine harvester plows a crop as a woman stands in a field typing on a tablet. A sprinkling system moves slowly across a greenhouse. A woman picks up a plant by the stem and holds it in front of a tablet. Digital, hexagon shaped grids appear on a field as a tractor harvests crops.
Dr. Alex Kirkpatrick: When you think of AI, what do you think of? For many people, it might be ChatGPT. GPT and other generative AI models have ballooned in popularity and usage in recent years due to their ease of use and ability to seemingly create new, original content such as text, images, and other media on demand, hence generative AI. But how do these machines work? How do they simulate human creativity, and how much should we trust the information they provide?	The logos for Western SARE (Sustainable Agriculture Research and Education), Washington State University, National Institute of Food and Agriculture, and Ag AID Institute appear on screen. Written underneath the logos is “Machine learning and other common concepts.” Alex speaks in the studio.
As of 2025, ChatGPT is doubtless the ambassador for all AI. GPT and other brands of chatbots are called large language models, or LLMs. GPT was trained on pretty much most of the human-generated text on the internet. When it gets an update, it’s trained on all the additional open-source texts since that last update. Also, publishers like Wiley and Sage might even sell access to private files stored on the internet. It’s generative. It creates new text by analyzing patterns in its vast dataset to predict what a response should be, what it should look like, what word goes where, and what comes after it, etc. It’s reactive to user inputs or prompts, but it has limited memory too. It retains the conversation you’re having with it while you’re having it. It can access archive conversations to personalize its interactions with you. But once you close that chat and move to a different GPT or delete the archived conversations, it has no memory of what you’ve said before. You aren’t giving ChatGPT more to learn from in an immediate sense when you chat with it. That conversation would have to be incorporated into its next training update for it to have any impact on the model more broadly. It’s also an example of deep learning, which is a concept to be aware of independently of LLMs because it may have useful agricultural applications. So what is machine learning, basically?	Text on screen: “Chatbots, machine learning and deep learning.” Written at the top of the next slide is “LLMs equals Large Language Models.” There is a photo of a phone in a person’s hand, which displays a messaging service. Behind is a robot wearing headphones with a speech bubble above it. In the corner of the screen is a green cube on top of a dark green square. There are leaves growing out of the square and cube. [Music in the background]. Text on screen: The first bullet point reads, “e.g. Chat GPT, Claude, et cetera.” The second bullet point reads, “Pre-trained on vast textual datasets.” The third bullet point reads, “Learns from patterns in text to generate next text. Predictive.” The fourth bullet point reads, “Limited memory AI.” The fifth bullet point reads, “Machine Learning / Deep Learning.”
Very simply, machine learning is a sub-discipline of artificial intelligence, machines that predict, make decisions, and navigate their environment independently. Machine learning is focused on building algorithms that can learn patterns from datasets. Consider this very simple pattern or sequence: a human doesn’t have to program a machine learning algorithm to put a duck next in this sequence. A human just has to program the machine to learn patterns from the dataset and make a prediction, perhaps even generate the probable next icon. It doesn’t need constant reprogramming or human intervention to approach every unique task such as this. Of course, the pattern might be a lot more complex in reality, and something other than a duck might actually be needed next. But machine learning is probabilistic, so you’re probably getting a duck next based on the available information. Machine learning algorithms can get things wrong, though.	The title of the next slide is “Artificial Intelligence.” An arrow points from “Artificial Intelligence to Machine Learning, ML.” A pattern appears underneath which shows the following: duck, trees, machine, duck, trees, machine, duck, trees, machine, question mark.
Now, deep learning is a subset of machine learning that relies on large neural networks. Very basically, the machine is built around artificial neurons similar to how a biological brain works. For example, one layer or neuron of an LLM recognizes individual letters. Another might recognize common words. Yet another might recognize whole sentences. As inputs move through different layers or neurons, the networked algorithm gets a better sense of patterns and relationships in the text.	An arrow points from the pattern sequence to “Deep Learning.” A green head with white lines inside the skull and a blue colored brain appear on either side of the words “Deep Learning.” Written underneath is “Neural Networks.” A video showing digital sequences, codes, and technological programs appears on screen.
This is what makes ChatGPT able to generate human-like linguistic responses. Of course, human-like means unpredictable. And while a human can likely explain how they came to a decision to write what they wrote in the way they wrote it, an LLM can’t. Its decisions lie within an inaccessible black box of dizzying connections and layers of artificial neurons.	A woman stands in a room with piles of boxes and a railing of clothes. There is a laptop on the desk. A photo icon of a man appears on screen with a speech bubble next to him. It reads “Hello.” A photo of a robot appears on screen with a speech bubble next to it. Written in the speech bubble is, “Hello. Can I help you?” A photo of a man appears on screen with a speech bubble. Written inside the speech bubble is, “I am looking for something.” A picture of a robot appears on screen with a speech bubble next to it. Written in the speech bubble is, “What product are you searching for?” The man replies, “I am looking for a shirt.” The robot replies, “Our new arrival white shirt patterns are very popular at the moment! Click link or show me more.” The screen transitions to various combinations of numbers that light up in blue and transform into rectangles as they rotate.
So, LLMs are probabilistic models that learn from datasets to produce best-fit results based on patterns they found within that dataset. This implies they can be wrong and somewhat mysterious sometimes. They doubtless present users with many opportunities, but there are real risks that should be acknowledged. LLMs afford users quick, intuitive responses to prompts and are useful for bouncing ideas off of, brainstorming, or getting a rapid response to a simple query with a pretty widely known answer. But because they’re probabilistic in nature, they aren’t really thinking about the answer at all. They’re recognizing patterns in your prompt, patterns in their training data, and offering a response that looks superficially like the kind of response that probably answers that kind of prompt. Most LLMs are commercial products, so they aim to please and generate responses it thinks you’ll like. All this means LLMs have a tendency to make things up, or hallucinate, as it’s known. On the plus side, LLMs are trained on vast datasets, such as the whole internet. That’s a static memory that a human would never be able to maintain, potentially putting a universe of information at your fingertips. But because of how neural networks work, there’s no accountability, no way of tracing a route from prompt to generated response. You’ll never know what route the AI took to give you the answer it did. A limited short-term memory means you can chat intuitively back and forth, and, like a human communication partner, it will remember things from earlier in the conversation. But you may have noticed the longer conversations have a tendency to drift. Also, your conversation isn’t changing the AI in reality or being absorbed into the training data as you speak. The model needs retraining and an update for that. LLMs tend to perform well at putting things in order and recognizing patterns in text, giving summaries, making lists, etc. This can all enhance your productivity. But they are machines, don’t forget. Any insight it might model is simply a result of probabilistic ordering of words or images, etc. There’s no real abstract insight underlying its output. But this pattern recognition and large training set also makes LLMs excellent translators between languages and dialects, which can be really useful. ChatGPT and others tend to be very accessible and easy to use for a broad range of people. On the downside, any professor or teacher will tell you that many people rely too heavily on LLMs to write for them or find answers to questions. This can lead to a lot of errors if the humans in the loop aren’t verifying responses. An over-reliance can lead to skill fade among users, potentially. And the more we use AI to generate media, the more we saturate training sets with AI contents. Studies are finding that AI performs poorly when fed its own work to learn from, drifting further from optimal and appropriate responses.	Alex speaks in the studio. The presentation slide is divided into two columns. The first one is “Opportunities” and the second column is “Limitations and risks.” Written underneath the first column is “Rapid answers and generation.” In the corner of the screen is a white box with plants, apples and pears inside. Written underneath the second column is “Hallucinations and inaccuracies.” Written under the first column is “Vast neural network and training data.” Written under the second column is “black box lack of transparency.” Written under the first column is “short term memory.” Written underneath the second column is “No long-term memory or consistency.” Written under the first column is “Pattern recognition expedites some tasks. E.g. making lists, summarizing a document, et cetera.” Written under the second column is “lacks genuine understanding or abstract insight.” Written under the first column is “Good translation skills.” Written under the first column is “Easy to use and accessible.” Written under the second column is “Human over-reliance.” Written under the second column is “perform poorly when fed AI generated content.”
Let’s move on from LLMs specifically and look at a different type of deep learning machine that might be applied to agriculture. Think of a tool, for example, that could help a farmer know the most optimal time to irrigate a crop and to what extent. You’d want a network of algorithms or artificial neurons that could each focus on specific layers of information, such as existing environmental conditions monitored through on-site temperature gauges, humidity gauges, soil moisture monitors, all the rest of it. You’d want a neuron to monitor broader regional weather patterns from sources like weather stations, perhaps even commercial broadcasters. You’d want a layer or neuron that considers current crop health and another that explores historical crop health and yield over time under different conditions. You’d want to feed that deep learning algorithm a massive data set of historical data and patterns on all of these important aspects. During training, it learns things like how certain drought conditions affect different soil types under different pre-existing conditions, how different weather patterns could affect or have affected crop health and yield and so on. By integrating that learning with observations of live and very recent conditions through network sensors, the machine can generate suggestions in real time, such as irrigate the field now using this amount of water. You can see why AI proponents vaunt its potential to boost sustainability and precision. Land managers are likely to want to make the ultimate decision, however, and this takes timely and appropriate human-computer interaction. But feasibly, one day, you might build in a few more layers and connections to external robots and hardware that control the irrigation system. Perhaps you afford the algorithm the ability to navigate the real-world environment as well as the virtual to actually open the valves and shut them, to decide which type of irrigation is most appropriate and which should be operationalized in any given scenario. A weak AI with limited memory that is reactive to media inputs. Full autonomy, perhaps.	Alex speaks in the studio. Text appears next to Alex and reads, “Deep learning may help irrigation decision making.” Written next to the first bullet point is, “Neural network. Weather monitoring, soil moisture, crop health, historical yield, water supply and demand, live weather prediction.” Written next to the second bullet point is “Massive training dataset to learn from.” Written next to the third bullet point is “Generates predictions and decisions.” Written next to the fourth bullet point is “Communicates succinctly to decision makers.”
[Music]	Text on screen, “Digital twins.”
Dr. Alex Kirkpatrick: Another related and promising use of machine learning in agriculture comes in the form of digital twins. A digital twin is an accurate virtual representation of a physical object in a digital environment. They can be used to remotely monitor the physical object’s performance around the clock, collect and analyze data, simulate real-world scenarios and their outcomes, and make data-powered decisions, and perhaps provide corrective recommendations if necessary. Very basically, here’s how it might work: machine learning acquires and analyzes sensor data from a farm, historical data, etc., and creates a digital counterpart. Or in theory, an accurate representation of the farm that exists only in the virtual realm. Imagine being able to create a complete digital model of an entire agricultural ecosystem, including fields, facilities, crops, animals, workforce, and machinery. You can then play around with this computerized version of your physical system, subjected to stressors, model different scenarios to see how it might respond, essentially diverging the digital version from the physical reality. You could simulate different scenarios on your farm, like different climate impacts, different crops, different land uses, and essentially glimpse into the future without having to gamble and experiment with the physical farm itself. In that sense, the twin might then determine the management decisions made to influence the real physical system. Then around and around again. As the physical system changes, so does the digital twin. As the digital twin changes under simulated conditions, so might the physical realm in turn.	Alex speaks in the studio. The title of the presentation slide is “Digital twins in agriculture.” Written underneath is, “A virtual representation of a physical ag environment.” Written inside a green circle is “Physical system, e.g. farm or landscape.” Written inside a blue circle is “digital twin.” Written inside a gray circle is “test scenario, forecasting, modeling.” An arrow points from the physical system circle to the digital twin circle. Written under the arrow is “data acquisition” and written above the digital twin circle is, “representative.” An arrow points from the digital twin circle to the test scenario circle. Written in between the two circles is, “simulation.” An arrow appears underneath the test scenario circle and points to the digital twin circle. Written underneath the digital twin circle is “deterministic.” An arrow points from the digital twin circle to the physical system circle. Written between the two circles is “decision.”
Check out this video on industrial digital twins from NVIDIA, one of the world’s foremost AI and tech companies. And following that, there’s another short video about agricultural digital twins specifically.	Alex speaks in the studio.
NVIDIA Speaker: The future of heavy industries starts as a digital twin. The AI agents helping robots, workers, and infrastructure navigate unpredictable events in complex industrial spaces will be built and evaluated first in sophisticated digital twins. This Omniverse digital twin of a 100,000 square foot warehouse is operating as a simulation environment that integrates digital workers, AMRs running the NVIDIA Isaac Perceptor Stack, centralized activity maps of the entire warehouse from 100 simulated ceiling-mount cameras using NVIDIA Metropolis, and AMR route planning with NVIDIA cuOpt. Software in loop testing of AI agents in this physically accurate, simulated environment enables us to evaluate and refine how the system adapts to real-world unpredictability. With generative AI-powered Metropolis vision foundation models, operators can even ask questions using natural language. The visual model understands nuanced activity and can offer immediate insights to improve operations.	Text on screen, “Fusing Real Time AI with Digital Twins.” The NVIDIA logo appears in the corner of the screen. The screen splits and shows two videos of industrial machines working inside a warehouse. Text on screen, “NVIDIA 2024. Fusing real time AI with digital twins.” A computer screen shows two animated industrial robotic arms. A digital warehouse showing digital robotic arms assembling a car. Digital workers walk around the warehouse. Multiple top-down views of a digital warehouse, with large crates, machinery, boxes and workers. A digital worker sits with a digital computer and looks at various crate sizes in the warehouse. Two digital workers walk through a warehouse and pass two reach forklifts and shelves of boxes and wooden crates. Various digital workers, who are wearing hard hats and high-vis jackets, move around the simulation warehouse. Forklifts carry boxes and transport them across the warehouse. One of the digital workers stands by a warehouse stacking rack. The screen splits and on the left-hand side, it shows multiple images of various shelving racks, crates and boxes in the warehouse. On the right-hand side, it shows a top-down view of a digital warehouse with colored numbers on it ranging from 0 to 30. Written at the top is “Multi-camera Tracking. People, 30.” A top-down view of a digital warehouse. A green line appears and points from a forklift to a crate. A digital simulation of a warehouse shows multiple green lines on the floor with white dots on them overlapping. The green lines go in various directions on the warehouse floor. A digital forklift moves across the warehouse floor and follows a green line as they pass shelving racks. A red line appears in front of the green line and goes in a different direction. A digital chat box appears on screen and written at the top is “Agent Response.” A photo of a man appears and the speech bubble next to it reads, “Did any unusual situation occur in aisle 3?” A green speech bubble appears underneath and reads, “A shelf collapsed and the boxes fell from shelving at 3:30 P.M. leaving boxes blocking the aisle.” A video appears underneath and shows the boxes falling.
KETV Reporter: Imagine using in-the-field data.	Text on screen, “New on KETV 7 Newswatch.” Two men move through a field of crops. Text on screen, “KETV, 2024. Digital twin crops.” A label with a barcode is attached to one of the crops.
Farmer in Video: Three point two.	Two people use a long rod to test the soil in a corn field.
Reporter: Mixing it with artificial intelligence technology.	A computer screen shows a simulation of a plant. A man has a conversation with a woman.
James Schnable: To just plug in different parameters and suddenly the plant had a few more leaves, the leaves were longer, the angles were different.	Another simulation shows a field of crops. The man speaks on camera.
Reporter: And sprinkle in computer graphics from the gaming world.	Plants sway in the breeze and a screen shows a diagram depicting the ratio measurement from leaf to root divided by stem height.
James Schnable: By those same approaches originally to video games to simulate corn plants and figure out what are all the different ways a corn plant can look.	A young man looks at a set of data on a computer.
Reporter: University of Nebraska-Lincoln plant scientist James Schnable believes he’s found a way to revolutionize plant research.	A computer screen shows a digital image of a crop changing in a software.
James Schnable: What I really hope this brings is a move from a guess-and-check approach to breeding.	Another screen shows a video game.
Reporter: It’s building what’s called a digital twin. He received a three-year $2 million grant from the National Science Foundation, collaborating with researchers from Iowa State and Purdue Universities. While it’s groundbreaking for crops, it’s something other industries, such as manufacturing and healthcare, are already using.	A young woman sits with a laptop and computer in front of her. A man closes the front panel of a CNC machine. A 3D scan of a foot appears on screen.
James Schnable: So now we could test potentially millions of combinations of different properties of plants, thousands of different combinations of planting plants in the field, as well as different environmental conditions.	James Schnabel speaks to the camera. Text on screen, “Sim City 2000.” The screen shows the game playing out.
Reporter: Think of it like the old computer game SimCity, where you can build different infrastructure and see how it affects traffic flows and development.	The screen shows the game playing out.
James Schnable: Testing hybrids in the field, there’s only so many we can test. And so, in both cases, if you can play around a lot, you’re going to find much better ideas that you can test in the real world. And simulations give us that.	The screen shows the game playing out.
Dr. Alex Kirkpatrick: We can’t talk about machine learning and deep learning without also acknowledging the potential challenges to sustainability. Machine learning comes at a massive cost to the environment. Just training older versions of ChatGPT costs the environment about as much as powering the average US home for about 123 years.	Alex speaks in the studio.
[Music in background] Here are the important things to take from this video: LLMs like ChatGPT are models that generate outputs based on patterns. They still have a tendency to hallucinate and get things wrong, and should be used mindfully by verifying the truth and accuracy of their outputs. LLMs are an example of machine learning where an algorithm isn’t trained to give a specific answer to a specific input, but rather the algorithm is trained to recognize patterns and generate a response itself with some autonomy. In the same wheelhouse is deep learning, which utilizes neural networks or layers of algorithms and inputs to come up with responses. Both machine and deep learning are concepts that we’ll circle back to throughout this toolbox, as they might have some pretty smart applications in agriculture.	Text on screen, “The takeaways.” On the next slide, a picture of a light bulb with leaves growing out of it appears on screen. Text on screen, “The first bullet point reads, LLM = Large Language Models. Pattern Based predictive models.” The second bullet point reads, “Machine Learning. The algorithm learns patterns from large datasets to generate outputs.” The third bullet point reads, “Deep Learning. Utilizes neural networks of inputs and algorithms to recognize complex patterns and make complex predictions.”
I’ve reduced the complexity of all of these concepts for expedience and coherence. But hopefully, you’ll now have a clearer picture brought to mind when in the future you hear terms like LLM, deep learning, and machine learning. As ever, thank thank you very much for your attention and your engagement.	[Music] Text on screen, “the closer.” Alex speaks in the studio.
[Music]	Text on screen, “Adieu. USDA National Institute of Food and Agriculture. U.S. Department of Agriculture. This material is based upon work that is supported by the National Institute of Food and Agriculture, U.S. Department of Agriculture, under award number 2023-38640-39571 through the Western Sustainable Agriculture Research and Education program under project number WPDP 24-013. USDA is an equal opportunity employer and service provider. Any opinions, findings, conclusions, or recommendations expressed in this publication are those of the author(s) and do not necessarily reflect the view of the U.S. Department of Agriculture. This material is based upon work supported by the AI Research Institutes program supported by NSF and USDA-NIFA under the AI Institute, Agricultural AI for Transforming Workforce and Decision Support (Ag AID). Award Number, 2021-67021-35344.” The logos for Western SARE (Sustainable Agriculture Research and Education), and Ag AID Institute display to the left.