With the proliferation of tools to collect and analyze data that can inform problem solving and decision making, the use of big data and data analytics has become ubiquitous throughout many industries. While the food industry may be slower to adopt big data and data analysis than some other industries, such as healthcare, it’s catching up as food scientists and other experts recognize its potential as a powerful tool to address large, complex problems in the food industry.
Food safety is one such problem. Affecting every step along the food supply chain, food safety relies on a company’s ability to gather reliable data in a timely manner and then act on that information as needed. From food traceability to digital pest management to better detection of foodborne illness breakouts to reductions in food spoilage, big data and data analytics are being employed to advance food safety at the local and global levels.
“Big data can be used at all steps of the food value chain to improve food safety,” says John Donaghy, PhD, head of food safety at Nestle in Switzerland. On the farm or at primary processing steps, he cites several data types that can be collected to improve food safety, such as water analytical test data, hygiene status of workers, and certification status of farms/processors. At the consumption/public health end of the food chain, he cites the use of big data and data analytics for communicating recalls to consumers and for source tracking foods that cause foodborne illness outbreaks. Between these end points, he indicates numerous areas during manufacturing where data can be collected, e.g., microbiological verification testing, process control data, and environmental monitoring data. “Data relevant to food safety and quality can be collected at so many steps throughout [the] food chain; even real-time monitoring of temperature during logistics and transport in the supply chain can be incorporated into dynamic risk management,” he adds.
For food manufacturers and processors, from small to large businesses, the potential impact of big data and data analytics to improve food safety can be enormous. A 2022 report by the Global Food Safety Initiative (GFSI) Science and Technology Advisory Group (STAG) describes the potential impact on business, as well as what businesses should be thinking about when considering the use of big data and data analytics in their own organization (see Tables 1 and 2, below).
A final key question for businesses, according the GFSI report, is: “Do businesses understand the strategic impact of big data on their operations and do they have the appropriate talent strategy for these changes?”
Several food safety experts offer their views on the value of big data and data analytics for food manufacturers and processors that may help businesses better answer these questions.
COLLECTING BIG DATA: INTERNAL AND EXTERNAL SOURCES
Suzy Sawyer, food safety, quality and regulatory digital and analytics leader at Cargill underscored the growing role that data plays for food companies to ensure safe, quality products. “What we’ve discovered at Cargill is that the vast amount of data collected from internal and external sources can be used to help identify potential food safety improvements, analyze, and manage quality control, and mitigate food supply chain risks,” she says.
She cited a number of internal sources of data collection, including data gathered manually (plant floor quality and safety checks and observations), as well as sources from digitized technologies such as sensors (inline processing from machines/processes), data loggers (sensors capturing characteristics such as temperature and humidity), and instrumentation (near infrared detection instruments).
External data sources include technologies designed to exchange data collection to improve food safety, such as regulatory notifications or alters, food-related media, weather, and commodity prices.
Digitizing data across the food supply chain enables companies to amass large quantities of internal data and to capture new data sources to improve food safety risk. New sources of data, such as those available on smartphones and social media, are creating massive data sets, while new technologies allow for the sharing of big data through what is called the internet of things (IoT). Data from sensors, devices, machines, and computing services can now be shared via the internet or a communication medium such as Bluetooth. One example of this is the large amount of data captured by RFID technology, providing information such as batch dates, product variables, weights, and sizes. Wireless devices can be used to automatically read data from RFID tags to improve stock management. Connecting sensors to this system could provide additional data on the environmental condition of goods as they move through the supply chain, such as temperature, humidity, dust, dirt, microbes, or food spoilage chemicals.
Other sources of data that are being generated from whole genome sequencing (WGS) and other “omics-based” methods offer a way to more precisely identify and characterize, for example, a specific bacterium within a food system. These data rely on advances in technology, such as machine learning and artificial intelligence, to generate algorithms that can offer predictive models of risk. FDA’s GenomeTrakr network, for example, uses WGS to help reduce foodborne illnesses and deaths. Another potential use of GenomeTrakr is to sequence pathogens that are not foodborne but that may still be linked to disruptions in the food supply chain; to date, the GenomeTrakr network has performed WGS on bacteria such as Salmonella, Listeria, E. coli, Campylobacter, Vibrio, and Cronobacter, as well as parasites and viruses, all of which are publicly available via the National Center for Biotechnology Information website.
Abani K. Pradhan, PhD, professor in the department of nutrition and food science and Center for Food Safety and Security Systems at the University of Maryland in College Park, sees the increasing use of “omics-based” methods as a paradigm shift in bacterial surveillance. He says that machine learning has the potential to “extract useful patterns that could help improve current methods and models to predict risk or help improve manufacturing- and processing-related decision making.”
Dr. Pradhan emphasizes, however, that current risk assessment frameworks and predictive models don’t typically incorporate useful information such as pathogen genomics data. He and his colleagues at the University of Maryland are currently testing ways to improve food safety by integrating experimental and field data with mathematical modeling and developing predictive and risk models to help guide policymakers, government agencies, and the food industry in making informed risk management decisions. They are also developing models to use bacterial genomic data, along with the accompanying metadata, to help predict whether a bacterium is more or less virulent in host systems. Further research involves developing a new method to incorporate bacterial genomic data into a dose-response modeling framework is underway (Risk Analysis. 2022;43:440-445).
“The primary advantage of these models is that they introduce a way to predict microbial behavior from a genomic perspective, particularly in microbial species that are known to have several subtypes (with potentially different characteristics) that can cause human infection and illness,” he adds.
Whether collecting data internally or externally, big data is just the source. And as indicated by the research just described, the real impact of big data is analyzing it and interpreting what the information means for an actionable goal.
DATA ANALYSIS: TRANSLATING DATA INTO ACTIONABLE INFORMATION
To harness the ability of big data to improve food safety, analytics to translate data into actionable information is needed. The term “precision food safety” is now being applied to refer to the use of big data, particularly the new data sources obtained from genomic sequencing and other “omics-based” methods, to improve food safety.
Strategic use of big data relies on the ability to analyze the information, whether by a food scientist within a company or a researcher working on developing predictive and risk models to help the food industry mitigate food safety risks.
Experts cite several challenges to this goal, one of which is precisely the “bigness” in big data. Dr. Donaghy refers to this as the volume and veracity of data. “The user has to understand where they can get the most value from all of this data and whether it is reliable enough,” he says.
Dr. Pradhan describes this challenge to processing large quantities of data as needing to “extract meaningful information from it, while ignoring ‘noise’ or irrelevant data.”
Another challenge is the need to digitize data so that it is in a form that can be analyzed, either by machine algorithms or trained personnel. Sawyer notes that in companies that have not modernized and are still working with legacy technology or manual processes, collecting data digitally or in a structured format may not be possible. She says a common theme Cargill hears when benchmarking with companies is that there are large amounts of unstructured data exchange between organizations. “Companies need to have the ability to extract meaningful information from these inconsistent formats and languages,” she says.
Another challenge is the lack of data standardization. “In the absence of industry-wide and cross-industry data standards, sources of data have established their own definitions that don’t always translate between systems internal to an organization or externally,” says Sawyer.
Not only does this make it difficult to connect or exchange data across multiple sources to make information consumable and informative, Sawyer says that the lack of data standards can affect the ability to filter big data sets that can be relevant to an organization or to a problem to be solved. “One way Cargill addresses some of these challenges is through the use of metadata and data science concepts such as natural language processing,” she adds. “Our team of digital, data, and analytics experts within our food safety, quality, and regulatory organization is also focused on new ways of working and improving food safety through data-generating technologies.”
Dr. Donaghy underscores that not all companies will be able to easily meet these challenges. “Companies need to see the value/benefit of moving from their current ways of working,” he says. He cites examples of how different-sized companies can begin to use big data in their operations. For smaller companies, he cites the many off-the-shelf digital solutions that can be purchased, such as recall-ready software that companies can plug into, and programs for ready-made environmental monitoring that companies can use to plug in their test data results.
Larger companies, he says, may employ data scientists who can understand and help improve their internal data—such as supplier management data, certification/audit data, incident management data, cleaning program data, and environmental monitoring data—through data analytics, such as predictive analytics.
Dr. Donaghy notes, however, that companies will still need food safety and quality experts to direct data scientists. He cited the example of next-generation sequencing as a diagnostic/investigation tool for food safety. “Companies can employ a third-party laboratory to do this for them, or they can do this internally,” he says. “If they do the latter, it will require food safety specialists as well as bio-informaticians.”
For Dr. Pradhan, hiring a data analyst to process and analyze big data may seem logical, but he thinks that food manufacturers or other stakeholders may benefit by getting training from subject matter experts, such as scientists and researchers who have a good understanding of the food processing, manufacturing, and safety paradigms in these techniques.
Whether a company hires someone new or educates current employees, a certain skillset will be needed to navigate this new terrain of big data and data analytics as applied to food safety. Sawyer lists four primary skillsets: data literacy (the ability to read, understand, and interpret data), data translation (the ability to understand the business needs, to speak technology, and to translate between the entities), data analytics (the ability to analyze data for insights and decision making), and data science (the ability to uncover patterns in data and build predictive models with artificial algorithms such as machine learning).
To realize the full potential of big data and data analytics to improve food safety, data sharing among companies, regulatory bodies, and researchers is vital. Amassing large amounts of data inputs from large numbers of sources, and the more data that is available to work with strengthens a company’s ability to use the data to see patterns, predict risks, and make decisions.
Barbara Kowalcyk, PhD, director of the Center for Foodborne Illness (CFI) Research and Prevention and associate professor of food safety and public health at The Ohio State University in Columbus, and her colleagues have been working on how to facilitate data sharing, given the need to aggregate data across industry to best inform algorithms based on artificial intelligence and machine learning. “Data sharing between organizations in the [food] industry is difficult from a proprietary perspective, so we need to figure out a way to share data and aggregate it,” she says. “If you have enough data, you can mine the data to help inform specific situations on what works best and then share it.” For example, if an intervention has worked well for one company, sharing that with others may allow those companies to direct resources toward that intervention.
Dr. Kowalcyk and her colleagues are working to develop a data governance framework for sharing public and private sector data that will support the development of risk assessment models and burden-of-disease estimates. The project will help answer questions that many people in the private and public sectors have regarding data sharing, such as who will have access to the data, how it will be used, and how confidentiality will be protected.
Initiatives underway are already highlighting both the reasons for and the benefit of sharing data to improve food safety. Along with the GenomeTrakr Network, FDA is piloting several other initiatives under FDA’s New Era of Smarter Food Safety. Launched in 2020, this initiative employs a number of “smarter” tools and approaches to improve food safety, such as root cause and predictive analyses, as well as other tools such as partnering with states to leverage data and analytics. Other FDA initiatives include the Artificial Intelligence Imported Seafood Pilot, the Domestic Mutual Reliance, and Remote Regulatory Assessments.
With access to new tools to capture large amounts of data and the means to interpret that data to improve food safety, food companies have a powerful new way to ensure the safety of their food product along the food supply chain—right at their fingertips. “All food sectors will benefit from the further use of big data interlinked from food source to consumption,” says Dr. Donaghy, “from the smarter way we do agriculture through to the more precise way authorities and manufacturers perform source attribution and investigation.”
TABLE 1. HOW DATA ANALYSIS CAN BENEFIT FOOD BUSINESSES
- Provide precise understanding of the reason for food spoilage.
- Improve a food’s shelf life by examining microflora in the plant environment.
- Track how a pathogen was introduced into a plant and how it is transferred from one location to another.
- Track the origin of an ingredient/lot of food.
- Better assess risks related to food/commodities from origin to harvest, to transport, etc.
- Ensure a product is not involved in a foodborne outbreak.
- Rapidly identify a contaminated lot if a product is involved in a foodborne outbreak to reduce the size and scope of a recall.
- Authenticate products.
- Use social media for early warning and mitigation of foodborne recalls or outbreaks.
TABLE 2. WHAT FOOD BUSINESSES SHOULD KNOW WHEN USING BIG DATA
- Recognize how big data can help drive continuous quality improvement, as well as its limitations and gaps.
- Hire personnel who recognize when and where it makes sense to collect, store, analyze, and visualize big data.
- Put mechanisms in place to use outputs from data analytics to make decisions, such as collating data with dashboards needs to, for example, use for early warning, root cause analyses of incidents, supplier risk profiling, or manufacturing reaction.
- Share data globally and between agencies to help monitor the flow of pathogens through global supply chains.