Evaluating responses by ChatGPT to farmers' questions on irrigated lowland rice cultivation in Nigeria | Scientific Reports – Nature.com
Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.
Advertisement
Scientific Reports volume 14, Article number: 3407 (2024)
Metrics details
The limited number of agricultural extension agents (EAs) in sub-Saharan Africa limits farmers’ access to extension services. Artificial intelligence (AI) assistants could potentially aid in providing answers to farmers’ questions. The objective of this study was to evaluate the ability of an AI chatbot assistant (ChatGPT) to provide quality responses to farmers’ questions. We compiled a list of 32 questions related to irrigated rice cultivation from farmers in Kano State, Nigeria. Six EAs from the state were randomly selected to answer these questions. Their answers, along with those of ChatGPT, were assessed by four evaluators in terms of quality and local relevancy. Overall, chatbot responses were rated significantly higher quality than EAs’ responses. Chatbot responses received the best score nearly six times as often as the EAs’ (40% vs. 7%). The evaluators preferred chatbot responses to EAs in 78% of cases. The topics for which the chatbot responses received poorer scores than those by EAs included planting time, seed rate, and fertilizer application rate and timing. In conclusion, while the chatbot could offer an alternative source for providing agricultural advisory services to farmers, incorporating site-specific input rate-and-timing agronomic practices into AI assistants is critical for their direct use by farmers.
In sub-Saharan Africa (SSA), rice productivity is often low due to sub-optimal crop management practices by smallholder farmers1,2,3. Farmers have limited access to agricultural extension services due to the limited number of extension agents (EAs), which results in many rice farmers not having access to updated advice for rice production4,5. Furthermore, within rural socio-cultural systems, EAs often do not effectively reach women farmers. In some areas in SSA, women are negatively affected by socio-cultural and religious constraints, which forbid them from communicating freely with men outside their families5. A wide variety of technology dissemination and scaling tools (rural radio, videos, etc.) have been developed and used to reach women farmers6,7. A dissemination approach in which women service providers reach women farmers has been also proposed for providing field-specific recommendations to farmers, which requires service providers to have digital technologies (smartphone, tablet)5. While further efforts are needed to improve access to electricity and internet to aid the adoption of digital extension services in the rural agrarian communities in SSA, recent development of artificial intelligence (AI) assistance is an unexplored resource for addressing challenges farmers face. One such platform, ChatGPT, represents a new generation of AI technologies driven by advances in large language models8. A recent study on health care reported that although the system was not developed to provide health care, the chatbot responses were preferred over physician responses and rated significantly higher for both quality and empathy9. However, its ability to help address farmers’ questions on rice cultivation in SSA is unexplored.
Therefore, the objective of this study was to evaluate the ability of an AI chatbot assistant (ChatGPT) to provide quality responses to farmers’ questions on rice production. We tested ChatGPT’s ability to respond with high-quality answers to farmers’ questions, by comparing the chatbot responses with EAs’ responses to questions in Kano State, one of major rice producing areas in northern Nigeria10,11.
Table 1 shows questions related to rice production, which are based on answers from 107 interviewed farmers about questions they want to ask EAs for improving their rice production. Popular questions mentioned by farmers were on types of inputs (variety, fertilizer, herbicide). In terms of number of questions in each intervention area, crop establishment, insect and disease management, and weed management had most (5, 5, and 4, respectively). Examples of EAs’ and chatbot responses to questions (nos. 1–3) are shown in Table 2. Mean chatbot responses were significantly longer (335 [202–468] words) than both EAs’ responses with and without extension materials, which had no difference (10 [2–45]) (Fig. 1).
Number of words per response authored by extension agents (EAs) and chatbot. As there was no difference in number of words per response by EAs without and with extension materials, data from both were combined. Different letter indicates significant difference (P < 0.001).
On average over 32 questions, evaluators rated chatbot responses significantly higher quality than responses by EAs without and with extension materials by 19 and 15% (P < 0.01) (Table 3). The mean rating for chatbot responses corresponded to a “good” response (3.8), whereas those for EAs’ responses without and with extension materials corresponded to an acceptable response (3.2 and 3.3, respectively). There was no significant difference in scores between EAs’ responses without and with extension materials. The Pearson correlation coefficient between scores of responses by EAs without and with extension materials was positive and significant (r = 0.71, P < 0.01). The correlation coefficients between scores of responses by chatbot and EAs without and with extension material were not significant (r = − 0.13, P > 0.05; r = − 0.15, P > 0.05).
The proportion of responses rated very good quality (5; range between 1 and 5) was significantly higher (p < 0.05) for chatbot responses than for those of EAs without and with extension materials (Table 4). The chatbot achieved the best score nearly six times as often as EAs (40% vs. 6% and 8%). In contrast, the proportion of responses rated acceptable was significantly lower for chatbot compared to EAs without and with extension materials (18% vs. 51% and 46%; Table 4). There was no significant difference in the number of responses rated poor and very poor between the chatbot and EAs without and with extension materials (Table 4).
Across the 32 questions, the evaluators preferred the chatbot response over the responses by EAs without and with extension materials for 78% and 69%, respectively (Fig. 2). When we looked at the responses where the chatbot had lower scores than those authored by EAs (questions 11, 12, and 23 in Tables 1 and 3) and having lower score than 3 (14 and 16), we found that the chatbot provided inaccurate information (Table 5)—i.e., the chatbot-recommended seed rate was too high (11); planting time was not correct in dry season (12); financial services were not available (14); soil testing was not recommended (16); recommended number of seedlings per hill was different but should not be different between the two seasons (23).
Cumulative probability of the difference in score between responses authored by extension agents (EAs) without and with extension material and chatbot. Response scoring options had a 1–5 scale, where higher values indicated greater quality.
After reviewing the chatbot responses, five out of the six EAs who had answered the 32 questions indicated that the chatbot provided relevant answers on rice cultivation and could be used as a tool for EAs to provide farmers with advice (Table 6). All EAs rated the chatbot responses better than their own answers to the questions, and were willing to use chatbot in the future to get the required information to assist farmers.
While chatbot responses were much longer than EAs’ responses, the evaluators preferred chatbot-generated responses over those by EAs even when the latter had extension materials. In fact, having extension materials did not significantly improve quality scores and the scores were highly correlated between responses by EAs with and without extension materials. The chatbot is programmed to provide detailed and comprehensive responses, whereas EAs may provide more concise and practical advice based on their experience. However, the study also found that the evaluators preferred chatbot responses over those provided by EAs, even when the latter had extension materials. Although the evaluators valued the detailed and comprehensive information provided by the chatbot, farmers might have different opinions from them. Longer answers by the chatbot could potentially overwhelm farmers with too much information. Further evaluation by farmers is needed, if the chatbot is directly used by farmers.
This result also confirmed a recent study on health9, which reported that chatbot responses were preferred over physician responses and rated significantly higher for both quality and empathy. The results from this study suggest that a chatbot might become a useful source of information for advising farmers who have limited access to EAs. However, there was no relationship between scores on the responses by the chatbot and EAs and the chatbot provided inaccurate information related to planting time, seed rate, and fertilizer application rate and timing and that message should be made known to rural farmers. Our result supports the paper on large language models (LLMs) and agricultural extension services12 which proposed an idealized LLM design process with human experts in the loop. Consequently, direct use of this tool by farmers is not recommended at present. Instead of direct use, chatbot could assist EAs when giving advice to farmers by drafting a message based on farmers’ questions. Such an AI-assisted approach could save EAs’ time, enabling them to reach more farmers. Furthermore, EAs could also improve their overall communication skills by reviewing and modifying AI-written drafts. Consequently, further research is needed to evaluate how an AI assistant will enhance EAs responding to farmers’ questions and improve their skills and knowledge.
For direct use by farmers, this study highlights the importance of ensuring that the chatbot is programmed with accurate and up-to-date information and that their responses are regularly reviewed and updated by experts in the field. This could involve ensuring that the AI assistant technologies are tailored to the needs and context of the farmers, providing practical and actionable advice, and ensuring that the AI assistant technologies are developed and implemented in a way that is transparent, accountable, and responsive to the needs and concerns of farmers. By addressing these challenges, farmers could directly benefit from AI assistant technologies. Further research is also needed to evaluate farmers’ perception of advisories provided by AI assistant technologies, changes in farmers’ practices after receiving advisories, and their target impact area (e.g., productivity, resource use efficiency, soil health)13.
In June 2023, we conducted interviews with farmers who grow rice in irrigated conditions in Kano State, northern Nigeria. Seventeen women and 90 men farmers were randomly selected from 4032 farmers who had participated in an on-farm survey the previous year (unpublished data) and were asked about questions they want to ask EAs for improving their rice production. Each farmer provided up to five questions. After compiling all the questions, similar questions were merged. We also removed some questions that were not relevant for irrigated rice production (e.g., drought-tolerant varieties). We modified questions to make sure that we consistently included information on location and rice production system and protected farmers’ identities. Table 1 shows the list of 32 questions used in this study, which covered a wide range of agronomic interventions including seed, variety, land preparation, crop establishment method, and management of nutrient, water, weeds, and insects and disease.
On August 10, 2023, the full text of the questions (Table 1) was put into a fresh chatbot session8 free of prior questions that could bias the results, and the chatbot response was saved in a Word file.
Six EAs were nominated from an agricultural extension office in Kano based on their expertise and knowledge of rice cultivation practices. To protect EAs’ identities, we do not specify names of the organizations in this paper. Three of the agents were women. None of them had used a chatbot for their extension services before. They were divided into two groups. One group (three agents) used extension materials for answering questions, while the other group did not. They wrote answers to questions on paper in their offices under the supervision of enumerators. The number of words in the responses by EAs with/without extension materials and the chatbot were counted. After EAs completed their responses, they reviewed the chatbot responses and were then asked about its potential use.
After all responses from the six EAs and the chatbot were compiled, for each question, order of the seven answers were randomized. So that, the order can be different from one question to another. Then, we labeled 1 to 7 in each question to blind evaluators to the identity of the responders. We eliminated information that could be used to identify respondents’ identity by evaluators (for a chatbot, we eliminated statements such as “I’m an artificial intelligence”). All the responses were evaluated by four local rice experts—two from research organizations and others from public extension agencies having good knowledge of local rice production. The evaluators were asked to judge the quality of the responses in terms of local relevance using Likert scales (1, very poor; 2, poor; 3, acceptable; 4, good; and 5, very good).
Scores were averaged across evaluators for each question. This method is used when there is no ground truthing in the outcome being studied, and the evaluated outcomes themselves are inherently subjective. Thus, the mean score reflects evaluator consensus, and disagreements (or inherent ambiguity, uncertainty) between evaluators is reflected in the score variance. Thus, analysis of variance (ANOVA) was conducted to assess difference in the quality score of EAs with and and without extension materials responses to ChatGPT responses. The chi-squared test was applied to identify significant differences between evaluators’ scores on responses by extension agents (EAs) with and without extension materials and chatbot. For the chi-squared test, the null hypothesis states that there is no significant difference between the evaluators’ scores, whereas the alternative hypothesis states that these scores differ. We employed a t-test to compare the difference in the number of words in EAs and chatbot responses because the number of words in EAs with and without content is similar. Shapiro and Bartlett tests were used before ANOVA and t-tests to ensure that the data had a normal distribution and was homogeneous in terms of variance. Mean separation was done using the Tukey HDS approach. Pearson correlation between scores of the responses of EAs and the chatbot was performed. All statistical analyses were performed in R statistical software, version 4.3.114.
The distribution of the expert assessment of the responses is presented in Fig. 2. We report the percentage of questions for which the chatbot response was preferred and identified the questions in which the chatbot responses had lower scores than those of EAs.
The authors confirm that all methods were carried out in accordance with relevant guidelines and regulations. The authors confirm that all experimental protocols were approved by Africa Rice Center Scientific Committee. The authors confirm that informed consent was obtained from all subjects involved in this study.
Data will be made available on request. Email could be sent to Dr. Ali Ibrahim (i.ali@cgiar.org).
Senthilkumar, K. et al. Quantifying rice yield gaps and their causes in Eastern and Southern Africa. J. Agron. Crop Sci. 206(4), 478–490 (2020).
Article Google Scholar
Saito, K. et al. Status quo and challenges of rice production sub-Saharan Africa. Plant Prod. Sci. 26, 320–333 (2023).
Article Google Scholar
Dossou-Yovo, E. R., Vandamme, E., Dieng, I., Johnson, J. M. & Saito, K. Decomposing rice yield gaps into efficiency, resource and technology yield gaps in sub-Saharan Africa. Field Crops Res. 258, 107963 (2020).
Article Google Scholar
Achandi, E. L. et al. Women’s access to agricultural technologies in rice production and processing hubs: A comparative analysis of Ethiopia, Madagascar and Tanzania. J. Rural Stud. 60, 188–198 (2018).
Article Google Scholar
Zossou, E., Saito, K., Assouma-Imorou, A., Kokou, A. & Tarfa, B. D. Participatory diagnostic for scaling a decision support tool for rice crop management in northern Nigeria. Dev. Pract. 31(1), 11–26 (2021).
Article Google Scholar
Zossou, E., Van Mele, P., Wanvoeke, J. & Lebailly, P. Participatory impact assessment of rice parboiling videos with women in Benin. Exp. Agric. 48(3), 438–447 (2012).
Article Google Scholar
Zossou, E., Vodouhe, S. D., Van Mele, P., Agboh-Noameshie, A. R. & Lebailly, P. Linking local rice processors’ access to rural radio, gender, and livelihoods in Benin. Dev. Pract. 25(7), 1057–1066 (2015).
Article Google Scholar
Chat GPT. https://openai.com/blog/chatgpt. Accessed 10 August 2023.
Ayers, J. W. et al. Comparing physician and artificial intelligence chatbot responses to patient questions posted to a public social media forum. JAMA Intern. Med. 183(6), 589–596 (2023).
Article PubMed Google Scholar
NIAR group. Agricultural Performance Survey Report 2019: Rice (Nigeria Institute for Agricultural Research, 2020).
Google Scholar
Kamai, N., Omoigui, L. O., Kamara, A. Y. & Ekeleme, F. Guide to Rice Production in Northern Nigeria (International Institute of Tropical Agriculture, 2020).
Google Scholar
Tzachor, A. et al. Large language models and agricultural extension services. Nat. Food 4, 941–947 (2023).
Article CAS PubMed Google Scholar
Saito, K. et al. Agronomic gain: Definition, approach, and application. Field Crops Res. 270, 108193 (2021).
Article ADS PubMed PubMed Central Google Scholar
R Core Team. R: A Language and Environment for Statistical Computing (R Foundation for Statistical Computing, 2023).
Google Scholar
Download references
The authors would like to thank Bill & Melinda Gates Foundation (Seattle, USA; Grant ID INV-005431) for funding this study through the CGIAR Excellence in Agronomy initiative. We would like to thank the extension agents from Kano State who answered farmers’ questions, and the evaluators who dedicated their time to evaluating responses from extension agents and the chatbot. We are grateful to Christian Alvari, Research Assistant for supervising the extension agents who answered farmers’ questions.
Africa Rice Center (AfricaRice), PMB 82, Abuja, 901101, Nigeria
Ali Ibrahim
Faculté d’Agronomie, Université Abdou Moumouni, B.P. 10960, Niamey, Niger
Ali Ibrahim
Africa Rice Center (AfricaRice), B.P. 1690, 101, Antananarivo, Madagascar
Kalimuthu Senthilkumar
Africa Rice Center (AfricaRice), 01 B.P. 2551, Bouaké 01, Côte d’Ivoire
Kazuki Saito
International Rice Research Institute (IRRI), DAPO Box 7777, 1301, Metro Manila, Philippines
Kazuki Saito
You can also search for this author in PubMed Google Scholar
You can also search for this author in PubMed Google Scholar
You can also search for this author in PubMed Google Scholar
A.I.: Methodology, data collection, curation & analysis, visualization, writing (review and editing); K.S.: Methodology, writing (review and editing); K.S.: Conceptualization, methodology, writing (original draft).
Correspondence to Kazuki Saito.
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
Reprints and permissions
Ibrahim, A., Senthilkumar, K. & Saito, K. Evaluating responses by ChatGPT to farmers’ questions on irrigated lowland rice cultivation in Nigeria. Sci Rep 14, 3407 (2024). https://doi.org/10.1038/s41598-024-53916-1
Download citation
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41598-024-53916-1
Anyone you share the following link with will be able to read this content:
Sorry, a shareable link is not currently available for this article.
Provided by the Springer Nature SharedIt content-sharing initiative
By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.
Advertisement
© 2024 Springer Nature Limited
Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.