CAMAGRI-GPT: A parameter-efficient agricultural knowledge system using domain-adapted large language models with retrieval augmentation
-
Graphical Abstract
-
Abstract
Despite their transformative potential, large language models (LLMs) remain underutilized in agriculture due to domain-specific data scarcity and computational constraints. This study presents CAMAGRI-GPT, a parameter-efficient agricultural consultation system that addresses these critical challenges through innovative domain adaptation. A corpus of 2.3 million annotated entries was constructed from raw documents (κ=0.82 agreement, 18 categories) and employed LoRA (r=8) and P-tuning v2 to reduce trainable parameters to 0.2% while maintaining 95.8% performance. The RAG framework with HNSW indexing achieves (87±12) ms retrieval latency, enabling real-time consultation. CAMAGRI-GPT demonstrated over 90.0% accuracy across three representative agricultural tasks (crop management, pest and disease diagnosis, and agricultural Q&A), consistently outperforming GPT-3 and BERT-Agri baselines, p<0.001. Median response latency remained below 2 s across all query categories, meeting field deployment requirements. These results demonstrate that domain-adapted LLMs can effectively deliver expert-level agricultural knowledge to resource-constrained farming communities, offering scalable and sustainable solutions to complement declining traditional extension services.
-
-