期刊信息
- 刊名: 河北师范大学学报(自然科学版)Journal of Hebei Normal University (Natural Science)
- 主办: 河北师范大学
- ISSN: 1000-5854
- CN: 13-1061/N
- 中国科技核心期刊
- 中国期刊方阵入选期刊
- 中国高校优秀科技期刊
- 华北优秀期刊
- 河北省优秀科技期刊
面向草分类的粗细粒度结合模型—基于CLIP的实现
- 青海民族大学 智能科学与工程学院,青海 西宁 810007
-
DOI:10.13763/j.cnki.jhebnu.nse.202601003
CLIP-based Coarse-to-fine Grained Model for Grasses Classification
摘要/Abstract
计算机视觉技术在智慧农业的应用中,经常面临开放环境下的分布外(out-of-distribution,OOD)样本识别差、细粒度分类能力不足等问题,比如草的图像分类问题.然而,传统基于CNN架构在开放场景下表现不佳,遇到与训练分布偏离的样本,不能够识别出OOD样本,并将其分配给类内标签,这会显著降低模型的稳定性.为此,提出一种基于视觉语言模型的分类方法,利用预训练好的CLIP(contrastive language-image pretraining)模型中的视觉编码器与文本编码器提取图像与文本的特征嵌入向量结合,通过特征相似度对比实现跨模态,匹配提升分类效果.首先,基于提示词引导粗粒度判别,快速区分已知类别与OOD类别图像;其次,对判定为已知类别的图像引入CUM-CLIP(custom ada pter-CLIP)模块,执行细粒度识别,进一步区分具体子类别,从而实现精细化、层次化分类.该方法在小样本条件下显着提升了训练效率与模型泛化能力.实验结果表明,CUM-CLIP与传统模型相比,该方法在训练时间、计算成本和分类精度方面均表现出显著优势,验证了其在开放场景下的鲁棒性和实用性.本研究为智慧农业领域的图像分类任务提供了一种高效、低成本的解决方案,可为相关研究提供有价值的参考.
In smart agriculture,computer vision often struggles with recognizing out-of-distribution(OOD)samples and performing fine-grained classification under open-set conditions-such as grasses imageclassification tasks.Traditional CNN-based models tend to misclassify OOD samples as known classes,reducing model stability.To address this,we propose a method based on the vision-language model CLIP.It uses pretrained image and text encoders to extract features and match them via similarity comparison.Acoarse-to-fine strategy is applied:first,prompt-based matching identifies whether an input belongs to known or OOD categories.Then,in-distribution samples are further classified using a fine-tuned CUMCLIP(custom adapter-CLIP)module.This approach improves classification accuracy,efficiency,and generalization,especially in few-shot scenarios.Experiments show that our method outperforms traditional models in open environments,offering a practical and robust solution for agricultural image classification.
关键词
参考文献 17
- [1] HU WY,CHEN T,LAN CJ,et al.SkipRes Net:Crop and Weed Recognition Based on the Improved ResNet[J].Land,2024,13(10):1585.doi:10.3390/landl3101585
- [2] FAISALHM,AQIB M,MAHMOODK,et al.A Customized Convolutional Neural Network-lased Approch for WeedsIdentification in Cotton Crops[J].Frontiers in Plant Science,2025,15:1435301.doi:10.3389/fpls.2024.1435301
- [3] SILVAJAOS,de SIQUEIRAV S,MESQUITA M,et al.Deep Learning for Weed Detection and Segmentation in Agri-cultural Crops Using Images Captured by an Unmanned Aerial Vehicle[J].Remote Sensing,2024,16(23):4394.doi:10.3390/rs16234394
- [4] RADFORD A,KIMJW,HALLACY C,et al.Learning Transferable Visual Models from Natural Language Supervision[C]//ICML.International Conference on Machine Learning.ICML,2021:8748-8763.
- [5] ZHOU Y Y,YANHP,DING K,et al.Few-shot Image Classification of Crop Diseases Based on Vision-Languge Models[J].Sensors,2024,24(18):6109.doi:10.3390/s24186109
- [6] LEEC P,LIMK M,SONGY X,et al.Plan-CNN-ViT:Plant Classification with Ensemble of Convolutional Neural Net-works and Vision Transformer[J].Plants,2023,12(14):2642.doi:10.3390/plants12142642
- [7] ALTHUNIYAN N,AL-SHAMASNEHAR,BAWAZIR A,et aL.DeepLeaf:Automated Leaf Classification Using Convo-lutional Neural Networks[J].European Scientific Journal,2024,20(30):22doi:10.19044/esj.2024.v20n30p22
- [8] KOCH G,ZEMELR,SALAKHUTDINOV R,et aL.Siamese Neural Networks for One-shot Image Recognition[C]//BACHFR,BLEI D M.Proceedings of the 32nd International Conference on Machine Learning.Lille:ICML,2015:1-30.
- [9] SUNG F,YANG YX,ZHANG L,et al.Learning to Compare:Relation Network for Few-shot Learning[C]//IEEE2018IEEE/CVF Conference on Computer Vision and Pattern Recognition.Salt Lake City:IEEE,2018:1199-1208.doi:10.1109/CVPR.2018.00131
- [10] GARG A,MAKAM V,OLIVEIRA R,et al.More Bariers for Rank Methods,via a"Numeric to Symbolic"Transfer [C]//IEEE.2019 IEEE 60th Anmua Symposium on Foundations of Computer Science(FOCS).Baltimore:IEEE,2019:824-844.doi:10.1109/FOCS2019.00054
- [11] FINN C,ABBEELP,LEVINE S.Model-agnostic Meta-learning for Fast Adaptation of Deep Networks[C]//ICML.Pro-ceedings of the 34th International Conference on Machine Learning,Sydney:ICML,2017:1126-1135.
- [12] VINYALS O,BLUNDELLC,LILLICRAP T P,et al.Matching Networks for One Shot Learning[C]//LEE D D,vonLUXBURG U.Proceedings of the 30th International Conference on Neural Information Processing Systems.New York:Curran Associates Inc,2016:3637-3645.
- [13] SNELLJ,SWERSKY K,ZEMEL R,et alPrototypical Network for Few-shot Learning[C]//von LUXBURG U,GUYON LProceedings of the 31st International Conference on Neural Information Processing Systems.New York:Cur-ran Associates Inc,2017:4080-4090.
- [14] CHEN MT,WANGXG,LUO H,et al.Learning to Focus:Cascaded Feature Matching Network forFew-shot ImageRecognition[J].Science China Information Sciences,2021,64:192105.doi:10.1007/s11432-020-2973-7
- [15] WANGHY,GOUK H,FRASER H,et al.Experiments in Cross-domain Few-shot Learning for Image Classification[J].Journal of the Royal Society of New Zealand,2023,53(1):169-191.doi:10.1080/03036758.2022.2059767
- [16] 宋思雨,苗夺谦.基于多粒度空间混乱的细粒度图像分类算法[J].智能系统学报,2022,17(1):144-150.doi:10.11992/tis.202105040SONG Siyu,MIAO Duoqian Fine-grained Image Classification Algorithm Based on Multi-granularity Regions Shuffle[J].CAAI Transactions on Intelligent Systems,2022,17(1):144-150.
- [17] CHEN F,SHEN Y,LIGL,et al.Classification of Wheat Grain Varieties Using Terahertz Spectroscopy and Convolution-al Neural Network[J].Journal of Food Composition and Analysis,2024,129:106060.doi:10.1016/j.jfca.2024.106060