河北师范大学学报—

期刊信息

刊名：河北师范大学学报（自然科学版）Journal of Hebei Normal University (Natural Science)
主办：河北师范大学
ISSN： 1000-5854
CN： 13-1061/N
中国科技核心期刊
中国期刊方阵入选期刊
中国高校优秀科技期刊
华北优秀期刊
河北省优秀科技期刊

协同采编系统

友情链接

一种适用于卷积结构的非图像数据预处理方法

黄涛¹

陈颖悦¹

陈玉明¹

曾念峰²

(1.厦门理工学院计算机与信息工程学院,福建厦门 361024;2.易成功(厦门)信息科技有限公司,福建厦门 361024)
DOI： 10.13763/j.cnkij.hebnu.nse.202301009

A Non-image Data Preprocessing Method for Convolution Structure

HUANG Tao¹, CHEN Yingyue¹, CHEN Yuming¹, ZENG Nianfeng²

PDF
下载
HTML

摘要/Abstract

摘要：

卷积神经网络凭借局部相关和权值共享等优良特性而广泛应用于图像处理领域，成为最受欢迎的神经网络架构之一. 然而，对于基因组、语音和金融等非图像形式的数据，传统的卷积网络可能无法完全适用. 为了摆脱这一困境，科研人员不断尝试研发诸如循环神经网络以及注意力机制网络等可用于解析非图像数据的网络结构，拓展神经网络的应用范围. 新型网络架构的研发无疑是困难且耗费巨大的，从另一个角度出发，提出一种适用于卷积网络结构的数据预处理方法. 通过处理源数据，将其转换为特定的一维特征向量或二维图像矩阵，接着送入自定义卷积结构中观察其算法表现. 实验采用UCI和Kaggle平台上的经典数据集进行测试并使用了SVM、决策树、随机森林等传统机器学习模型来对比该方法的可行性和有效性.

Abstract：

Convolutional neural networks are widely used in the field of image processing due to their excellent properties such as local correlation and weight sharing, and have become one of the most popular neural network architectures. However, for non-image forms of data such as genomics, voice,and finance, traditional convolutional networks may not be fully applicable. In order to get rid of this dilemma, researchers have been trying to develop network structures such as recurrent neural networks and attention mechanism networks that can be used to analyze non-image data,and expand the application scope of neural networks. The research and development of new network architecture is undoubtedly difficult and costly. From another perspective, we propose a data preprocessing method for convolutional network structure. By processing the source data, it is converted into a specific one-dimensional feature vector or two-dimensional image matrix,and then sent to a custom convolution structure to observe its algorithm performance. In the experiment,the classical data sets on UCI and Kaggle platforms are used for testing, and traditional machine learning models such as SVM, decision tree, and random forest are used to compare the feasibility and effectiveness of the method.

关键词

关键词： 非图像数据 ; 卷积神经网络 ; 深度学习 ; 数据预处理 ; 机器学习

Key words： non-imagedata ; convolutional neural networks ; deep learning ; data preprocessing ; machine learning

参考文献 14

[1] KRIZHEVSKY A，SUTSKEVER I，HINTON G E.Imagenet Classification with Deep Convolutional Neural Networks[J].Communications of the ACM，2017，60(6):84-90.doi:10.1145/3065386
[2] GUYER D E，MILES G E，SCHREIBER M M，et al.Machine Vision and Image Processing for Plant Identification[J].Transactions of the ASAE，1986，29(6):1500-1507.doi:10.13031/2013.30344
[3] 王树文,闫成新,张天序,等.数学形态学在图像处理中的应用[J].计算机工程与应用,2004,40(32):89-92.doi:10.3321/j.issn:1002-8331.2004.32.026 WANG Shuwen，YAN Chengxin，ZHANG Tianxu，et al.Application of Mathematical Morphology in Image Processing[J].Computer Engineering and Applications，2004,40(32):89-92.
[4] 解涛,梁卫平,丁达夫.后基因组时代的基因组功能注释[J].生物化学与生物物理进展,2000,27(2):166-170.doi:10.3321/j.issn:1000-3282.2000.02.015 XIE Tao，LIANG Weiping，DING Dafu.Genome Annotation in the Postgenome Era[J].Progress in Biochemistry and Biophysics,2000,27(2):166-170.
[5] 崔凯,吴伟伟,刁其玉.转录组测序技术的研究和应用进展[J].生物技术通报,2019,35(7):1-9.doi:10.13560/j.cnki.biotech.bull.1985.2019-0374 CUI Kai，WU Weiwei，Diao Qiyu.Application and Research Progress on Transcriptomics[J].Biotechnology Bulletin，2019,35(7):1-9.
[6] YU Y，SI X，HU C，et al.A Review of Recurrent Neural Networks:LSTM Cells and Network Architectures[J].Neural Computation，2019，31(7):1235-1270.doi:10.1162/neco_a_01199
[7] 林奕欧,雷航,李晓瑜,等.自然语言处理中的深度学习:方法及应用[J].电子科技大学学报,2017,46(6):913-919.doi:10.3969/j.issn.1001-0548.2017.06.021LIN Yiou，LEI Hang，LI Xiaoyu，et al.Deep Learning in NLP:Methods and Applications[J].Journal of University of Electronic Science and Technology of China，2017,46(06):913-919.
[8] MIKOLOV T，KOMBRINK S，BURGET L，et al.Extensions of Recurrent Neural Network Language Model[C]//2011 IEEE International Conference on Acoustics，Speech and Signal Processing (ICASSP).Prague:IEEE，2011:5528-5531.doi:10.1109/icassp.2011.5947611
[9] 刘建伟,刘俊文,罗雄麟.深度学习中注意力机制研究进展[J].工程科学学报,2021,43(11):1499-1511.doi:10.13374/j.issn2095-9389.2021.01.30.005 LIU Jianwei，LIU Junwen，LUO Xionglin.Research Progress in Attention Mechanism in Deep Learning[J].Chinese Journal of Engineering，2021,43(11):1499-1511.
[10] MASALA M，RUSETI S，DASCALU M.Robert-a Romanian Bert Model[C]//Proceedings of the 28th International Conference on Computational Linguistics.Barcelona:International Committee on Computational Linguistics，2020:6626-6637.doi:10.18653/v1/2020.coling-main.581
[11] FLORIDI L，CHIRIATTI M.GPT-3:Its Nature，Scope，Limits，and Consequences[J].Minds and Machines，2020，30:681-694.doi:10.1007/s11023-020-09548-1
[12] HU Z，DONG Y，WANG K，et al.Gptgnn:Generative Pretraining of Graph Neural Networks[C]//Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining.New York，USA.Association for Computing Machinery，2020:1857-1867.doi:10.1145/3394486.3403237
[13] 陈振华,余永权,张瑞.模糊模式识别的几种基本模型研究[J].计算机技术与发展,2010,20(9):32-35.doi:10.3969/j.issn.1673-629x.2010.09.008. CHEN Zhenhua，YU Yongquan，ZHANG Rui.Research on Serveal Models of Fuzzy Pattern Recongnition Problems[J].Computer Technology and development,2010,20(9):32-35.
[14] van der MAATEN L，HINTON G.Visualizing Data Using t-SNE[J].Journal of Machine Learning Research，2008(9):2579-2605