-
人体动作识别[1]旨在通过在线或离线的方式,从传感器所采集的数据中自动识别出人体正在执行的动作,它是计算机视觉[2]、机器学习[3]、模式识别[4]和人工智能[5]等技术交叉融合发展的结果,在新型人机交互、虚拟现实、增强现实和辅助培训等领域具有广阔的应用前景。
人体动作识别算法主要有模板匹配和机器学习两大类。模板匹配算法将动作实例与模板库中的动作进行对比,模板库中与动作实例相似度最高的动作即为识别结果;文献[6]提出的时间自相似和动态规整的识别算法即属此类。模板匹配识别算法的缺点是随着模板库的增加,模板比对的开销会越来越大。机器学习算法是用一系列动作实例训练一个分类器,此分类器能够区分不同动作的共性与差异,进而利用分类器进行动作分类;文献[7]提出的方法即属此类,它结合时空特征与3D-SIFT描述子提取人体特征,使用SVM(support vector machine)算法识别人体动作。与基于模板匹配的识别算法类似,基于机器学习的识别算法也要求手动提取动作特征,因此操作繁琐,难以通用。
与普通的机器学习不同,深度学习算法[8-10]能够自动提取目标特征,其中CNN在深度学习中应用得尤为广泛[11],在很多情况下都表现出优秀的性能,适合处理人体动作识别等复杂任务。为此,本文提出基于CNN的精确人体动作识别算法,对人体骨骼数据进行编码处理,进而构建深度学习框架,从而准确识别数据所包含的人体动作。
Data Encoding and CNN Accurate Recognition of Human Body Motion
doi: 10.12178/1001-0548.2019108
- Received Date: 2019-04-22
- Rev Recd Date: 2019-11-26
- Available Online: 2020-05-28
- Publish Date: 2020-05-01
-
Key words:
- convolutional neural network /
- feature extraction /
- human motion recognition /
- motion data encoding
Abstract: It is affected by many negative factors to recognize human motions accurately, such as the affecting of the light intensity during motion data collecting, the vagueness and transformation of the human motion features. To decrease the effect of these adverse factors and improve the accuracy of motion recognition, the following contents are investigated in this paper. Firstly, the human joint data collected by Kinect are preprocessed to overcome the illumination problem. Secondly, encoding methods are proposed to encode the preprocessed data, and then the encoded data are inputted into CNN to extract human motion features automatically, making the description of motion feature easier. Finally, the CNN completes motion classification with SoftMax. These experiments show the proposed algorithm can achieve a high recognition accuracy (most of the F1 values are larger than 0.8) and can adapt to different data settings; the compound property data are better than single property data in single property tests and the F1 value can be 0.916; the F1 values in compound property tests are smaller than those of single property tests and the maximal decrease percentage can be 25%.
Citation: | HU Qing-song, ZHANG Liang, DING Juan, LI Shi-yin. Data Encoding and CNN Accurate Recognition of Human Body Motion[J]. Journal of University of Electronic Science and Technology of China, 2020, 49(3): 473-480. doi: 10.12178/1001-0548.2019108 |