帮助 本站公告
您现在所在的位置:网站首页 > 知识中心 > 文献详情
文献详细Journal detailed

酒店在线评论的情感倾向挖掘方法应用研究
A Study on Sentiment Orientation Analysis of Online Hotel Comments on Chinese Web

导  师: 莫赞

学科专业: 1201

授予学位: 硕士

作  者: ;

机构地区: 广东工业大学

摘  要: 愈来愈多的在线消费用户开始浏览大量的网络评论来了解产品和服务的口碑,帮助自己做出可靠的决策。同时网络客户评论作为反馈机制也帮助了服务提供商改进其服务从而获得竞争力。但是,网络评论数量的飞速增长,使得信息内容越来越庞杂,造成客户评论中有用信息难以获取的后果,尤其使得客户难以在短时间内获得对于人物、事件、产品的观点和态度。因此,迫切需要一定的技术手段来使这一过程变得更为准确而便捷,此时“评论挖掘”应运而生并吸引了众多研究者进行学习和研究。评论挖掘主要涉及情感倾向分析、特征挖掘、主观内容识别等;其中情感倾向性分析目的是通过挖掘和分析文本中的立场、观点、情绪、好恶等主观信息,对文本中的主观态度进行判断,涉及人工智能、机器学习、数据挖掘、自然语言处理等多个领域。在英文评论研究领域,研究者已初步取得一些成果,而针对中文网络用户评论的研究却仍处于起步阶段。随着中国电子商务在世界领域内的崛起,亟需关于中文评论中有用信息的自动提取的先进技术。本文以中文网络中对形成旅游预订决策非常重要的酒店评论为研究对象展开探讨。酒店在线评论是非常具有代表性的,与其它在线评论不同的是其更受客户的依赖,对客户是否进行产品预订或购买起到决定性作用;它是顾客对酒店服务质量的真实感知,学术界已有利用其进行酒店服务质量研究的相关成果,但多采用内容分析法,不能对评论进行批量处理,成果应用大受限制。 基于以上问题,本文采用机器学习的方法针对网络评论文本进行情感倾向性分析研究,旨在为中文领域内的客户和企业提供更为方便和科学的评论挖掘工具。本文采用开源爬虫框架从携程网客户评论获取语料并按六种不同的评价对象类别进行分类;重点详细介绍了语料库的预处理,包括中文分词和去无用词;然后选用随机森林降序排列特征提取方法和SVM标准分类器,结合本文提出的客户评论情感模型在R语言环境中实现了多种算法分类结果的进一步改善;实验结果表明该计算路径下得到的分类效果更好、准确率更高,不仅克服了文本分析中高维稀疏的数据问题及训练集中的噪声问题,并具有稳定的面向海量web文本切分的实用性能,实验结果还表明这种分类后的倾向性分析更能准确和细致地反映客户的立场和观点,帮助管理者快速地掌握客户对于酒店各个方面的喜爱或者厌恶程度,具有实际的意义。 With the deeper and wider applications of the Internet, more and more customers browse large number of online reviews in order to know other customers word-of-mouth of product and service to make an informed choice. At the same time, the network customer reviews as a feedback mechanism can help vendors and manufacturers improve their products and service, and then get competitive advantage. However, with the e-business arising, the number of reviews is growing rapidly and the content is more complicated, it is very difficult to retrieve useful knowledge from customers' reviews, especially difficult to get people's perspectives and attitudes from many characters, events, products in a short period. It needs technical methods to improve the accuracy and convenience of mining information. Review mining comes about to extract valuable information from customers' reviews, and the purpose of orientation analysis of review mining is to determine the attitude of the entire text by text mining and analysising the viewpoints, opinions, emotions, likes and dislikes, etc., of the subjective information in the text. It has attracted many researchers' attention. It mainly includes sentiment classification, mining products features and learning subjective language etc. In English reviews area, Researchers have made some successful results but few studies have been conducted to Chinese customer reviews on the Internet. As Chinese e-business has increased dramatically in cyber space, how to automatically retrieve useful knowledge from online Chinese reviews has become urgent. In this paper, our research object is hotel online reviews which is important for travel booking, it is very representative and customers usually rely on it, according to the actual situation, we firstly categorize the comments, forming different dimensions, and we propose sentiment orientation model and use open source frame of web crawler, ICTCLAS Chinese text segmentation system, feature extraction by descending method, choose the tool R so that to make a multi algorithms comparison with complex review emotion words as experimental subjects, the experiment results show that this method is better and more suitable for web text sentiment classification, which overcomes the high-dimensional sparse data problems of textual analysis and noises in the training set. In addition, the method proposed here is efficient and effective to deal with huge web text.

关 键 词: 在线评论 情感倾向分析 机器学习 随机森林

分 类 号: [TP391.1 F713.36 F719]

领  域: [自动化与计算机技术] [自动化与计算机技术] [经济管理] [经济管理]

相关作者

作者 张丹
作者 胡小玲
作者 邵瀛
作者 黎浿澎
作者 谭蕾蕾

相关机构对象

机构 华南理工大学
机构 中山大学管理学院
机构 中山大学
机构 广州大学旅游学院
机构 华南师范大学增城学院旅游管理系

相关领域作者

作者 杜松华
作者 李文姬
作者 邵慧君
作者 周国林
作者 邢弘昊