帮助 本站公告
您现在所在的位置:网站首页 > 知识中心 > 文献详情
文献详细Journal detailed

基于文本分类技术的微博平台潜在客户挖掘
Identifying Potential Customers on Microblog Platforms Using Text Classification Techniques

导  师: 蒋盛益

学科专业: 120202

授予学位: 硕士

作  者: ;

机构地区: 广东外语外贸大学

摘  要: 微博/(Microblog/)、Facebook和YouTube等社会化媒体的快速发展已经深刻地改变了企业与客户、客户与客户之间的沟通互动方式,在这种新兴媒体上,客户在产品或服务交易市场上发挥着空前主动的角色。社会化媒体具有强大的信息传播能力、互动性强、信息分享实时等特点,充分利用这些特点进行有效的社会化媒体营销能帮助企业改善品牌形象,提高品牌知名度,从而扩大其市场份额。微博的用户数量庞大、信息传播速度迅速、影响范围广泛,这使得微博营销成为企业社会化媒体营销中最为重要的一个环节,而潜在客户识别是开展精准微博营销的重要基础。 如何有效地表示客户的特性是潜在客户挖掘最重要的基础问题,它对潜在客户挖掘效果具有决定性的作用。目前,国内外对微博平台潜在客户挖掘的研究尚少,相关的研究主要根据客户的人口统计信息和微博使用行为等方面抽取特征来刻画客户的特性,该类型方法的操作较为复杂;同时,由于对客户特性的描述特征还不够准确等问题导致其识别准确率偏低(最好的准确率为76/%左右)。 本研究认为客户的社会关系网的兴趣爱好信息对客户特性的描述具有重要意义,旨在通过微博平台探索客户的社会关系特性在潜在客户挖掘中的作用,提出融合客户及其微博好友自定义标签信息,从客户个人和社会特性两个方面生成客户特性描述文本,进而提出一种基于文本分类的微博平台潜在客户挖掘框架。 大量的实验结果表明:本研究提出的客户特性描述方法能帮助潜在客户识别模型平均有86/%左右的准确率;K最近邻(K Nearest Neighbors,KNN)分类、朴素贝叶斯(Naive Bayes,NB)分类、Rocchio分类、基于类别质心的分类方法(Centroid-based Classification,Centroid)和支持向量机分类(Support VectorMachines, SVM)等5种文本分类算法都获得较高准确率的潜在客户识别效果,验证了本研究所提出框架的有效性。在这5个分类器中,SVM取得了准确率最高的潜在客户识别性能,但其建模和决策分析较为耗时,而NB是在潜在客户识别性能和运行时间方面权衡的最好的分类算法,其次为Rocchio和Centroid。 借助微博平台提供的丰富社会关系信息,融合客户的社会关系网的兴趣爱好信息来刻画客户的特性不仅为潜在客户挖掘提供一种新的视角和手段,同时也为客户细分、客户流失等经典客户关系管理问题的研究提供很好的参考。 The interactive ways between companies and customers, and the ways customerscommunicate with each other have been changing dramatically with the rapid rise of avariety of social media, e.g., Microblog, Facebook and YouTube. Customers havebeen provided with many options to actively participate in the market. Compared totraditional media, the social media brings itself with many advantageous features suchas rapid dissemination, great interactive interface and real-time message sharing.These features, if being made full use of, can help companies improve their brandimages and awareness, and further enlarge their market shares. Microblog ischaracterized by a huge number of users, fast message propagation and a broad rangeof influence, so that microblog marketing has been made as one of the most importantparts of social media marketing in many companies. Identifying potential customers isessentially the foundation of microblog marketing. The most vital and fundamental problem in potential customer mining is how todenote the characteristics of customers effectively, which exerts critical effects on theperformance of potential customer mining. Recently there has been very little researchon this issue. Related work mainly focused on utilizing the demographic informationand microblog usage behavior of customers to describe the customers’ characteristics.This kind of methods requires complicated operations. Moreover, the accuracy ofclassifying potential customers is somewhat low due to insufficiently accuratecharacteristics description of customers /(the best accuracy is about76/%/). In this study, we assume the preferences of the customers’ friends are of greatimportance in the characteristics description of customers, and aim to exploit theeffects of these preferences on identifying potential customers. Under this assumption,we propose a method to generate the textual descriptions of customers’ characteristicsfrom the personal and social relationship perspectives, via taking advantage of theself-defined tags of customers and their friends on microblog platforms. We furtherpropose a framework of mining potential customers by using text classificationtechniques. Extensive experiments have been conducted to evaluate our proposed methods.The results show that our proposed textual descriptions of customers’ characteristicsenable text classifiers to obtain the accuracy of classifying potential customers atabout86/%on average. Various text classifiers, i.e., K Nearest Neighbors /(KNN/), Naive Bayes /(NB/), Rocchio, centroid-based Classification /(Centroid/) and SupportVector Machines /(SVM/), achieve high potential customer classification accuracy,which validate the effectiveness of the proposed framework. Among these fiveclassifiers, SVM obtains the best classification accuracy yet requires a large amountof time in modeling and classification. An excellent trade-off between effectivenessand efficiency goes to NB, and then coming with Rocchio and Centroid. With the aid of the fruitful social relationships existed on microblog platforms,the emerging of the preferences of customers’ friends into customer descriptions cannot only provide potential customer mining on microblog platforms with a newperspective and method, but also present important references to many classic CRM/(customer relationship management/) problems such as customer segmentation andcustomer churn.

关 键 词: 客户特性描述 社会关系 潜在客户挖掘 文本分类 微博营销

分 类 号: [F274 G206]

领  域: [经济管理] [经济管理] [文化科学]

相关作者

作者 庞观松
作者 张华容
作者 邓俊勇
作者 魏亚雄
作者 赵庆年

相关机构对象

机构 中山大学
机构 暨南大学
机构 华南理工大学
机构 广东外语外贸大学
机构 华南师范大学

相关领域作者

作者 张健
作者 张方超
作者 彭铁牛
作者 彭飞
作者 徐广生