77范文网 - 专业文章范例文档资料分享平台

An Uncertainty-Aware Approach for Exploratory Microblog Retr

来源:网络收集 时间:2021-04-06 下载这篇文档 手机版
说明:文章内容仅供预览,部分内容可能不全,需要完整文档或者需要复制内容,请下载word后使用。下载word有问题请添加微信号:或QQ: 处理(尽可能给您提供完整文档),感谢您的支持与谅解。点击这里给我发消息

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TVCG.2015.2467554, IEEE Transactions on Visualization and Computer Graphics

An Uncertainty-Aware Approach for Exploratory Microblog RetrievalMengchen Liu, Shixia Liu, Xizhou Zhu, Qinying Liao, Furu Wei, and Shimei PanLegend:Upper Extreme h ,  > ∥ ,  Lower Extreme#pjnet#makedclisten

0

1.0#alaska

#congress

#lnyhbt

#t2sda

#science

#breaking#furlough#ff

#budget#democrats#debtceiling#dems#debt#debtlimit#us#usa#cr#tcot#wwiimemorial

#barrycades#veterans#spitehouse#healthcare#impeachobama

#benghazi#tlot#military#ebt

#nationalparks#dems#pjnet#tcot

#politics#cnn

#dc

(b)#spitehouse

#default#government

#irs#obamacare#obama

#texas#gopshutdown#gop#edshow#msnbc

#science

#nationalparks#news

A

#shutdown#govtshutdown#obamashutdown#senate

#tgdn#truth

#jobs#economy#getcovered

#military#tlot#ebt

#spitehouse(d)

#military

#republicans#maddow#republican#teaparty#p2

#libcrib#demandavote#retweet#potus

#obamacare#obama#sot#jobs#economy

#ebt(e)

#immigration#america#fail

#dearcongress

#cspanchat#house

#topprog#sequester#uniteblue#vote#stoprush#enoughalready#aca#koch#wic#1u#endthisnow#inners#tedcruz#cleancr#boehner#justvote

#obamacare#getcovered#doctorwho#maddow#teaparty#p2

#teaparty

(a)

(c)

(f)

Fig. 1. Exploratory retrieval of the government shutdown dataset: (a) the hashtag graph with uncertainty and its propagation; (b) uncertainty propagation; (c)-(f) interactive ranking re nement results. Abstract— Although there has been a great deal of interest in analyzing customer opinions and breaking news in microblogs, progress has been hampered by the lack of an effective mechanism to discover and retrieve data of interest from microblogs. To address this problem, we have developed an uncertainty-aware visual analytics approach to retrieve salient posts, users, and hashtags. We extend an existing ranking technique to compute a multifaceted retrieval result: the mutual reinforcement rank of a graph node, the uncertainty of each rank, and the propagation of uncertainty among different graph nodes. To illustrate the three facets, we have also designed a composite visualization with three visual components: a graph visualization, an uncertainty glyph, and a ow map. The graph visualization with glyphs, the ow map, and the uncertainty analysis together enable analysts to effectively nd the most uncertain results and interactively re ne them. We have applied our approach to several Twitter datasets. Qualitative evaluation and two real-world case studies demonstrate the promise of our approach for retrieving high-quality microblog data. Index Terms—microblog data, mutual reinforcement model, unc

ertainty modeling, uncertainty visualization, uncertainty propagation.

1 I NTRODUCTION Microblogs such as Twitter and Facebook are among the most popular platforms for people to share their daily observations and thoughts, M. Liu and S. Liu are with Tsinghua University. E-mail: simon900314@, shixia@. S. Liu is the corresponding author. X. Zhu is with USTC. E-mail: ezra0408@. Q. Liao and F. Wei are with Microsoft. E-mail:{qiliao,fuwei}@. S. Pan is with University of Maryland, Baltimore County. E-mail: shimei@umbc.edu Manuscript received 31 Mar. 2015; accepted 1 Aug. 2015; date of publication xx Aug. 2015; date of current version 25 Oct. 2015. For information on obtaining reprints of this article, please send e-mail to: tvcg@.

including personal status updates and opinions regarding products or government policies. Since the crowd in microblogs provides many individual comments/opinions that were not available before, businesses and organizations have begun to leverage microblogs to pro le customers, derive brand perception, gauge citizen sentiments, and predict the stock market[23, 34, 41, 53]. For example, retailers track and examine relevant microblog posts to understand customer opinion toward their products and services. In spite of the growing interest in quickly analyzing customer opinions or breaking news in microblogs, progress has been hampered by the lack of an effective mechanism to retrieve data of interest from microblogs. For this reason, researchers have developed a number of microblog retrieval methods[6, 17]. The main goal is to generate a list of k microblog posts that are relevant to the information needs represented by a query q. Although these methods have successfully retrieved

1077-2626 (c) 2015 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See /publications_standards/publications/rights/index.html for more information.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI

10.1109/TVCG.2015.2467554, IEEE Transactions on Visualization and Computer Graphics

relevantposts,theyhavetwomajordrawbacks.First,theuniquecharacteristicsofmicroblogdataarenotcomprehensivelyconsideredtoimproveretrievalperformance.Posts,users,andhashtagsarethreekeydimensionsofmicroblogdata.Thesedimensionsarenotindependent,astheyoftenin uenceoneanother.Forexample,apostpublishedbyanin uentialuserandlabeledwithapopularhashtagtendstobesalient.However,theexistingapproachestomicroblogretrievaldonottightlyintegratethethreedimensionsanddonottakeadvantageoftherelationshipsamongthem.Theseapproacheseitherconsideronlythepostortreatitastheprimarydimensionandtheothersassecondarydimensionsto lterposts.Forexample,ScatterBlogs2[6] ndspostsofinterestbycheckingwhetherthepostscontainacertainhashtag.Second,existingmethodsdonotaddressuncertaintyintheretrievalmodels.Improvingthemodelingandpresentationofuncertaintycanhelpdescriberetrieveddatamoreaccurately,whichcaninturnassistanalystsinmakingmoreinformeddecisions[14,39,49].

Toaddresstheaboveproblems,wehavedevelopedanuncertainty-awaremicroblogretrievaltoolkit,MutualRanker,toestimateuncertaintyintroducedbytheanalysisalgorithmaswellastoquicklyretrievesalientposts,users,andhashtags.Sincepostsaresharedandpropagatedonsocialnetworks,theauthorityofanauthorandthepopularityofahashtagplayanimportantroleindeterminingtheimportanceofapostandviceversa.Accordingly,weformulateuncertainty-awaremicroblogretrievalusinganuncertainty-basedmutualreinforcementgraphmodel(MRG)[16,45],wherethecontentqualityofposts,thesocialin uenceofusers,andthepopularityofhashtagsmutuallyreinforceoneanother.WeadoptaMonteCarlosamplingmethodtosolveMRGbecauseofitseffectivelocalupdatemechanism,fastconvergence,andprobability-baseduncertaintyformalization[2].Adispersion-basedmeasureisusedtoestimatetheuncertaintygeneratedbytheMonteCarlosamplingmethod.Inaddition,wemodeltheuncertaintypropagationasaMarkovchain.Tohelpanalystsunderstandtheretrieveddata,wehavedesignedacompositevisualization[20].Speci cally,adensity-basedgraphvisualizationhasbeendevelopedtovisuallyillustrateposts,users,hashtags,andtheirrelationships.Anuncertaintyglyphanda owmapareemployedtorepresentuncertaintyanditspropagationonagraph(Fig.1).Thethreevisualizationcomponents,togetherwiththeuncertaintyanalysis,enableanalyststoquicklydetectthemostuncertainresultsandinteractivelyresolvethem.TheMonteCarlosamplingmethodisthenusedtoincrementallymodifytherankingresultstomeetuserneeds.

Insummary,ourworkpresentsthreetechnicalcontributions:

Anuncertain-awaremicroblogretrievalmodelthatextractssalientposts,users,andhashtags.Thismodelalsocomputestheassociateduncertaintyanditspropagationamonggraphnodes. Acompositevisualizationthatenablesuserstounderstandthethree-level,mutualreinforcementrankingresults,theassociateduncertainty,anduncertaintypropagationpatterns.

Avisualanalyticssystemthathelpsusersquicklyretrievedataofinterest,aswellasanalyzeandunderstandtherankingresultsinaninteractiveanditerativeprocess.2RELATEDWORK

2.1MicroblogRetrieval

Inthe eldofdatamining,anumberofapproacheshavebeenpro-posedtoretrievedatafrommicroblogs.AcomprehensivesurveywaspresentedbyCherichiandFaiz[12].Mostrecentworkcanbecatego-rizedintotwogroups:vector-space-basedapproachesandlinkanalysisapproaches.

Thevector-space-basedapproachemploystwofeaturevectorstorepresentaqueryandapost.Asimilaritymeasure(e.g.,cosinesimi-larity)isthenadoptedtoestimatethesimilaritybetweenthepostandthequery.TherehavebeensomerecentresearcheffortsthatexploitadditionalstructuralfeaturessuchasURLsandhashtagstoenhanceretrievalperformance[1,29,31].

Recently,totakeadvantageofthelinkstructureofsocialnetworks,researchershaveintroducedthePageRankalgorithm[7]inmicroblogretrieval.Forexample,TwitterRank[46]adoptsthefollower-followee

linkstructureandthePageRankalgorithmtoidentifyin uentialusers.Duanetal.[16]modeledthetweet-rankingproblemasanMRG[45],wherethesocialin uenceofusersandthecontentqualityoftweetsmu-tuallyreinforceeachother.Speci cally,thepostgraph,theusergraph,andthehashtaggraph,aswellastherelationshipsbetweenthethreegraphs,wereusedtoretrievesalientposts,users,andhashtags.Weex-tendthisapproachbyexplicitlymodelingtheuncertaintyoftherankingresult,aswellasitspropagationonthetweet/user/hashtaggraph.Inthe eldofvisualanalytics,agreatdealofresearchhasbeenconductedonvisuallyanalyzingmicroblogdata.Themethodsappliedincludeeventdetection[30],topicextractionandanalysis[25,40,50],informationdiffusion[8,52],sentimentanalysis[47,48],andrevenue/stockprediction[28,34].However,fewstudieshavefocusedonmicroblogretrieval.

Boschetal.[6]developedScatterBlogs2toextractmicroblogpostsofinterest.Itallowsanalyststobuildcustomizedpost ltersandclas-si ersinteractively.These ltersandclassi ersarethenutilizedtosupportreal-timepostmonitoring.Inpost ltering,thepostdimensionisconsideredtheprimarydimensionandthehashtagthesecondarydimension.Incontrast,wetightlyintegratetheposts,users,andhash-tagsintheMRGmodelandusethemodeltoretrievehigh-qualitymicroblogdata.Moreover,wealsomodeluncertaintyintheretrievalprocess.Sinceanalystscaninteractivelyre nethemodel,wecanfurtherimproveretrievalqualitybyleveragingtheuncertaintyformal-izationandanalysts’knowledge.

2.2InteractiveUncertaintyAnalytics

Frequently,uncertaintyisintroducedintovisualanalyticswhendataisacquired,transformed,orvisualized[14,24,27].Anumberofuncer-taintyanalysismethodshavebeenproposed,whichcanbecategorizedintotwogroups:uncertaintyvisualizationanduncertaintymodeling.Manystudiesonuncertaintyvisualizationhavebeenconductedinthe eldofgeographicvisualizationandscienti cvisualization[32,37,42].Typicaluncertaintyrepresentationtechniquesincludetheadditionofglyphsandgeometry,themodi cationofgeometryandattributes,an-imation,soni cation,andpsycho-visualapproaches[32].Recently,researchersareincreasinglyinterestedinthedesignofuncertaintyrep-resentationsforinformationvisualizationandvisualanalytics.Forexample,Collinsetal.[13]designedtwoalternatives,thegradientborderandthebubbleborder,toillustrateuncertaintyinlatticegraphs.Wuetal.[48]developedacircularwheelrepresentationandsubjectivelogictoconveyuncertaintyincustomerreviewanalysis.Slingsbyetal.[38]utilizedbarchartstorevealtheuncertaintyassociatedwithgeodemographicclassi ers.Torepresentuncertaintyinaggregatedvertexsets,Vehlowetal.[43]consideredthelightnessandshapeofthenode.Chenetal.[10]paredwiththesemethods,MutualRankernotonlyvisualizesun-certainty,butalsoitspropagationonagraph.Wealsosupportuserstointeractivelymodifytheuncertainresult.

Anothertypeofuncertaintyvisualizationrepresentstheuncertaintyintheanalysisprocess.ZukandCarpendale[55]studiedissuesrelatedtouncertaintyinreasoninganddeterminedthetypeofvisualsupportrequired.Correaetal.[14]developedaframeworktorepresentandquantifytheuncertaintyinthevisualanalyticsprocess.Wuetal.[49]extendedthisframeworktoshowtheuncertainty owintheanalysisprocess.Bycontrast,ourworkaimstomodeluncertaintyinmicroblogretrieval.Wefocusonvisuallyillustratingtopologicaluncertaintypropagationonagraphandondesigninganiterativevisualanalyticsprocesstoactivelyengageanalystsinreducingoveralluncertainty.Probabilitytheory,fuzzysettheory,roughsettheory,andevidencetheoryarefourmajorapproachestomodeluncertainty[54].Amongtheseapproaches,probabilitytheoryisthemostcommonlyusedmethodinvisualanalytics.Forexample,Correa[14]andWuetal.[49]re-gardeduncertaintyasaparameterthatdescribesthedispersionofmea-suredvalues.Speci cally,theyrepresenteduncertaintyasanestimatedstandarddeviation,inwhichthemeasuredvalueisde nedonthesetofbothpositiveandnegativerealnumbers.Sincethemeasuredvalue(therankingscore)inourapproachisde nedonthesetofpositivereal

百度搜索“77cn”或“免费范文网”即可找到本站免费阅读全部范文。收藏本站方便下次阅读,免费范文网,提供经典小说教育文库An Uncertainty-Aware Approach for Exploratory Microblog Retr在线全文阅读。

An Uncertainty-Aware Approach for Exploratory Microblog Retr.doc 将本文的Word文档下载到电脑,方便复制、编辑、收藏和打印 下载失败或者文档不完整,请联系客服人员解决!
本文链接:https://www.77cn.com.cn/wenku/jiaoyu/1214058.html(转载请注明文章来源)
Copyright © 2008-2022 免费范文网 版权所有
声明 :本网站尊重并保护知识产权,根据《信息网络传播权保护条例》,如果我们转载的作品侵犯了您的权利,请在一个月内通知我们,我们会及时删除。
客服QQ: 邮箱:tiandhx2@hotmail.com
苏ICP备16052595号-18
× 注册会员免费下载(下载后可以自由复制和排版)
注册会员下载
全站内容免费自由复制
注册会员下载
全站内容免费自由复制
注:下载文档有可能“只有目录或者内容不全”等情况,请下载之前注意辨别,如果您已付费且无法下载或内容有问题,请联系我们协助你处理。
微信: QQ: