I am deeply indebted to Dr. Michael Berry, my major advisor, for his kind guidance and support. I also thank Dr. Susan Dumais, director of the Information Sciences Research Group at Bellcore, for her technical advice. In addition, she graciously allowed us
documentsretrieved.Recallistheratioofthenumberofrelevantdocumentsretrievedforaquerytothenumberofdocumentsrelevanttothatqueryintheentiredocumentcollection.Bothprecisionandrecallareexpressedasvaluesbetween0and1.Anoptimalretrievalsystemwouldprovideprecisionandrecallvaluesof1,althoughprecisiontendstodecreasewithgreaterrecallinreal-worldsystems[FBY92].
2.2EarlyWork
Theideaofusingavector-spacemodeltorepresentinformationhasexistedsincetheadventofthecomputerera.Throughtheyears,variousmodi cationshavebeensug-gestedtoimprovetheretrievalperformanceofthemodels.Inaddition,ascomputingresourcesbecamemoreavailableandreliable,themodelshavebecomeincreasinglysophisticatedandcomplex,oftenaccompaniedbysigni cantincreasesinretrievalperformance.
2.2.1H.P.LuhnandtheOriginofVector-SpaceModels
In1953,H.P.Luhnpublishedaninitialdiscussionofvector-spacemodelsforinforma-tionretrieval[Luh53]thatsummarizedmanyofthekeyissuesandconceptsstillbeingconsideredtoday.Luhnwasmotivatedbytheconcernthatthecontrolledvocabulariesandclassi cationschemesusedinmanualindexingmaychangeovertime.Luhnwasalsoconcernedthatbyonlyclassifyingconceptsinadocumentthatseemedimportantatthetime,aspectsofthedocumentthatmightbecomemoreimportantinthefuturewouldbelost.
The1953article,alongwithasubsequentarticlein1957[Luh57],proposedtheuseofa“notion-space”toalleviatetheproblemsofclassi cationandthelackofuniformityinwordusage.Luhnsuggestedeachdocumentshouldberepresentedinthenotion-spacebyanarraywhoseelementscorrespondtothevariousnotions(language-independentideas)inthedocumentcollection.Tosearchthecollection,thenotionscontainedinthequerywouldbeencodedlikethenotionsinthedocument,andthearraywouldbe
9
百度搜索“77cn”或“免费范文网”即可找到本站免费阅读全部范文。收藏本站方便下次阅读,免费范文网,提供经典小说综合文库Toward Large-Scale Information Retrieval Using Latent Semant(18)在线全文阅读。
相关推荐: