SuggestingAccurateMethodandClassNames
MiltiadisAllamanis
School
EarlT.Barr
Dept.
ChristianBird CharlesSutton
ResearchMicrosoft
Redmond,WA,USA
Microsoft
{m.allamanis,csutton}@ed.ac.ukofInformatics
UniversityofEdinburghEdinburgh,EH89AB,UKofComputerScienceUniversityCollegeLondon
London,UK
e.barr@ucl.ac.ukcbird@http://www.77cn.com.cn
ABSTRACT
Descriptivenamesareavitalpartofreadable,andhencemaintain-able,code.Recentprogressonautomaticallysuggestingnamesforlocalvariablestantalizeswiththeprospectofreplicatingthatsuccesswithmethodandclassnames.However,suggestingnamesformeth-odsandclassesismuchmoredif cult.Thisisbecausegoodmethodandclassnamesneedtobefunctionallydescriptive,butsuggestingsuchnamesrequiresthatthemodelgoesbeyondlocalcontext.Weintroduceaneuralprobabilisticlanguagemodelforsourcecodethatisspeci callydesignedforthemethodnamingproblem.Ourmodellearnswhichnamesaresemanticallysimilarbyassigningthemtolocations,calledembeddings,inahigh-dimensionalcontin-uousspace,insuchawaythatnameswithsimilarembeddingstendtobeusedinsimilarcontexts.Theseembeddingsseemtocontainsemanticinformationabouttokens,eventhoughtheyarelearnedonlyfromstatisticalco-occurrencesoftokens.Furthermore,weintroduceavariantofourmodelthatis,toourknowledge,the rstthatcanproposeneologisms,namesthathavenotappearedinthetrainingcorpus.Weobtainstateoftheartresultsonthemethod,class,andeventhesimplervariablenamingtasks.Morebroadly,thecontinuousembeddingsthatarelearnedbyourmodelhavethepotentialforwideapplicationwithinsoftwareengineering.CategoriesandSubjectDescriptors:
D.2.3[SoftwareEngineering]:CodingToolsandTechniquesGeneralTerms:Algorithms
Keywords:Codingconventions,naturalnessofsoftware“Youshallknowawordbythecompanyitkeeps.”
—J.R.Firth
1.INTRODUCTION
Languagestartswithnames.Whileprogramming,developersmustnamevariables,parameters,functions,classes,and les.Theystrivetochoosenamesthataremeaningfulandconventional,i.e.consistentwithothernamesusedinrelatedcontextsintheircodebase.Indeed,leadingindustrialexperts,includingBeck[9],Mc-Connell[34],andMartin[33],havestressedtheimportanceofidenti ernaminginsoftware.Findinggoodnamesforprogram-minglanguageconstructsisdif cult;poornamesmakecodehardertounderstandandmaintain[29,50,30,7].Empiricalevidence
Permissiontomakedigitalorhardcopiesofallorpartofthisworkforpersonalorclassroomuseisgrantedwithoutfeeprovidedthatcopiesarenotmadeordistributedforpro torcommercialadvantageandthatcopiesbearthisnoticeandthefullcitationonthe rstpage.Tocopyotherwise,torepublish,topostonserversortoredistributetolists,requirespriorspeci cpermissionand/orafee.
FSE’15,August30–September4,2015,Bergamo,ItalyCopyright2015ACMX-XXXXX-XX-X/XX/XX...$15.00.
suggeststhatpoornamesalsoleadtosoftwaredefects[13,1].Codemaintenanceexacerbatesthedif cultyof ndinggoodnames,be-causetheappropriatenessofanamechangesovertime:anexcellentchoice,atthetimeaconstructisintroduced,candegradeintoapoorname,aswhenavariableisusedinnewcontextorafunction’ssemanticschanges.
Namesofmethodsandclassesareparticularlyimportant,andcanbedif culttochoose.Høstetal.eloquentlycapturedtheirimportance:“Methodsarethesmallestnamedunitsofaggregatedbehaviorinmostconventionalprogramminglanguagesandhencethecornerstoneofabstraction”[26].Semanticallydistinctmethodnamesarethebasictoolsforreasoningaboutprogrambehaviour.Programmersdirectlythinkintermsofthesenamesandtheircom-positions,sinceaprogrammerchosethemfortheunitsintowhichtheprogrammerdecomposedaproblem.Moreover,methodnamescanbehardtochange,especiallywhentheyareusedinanAPI.Whenpublishedinapopularlibrary,methodnamingdecisionsareespeciallyrigidandpoornamescandoomaprojecttoirrelevance.Inthispaper,wesuggestthatmodernstatisticaltoolsallowustoautomaticallysuggestdescriptive,idiomaticmethodandclassnamestoprogrammers.Wetacklethemethodnamingproblem:theproblemofinferringamethod’snamefromitsbody(oraclassfromitsmethods).Asdevelopersspendapproximatelyhalfoftheirdevelopmenttimetryingtounderstandandcomprehendcodeduringmaintenancealone[17],anyprogresstowardsolvingthemethodnamingproblemwillimprovethecomprehensibilityofcode[49]leadingtoanincreaseprogrammerproductivity[24].
Inpreviouswork,weintroducedtheNATURALIZEframework[2],whichlearnsthecodingconventionsusedinacodebaseandtacklesonenamingproblemprogrammersface—thatofnamingvariables—byexploitingthe“naturalness”orpredictabilityofcode[25].However,themethodnamingproblemismuchmoredif cultthanthevariablenamingproblem,becausetheappropriatenessofmethodandclassnamesdependsnotsolelyontheirusesbutalsoontheirinternalstructure—theirbodyortheirsetofmethods.Anadequatenamemustdescribenotjustwhatthemethodis,butwhatitdoes.Variablenames,bycontrast,canoftenbepredictedsolelyfromafewtokensoflocalcontext;forexample,itiseasytopredictthevariablenamethatfollowsthetokensfor(int.Becausemethodandclassnamesmustbefunctionallydescriptive,theyoftenhaverichinternalstructure:methodnamesareoftenverbphrasesandclassnamesareoftennounphrases.Butthismeansthatmethodandclassnamesareoftenneologisms,thatis,namesnotseeninthetrainingcorpus.Existingprobabilisticmodelsofsourcecode,includingthen-grammodelsusedinNATURALIZE,cannotsuggestneologisms.Theseaspectsofthemethodnamingproblemseverelyexacerbatethedatasparsityproblemfacedbyallprobabilisticlanguagemodels,becauseaddressingthembybuildingmodelsthatconsidermorecontextnecessarilymeansthatanyindividualcontextwillbeobservedlessoften.Therefore,themethodnamingproblemrequiresmodelsthat
百度搜索“77cn”或“免费范文网”即可找到本站免费阅读全部范文。收藏本站方便下次阅读,免费范文网,提供经典小说综合文库2015-FSE-Suggesting accurate method and class names在线全文阅读。
相关推荐: