Open Access Open Access  Restricted Access Subscription Access

Summarization of Software Artifacts:A Review


Affiliations
1 AKTU Lucknow, UP, India
2 Computer Science Department, BIET, Jhansi, UP, India
 

Summarization of software artifacts is an ongoing field of research among the software engineering community due to the benefits that summarization provides like saving of time and efforts in various software engineering tasks like code search, duplicate bug reports detection, traceability link recovery, etc. Summarization is to produce short and concise summaries. The paper presents the review of the state of the art of summarization techniques in software engineering context. The paper gives a brief overview to the software artifacts which are mostly used for summarization or have benefits from summarization. The paper briefly describes the general process of summarization. The paper reviews the papers published from 2010 to June 2017 and classifies the works into extractive and abstractive summarization. The paper also reviews the evaluation techniques used for summarizing software artifacts. The paper discusses the open problems and challenges in this field of research. The paper also discusses the future scopes in this area for new researchers.

Keywords

Summarization, Software Artifacts, Mining Software Repositories, Extractive Summarization, Abstractive Summarization.
User
Notifications
Font Size

  • S. Rastkar, G. C. Murphy, and G. Murray, “Summarizing software artifacts: A case study of bug reports,” in ICSE, 2010, pp. 505–514.
  • P. W. McBurney and C. McMillan, “Automatic source code summarization of context for java methods,” Transactions on Software Engineering, vol. 42, no. 2, pp. 103–119, 2016.
  • P. W. McBurney, “Automation documentation generation via source code summarization,” in International Conference on Research Advances in Integrated Navigation System, l 2015, pp. 35–44.
  • R. Lotufo, Z. Malik, and K. Czarnecki, “Modelling the ’hurried’ bug report reading process to summarize bug reports,” in Inter-National Conference on Software Maintenance, 2012, pp. 430–439.
  • N. Nazar, Y. Hu, and H. Jiang, “Summarizing software artifacts: A literature review,” Springer Journal of Computer Science and Technology, pp. 883–909, 2016.
  • R. Ferreria, F. Freitas, L. de Souza Cabral, R. D. Lins, R. Lima, G. Franca, S. Jsimske, and L. Favaro, “A context based text summarization,” in 11th IAPR International Workshop on Document Analysis System, 2014, pp. 66–70.
  • M. Indu and K. K. V, “Review on text summarization evaluation methods,” in International Conference on Research Advances in Integrated Navigation System, April 2016.
  • S. Saziyabegum and P. S. Sajja, “Literature review on extractive text summarization approaches,” International Journal of Computer Applications (0975-8887), vol. 156, no. 12, Dec 2016.
  • G. Antoniol, G. Canfora, G. Casazza, A. D. Lucia, and E. Merlo, “Recovering traceability links between code and documentation,” IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, vol. 28, no. 10, pp. 970–983, Oct 2002.
  • S. Haiduc, J. Aponte, L. Moreno, and A. Marcus, “On the use of automated text summarization techniques for summarizing source code,” in 17th Working Conference on Reverse Engineering, 2010, pp. 35–44.
  • S. Haiduc, J. Aponte, and A. Marcus, “Supporting program comprehension with source code summarization,” in ICSE 2010, May 2010, pp. 223–226.
  • N. Rahman and B. Borah, “A survey on existing extractive techniques for query-based text summarization,” in International Symposium on Advanced Computing and Communication, 2015.
  • S. Rastkar, G. C. Murphy, and G. Murray, “Automatic summarization of bug reports,” Transactions on Software Engineering, vol. 40, no. 4, 2014.
  • S. Rastkar and G. C. Murphy, “Why did this code change?” in ICSE 2013, 2013, pp. 1193–1196.
  • P. C. Rigby and M. P. Robillard, “Discovering essential code elements in informal documentation,” in ICSE 2013, San Francisco, CA, USA, 2013, pp. 832–841.
  • N. NAZAR, H. JIANG, G. GAO, T. ZHANG, X. LI, and Z. REN, “Source code fragment summarization with small-scale crowd-sourcing based features,” Front. Comput. Sci.), Oct 2015.
  • A. T. T. Ying and M. P. Robillard, “Code fragment summarization,” in ESEC/FSE’13, 2013, pp. 655– 658.
  • L. Guerrouj, D. Bourque, and P. C. Rigby, “Leveraging informal documentation to summarize classes and methods in context,” in 37th International Conference on Software Engineering, 2015, pp. 639– 642.
  • S. Mani, R. Catherine, V. S. Sinha, and A. Dubey, “Ausum: Approach for unsupervised bug report summarization,” in SIGSOFT’12/FSE-20, 2012, pp. 1–11.
  • I. Ferreira, E. Cirilo, V. Vieira, and F. Mourao, “Bug report summarization: An evaluation of ranking techniques,” in 2016 X Brazilian Symposium on Components, Architectures and Reuse Software, 2016, pp. 101–110.
  • A. D. Sorbo, S. Panichella, C. A. Visaggio, M. D. Penta, G. Canfora, and H. C. Gall, “Development emails content analyzer: Intention mining in developer discussions,” in 30th IEEE/ACM International Conference on Automated Software Engineering, 2015, pp. 12–23.
  • P. W. McBurney, C. Liu, C. McMillan, and T. Weninger, “Improving topic model source code summarization,” in ICPC 2014, June 2014.
  • B. P. Eddy, J. A. Robinson, N. A.Kraft, and J. C. Carver, “Eval-uating source code summarization techniques: Replication and expansion,” in ICPC 2013, 2013, pp. 13–22.
  • S. Badihi and A. Heydarnoori, “Crowdsummarizer :automated generation of code summaries for java programs through crowd-sourcing,” IEEE Software, pp. 71–80, 2017.
  • P. Rodeghero, C. Liu, P. W. McBurney, and C. McMillan, “An eye-tracking study of java programmers and application to source code summarization,” IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, pp. 1038–1054, 2015.
  • E. Lloret, L. Plaza, and A. Aker, “Analyzing the capabilities of crowdsourcing services for text summarization,” Springer Sci-ence+Business Media B.V. 2012, pp. 338–369, 2012.
  • S. G. Hong, S. Shin, and M. Y. Yi, “Contextual keyword extrac-tion by building sentences with crowdsourcing,” Springer Science+Business Media New York 2012, 2012.
  • H. Mizuyama, K. Yamashita, K. Hitomi, and M. Anse, “A proto-type crowdsourcing approach for document summarization service,” in IFIP International Conference on Advances in Production Management Systems, 2013, pp. 435–442.
  • M. M. Rahman, C. K. Roy, and I. Keivanloo, “Recommending insightful comments for source code using crowdsourced knowledge,” in SCAM 2015, Bremen, Germany, 2015, pp. 81–90.
  • H. T. Le and T. M. Le, “An approach to abstractive text summa-rization,” in International Conference of Soft Computing and Pattern Recognition, 2013, pp. 371–376.
  • R. P. Buse and W. R. Weimer, “Automatic documentation inference for exceptions,” in ISSTA 2008, 2008, pp. 273–281.
  • G. Sridhara, E. Hill, D. Muppaneni, L. Pollock, and K. Vijay-Shanker, “Towards automatically generating summary comments for java methods,” in ASE 2010, 2010, pp. 43–52.
  • R. P. Buse and W. Weimer, “Automatically documenting program changes,” in ASE’10, 2010, pp. 33– 42.
  • S. Rastkar, G. C. Murphy, and A. W. Bradley, “Generating natural language summaries for crosscutting source code concerns,” in 27th International Conference on Software Maintenance, 2011, pp. 103– 112.
  • L. F. Cortes-Coy, M. Linares-Vasquez, J. Aponte, and D. Poshy-vanyk, “On automatically generating commit messages via sum-marization of source code changes,” in 14th IEEE Working Conference on Source Code Analysis and Manipulation, 2014, pp. 275–284.
  • L. Moreno, G. Bavota, M. D. Penta, R. Oliveto, A. Marcus, and G. Canfora, “Arena: An approach for the automated generation of release notes,” Transactions on Software Engineering, 2016.
  • L. Moreno, A. Marcus, L. Pollock, and K. Vijay-Shanker, “Jsummarizer: An automatic generator of natural language summaries for java classes,” in ICPC 2013, San Francisco, CA, USA, 2013, pp. 230– 232.
  • N. J. Abid, N. Dragan, M. L. Collard, and J. I. Maletic, “Using stereotypes in the automatic generation of natural language summaries for c++ methods,” in ICSME 2015, Bremen, Germany, 2015, pp. 561– 565.
  • M. Kamimura and G. C. Murphy, “Towards generating human-oriented summaries of unit test cases,” in ICPC 2013, San Francisco, CA, USA, 2013, pp. 215–218.
  • H. Li, C. Vendome, M. L. Vasquez, D. Poshyvanyk, and N. A. Kraft, “Automatically documenting unit test cases,” in International Con-ference on Software Testing, Verification and Validation, 2016, pp. 341– 352.
  • J. Shen, X. Sun, B. Li, H. Yang, and J. Hu, “On automatic summarization of what and why infoormation in source code changes,” in 40th Annual Computer Software and Applications Conference, 2016, pp. 103–112.
  • B. J. Dorr, C. Monz, S. President, R. Schwartz, and D. Zajic, “A methodology for extrinsic evaluation of text summarization: Does rouge correlate?” in Proceedings of the ACL Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation and/or Summarization, June 2005, pp. 1–8.
  • J. Aponte and A. Marcus, “Improving traceability link recovery methods through software artifact summarization,” in TEFSE 2011, May 2011, pp. 46–49.
  • V. Gupta, “A survey of various summary evaluation techniques,” International Journal of Advanced Research in Computer Science and Software Engineering, pp. 159–162, 2014.
  • A. Nenkova, “Summarization evaluation for text and speech: Issues and approaches,” in INTERSPEECH 2006, Sep 2006.
  • B. P. Eddy, J. A. Robinson, N. A. Kraft, and J. C. Carver, “Evaluating source code summarization techniques: Replication and expansion,” in ICPC, 2013, pp. 13–22.
  • L. F. Cortes-Coy, M. Linares-Vasquex, J. Aponte, and D. Poshyvanyk, “On automatically generating commit messages via sum-marization of source code changes,” in International Working Con-ference on Source Code Analysis and Manipulation, 2014, pp. 275–284.
  • J. Fowkes, P. Chanthirasegaran, and R. Ranca, “Autofolding for source code summarization,” IEEE Transactions on Software Engi-neering, 2016.
  • A. Ankolekar, K. Sycara, J. Herbsleb, R. Kraut, and C. Welty, “Sup-porting online problem-solving communities with the semantic web,” in WWW ’06 Proceedings of the 15th international conference on World Wide Web, 2006, pp. 575–584.
  • B. Dit and A. Marcus, “Improving the readability of defect re-ports,” in RSSE ’08 Proceedings of the 2008 international workshop on Recommendation systems for software engineering, 2008, pp. 47–49.
  • E. Hill, L. Pollock, and K. V. Shanker, “Automatically capturing source code context of nl- queries for software maintenance and reuse,” in ICSE 2009, 2009, pp. 232–242.
  • A. Marcus and J. I. Maletic, “Recovery of traceability links between software documentation and source code,” International Journal of Software Engineering and Knowledge Engineering, vol. 15, no. 5, pp. 811–836, 2015.
  • E. Wong, T. Liu, and L. Tan, “Clocom: Mining existing source code for automatic comment generation,” in SANER 2015, Montreal, Canada, 2015, pp. 388–389.
  • S. Panichella, J. Aponte, M. D. Penta, A. Marcus, and G. Canfora, “Mining source code descriptions from developer communications,” in ICPC 2012, Passau, Germany, 2012, pp. 63–72.
  • N.Moratanch and S.Chitrakala, “A survey on abstractive text summarization,” in International Conference on Abstractive Summarization, 2016.
  • L. Ponzanelli, A. Mocci, and M. Lanza, “Summarizing complex development artifacts by mining heterogeneous data,” in 12th Working Conference of Mining Software Repositories, 2015, pp. 401– 405.

Abstract Views: 363

PDF Views: 143




  • Summarization of Software Artifacts:A Review

Abstract Views: 363  |  PDF Views: 143

Authors

Som Gupta
AKTU Lucknow, UP, India
S. K. Gupta
Computer Science Department, BIET, Jhansi, UP, India

Abstract


Summarization of software artifacts is an ongoing field of research among the software engineering community due to the benefits that summarization provides like saving of time and efforts in various software engineering tasks like code search, duplicate bug reports detection, traceability link recovery, etc. Summarization is to produce short and concise summaries. The paper presents the review of the state of the art of summarization techniques in software engineering context. The paper gives a brief overview to the software artifacts which are mostly used for summarization or have benefits from summarization. The paper briefly describes the general process of summarization. The paper reviews the papers published from 2010 to June 2017 and classifies the works into extractive and abstractive summarization. The paper also reviews the evaluation techniques used for summarizing software artifacts. The paper discusses the open problems and challenges in this field of research. The paper also discusses the future scopes in this area for new researchers.

Keywords


Summarization, Software Artifacts, Mining Software Repositories, Extractive Summarization, Abstractive Summarization.

References