Skip to content
2000
Volume 1, Issue 1
  • ISSN: 2665-9972
  • E-ISSN: 2665-9964

Abstract

In the era of information overload it is very difficult for a human reader to make sense of the vast information available on the internet quickly. Even for a specific domain like a college or university website, it may be difficult for a user to browse through all the links to quickly get the relevant answers.

In this scenario, the design of a chat-bot which can answer questions related to college information and compare between colleges will be very useful and novel.

In this paper, a novel conversational interface chat-bot application with information retrieval and text summarization skill is designed and implemented. Firstly, this chat-bot has a simple dialog skill; when it can understand the user query intent, it responds from the stored collection of answers. Secondly, for unknown queries, this chat-bot can search the internet, and then perform text summarization using advanced techniques of natural language processing (NLP) and text mining (TM).

The advancement of NLP capability of information retrieval and text summarization using machine learning techniques of Latent Semantic Analysis (LSI), Latent Dirichlet Allocation (LDA), Word2Vec, Global Vector (GloVe) and TextRank is reviewed and compared in this paper first before implementing them for the chat-bot design. This chat-bot improves user experience tremendously by getting answers to specific queries concisely which takes less time than to read the entire document. Students, parents and faculty can get the answers for a variety of information like admission criteria, fees, course offerings, notice board, attendance, grades, placements, faculty profile, research papers, patents, . more efficiently.

The purpose of this paper was to follow the advancement in NLP technologies and implement them in a novel application.

Loading

Article metrics loading...

/content/journals/cccs/10.2174/2665997201999201022191540
2021-04-01
2024-11-22
Loading full text...

Full text loading...

References

  1. FeldmanR. SangerJ. The text mining handbook: advanced approaches in analyzing unstructured data.New York, NYCambridge University Press2007
    [Google Scholar]
  2. HighR. The era of cognitive systems: an inside look at ibm watson and how it works.IBM Corporation2012
    [Google Scholar]
  3. ElderJ. MinerG. NisbetR. Practical text mining and statistical analysis for non-structured text data applications.Elsevier2012
    [Google Scholar]
  4. BerryM.W. Survey of text mining: clustering, classification and retrieval.Springer2007
    [Google Scholar]
  5. CollobertR. WestonJ. a unified architecture for natural language processing: deep neural networks with multitask learningproceedings of the 25th international conference on machine learning.2008, 160167Helsinki, Finland
    [Google Scholar]
  6. BengioY. DucharmeR. VincentP. JauvinC. A neural probabilistic language model.Journal of MLR.2003311371155
    [Google Scholar]
  7. TuringA.M. Computing Machinery and Intelligence.Mind195043346010.1093/mind/LIX.236.433
    [Google Scholar]
  8. BradeskoL. MladenicD. A Survey of chatbot systems through a loebner prize competitionproceedings of slovenian language technologies society eighth conference of language technologies2012, 3437
    [Google Scholar]
  9. DeerwesterS. DumaisS. FurnasG. LandauerT. HarshmanR. Indexing by latent semantic analysisJ. of JASIS199010.1002/(SICI)1097‑4571(199009)41:6<391::AID‑ASI1>3.0.CO;2‑9
    [Google Scholar]
  10. BleiD. NgA. JordanM. Latent Dirichlet Allocation.J. MLR200339931022
    [Google Scholar]
  11. BleiD. Griffiths, Jordan M., and Tannenbaum J., Hierarchical topic models and the nested chinese restaurant process. Advances in Neural Information Processing Syst.Cambridge, MAMIT Press2004
    [Google Scholar]
  12. BrettM. Topic modeling: a basic introduction.J. JDH2012
    [Google Scholar]
  13. GriffithsT.L. SteyversM. Finding scientific topicsProc. Natl. Acad. Sci. USA2004101Suppl. 15228523510.1073/pnas.030775210114872004
    [Google Scholar]
  14. RadhaG. Exploring the field of text miningJ. IJCA2017Vol. 975
    [Google Scholar]
  15. RadhaG. Exploring information retrieval by latent semantic and latent dirichlet allocation techniques.J. IRJCS2020
    [Google Scholar]
  16. RadhaG. Impact of artificial intelligence and natural language processing on programming and software engineering.J. IRJCS2020
    [Google Scholar]
  17. LiY. DavidM. ZuhairB. JamesD. O’SheaD. KeelC. Sentence similarity based on semantic nets and corpus statistics.IEEE Trans. Knowl. Data Eng.2006181138115010.1109/TKDE.2006.130
    [Google Scholar]
  18. MikolovT. SutskeverI. ChenK. CorradoG. DeanJ. Distributed representations of words and phrases and their compositionality.Adv. Neural Inf. Process. Syst.201331113119
    [Google Scholar]
  19. MikolovT. QuocV. SutskeverI. Exploiting similarities among languages for machine translationarXiv:1309 4168 [CS CL]2013
    [Google Scholar]
  20. PenningtonJ. SocherR. C.Manning Glove: global vector for word representationProceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)201415324310.3115/v1/D14‑1162
    [Google Scholar]
  21. RumelhartD.E. GeoffreyE.H. WilliamsR.J. Learning representations by back- propagating errors.Nature1986323608853353610.1038/323533a0
    [Google Scholar]
  22. ZhuT. KanL. The similarity measure based on lda for automatic summarization. Elsevier.. IWIEE2012
    [Google Scholar]
  23. LinC.Y. ROUGE: A Package for automatic evaluation of summaries. proceedings of the workshop on text summarization branches out.Barcelona, Spain2004
    [Google Scholar]
  24. LuhnH.P. The automatic creation of literature abstracts.IBM J. Res. Develop.19582215916510.1147/rd.22.0159
    [Google Scholar]
  25. DunningT. Accurate methods for the statistics of surprise and coincidence.Comput. Linguist.1993916174
    [Google Scholar]
  26. EdmundsonH.P. New methods in automatic extracting.J. Assoc. Comput. Mach.196910.1145/321510.321519
    [Google Scholar]
  27. MihalceaR. Text rank - bringing order into textsProceedings of the conference on empirical methods in natural language processing (EMNLP 2004)2004
    [Google Scholar]
  28. LinC.Y. ROUGE: A package for automatic evaluation of summaries. proceedings of the workshop on text summari-zation branches out.Barcelona, Spain2004
    [Google Scholar]
/content/journals/cccs/10.2174/2665997201999201022191540
Loading
/content/journals/cccs/10.2174/2665997201999201022191540
Loading

Data & Media loading...

This is a required field
Please enter a valid email address
Approval was a Success
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error
Please enter a valid_number test