N-Gram & PoS Identifier and Verifier

The service aims to develop services and plugin which helps to manage task easily using Natural Language Processing"
under the supervision of Dr. Maiga Chang, Professor at the School of Computing and Information Systems, Athabasca University.

About the Current Project

DBpedia is a crowd-sourced community effort to extract structured content from the information created in various Wikimedia projects. This structured information resembles an open knowledge graph (OKG) which is available for everyone on the Web. A knowledge graph is a special kind of database which stores knowledge in a machine-readable form and provides a means for information to be collected, organised, shared, searched and utilised.

  • The use of knowledge graph requires no time for training but only needs time of the graph construction process. The service is therefore capable of being online quicker.
  • The service can check whether given words, set of words seperated by delimeter are valid ngrams or not. The service is also capable of checking validity of the part-of-speech(POS) of the words entered by the user.
  • The service can extract valid N-gram from sentences,words or set of words entered by the user making the sentence grammatically correct.

Terms of Use

The VIP Research Group is a research group led by Prof. Maiga Chang (https://www.athabascau.ca/science-and-technology/our-people/maiga-chang.html) at School of Computing and Information Systems, Athabasca University. This "multi-sentence similarity calculation web service" (https://ws-nlp.vipresearch.ca/) is one of the research group's works. The research group does have follow-up research plan to improve it and further use it in other research projects.

Almost all of Prof. Chang's works are open access (or open source). The web service (https://ws-nlp.vipresearch.ca/) is now open access and there is no plan to make it open source. The web service is open access and running on a self-sponsored server, as all of other research projects (see http://maiga.athabascau.ca/#advanced) they will be always online, improving, and accessible as long as the cost can be affordable and covered by Prof. Chang.

Of course if in any case just like the access volume of the web service becoming high or any business/commercial takes advantage of using it to make money, then the term of using the web service may look for changes; for examples, donations, personal/academic/business license and subscription modes, etc. However, it is really too early to say that.

About Us

Our Mission

Our research aims to bring a sentence similarity service which would measure the closeness of two or more sentence or paragraph using Natural Language Processing and WordNet

Our Supervisor

Dr. Maiga Chang is a Full Professor in the School of Computing and Information Systems at Athabasca University, Canada.

Research Goal

The research focuses on creating a service capable of verifying valid n-grams from a given set of words. The service is capable of extracting valid n-grams and their part of speech(POS) from the words provided by the user which can be used for verification purposes.

Our Team

Greg Fredin

Greg Fredin is an Athabasca University undergraduate student living in Edmonton, Alberta, Canada. He is also getting a minor in psychology which will help with his interest in further AI research.

Rob Schmidt

Rob Schmidt is an Athabasca University undergraduate student from Calgary, Alberta, Canada. He will be pursuing a master's degree in Computer Science and has a particular interest in game based design, learning and research.

Bhavesh GANDHI

Bhavesh Gandhi is an undergraduate student. He is pursuing Electrical and Electronics Engineering from Heritage Institute of Technology, India. His research interest lies in the domain of Machine Learning and Natural Language Processing.


Presentation Video

Live demonstrations on a 12-weeks work outcome (June 2021~August 2021). This research uses Natural Language Processing basics with DBPedia to identify the valid n-gram words and important part-of-speech tags. The research outcome implements services that can take user's requests in JSON to help them verify valid part-of-speech tags and identify valid n-grams. The research outcome involve Python, PHP, JavaScript (AJAX and JSON), and DBPedia.

  1. Stage - 1: Automated System to extract and store Valid N-grams and their POS tags from DBpedia.
  2. Stage - 2: Developing the API service.

Stage 1: N-gram Extraction and Storage

Stage 1's major features include (but not limited to)

  1. To extract and store Valid N-grams and their POS tags from DBpedia.
  2. Cron jobs for the backend services.
  3. Dashboard that shows backend services' working progress.

Stage 2: The API Service.

Stage 2's major features include (but not limited to)

  1. Developing an API service.
  2. Using the stored N-grams and their POS make a service for users to get the desired information.

Frequently Asked Questions