Valid N-Gram and PoS Identifier and Verifier

About the Current Project

DBpedia is a crowd-sourced community effort to extract structured content from the information created in various Wikimedia projects. This structured information resembles an open knowledge graph (OKG) which is available for everyone on the Web. A knowledge graph is a special kind of database which stores knowledge in a machine-readable form and provides a means for information to be collected, organised, shared, searched and utilised.

The use of knowledge graph requires no time for training but only needs time of the graph construction process. The service is therefore capable of being online quicker.
The service can check whether given words, set of words seperated by delimeter are valid ngrams or not. The service is also capable of checking validity of the part-of-speech(POS) of the words entered by the user.
The service can extract valid N-gram from sentences,words or set of words entered by the user making the sentence grammatically correct.

Terms of Use

The VIP Research Group is a research group led by Prof. Maiga Chang (https://www.athabascau.ca/science-and-technology/our-people/maiga-chang.html) at School of Computing and Information Systems, Athabasca University. This "multi-sentence similarity calculation web service" (https://ws-nlp.vipresearch.ca/) is one of the research group's works. The research group does have follow-up research plan to improve it and further use it in other research projects.

Almost all of Prof. Chang's works are open access (or open source). The web service (https://ws-nlp.vipresearch.ca/) is now open access and there is no plan to make it open source. The web service is open access and running on a self-sponsored server, as all of other research projects (see http://maiga.athabascau.ca/#advanced) they will be always online, improving, and accessible as long as the cost can be affordable and covered by Prof. Chang.

Of course if in any case just like the access volume of the web service becoming high or any business/commercial takes advantage of using it to make money, then the term of using the web service may look for changes; for examples, donations, personal/academic/business license and subscription modes, etc. However, it is really too early to say that.

How to Access

Version 1.2 Version 1.1 Version 1.0

About Us

Our Mission

Our research aims to bring a sentence similarity service which would measure the closeness of two or more sentence or paragraph using Natural Language Processing and WordNet

Our Supervisor

Dr. Maiga Chang is a Full Professor in the School of Computing and Information Systems at Athabasca University, Canada.

Research Goal

The research focuses on creating a service capable of verifying valid n-grams from a given set of words. The service is capable of extracting valid n-grams and their part of speech(POS) from the words provided by the user which can be used for verification purposes.

Our Team

Bala Guhanesh
2024

I'm Bala Guhanesh G S from VIT Chennai, India. I am a Computer Science Engineering student focusing in the field of Natural Language Processing.

Hsiang-han Cheng (Eleanor)
2024

I am Hsiang-han Cheng, a second-year master’s student in the Department of Computer Science and Information Engineering at Donghua University.

Kang-Fu Zheng (Wayne)
2024

Kang-Fu Zheng is a second-year Master's degree student at National Formosa University, living in Tainan, Taiwan. He obtained his Master's degree in Electrical Engineering in Taiwan.

Rob Schmidt
2022, 2023-2024

Rob Schmidt is an Athabasca University undergraduate student from Calgary, Alberta, Canada. He may be pursuing a master's degree in Computer Science and has a particular interest in game based design, learning and research.

Arjun Guliya
2023

Arjun is an undergraduate student pursuing Computer Engineering from Queen's University, Canada. He is passionate about using technology to solve complex problems. A strong mathematical background & programming experience makes him an excellent fit for such projects.

Greg Fredin
2023

Greg Fredin is an Athabasca University undergraduate student living in Edmonton, Alberta, Canada. He is also getting a minor in psychology which will help with his interest in further AI research.

Bhavesh GANDHI
2021

Bhavesh Gandhi is an undergraduate student. He is pursuing Electrical and Electronics Engineering from Heritage Institute of Technology, India. His research interest lies in the domain of Machine Learning and Natural Language Processing.

Odinakachukwu Nzekwe
2025

Odinakachukwu Nzekwe is an undergraduate student at Athabasca University, pursuing a degree in Computer Science with a focus on Cybersecurity. Based in Mississauga, Ontario Canada, Odinakachukwu is passionate about digital security, ethical hacking, and the ever-evolving landscape of cyber threats.

Videos

Presentation Video

Live demonstrations on a 12-weeks work outcome (June 2021~August 2021). This research uses Natural Language Processing basics with DBPedia to identify the valid n-gram words and important part-of-speech tags. The research outcome implements services that can take user's requests in JSON to help them verify valid part-of-speech tags and identify valid n-grams. The research outcome involve Python, PHP, JavaScript (AJAX and JSON), and DBPedia.

Stage - 1: Automated System to extract and store Valid N-grams and their POS tags from DBpedia.
Stage - 2: Developing the API service.

Stage 1: N-gram Extraction and Storage

Stage 1's major features include (but not limited to)

To extract and store Valid N-grams and their POS tags from DBpedia.
Cron jobs for the backend services.
Dashboard that shows backend services' working progress.

Stage 2: The API Service.

Stage 2's major features include (but not limited to)

Developing an API service.
Using the stored N-grams and their POS make a service for users to get the desired information.

Frequently Asked Questions

What can be the service used for ?

The service can be used for extracting and validating N-gram and the most frequent POS(part-of-speech) tags.
What does the sentence similarity service does ?

The sentence similarity actually calculates the similarity between two sentences and assign a score to the overall result.
How the current service is different from the other services ?

The current service uses the valid N-gram learning service to filter out the sentences, which means if there is some word which makes no sense, then it will be remove from the sentence preserving the position of the words.The service can also be useful to extract and check the grammatical correctness of the sentences (upto 4-grams).
How can i use this service ?

Please go to the HOW-to section and find the necessary documentation for each of the service and how to use them.
Can there be frequent updates in the services ?

Yes, there can be updates in the service or a new service , the webpage will also be updated with the latest information of the new service.