Extracting Domain Models from Turkish Text Using NLP and Machine Learning

Domain information and software requirements are mainly collected and documented as natural language text.


poster
Poster

Domain information and software requirements are mainly collected and documented as natural language text. As the documents get longer, it is difficult for humans to get the bird’s eye view on the information presented. Models are abstractions of such information, focusing only on the necessary and relevant pieces of information.

However, building models requires times and effort. Automated model extraction from natural language text helps organizations to save time and effort and reduce their costs.

This project aims at applying natural language processing techniques and machine learning to extract various models (class diagrams or other domain specific modeling languages) from Turkish text.

This project is better takes as a group with an intention to continue the following semester as well.

Required skills are:

  • Basic understanding of Turkish language

  • Eagerness to learn new machine learning techniques such as  transfer learning, active learning

  • Eager ness to learn and apply natural language processing techniques

  • Python and probably java programming skills