Semantic Kernels and Deep Learning for Short Text Classification
Shot-text classification is a challenging topic in particular for Turkish text due to sparsity of distinguishing features and high dimensionality.
Shot-text classification is a challenging topic in particular for Turkish text due to sparsity of distinguishing features and high dimensionality.
Most of the jobs related to Turkish tweets are done for sentiment analysis. This project aims to analyze and classify Turkish tweets on Twitter based on their topics which are decided as follow: polity, economics&investment, health, technology&informatics, history, literature&film, sports, education&personal growth and magazine.
As far as we know, there is no such a study done on Turkish tweets. We use a fine tuning approach: Transformer Encoder and a machine learning method Support Vector Machine with smoothing Kernels based on Semantic Values of Terms. Also Naive Bayes, Random Forest and SVM with Linear Kernels are used to compare the results.