Building Semantic Kernel for Persian Text Classification with a Small Amount of Training Data

Jadidinejad, Amir H.; Marza, Venus

کد مقاله : 640809 بازدید : 133 صفحه: 125 - 136

نوع مقاله: پژوهشی

Building Semantic Kernel for Persian Text Classification with a Small Amount of Training Data

محورهای موضوعی : B. Computer Systems Organization

1 - Faculty of Computer and Information Technology Engineering, Qazvin Branch, Islamic Azad University, Qazvin, Iran
2 - Department of Computer Engineering, West Tehran Branch, Islamic Azad University, Tehran, Iran

تاریخ دریافت : 1393/11/28 تاریخ پذیرش : 1393/11/28 تاریخ انتشار : 1393/11/12

کلید واژه: Support Vector Machine, Semantic Kernel, Vector Space Kernel, Dimensionality Reduction, Text Classification,

چکیده مقاله :

The original idea of semantic kernels is to use semantic features instead of terms appeared in the text document. In this article, the documents are transformed into a new k-dimensional feature space by applying Singular Value Decomposition on the Term-Document matrix and extracting 𝑘 eigenvectors with higher energy. The suggested semantic kernel causes severe reduction of dimensions which leads to two main conclusions. First, the computational complexity of the classifier is severely reduced. Second, the trained classifier has less sensitivity on the input terms; therefore, it can classify documents effectively. Experiments on Persian documents indicate the absolute superiority of the suggested semantic kernel in comparison to well-known vector space (Bag-of-Words) kernel, especially under the circumstances in which external semantic resources are not available and the amount of available training data is not sufficient

چکیده انگلیسی:

منابع و مأخذ:

اشتراک گذاری

آدرس مقاله

Building Semantic Kernel for Persian Text Classification with a Small Amount of Training Data

سکوی نشر دانش

پیوندهای سایت

مراکز مرتبط

پشتیبانی

صفحات رسمی