Shamra Search Engine

NLP: Arabic Dataset for the News Classification Task

We released last month a dataset for Arabic news classification. The dataset can be useful for those who are interested in implementing classification models for news in the Arabic language, especially the Syrian accents.

Exploring the frontiers of Arabic natural language processing - Wamda

Data Source

The data is collected from Shamra Search Engine, a project owned by Prime technologies, from the news service. The service started in 2015 until now; thus, it contains variant news and topics that were trending since 2015.

Acquiring the Dataset

The dataset costs 750$ for one time license and 1500$ for life-long license, this means that if you pay the life-long license, you can always acquire the latest version of the dataset for free whenever it is released.

If you are interested in more information about the dataset, contact us at sales [at] prime-itech [dot] com