We released arabic parallel corpus for the machine translation task. The corpus is aligned on document level and contains high quality documents that were extracted from abstracts of journal publications. The abstract are in English and Arabic published in our service Shamra Academia Search Engine.
The corpus contains 8500 documents published in medical and engineering journals.
If you are interested in more information about the dataset, contact us at sales [at] prime-itech [dot] com