

CHENNAI: The Central Institute of Classical Tamil (CICT) has completed the digitisation and verification of 640 of the 1,330 couplets of the Thirukkural from palm-leaf manuscripts, marking a significant milestone in its effort to create a comprehensive digital archive of Tamil's classical literary heritage.
Announcing the progress, the institute said on Friday that the work forms part of its Digital Archives of Classical Tamil initiative under which 64 ground-truth manuscript specimens have been processed so far, covering 48.1 per cent of the Thirukkural. The entire Arathuppal (Book of Virtue) comprising 380 couplets has been verified, while work on Porutpal (Book of Wealth) is under way. The Kamathuppal (Book of Love) section remains to be taken up.
According to the institute, every transcription is cross-verified with critical editions of the text and annotated to record scribal variations, palaeographic features, and elements of Old Tamil orthography. The verified corpus has been integrated with translations in 30 languages and is being developed as a training dataset for handwritten Tamil text recognition technologies.
"All resources are provided through open access. Verified manuscript transcriptions are released progressively as they are completed and assigned individual digital object identifiers for long-term scholarly access," said CICT Director R Chandrasekaran.
The digital archive offers deep-zoom manuscript viewing, searchable concordances, Tamil text-to-speech functionality, and research tools designed for textual analysis and manuscript studies. The resources are released under a Creative Commons licence to support wider academic use.
The institute said the Thirukkural project is one among 13 major resources currently available on the portal. Collectively, the repository contains more than 80,000 headwords and 6.28 million characters of scholarly commentary and exegesis.
Other resources include a searchable index of the Sangam corpus, the Comprehensive Digital Encyclopaedia of Classical Tamil, a digital audio edition of the Madras University Tamil Lexicon, digital editions of Nannool, Tandiyalankaram, Sendan Tivakaram, and Pinkala Nigandu, a Tamil grammatical lexicon, and a digital archive containing first editions of 41 classical Tamil texts.
CICT said the repository is being updated continuously and new verified manuscripts, lexical resources and scholarly editions will be added on a mission-mode basis to strengthen access to classical Tamil literature and research materials worldwide.
640 Thirukkural couplets digitised
48.1% of the corpus completed
Arattuppal fully verified
Porutpal work in progress
Translations available in 30 languages
Supports Tamil AI recognition tools
13 digital resources now online
80,000+ headwords archived
6.28 million characters digitised
Resources available free to users