Representative Image (Photo/IANS) 
Technology

OpenAI introduces data partnerships to deeply train AI models

The ChatGPT developer said that it is interested in large-scale datasets that reflect human society and that are not already easily accessible online to the public

IANS

NEW DELHI: Sam Altman-run OpenAI has introduced data partnerships, where it will work together with organisations to produce public and private datasets for training AI models.

The company is aiming to collaborate with organisations to help AI models understand “all subject matters, industries, cultures, and languages” which requires as broad a training dataset as possible.

“Data Partnerships are intended to enable more organisations to help steer the future of AI and benefit from models that are more useful to them, by including content they care about,” the company said in a statement.

The ChatGPT developer said that it is interested in large-scale datasets that reflect human society and that are not already easily accessible online to the public.

“We can work with any modality, including text, images, audio, or video. We’re particularly looking for data that expresses human intention (e.g. long-form writing or conversations rather than disconnected snippets), across any language, topic, and format,” the company noted.

OpenAI said it can work with data in almost any form and can use its next-generation in-house AI technology to help people digitise and structure their data.

“For example, we have world-class optical character recognition (OCR) technology to digitise files like PDFs, and automatic speech recognition (ASR) to transcribe spoken words,” the company added.

The company is seeking partners to help it create an open-source dataset for training language models.

“This dataset would be public for anyone to use in AI model training. We would also explore using it to safely train additional open-source models ourselves. We believe open-source plays an important role in the ecosystem,” said OPenAI.

“We are also preparing private datasets for training proprietary AI models, including our foundation models and fine-tuned and custom models,” it added.

From reel to real: Cinema’s grip on Tamil Nadu politics

2026 TN elections | TVK's 1 vote win Tirupattur; nail-biting finish in many segments

Restore old pension scheme, VAOs urge TVK chief Vijay

2026 TN elections | Spell out stance on forming govt, CPM tells Vijay

Actor Jiiva's father and producer RB Choudary dies in Jodhpur car crash