Stack Overflow Will Charge AI Giants for Training Data (

Stack Overflow Will Charge AI Giants for Training Data

The News/Media Alliance, a US trade group of publishers, including Cond Nast, which owns WIRED, todayunveiled principles calling on generative AI developers to negotiate any use of their data for training and other purposes and respect their right to fair compensation.

Meta, Google, and OpenAImaker ofChatGPTall have developed AI systems using data sets that culled content from thousands of online sources, including Stack Overflow and Reddit, according to outsideanalysesand their owndisclosures.

Developing the AI systems behind tools such as ChatGPT and the image generator DallE costs hundreds of millions of dollarsand its about to get more expensive.

OpenAI, Google, and other companies building largescale AI projects have traditionally paid nothing for much of their training data, scraping it from the web.

Feeding text from online banter or expert discussions about programming into machine learning algorithms known as large language models, or LLMs, can help AI text generators or chatbots be more fluent and knowledgeable.Using LLMs to generate programming code is viewed as one of the technology’s biggest opportunities, with Microsoft charging as much as19 a month per person for its code generator GitHub Copilot.

Community platforms that fuel LLMs absolutely should be compensated for their contributions so that companies like us can reinvest back into our communities to continue to make them thrive, Stack Overflows Chandrasekar says. We’re very supportive of Reddits approach.

Chandrasekar described the potential additional revenue as vital to ensuring Stack Overflow can keep attracting users and maintaining highquality information.

This post was created with our nice and easy submission form. Create your post!


Leave a Reply