Bluesky is weighing a proposal that gives users consent over how their data is used for AI

Speaking at the SXSW conference in Austin on Monday, Bluesky CEO Jay Graber says the social network has been working on a framework for user consent over how they want their data to be used for generative AI.
The public nature of Bluesky’s social network has already allowed others to train their AI systems on users’ content, as was discovered last year when 404 Media came across a dataset built from 1 million Bluesky posts hosted on Hugging Face.
Bluesky competitor X, meanwhile, is feeding users’ posts into sister company xAI to help train its AI chatbot Grok. Last fall, it changed its privacy policy to allow third parties to train their AI on users’ X posts, as well. The move, followed by the U.S. elections that elevated X owner Elon Musk’s status within the Trump administration, helped fuel another exodus of users from X to Bluesky.
As a result, Bluesky’s open source, decentralized X alternative has grown to over 32 million users in just two years’ time.
However, the demand for AI training data means the new social network has to think about its AI policy, even though it doesn’t plan to train its own AI systems on users’ posts.
Speaking at SXSW, Graber explained that the company has engaged with partners to develop a framework for user consent over how they would want their data to be used — or not used — for generative AI.
“We really believe in user choice,” Graber said, saying that users would be able to specify how they want their Bluesky content to be used.
“It could be something similar to how websites specify whether they want to be scraped by search engines or not,” she continued.
“Search engines can still scrape websites, whether or not you have this, because websites are open on the public Internet. But in general, this robots.txt file gets respected by a lot of search engines,” she said. “So you need something to be widely adopted and to have users and companies and regulators to go with this framework. But I think it’s something that could work here.”
The proposal, which is currently on GitHub, would involve getting user consent at the account level or even at the post level, then ask other companies to respect that setting.
“We’ve been working on it with other people in the space concerned about how AI is affecting how we view our data,” Graber added. “I think it’s a positive direction to take.”
You Might Also Like
Spotify says its payouts are getting better, but artists still disagree
Spotify on Tuesday released its annual Loud & Clear report, detailing information about the music streaming service’s royalty payments. While...
Elea AI is chasing the healthcare productivity opportunity by targeting pathology labs’ legacy systems
VC funding into AI tools for healthcare was projected to hit $11 billion last year — a headline figure that...
Tata Technologies’ data leaked by ransomware gang
A ransomware group called Hunters International has published some of the data it claims to have stolen from Tata Technologies,...
Uber terminates Foodpanda Taiwan acquisition, citing regulatory hurdles
Uber Technologies has ended its acquisition of Delivery Hero’s Foodpanda in Taiwan, the Germany-based tech firm said on Tuesday. The...