Bluesky is weighing a proposal that gives users consent over how their data is used for AI

Speaking at the SXSW conference in Austin on Monday, Bluesky CEO Jay Graber says the social network has been working on a framework for user consent over how they want their data to be used for generative AI.
The public nature of Bluesky’s social network has already allowed others to train their AI systems on users’ content, as was discovered last year when 404 Media came across a dataset built from 1 million Bluesky posts hosted on Hugging Face.
Bluesky competitor X, meanwhile, is feeding users’ posts into sister company xAI to help train its AI chatbot Grok. Last fall, it changed its privacy policy to allow third parties to train their AI on users’ X posts, as well. The move, followed by the U.S. elections that elevated X owner Elon Musk’s status within the Trump administration, helped fuel another exodus of users from X to Bluesky.
As a result, Bluesky’s open source, decentralized X alternative has grown to over 32 million users in just two years’ time.
However, the demand for AI training data means the new social network has to think about its AI policy, even though it doesn’t plan to train its own AI systems on users’ posts.
Speaking at SXSW, Graber explained that the company has engaged with partners to develop a framework for user consent over how they would want their data to be used — or not used — for generative AI.
“We really believe in user choice,” Graber said, saying that users would be able to specify how they want their Bluesky content to be used.
“It could be something similar to how websites specify whether they want to be scraped by search engines or not,” she continued.
“Search engines can still scrape websites, whether or not you have this, because websites are open on the public Internet. But in general, this robots.txt file gets respected by a lot of search engines,” she said. “So you need something to be widely adopted and to have users and companies and regulators to go with this framework. But I think it’s something that could work here.”
The proposal, which is currently on GitHub, would involve getting user consent at the account level or even at the post level, then ask other companies to respect that setting.
“We’ve been working on it with other people in the space concerned about how AI is affecting how we view our data,” Graber added. “I think it’s a positive direction to take.”
You Might Also Like
How La Fourche, an online organic supermarket, is thriving after q-commerce’s bust
La Fourche is just seven years old but it has been quite a rollercoaster for the French startup. During this...
Pentera nabs $60M at a $1B+ valuation to build simulated network attacks to train security teams
Strong and smart security operations teams are at the heart of any cybersecurity strategy, and today a startup that builds...
Ditto lands $82M to synchronize data from the edge to the cloud
Ditto, a company that’s setting out to bring “resilient” connectivity to edge devices, has raised $82 million in a Series...
Scimplify raises $40M to help manufacturers access specialty chemicals
Scimplify, an Indian startup that helps pharmaceutical and agriculture companies access a range of specialty chemicals, has raised $40 million...