Contract Review
How Social Media is using your data to train AI, and can you opt out?
Aug 6, 2024
Social media platforms such as Facebook, Instagram, TikTok, and LinkedIn are all using your data to train their AI models. While this topic may seem a little different than our usual contract discussions, it’s something users often agree to via their Terms of Use and Privacy Policy agreement for each platform. This week, we sift through the jargon to explain how the major platforms are using your data, the legalities involved, and how you might be able to stop them from doing so.
Why do social media platforms need your data to train their AI?
AI models require an extraordinarily large set of training data to ‘learn’. For example, to accurate generate a picture of a rabbit, a model has likely reviewed thousands of rabbit photos. This makes the scale of contextual data (posts, comments, images, videos) that social media platforms have access to a treasure trove for model development. Since these companies already access user data for free for advertising revenue, there is high incentive for them to leverage that data for other product development as well.
What data are Social Media Platforms using and can you opt out?
For each major social media platform, we’ve listed what data they’re using and whether it’s possible to opt out for US users.
Meta (Facebook, Instagram, Threads, WhatsApp) uses all public social media data for training. Private messages and accounts set to private are not used, with the exception of messages with chatbots. Users with public accounts cannot opt out, but they may file a complaint if they have evidence that their personal information has been leaked in through the model.
X/Twitter uses all posts and responses on the site for training Grok, a model developed by xAI. Users can opt out on the desktop version of X under “Settings” → “Privacy and safety” → “Grok”.
LinkedIn describes using all data about your usage and personal data are fair game for generative AI models. Similar to Meta, there is no straightforward opt out method, as users can fill out an objection form and await the company’s response.
TikTok’s practice of using user data to feed AI models predates most other platforms as they revolutionized algorithmic, model-driven feeds. Their Privacy Policy clearly states that collected data is used “to train and improve [their] technology, such as our machine learning models and algorithms”, and there isn’t any way to opt out.
Is it legal for companies to use your data for their models?
This question is tricky, because there isn’t a clear answer yet. Most companies have made very minor edits to their privacy policies to expand their existing usage of user data to include AI models. The ability for AI models to regurgitate what’s being fed to them from one user to another opens up massive holes in privacy and ownership. Lawsuits have already been filed left and right targeting companies such as OpenAI, Microsoft, Meta, and Google. Only time will tell if US legislation continues the free reign of AI model development, or enacts stricter privacy laws such as the GDPR in Europe.
—
While we have only covered social media platforms in this post, the use of what is perceived as private data to train models is pervasive. Google uses Gmail data, Microsoft uses chats with Bing, and Zoom even considered using video chat data before pulling back on that decision. One only has to look inside the Privacy Policy of any software product they use to find out the myriad of ways their data gets used. While these contracts are harder to negotiate, we still believe strongly in transparency for the end user and hope this post was helpful in explaining what companies are doing with your data.
For advocacy and beyond,
The Ask Ginkgo Team
stay in the loop