Meta President of Global Affairs Nick Clegg said that the company used only public posts and stayed clear of both private posts that were shared with friends and family as well as private messages to train the company's AI bot, according to a report from Reuters.
"We've tried to exclude datasets that have a heavy preponderance of personal information," Clegg said during the company's annual Connect conference, adding that the "vast majority" of the data used was already publicly available.
Tech companies have been under fire in recent months over reports that they have been using information from the internet with permission to train AI models, which are capable of sorting through a massive amount of data. In some cases, the data sweeps have resulted in lawsuits, according to Reuters, especially when AI is accused of reproducing copyrighted materials.
Facebook logo on a smartphone. (Credit: Thiago Prudencio/SOPA Images/LightRocket via Getty Images)
"AI's need staggering amounts of training data, so user posts are an ideal way to 'feed the beast,'" Christopher Alexander, chief analytics officer of Pioneer Development Group, told Fox News Digital. "The concern is how these better-trained AI personas are put into use. You have the potential for some incredibly persuasive AIs that speak exactly how a person they are communicating with best identifies with. There are some real concerns about how human-seeming AI can become, and that should be considered."
Jon Schweppe, policy director of American Principles Project, questioned how many "safeguards" have been put in place to protect personal information.
The Instagram logo seen displayed on a smartphone. (Rafael Henrique/SOPA Images/LightRocket via Getty Images)
"Meta appears to be building its AI on the backs of its users’ posts, pictures, and personal data. Is that really what consumers signed up for?" Schweppe told Fox News Digital. "America is at a flashpoint, similar to where we were when social media came onto the scene nearly 20 years ago and launched the largest social experiment in the history of mankind. Either Congress acts now and gives the American people oversight over AI, or we will once again be left to the mercy of our technological overlords."
Meta CEO Mark Zuckerburg publicly unveiled the company's new AI tool during the product conference last week, which was made using a custom model similar to the Llama 2 large language model, according to Reuters. The product will be able to generate text, audio and images while also having access to real-time information by partnering with Bing search, the report noted.
The public Facebook and Instagram posts were used to train Meta's AI tool for both image generation and chat responses, while users' interactions with the bot will help it improve its features in the future, Meta told Reuters.
Ziven Havens, policy director at the Bull Moose Project, told Fox News Digital that it "should come as no surprise to users" that their posts would be used to train Meta's AI tools, but argues users should be concerned about "whether or not their data is being used in a responsible, secure way."
Mark Zuckerberg, CEO and founder of Facebook Inc., speaks during the Silicon Slopes Tech Summit in Salt Lake City on Jan. 31, 2020. (Credit: George Frey/Bloomberg via Getty Images)
"Without real action from Congress, Americans have to assume that these AI companies are being responsible with their data, which many Americans would find hard to believe given the past decade," Havens said. "If Congress doesn’t act, data privacy concerns are only going to continue to rise."
Phil Siegel, founder of the Center for Advanced Preparedness and Threat Response Simulation, told Fox News Digital it is "not surprising" that Meta is using posts to train its AI, noting that it gives the bot access to "unique data and will allow it to train an LLM (large language model) to act like they’re social media users, and it will be distinct from LLMs that just scrape factual information from the internet world."
But Siegel noted that there could be concerns about the spread of artificial personalities, especially given the mental health impact social media has already had on users.
"I worry that the models will spread bad or offensive information, hyper hallucinate (think about combining emotional human responses with already hallucinating LLMs), create warped personalities to interact with teens and more. This may be how the meta verse actually progresses," he said. "Let’s be honest, the social media companies have hurt the mental health of many people old and young without AI in the mix… they need to make sure this AI doesn’t amplify the problem even further."
Reached for comment by Fox News Digital, a spokesperson for Meta said that some of the details in the Reuters report were inaccurate, noting that Meta's AI used public Facebook and Instagram's posts to train the model for "image generation features," but did not use public or private data to "train the custom model for text/LLM on our AI Assistant and characters." The spokesperson added that Meta's new AI will not "generate audio," a feature that some reports have indicated will be available.