What Your Email Address Reveals About You: LLMs and Digital Footprints
(maximepeabody.com)
LLMs are famous for having been trained on massive amounts of data. Estimates for GPT4, for example, give training data sizes of up to 1 petabyte of data. This training data comes from crawling the open internet, as well as collections of books, articles, scientific papers, etc.
LLMs are famous for having been trained on massive amounts of data. Estimates for GPT4, for example, give training data sizes of up to 1 petabyte of data. This training data comes from crawling the open internet, as well as collections of books, articles, scientific papers, etc.