ChatGPT is full of sensitive private information and spits out verbatim text from CNN, Goodreads, WordPress blogs, fandom wikis, Terms of Service agreements, Stack Overflow source code, Wikipedia pages, news blogs, random internet comments, and much more.
This is not correct. I understand how neural networks function, I also understand that the neural network is not a complete system in itself. In order to be useful, the model is connected to other things, including a source of reference information. For instance, earlier this year ChatGPT was connected to the internet so that it could respond to queries with more up-to-date information. At that point, the neural network was frozen. It was not being actively trained on the internet, it was just connected to it for the sake of completing search queries.