Data Integrity

Jan 21

I’m low-key worried about data integrity over the next decade. If someone like Wikipedia doesn't step up and offer an open source dataset, it's going to be impossible to determine truth for factual claims.

I was thinking about how ai hallucinates from our perspective "wrong output not matching target response" but could be a correct answer from the AI perspective which likely takes into account general sentiment of the broader Internet. It got so bad that grok had to be pinned to Elon Musk's political takes on Twitter so it would at least be on brand if they couldn't achieve political correctness.

With cases of political actor influence like in this video, it seems that someone needs to teach ai the difference between reality and a simulation. You need real world data to back everything up with. I think a set of open source experiments that all get published to a single place in predictable format could be that dataset. If Wikipedia or even GitHub provided a space for that, ai might get past the truth issue. One dataset without all the noise.

Source: https://www.youtube.com/watch?v=CsCweuN9Ua8

Joshua Smith

Data Integrity

OpenAI or ClosedAI?

Truth-ish