Why Models Hallucinate
This might also have implications outside of new models. Including training data multiple times for an output to increase the likelihood it catches the correct answer might be a decent strategy. Hopefully the files won't get too big and cause other issues, but for generating a timeline for instance, having multiple copies of medical charts might increase accuracy without additional time or resources.
Assuming that the increased size fits within the token limit so that you do not run into other issues... (here's hoping token limits will be a think of the past soon! Maybe the trillions in infrastructure will help:/)
Source: https://cdn.openai.com/pdf/d04913be-3f6f-4d2b-b283-ff432ef4aaa5/why-language-models-hallucinate.pdf