Australian technology news, reviews, and guides to help you
Australian technology news, reviews, and guides to help you

CSIRO finds AI struggles with evidence for answers

AI can be used for lots of things, such as music making, images, and more, but if you ask it for answers, you might need to take the results with a grain of salt.

It’s been an interesting few years, but we’re now at the point where the word “AI” is more than just a buzzword, but a feature being rolled out across the industry.

AI in PCs, AI in your phone, AI in your browser, and AI in your car. AI is everywhere, and if you ask Google a question, you might even get back an answer from AI. That’s what Google’s Gemini system is for, and it could make its way to search in Australia very soon.

But it’s not the only place where AI is being asked questions, with many people turning to ChatGPT, Open AI’s platform to get text and ideas seemingly out of nothing more than a prompt.

Much like how you can make music out of text, as well as images from a text idea, so too can you ask ChatGPT for something, be it lyrics to a song, a handy poem, a segment of code, or even the answer to complex questions.

In fact that last one has been a bit of a testing ground for scientists from the CSIRO, which has been working with the The University of Queensland to work out whether ChatGPT understands the right information to provide when asked a medical question.

To test this out, scientists and researchers asked ChatGPT 100 questions and compared the answers to the correct response based on existing medical knowledge. The results showed that ChatGPT didn’t always provide the right evidence-based response, particularly when a question included supporting or contrary evidence, though was good at giving answers when it was simply a question only.

In essence, it appears an AI’s large language model or “LLM” mightn’t deliver the right answer when it’s fed too much information, and may be a problem with current AI systems.

“We’re not sure why this happens. But given this occurs whether the evidence given is correct or not, perhaps the evidence adds too much noise, thus lowering accuracy,” said Dr Bevan Koopman, Principal Research Scientist for the CSIRO and Associate Professor at The University of Queensland.

One thing the CSIRO didn’t comment on was whether the same tests were made with Google’s Gemini, or even other AI systems like it. But all the same, you just might want to take answers from an AI with a grain of salt.

Read next