It takes shedloads of data to train AI systems to perform tasks accurately and reliably. Many companies pay gig workers on platforms like Mechanical Turk to complete tasks that are typically hard to automate, such as solving CAPTCHAs, labelling data and annotating text. This data is then fed into AI models to train them. The workers are poorly paid and are often expected to complete many tasks quickly.
Now, some are turning to tools like ChatGPT to maximise their earning potential. Researchers from the Swiss Federal Institute of Technology (EPFL) hired 44 people on the gig work platform Amazon Mechanical Turk to summarise 16 extracts from medical research papers. Then they analysed their responses using an AI model they'd trained themselves that looks for telltale signals of ChatGPT output, such as lack of variety in choice of words.
They also extracted the workers' keystrokes to determine whether they'd copied and pasted their answers, an indicator that they'd generated their responses elsewhere. They estimated that around 33- 46 per cent of the workers had used AI models like OpenAI's ChatGPT.
The researchers thought this was likely to grow even higher as ChatGPT and other AI systems become more powerful and easily accessible.
So far, the study has yet to be peer-reviewed by humans, but that would soon be outsourced to AI too.
The worry is that using AI-generated data to train AI could introduce more errors into already error-prone models. Large language models regularly present false information as fact