Summary
OpenAI developed a new model by fine-tuning its most powerful offering, GPT-4, to assist human trainers tasked with assessing code. The company found that the new model, dubbed CriticGPT, could catch bugs that humans missed. It is also part of an effort to ensure that AI behaves in acceptable ways even as it becomes more capable.
Dylan Hadfield-Menell, a professor at MIT who researches ways to align AI, says the idea of having AI models help train more powerful ones has been kicking around for a while. He says it remains to be seen how generally applicable and powerful it is.