AI research lab Anthropic developed new RLAIF techniques for Constitutional AI that help align AI with human values. They use self-supervision and adversarial training to teach AI to behave according to certain principles or a ‘constitution’ without needing explicit human labeling or oversight. Constitutional AI aims to embed legal and ethical frameworks into the model, like those in national constitutions.
An example of Constitutional AI is an AI system designed to assist with medical decision-making. The system is trained on a constitution of ethical principles and legal guidelines, such as the right to privacy and autonomy. When faced with a complex medical decision, the system can use its training to determine the best course of action while aligning with these values.