Seeing Seeds Beyond Weeds: Green Teaming Generative AI for Beneficial Uses. (arXiv:2306.03097v1 [cs.HC])

Large generative AI models (GMs) like GPT and DALL-E are trained to generate
content for general, wide-ranging purposes. GM content filters are generalized
to filter out content which has a risk of harm in many cases, e.g., hate
speech. However, prohibited content is not always harmful — there are
instances where generating prohibited content can be beneficial. So, when GMs
filter out content, they preclude beneficial use cases along with harmful ones.
Which use cases are precluded reflects the values embedded in GM content
filtering. Recent work on red teaming proposes methods to bypass GM content
filters to generate harmful content. We coin the term green teaming to describe
methods of bypassing GM content filters to design for beneficial use cases. We
showcase green teaming by: 1) Using ChatGPT as a virtual patient to simulate a
person experiencing suicidal ideation, for suicide support training; 2) Using
Codex to intentionally generate buggy solutions to train students on debugging;
and 3) Examining an Instagram page using Midjourney to generate images of
anti-LGBTQ+ politicians in drag. Finally, we discuss how our use cases
demonstrate green teaming as both a practical design method and a mode of
critique, which problematizes and subverts current understandings of harms and
values in generative AI.

Source: https://arxiv.org/abs/2306.03097

webmaster

Related post