Bias Assessment and Mitigation in LLM-based Code Generation. (arXiv:2309.14345v1 [cs.SE])
Utilizing state-of-the-art Large Language Models (LLMs), automatic code
generation models play a pivotal role in enhancing the productivity and
efficiency of software development coding procedures. As the adoption of LLMs
becomes more widespread in software coding ecosystems, a pressing issue has
emerged: does the generated code contain social biases, such as those related
to age, gender, and race? This issue concerns the integrity, fairness, and
ethical foundation of software applications that depend on the code generated
by these models, yet is under-explored in the literature. This paper presents a
novel bias assessment framework that is specifically designed for code
generation tasks. Based on this framework, we conduct an extensive evaluation
on the bias of nine state-of-the-art LLM-based code generation models. Our
findings reveal that first, 31.45% to 79.93% code functions generated by our
evaluated code generation models are biased, and 9.68% to 37.37% code
functions’ functionality are affected by the bias, which means biases not only
exist in code generation models but in some cases, directly affect the
functionality of the generated code, posing risks of unintended and possibly
harmful software behaviors. To mitigate bias from code generation models, we
propose three mitigation strategies, which can decrease the biased code ratio
to a very low level of 0.4% to 4.57%.
Source: https://arxiv.org/abs/2309.14345