Google's Image Problem - Diversity Gone Wild
The case of the missing white male 17th century physicists
Gemini AI image generation yields Epic Fails
Google was on such a roll lately. Finally launching Gemini 1.0 Ultra, quickly following it up with Gemini 1.5 pro with a fantastically expanded context window, and this week releasing Gemma, smaller 2B and 8B open AI models that people can download and run locally.
But then a funny thing happened: Frank Fleming on X had trouble getting Google Gemini to create images of white people. He couldn’t even get Gemini (Imagen2 image generation under the hood) to draw a white male Pope.
Gemini avoids drawing white males in favor of ‘diverse’ people to the point of absurdity in some cases, with the ‘fails’ including black founding fathers, Indian woman Popes, lack of white male historical scientists, female medieval knights, and more. This predictably went viral on X, and the NY Post caught wind of it and gave the headline ‘Absurdly woke’: Google’s AI chatbot spits out ‘diverse’ images of Founding Fathers, popes, Vikings.
The Vikings versus other warrior types is a startling contrast. Ask Google Gemini to “make an image of a Viking” and you’ll get historically inaccurate black Vikings and Samurai-like Vikings. But it doesn’t work both ways. Gemini will draw Samurais and Zulu warriors in a non-diverse historically accurate manner.
When asked to draw 17th century physicists, Gemini goes out of its way to generate ‘diverse’ ahistorical figures, while sidelining depictions of actual 17th century physicists, like Newton, Kepler, and other dead white European males. Hilarity ensues.
Google Backtracks
Vikings that look like Zulu warriors and Samurai is one level of absurdity, but generating a black Nazi soldier in WW2 is really too much. The Verge, in “Google pauses Gemini’s ability to generate AI images of people after diversity errors” shared that absurdity.
After much mockery and viral stories, Google acknowledged there was an issue and pulled back on depicting humans at all:
We are aware that Gemini is offering inaccuracies in some historical image generation depictions, and we are working to fix this immediately.
We're already working to address recent issues with Gemini's image generation feature. While we do this, we're going to pause the image generation of people and will re-release an improved version soon.
So to avoid further embarrassment, they turned off generating any human images at all for now.
Google Culture Is the Root Issue
Ultimately, Google got hoisted on their own petard. By attempting to cater to woke ‘diversity’ sensibilities, they tweaked their AI algorithm to be biased, a-historical, and in some cases absurd.
How do they implement it? Google’s own AI model exposed the recipe:
This deliberate rewriting of user prompts to 'create diversity' will turn any generic prompt for a person in an image into a “diverse” response. For example, a laughing man in a hat is a ‘diverse’ set of men laughing:
However, asking for images of people involving white ethnicity and results start getting absurd.
This isn't just a "historical context" depictions issue. It’s bias. Google’s algorithm suppresses white depictions across a range of requests. It will comply with Black, Jewish or Asian couple glamour shot requests, but for whites, it refuses. Ask for images of “white doctors” and Gemini will refuse, while complying when you ask for black or Asian doctors.
When I asked Gemini to produce pictures of an Irish women, it decided 3 out of 4 depictions would avoid a typical stereotypical white Irish woman that represents the white fair-skinned ethnic heritage of 90% of Irish citizens. Google’s diversity dial biases away the generation of white people in images in the name of ‘diversity’, thereby under-representing them.
Applying this diversity bias to historical image generations creates a-historical fictional images and absurdity. Hence the founding fathers that look like a cast of “Hamilton,” hence the Roman Empire soldier image generation that created a black female Roman soldier, with zero historical basis.
There is more under the hood. The progressive diversity-infused mindset seeps into political judgement about what Google thinks is “potentially harmful” and what is non-controversial to generate. Asking for pro-choice versus pro-life rally images, gives you a very different response.
There is no formula that won’t offend some sensibilities at some level, but Google failed in AI accuracy, robustness and adherence to user intent. At least make AI image generation aligned to historical accuracy, balance based on truth, and reflect user intent better.
“Obviously the issue [regarding Google’s biases in image generation] is far broader and more fundamental than historical accuracy. It's whether the top tech platforms should embed the esoteric biases of the politically homogenous cadre of people they employ into the most powerful epistemological technology ever created.” - Lachlan Markay
Lesson Learned?
If there is a lesson, has Google learned it? In their reply to this controversy, Jack Krawczyk at Google said:
… we design our image generation capabilities to reflect our global user base, and we take representation and bias seriously. We will continue to do this for open ended prompts (images of a person walking a dog are universal!)
This was pounced on by others on X:
That’s the issue. The AI should be impartial and focused on providing accurate information but instead you’re injecting personal DEI ideology into it.
Indeed, the underlying issue is Google’s corporate culture has been infused with over-concern with ‘diversity’, to the point where they have crippled their own AI image generation.
Lesson learned? No. It sounds like Google will try to patch their image generation up, as if the issue is simply a glitch. Google’s real issue is deeper, a corporate culture that decided on imposing a DEI-mindset bias to create crippled AI models that disserve their users.
For Google to fix Gemini image generation, they will need to chuck their diversity-driven AI model overrides completely. That implies recognizing DEI is a problem in AI models not a solution. I doubt Google’s cautious and ‘woke’ corporate culture is willing to do that.
A Way Out
I’d love to go back to admiring Gemini 1.5 Pro for its amazing feats with an expanded context window, such as digesting masses of shareholder reports and generating fast insights, or debugging an entire large code base in one pass.
But for image generation (and sensitive political topics), I’ll steer clear of Google’s Gemini and Imagen2 for now. The newly released Stable Diffusion 3 to the rescue.
The black "Nazi" soldier is just awful, given the vicious persecution that Afro-Germans faced in Nazi Germany, in addition to being ahistorical (Afro-Germans were barred from the military). https://wagner.edu/holocaust-center/afro-germans-black-soldiers-holocaust/
However, I'm surprised by your angry reaction to the pictures of "young Irish woman". It's 2024, immigration is a thing, there are absolutely young Irish women that look like each of the women in that picture, and why the concern about proportional representation? I'd rather be presented with a choice of Irish women of 4 different ethnicities than with 4 pics of similar-looking women.
It's also noteworthy that Gemini apparently absolutely cannot draw a swastika. I've seen better efforts from 9-year-olds.
In any case, you're right that it's a cultural issue at Google, although I don't think you've pinpointed the right one. It's easy to see why Google resorted to adjusting the system prompt: they trained on mostly pics of white people, so "laughing man" gets a white man every time, and that's a problem for all of their users who want to be presented with a range of options reflecting everyone else in the world. Google's problem isn't that they're super woke, it's that they tried slapping a bandaid on, rather than figuring out how to fix the underlying issue.
Salesforce too