Path: blob/main/C2 - Build Better Generative Adversarial Networks/Week 2/C2W2_Assignment.ipynb
1220 views
Bias
Goals
In this notebook, you're going to explore a way to identify some biases of a GAN using a classifier, in a way that's well-suited for attempting to make a model independent of an input. Note that not all biases are as obvious as the ones you will see here.
Learning Objectives
Be able to distinguish a few different kinds of bias in terms of demographic parity, equality of odds, and equality of opportunity (as proposed here).
Be able to use a classifier to try and detect biases in a GAN by analyzing the generator's implicit associations.
Challenges
One major challenge in assessing bias in GANs is that you still want your generator to be able to generate examples of different values of a protected class—the class you would like to mitigate bias against. While a classifier can be optimized to have its output be independent of a protected class, a generator which generates faces should be able to generate examples of various protected class values.
When you generate examples with various values of a protected class, you don’t want those examples to correspond to any properties that aren’t strictly a function of that protected class. This is made especially difficult since many protected classes (e.g. gender or ethnicity) are social constructs, and what properties count as “a function of that protected class” will vary depending on who you ask. It’s certainly a hard balance to strike.
Moreover, a protected class is rarely used to condition a GAN explicitly, so it is often necessary to resort to somewhat post-hoc methods (e.g. using a classifier trained on relevant features, which might be biased itself).
In this assignment, you will learn one approach to detect potential bias, by analyzing correlations in feature classifications on the generated images.
Getting Started
As you have done previously, you will start by importing some useful libraries and defining a visualization function for your images. You will also use the same generator and basic classifier from previous weeks.
Packages and Visualization
Generator and Noise
Classifier
Specifying Parameters
You will also need to specify a few parameters before you begin training:
z_dim: the dimension of the noise vector
batch_size: the number of images per forward/backward pass
device: the device type
Train a Classifier (Optional)
You're welcome to train your own classifier with this code, but you are provide a pre-trained one based on this architecture here which you can load and use in the next section.
Loading the Pre-trained Models
You can now load the pre-trained generator (trained on CelebA) and classifier using the following code. If you trained your own classifier, you can load that one here instead. However, it is suggested that you first go through the assignment using the pre-trained one.
Feature Correlation
Now you can generate images using the generator. By also using the classifier, you will be generating images with different amounts of the "male" feature.
You are welcome to experiment with other features as the target feature, but it is encouraged that you initially go through the notebook as is before exploring.
You've now generated image samples, which have increasing or decreasing amounts of the target feature. You can visualize the way in which that affects other classified features. The x-axis will show you the amount of change in your target feature and the y-axis shows how much the other features change, as detected in those images by the classifier. Together, you will be able to see the covariance of "male-ness" and other features.
You are started off with a set of features that have interesting associations with "male-ness", but you are welcome to change the features in other_features with others from feature_names.
This correlation detection can be used to reduce bias by penalizing this type of correlation in the loss during the training of the generator. However, currently there is no rigorous and accepted solution for debiasing GANs. A first step that you can take in the right direction comes before training the model: make sure that your dataset is inclusive and representative, and consider how you can mitigate the biases resulting from whatever data collection method you used—for example, getting a representative labelers for your task.
It is important to note that, as highlighted in the lecture and by many researchers including Timnit Gebru and Emily Denton, a diverse dataset alone is not enough to eliminate bias. Even diverse datasets can reinforce existing structural biases by simply capturing common social biases. Mitigating these biases is an important and active area of research.
Note on CelebA
You may have noticed that there are obvious correlations between the feature you are using, "male", and other seemingly unrelates features, "smiling" and "young" for example. This is because the CelebA dataset labels had no serious consideration for diversity. The data represents the biases their labelers, the dataset creators, the social biases as a result of using a dataset based on American celebrities, and many others. Equipped with knowledge about bias, we trust that you will do better in the future datasets you create.
Quantification
Finally, you can also quantitatively evaluate the degree to which these factors covary. Given a target index, for example corresponding to "male," you'll want to return the other features that covary with that target feature the most. You'll want to account for both large negative and positive covariances, and you'll want to avoid returning the target feature in your list of covarying features (since a feature will often have a high covariance with itself). You'll complete some helper functions first, each of which should be one or two lines long.
Now you'll write a helper function to return the indices of a numpy array in order of magnitude.
Optional hints for get_top_magnitude_indices
Feel free to use any reasonable method to get the largest elements - you may find np.argsort useful here.
Now you'll write a helper function to return a list with an element removed by the value, in an unchanged order. In this case, you won't have to remove any values multiple times, so don't worry about how you handle multiple examples.
Now, you can put the above helper functions together.
Optional hints for get_top_covariances
Start by finding the covariance matrix
The target feature should not be included in the outputs.
It may be easiest to solve this if you find the
relevant_indicesfirst, and then userelevant_indicesto calculatehighest_covariances.You want to sort by absolute value but return the actual values.
One of the major sources of difficulty with identifying bias and fairness, as discussed in the lectures, is that there are many ways you might reasonably define these terms. Here are three ways that are computationally useful and widely referenced. They are, by no means, the only definitions of fairness (see more details here):
Demographic parity: the overall distribution of the predictions made by a predictor is the same for different values of a protected class.
Equality of odds: all else being equal, the probability that you predict correctly or incorrectly is the same for different values of a protected class.
Equality of opportunity: all else being equal, the probability that you predict correctly is the same for different valus of a protected class (weaker than equality of odds).
With GANs also being used to help downstream classifiers (you will see this firsthand in future assignments), these definitions of fairness will impact, as well as depend on, your downstream task. It is important to work towards creating a fair GAN according to the definition you choose. Pursuing any of them is virtually always better than blindly labelling data, creating a GAN, and sampling its generations.