system does not work as well for one type/group of people compared to another training data really does matter: it may make generalized predictions based on a majority/minority class. Because IID characteristic of input data, the majority will be over represented