At dida, a German AI service provider, we have a strong opinion on data labeling:
It is best to first work on a high-quality data labeling schema, so a system, where the domain experts and the machine learning scientists jointly define, which aspects are crucial to be labeled and which fine-grained details are important. Then, once the labeling schema is well-defined, at dida, we start labeling: first our ML scientists themselves and then mainly with the help of our internal data labeling students. We prefer to label data in-house, as we have more control over the labeling quality and it is easier to adapt the labeling schema.
Take this computer vision, remote sensing example of 4 persons labeling the same rooftop for our rooftop segmentation solution: All 4 persons created their labels differently. Person 1 did a good job. Person 2 forgot some obstacles on the roofs. Person 3 drew his / her labels not precisely and not with edges and person 4 completely failed the task.