When people hear “artificial intelligence,” many envision “big data.” There’s a reason for that: some of the most prominent AI breakthroughs in the past decade have relied on enormous data sets. Image classification made enormous strides in the 2010s thanks to the development of ImageNet, a data set containing millions of images hand sorted into thousands of categories. More recently GPT-3, a language model that uses deep learning to produce humanlike text, benefited from training on hundreds of billions of words of online text.
Full story : ‘Small Data’ Are Also Crucial for Machine Learning.