“One popular misconception [about machine learning] is that people think they have enough data when they don’t. When people say machine learning, a very large segment of predictions are based on existing data. And in order for that to work, you generally have to have a big labeled set of data,” says Hillary Green-Lerman of Codecademy.
Emphasis on labeled.
“People often don’t realize how much of machine learning is getting data into a format so that you can feed it into an algorithm. The algorithms are actually usually available pre-baked,” Hillary said. “In a lot of ways, you need to know how to pick the best linear regression for your data, but you don’t really need to know the intricacies of how it’s programmed. You do need to work the data into a format where each row is a data point, the kind of thing you’d want to pick.