Discussion about this post

User's avatar
Morgan's avatar

Back in the day, I used to lead some research efforts on neural network speech recognition. One approach that looked promising (mostly done in labs other than ours) was to train a large model on many hours of speech, and then use that model to generate synthetic data that could be used to train a much smaller model. While the very "deep" and large models were better at learning, the smaller ones were just about as good, and could be pretty shallow and require much less computation for inference. In our lab we had also used such methods just to improve our models for speech recognition using the same networks. So I can see that it could be helpful. But there is no question that having more real data, especially in something like the political landscape that is so time variable, would be better than synthetic data, which probably just fills in a few holes in the models.

Judd's avatar

Feels like the analogy of trying to navigate a car by looking in the rear view mirror. All the data these models is trained on is based on past experience not current or predicted feelings. Polls take a snapshot of the current situation, and models like those used by the Silver Bulletin try to predict what that means for a future outcome - Like driving a car in pitch darkness where you can see directly in front of you but no further. Still better than looking backwards.

13 more comments...

No posts

Ready for more?