2 Comments

A technical quibble here: in what way do you see "word embeddings" as unsupervised? They are just computed artifacts based on some other metric of goodness cooked up by humans.

We need more social science of the data engineers.

Expand full comment
author

After discussing this with my consiglieri, I agree that my distinction between supervised and unsupervised learning doesn’t really make sense. I was half-remembering some Yann Lecun graphic about a cake from around the time period I was describing…and I think the core intuition has to do with how much and what kind of human information is being given to the machine.

What I’m grasping towards is that with simple supervised learning, we’re evaluating the machine only on its ability to recognize human-imposed categories. There’s something else, with very large and non-human-phenomenological data (so, not text, not images), where we begin by telling the machine how to detect machine-style patterns in the data.

The “newsfeed distance” idea is trying to meet in the middle…using our categories and then seeing how they interact with machine categories. And then yes, 100 times yes, to actually do that in practice we need to know what categories the specific data engineers were using

Expand full comment