Notes on "The Long-Tail Problem in AI, and How Autonomous Markets Can Solve It"

July 27, 2020

#readingnotes #ai #crypto

For my own mental health I try to limit my VC Twitter exposure, but this article on decentralizing AI by Ali Yaha at Andreessen Horowitz has been making some wider rounds.

Broadly speaking these are subjects I care about as I watch both the moral hypotheticals in AI grow more real and the decentralization tech improve with each passing year. I’ve accepted that this technology is not going away because it’s just too darn useful to nation states and all their various apparatuses. To distill my anxiety down into one sentence, we’ve passed the point where the genie is out of Pandora’s box. It seems like our only course of action is to direct the energy into exploring peaceful and safe uses of the technology that benefits everyone.

Reading Notes

A common refrain in 2018 was “AI is communist; crypto is libertarian.” It was said first by Peter Thiel as a tongue-in-cheek observation that, while crypto is a technology movement that strives to decentralize power, AI is another that can’t seem to help but centralize it.

🎵 The best part of waking up are scissor statements in your cup. 🎵

I don’t take issue with the longer explanation, but let’s sprinkle a couple fun facts in here regardless:

  1. Both China and the US <3 surveillance AI.

  2. Peter Thiel has money and influence in Palantir, Anduril, Clearview AI, and Facebook.

And true, there is no denying that having a lot of data does in fact lend giants an advantage. Large datasets of training examples are still a crucial input to even our most data-efficient Machine Learning algorithms, and that’s unlikely to change anytime soon.

I suspect that this article was published in response to the release of OpenAI’s GPT-3, a model with 175 billion parameters. It would take over a terabyte of RAM to hold the model weights alone. Mirroring the author’s sentiments here, it really is worrying that everyone except the most well funded organizations are locked out of low level research at the moment.

As I understand it, the current stance of OpenAI is that it is both unethical and impractical to open source the full GPT-3 model, and instead they are working on building a paid API around it. OpenAI is a non-profit, but I do question how much the monetization potential played into that decision.

The truth is that today’s neural networks are fantastic interpolators but terrible extrapolators. They are powerful pattern matchers with the ability to contort themselves to fit almost any dataset, but their fitting is blind to the mechanisms that generate the data in the first place.

I like this. For a while now, my reductionist name for neural networks has been “function estimators” because it cuts through the hype quite well. It’s not even a real dig at the tech. They’re quite good at what they do.

Our neural networks today are architected and trained via the top-down, hierarchical efforts of a group of people that invariably work for the same company. At X, for instance, my team and I were on the hook for everything from scoping the ambitions of the project, specifying the architecture of our networks, tuning their parameters, building our robots from the ground up, and babysitting them (below) as they collected petabytes of data.

That whole bias problem in AI seems a bit more obvious from this vantage point, doesn’t it? There is a cultural centralization affecting these AI systems. I’m not claiming any originality here. People have been shouting about this for a while, but it’s worth reiterating.

(I’m not sure if the author intended for this to be a commentary on bias, but this passage really stuck out.)

But what if such intelligence could actually emerge bottom up? What if it could be born, not from the efforts of just one company, but from the aggregate knowledge of countless people working independently from far afield to contribute diverse signal to the collective?

Enter crypto.

It’s time to interject here that cryptocurrency is not synonymous with decentralization. It’s one particular framing of the larger problem. Off in another corner of the Internet, the ActivityPub ecosystem seems to be doing quite well for itself in spite of the ever-growing stack of obituaries.

With this new bit of logic, our neural network begins to take on a life of its own as a self-funding, autonomous entity. And, as the flywheel is set into motion, the service it offers will begin to become useful to the demand side too — i.e., developers willing to pay real money for its predictions.

Legitimately, I’m terrified of the mess this will create when applied to tasks like facial recognition. This seems like the road towards decentralized surveillance.

Or maybe it’ll end up being more lucrative as a market for reject paparazzi shots. One can only hope.

We must, for example, build into our smart contract a mechanism that can reliably measure signal in the data that is contributed to it. It must have logic that is capable of telling apart garbage data from the real thing.

In order to train our image classifier we require an oracle that can already solve the problem! /s

More seriously, this isn’t referring to a mechanism that can solve the exact same problem. This screening mechanism would probably only need to coarsely identify likely interesting training data. Perhaps a previous generation of the classifier could even be used as the screener to train the next generation?

The screening mechanism is also functionally similar to the sort metric I needed in this previous CIFAR-10 training order experiment. A good sort metric should measure the amount of “good stuff” (as I called it) in each training image.

A related problem is that, because the state of all smart contracts today is visible on the blockchain to anyone who wishes to look, it would be trivial for an attacker to steal the underlying neural net (once trained) and bypass the need to pay for its predictions.

I’ve predicted before that the open source era of machine learning will draw to a close as companies start locking up their trained models as trade secrets both for monetary and security reasons. Naturally, I’d love to be wrong about this one.

It is because the core protocols of the internet (i.e. TCP/IP) are decentralized, that it is possible for trillion-dollar companies like Google to be built on top of them. By that same token, it is inconceivable that a company the size of Google could be built on top of Google, in the same way that Google was built on top of TCP/IP.

Yes. On the shoulders of protocols. Please.