Skip to main content

Blob storage is a solved problem- what about compute?

Dion Whitehead

2023-11-26

Storage is a solved problem

Storing blobs of data for your web application used to be more involved, I had to think about it. Now I don’t (much): when I build an app or website, that needs some blob storage my thoughts are:

  1. create a bucket or whatever it’s called in one of some cloud provider, doesn’t have to be the biggest name, they’re all very reliable
  2. put stuff in
  3. get stuff out

I don’t care where much it is. Why? Because it’s a solved problem. Remote blob storage is:

  • very cheap, and getting cheaper
  • reliable [1]

At scale however, yes, you do have to think about cost, but those calculations are pretty straightforward business calculations. At anything below very large data, you don’t have to think much about it, and this makes it a solved problem.

It’s a solved problem in a similar way that nature has solved storing information: replication, with automated mechanisms for damage repair/reconciliation [2].

Compute is not a solved problem

Because you have to think about it. What I want:

I give you some application, some workflow, for example, a machine learning agent, some program I have created, some tool, and it requires from time to time, some level of computing power. You are able to automatically, safely, connect that application to the right level of compute resources as needed.

  • If the program is in the browser, I might be able to use the GPU, while the tab is open and running.
  • If the program is downloaded and installed program, I have access to your entire computer, but due to that, security and parasitic programs become a problem.
  • If I make available some cloud computing, it suddenly becomes complex, with lots of decisions, but with lots of scalable compute resources of different useful types.

Computing resources are inherently valuable, and often able to be converted to $$ efficiently via automation.

Obviously storage !== compute but if it were as easy, then I could distribute complex scientific simulations, and revive them years later, and they would “just work”.

The team at https://metapage.io aim to solve that problem: compute as a simple commodity.

When we can treat compute as a commodity, we have more power over the compute providers. When they manage to get their system to be difficult to move from, you lose bargaining power.

Personally, I default to https://www.digitalocean.com/. This isn’t a paid plug! They just do a great job of making plenty of options, at the right complexity/resolution, not too many, not too few.

For my full stacks, I’m using AWS but not directly, and I would prefer not to, but

References

[1] How data is lost in the cloud

https://spanning.com/blog/how-data-is-lost-in-the-cloud/

[2] Mechanisms of ionizing-radiation resistance

https://en.wikipedia.org/wiki/Deinococcus_radiodurans

END PAGE