Dumb sensors are everywhere, and pretty soon they’re going to be smart.1
The easiest example for me to visualize is the security camera. Let's say you own a liquor store in a not-so-nice neighborhood. Shoplifting is a pretty common occurrence and you want to catch every instance of it. You set up some cameras, and once a day you sift through the footage to see if you missed anything.
You will quickly realize that reviewing 120 hours of footage (24 hours x 5 cameras) where mostly nothing happens is tedious, even impossible. The juice is not worth the squeeze. But flip a switch, and instead of a human reviewing the footage, you pay $1/day for GPT-o2 to review it, and you're quite plausibly making a respectable return on that investment.
Why stop at detecting shoplifting? A smart camera that’s always watching could, for instance, analyze foot traffic patterns to suggest improvements to shelf placements, or track inventory levels on shelves and alert the owner (or place orders itself). Footnote 1 again – the models are getting much smarter!!!
When they become smart, existing fleets of dumb sensors will suddenly be much more valuable. Cameras are an endless source of examples – baby monitors, police bodycams, dashcams.
But let's shift gears. Consider wearables. Heart rate monitors (Fitbit, Apple Watch, Whoop) are much more popular than they used to be. Twenty years ago we had HRMs that could put together a graph of your heart rate. You could stare at the graph, download it, look for trends. That analysis was not automatic so it was hard to extract value out of the data. Now it is. A typical HRM will give you detail on your cardiovascular fitness, level of physical exertion, quality of sleep and recovery.
Something similar goes for novel sensors that monitor data that has previously been difficult to act on. Continuous glucose monitors are an interesting example. Diabetics rely on them to make sure they don't die; non-diabetics have started using them to understand how their bodies react to their diet and activity levels. To bring cameras into it again, if you hook up some Meta Ray Bans, you might be able to skip the annoying step of logging the food you eat, and just get a weekly report of insights.
Something I haven't seen but am excited for: monitors that capture a broad range of variables, none too exciting on its own, that are valuable when integrated. For instance I think you could fit many more sensors into the form factor of a CGM. You could monitor lots of stuff in your interstitial fluid, from hormones to micronutrients. If you managed to pack all of this into one sensor, the ML model on the backend might be able to derive many useful insights from dietary modifications to alerts about underlying conditions. There are also less invasive approaches to contemplate - for instance, could you stick sensors in your toilet to gather health insights passively?
In some ways cameras are a shortcut for these multi-sensor approaches. This was Elon's point about using cameras for self-driving. Yes, you get more information with LIDAR but you can do the job with cameras. Instead of monitoring specific variables like say, wind speed, atmospheric pressure, temperature, and precipitation, you can look at a video feed and get a sense of the weather. It won't be as good, but it just needs to be good enough. If I look out my window and see a sunny day where people are wearing shorts and hanging out at the dog park, I'm going to guess it's 75-80 degrees. Good enough to decide what to wear.
Could microscopy enable this kind of shortcut for biological variables? If I had a good enough microscope stuck in my arm that sent a video feed to my computer, would analysis of the video able to tell some interesting things about what's going on in my body? Not the most practical idea — maybe instead you put a microscope on the air filters in your house to monitor for toxic mold?
Besides sensors, lots of companies nowadays have amassed valuable datasets, and smart models mean they will be able to understand what’s in those datasets much better. Consider, for instance, Bloomberg, the ludicrously expensive service relied on by ~every financial analyst. In 2023 Bloomberg released BloombergGPT. What information will a GPT-6 level model be able to extract from the corpus of data on Bloomberg, from financials to SEC filings, particularly when it has a huge context window to work with?
When a technology is improving at the rate ML models are, you have to constantly remind yourself that this is the worst version we will ever have. The iPhone in 2008 was revolutionary, and it had the worst camera, the slowest processor, the blurriest screen, the quietest speakers, that users would ever have. Same goes for LLMs and ML models - we'll look back on today's models as dumb.
For instance, Gemini's context window is now 1 million tokens – about twice the length of Anna Karenina; what will that be five years from now?
Currently it seems like GPT-4o costs $0.005525/frame, so at 15 FPS (a decent rate for a security camera), you’re looking at ~$6,000 an hour. Not economical today, but I anticipate in a year or two this calculation will look very different.