AI Gets Eyes and Ears | LiveKit’s Russ d’Sa
Welcome to a world where AI uses all of its senses to save lives
We've spent decades teaching ourselves to communicate with computers via text and clicks. Now, computers are learning to perceive the world like us: through sight and sound. What happens when software needs to sense, interpret, and act in real-time using voice and vision?
This week, Andrew sits down with Russ d'Sa, Co-founder and CEO of LiveKit, whose technology acts as the crucial infrastructure enabling machines to interact using real-time voice and vision, impacting everything from ChatGPT to critical 911 responses.
Explore the transition from text-based protocols to rich, real-time data streams. Russ discusses LiveKit's role in this evolution, the profound implications of AI gaining sensory input, the trajectory from co-pilots to agents, and the unique hurdles engineers face when building for a world beyond simple text transfers.
"What is tricky with realtime and with audio and video in particular is that... one, when it's realtime, you do not have a lot of room for corrective measures... [Two], you can make assertions about text... but now you have to [make] assertions around waveforms or around images." —Russ d’Sa
The Download
The Download flips the switch from busy to productive. 💡
1. AI is hallucinating package names and hackers are “slopsquatting” on them 🗑️
Bad actors are getting crafty, registering fake packages based on the AI-generated hallucinations of language models. Dubbed "slopsquatting," this method exploits developers who copy-paste code without verifying package names, leading to the injection of malware through bogus packages like the newly minted python-observe
. With AI coding tools becoming more ubiquitous, the risk of supply chain attacks is skyrocketing. How do we guard against agents that might unwittingly import malicious packages?
Read: The Rise of Slopsquatting: How AI Hallucinations Are Fueling a New Class of Supply Chain Attacks
2. Microsoft tightens the screws on VSCode extensions, locking out Cursor 🔒
Microsoft has blocked its C/C++ extension for non-VS Code tools like Cursor, coinciding with the launch of GitHub Copilot's Agent Mode. While it’s their playground and they can set the rules, this disruption highlights the precarious position of startups relying on larger ecosystems.
Read: Has the VSCode C/C++ Extension been blocked?
3. Chain-of-Vibes: why AI isn't ready for unsupervised coding (yet) 🎢
Pete Hodgson's "Chain-of-Vibes" workflow illustrates how to effectively leverage AI in coding—by breaking tasks into manageable steps and keeping a human in the loop. This approach resonates with many developers who are discovering the power of ✨vibe coding✨. As we navigate this new frontier, the key is to clarify context, select the right AI for the job, and embrace the iterative process. Are you experimenting with this? I’m posting about this on LinkedIn every week. Come join the conversation!
Read: Chain-of-Vibes | Pete Hodgson
Transform your code reviews with LinearB's AI solutions 🔧 (sponsored)
Are your code reviews slowing down your team's progress? LinearB offers automations that enhance the review process with AI-powered PR descriptions and instant feedback on every pull request. This means fewer bottlenecks and more focus on building quality software. With smart AI orchestration to flag bugs and suggest improvements, you can eliminate review fatigue and streamline your workflow. It's time to make your code reviews work for you.
Read: Introducing AI-Powered Code Review with LinearB AI
4. The browser wars reboot: blood in the water? 🦈
With Google potentially forced to sell Chrome, a host of contenders (including OpenAI, Yahoo, and DuckDuckGo) are circling like sharks. It’s a pivotal moment where AI, search, and browsing are becoming inseparable. As we shift from treating to embracing AI as our new tool for discovery over search, the future of browsing itself could look entirely different in the very near future. Will the next big player emerge from Chrome’s shadow, or are we witnessing the start of something new entirely?
Read: OpenAI tells judge it would buy Chrome from Google
5. Hacked crosswalks in Seattle play deepfaked audio 🎤
In a bizarre twist, hackers exploited a simple vulnerability in Seattle's crosswalk signals to play deepfake audio of tech billionaires. This incident highlights how AI can amplify traditional attack vectors. They hacked more than a mundane traffic signal, they hacked social media with a viral sensation as Seattle residents and visitors uploaded videos of the deepfaked crosswalks online. IoT devices are already deeply integrated into our lives, and the potential for chaotic (but creative) hacks is only going to grow.
Read: Seattle crosswalk signals with deepfake Bezos audio may have been hacked with just a cellphone