Building the data layer of web3: A conversation with Chris Liu

Why web3 needs a decentralized data layer
Blockchains excel at storing transactions securely, but they are not designed for handling large-scale data processing. Every action recorded on-chain requires significant storage space and computational power, making it impractical to use blockchain for high-frequency data storage.
This is why most dApps still offload their data to centralized storage solutions, such as cloud providers, traditional databases, or content delivery networks (CDNs). While this approach is convenient, it introduces centralization risks, including:
- Censorship and access control – A centralized provider can suspend access or modify data without user consent.
- Single points of failure – If an API or database goes offline, the entire dApp may become unusable.
- Privacy concerns – Users don’t always have control over how their data is stored and shared.
Chris emphasized that building a decentralized data layer is the key to making web3 applications fully independent of centralized intermediaries.
How decentralized data solutions work
To bridge the gap between blockchain and scalable data storage, several decentralized solutions are emerging. Chris explained how these models operate and why they are important:
- Decentralized Storage Networks
- Protocols like IPFS (InterPlanetary File System) and Arweave allow dApps to store files and retrieve them without relying on a central authority.
- Unlike traditional cloud storage, these systems distribute files across a peer-to-peer network, making them resistant to censorship and failures.
- Decentralized Indexing and Querying
- Blockchain data is difficult to query directly, so tools like The Graph create decentralized APIs that allow developers to search blockchain data efficiently.
- Instead of relying on centralized servers to fetch and sort blockchain transactions, subgraphs allow dApps to access indexed data in a decentralized manner.
- Decentralized Compute Layers
- Some projects are working on off-chain compute networks that process data outside the blockchain but still maintain decentralized verification.
- This allows for real-time analytics, machine learning models, and large-scale computations without congesting the main chain.
By integrating these solutions, developers can create applications that are scalable, resilient, and aligned with web3 principles.
The challenges of building decentralized data systems
While the vision for fully decentralized data infrastructure is promising, Chris highlighted some challenges that must be addressed:
1. Performance and Latency
- Fetching data from a decentralized network takes longer than querying a centralized database.
- Solutions like caching mechanisms and distributed indexing are being developed to reduce latency.
2. Data Integrity and Security
- Ensuring that stored data remains tamper-proof without relying on a central authority is complex.
- Cryptographic proofs and content-addressed storage help maintain data authenticity.
3. Developer Experience
- Many decentralized data tools are still difficult to integrate with existing web3 applications.
- Improving documentation, SDKs, and infrastructure tools is critical to onboarding more developers.
Chris believes that as these challenges are solved, decentralized data layers will become the standard for web3 applications.
The real-world impact of decentralized data solutions
Chris shared several industries that could benefit from decentralized data layers:
1. Decentralized Finance (DeFi)
- Real-time price feeds and trading data often rely on centralized oracles.
- Decentralized indexing solutions ensure that financial data remains transparent and verifiable.
2. Gaming and NFTs
- Most NFT metadata is stored off-chain, creating risks if centralized servers disappear.
- A decentralized data layer ensures that game assets and NFT metadata remain accessible forever.
3. DAOs and Governance
- Many DAOs rely on centralized voting platforms or Google Docs for proposals.
- A decentralized data layer allows for tamper-proof voting records and transparent decision-making.
4. AI and Machine Learning
- Training AI models requires large datasets, which are usually controlled by a few large tech companies.
- Decentralized compute networks could democratize access to AI training resources.
The future of web3’s data infrastructure
Chris believes that the next big evolution in web3 won’t just be about better blockchains—it will be about better data infrastructure.
The industry is moving towards multi-layered architectures, where:
- Layer 1 secures transactions and ownership.
- Layer 2 handles scaling and faster transactions.
- Decentralized data layers manage storage, indexing, and compute power.
This modular approach ensures that web3 applications can scale without sacrificing decentralization or user control.
Chris also emphasized the importance of interoperability. As more projects integrate decentralized data layers, cross-chain compatibility will become essential. Developers should design applications that can operate across multiple ecosystems while keeping data portable and accessible.
Final thoughts
Web3 is about more than just smart contracts and cryptocurrencies—it’s about creating a decentralized internet where users control their own data.
Chris Liu’s insights highlight the critical role of decentralized data layers in making this vision a reality. Whether it’s NFTs, DeFi, DAOs, or AI, the ability to store, process, and retrieve data in a trustless manner will define the next phase of web3’s growth.
If you’re a developer, investor, or builder, this episode provides valuable insights into how decentralized infrastructure is evolving and what it means for the future of web3 applications.
Listen to the full conversation
For a deep dive into decentralized data storage, indexing, and computation, listen to the full episode on:
- Spotify: Listen here
- Apple Podcasts: Listen here
If you’re excited about building scalable, trustless web3 applications, share this episode with your network!