ecosistema-social-decentral.../protocols/ipfs.md

11 KiB
Raw Permalink Blame History

IPFS

IPFS is a content-addressed protocol for peer-to-peer hypermedia storage and distribution. The IPFS network is mainly used as a storage layer for decentralized applications. Its main purpose is to add, share, find, and transfer files in a globally distributed file system. Components of IPFS, such as libp2p, the p2p networking library, and IPLD, the data model for content-addressed Merkle DAGs, are used separately by applications that do not participate in the IPFS network.

The IPFS Alpha launched in February 2015. IPFS protocols are designed for upgradeability.

Identity

IPFS nodes have a peer ID, the hash of their public/private keypair. When peers connect, they exchange public keys and check to make sure they match the node IDs. Communications are encrypted using these keys. Node IDs are pseudonymous, and can be reset as needed to maintain privacy. Node private keys are stored in the IPFS config by default.

IPFS can serve as a content addressed storage system for decentralized identity solutions (such as Microsoft's ION). IPID is an implementation of the DID (decentralized identifiers) specification over IPFS, using IPNS.

Network

IPFS is a distributed protocol. Every node gets to participate in the network in any configuration, enabling different kinds of network topologies to emerge. Nodes can connect to each other independently of the client (browser, mobile, desktop, command line).

When nodes join the network, they bootstrap off of long-lived peers or those in their local area network. IPFS comes with a default list of trusted peers, which can be modified. A node can opt-in to be part of the main public network and/or an alternative network. Nodes joining the main public network will join the DHT as either clients (consumers) or servers (providers of the content routing service). Nodes can also directly connect to peers theyre interested in either through peer ID or through subscribing to relevant pubsub channels.

All nodes in the IPFS network use libp2p, the modular networking library, to make peer-to-peer connections. Libp2p is transport agnostic, leaving the choice of transport protocol up to the developer, and allowing an application to support many different transports at the same time. Peers dial to each other using a multiaddr, a self-describing network address that lets peers know a nodes preferred way to be dialed. The use of multiaddrs is intended to future-proof addresses, and allow multiple transport protocols and addresses to coexist. All connections in IPFS are end-to-end encrypted and authenticated using public/private key cryptography.

Gateways allow IPFS to be accessed over HTTP, which makes content stored in IPFS accessible through a standard browser.

Data

IPFS is commonly known as a distributed file system, but the data layer is closer to a graph database with elements of linked data. IPFS uses IPLD for representing any piece of data available in the network. The IPLD data model treats all hash-linked data structures as subsets of a unified information space. Higher level abstractions can be built on top of it, like the OrbitDB database, or Textile threads.

Files are located in the IPFS network by their CID, a content identifier that is based on the hash of the file. The hash of a file does not change, so this form of addressing does not allow updates. However, mutable addresses can be built on top by using the hash of a public key as an address. IPFS's native version is IPNS (InterPlanetary Name System), although other versions (such as ENS) are compatible. The keypair associated with the address is used to sign content that is published under it. DNSLink is also used to map a domain name to an IPFS address. It is currently faster than IPNS and has the advantage of being human-readable and memorable.

IPFS does not have built-in incentives for nodes to persist content to the network. Users store their own data by pinning it to their local IPFS nodes, communities collaborate in backing up data through tools like collaborative clusters, and enterprises pay 3rd party pinning services like Infura to ensure availability and reliability. Projects like Storj and Filecoin are building blockchain networks for incentivizing persistent IPFS storage.

If all users stop hosting a piece of data, it is removed from the network. However, if a node chooses to continue hosting it, it can still be located by its content ID.

Moderation & Reputation

IPFS is mostly a layer below identity and reputation, but libp2p has some low-level primitives around connection management which can be used to encode peer reputation. Peer reputation is based on how reliable the peer is at returning requested data. Both libp2p and IPFS support the explicit configuration to avoid or block known bad IPs.

Each IPFS node is in full control of the data it pins. Nodes can add a denylist to their configuration, optionally using the one used by public gateways to block DMCA takedown content, malware, and other illegal or pernicious content. There is a proposed design for an autonomy-preserving content moderation system by which nodes can subscribe to denylists from entities they trust to help avoid or filter unwanted content.

Social & Discovery

IPFS uses a DHT for finding content in the network. Each host advertises the data theyre storing once per day, which can be looked up through the DHT. Nodes also discover peers through their local area network, and by bootstrapping with other nodes. The DHT is also used for bootstrapping pubsub channels which groups can subscribe to for topic-based updates to content they care about.

Privacy & Access Control

Content published to IPFS is public by default. Encryption can be used to add privacy and access control layers on top of IPFS (see Peergos).

Interoperability

IPFS Gateways allow the network to be accessed over HTTP in browsers without native IPFS support.

The IPLD data structure is designed to allow any kind of hash linked data can be ingested into IPFS, including blockchains like Bitcoin and Ethereum, and git repos.

Scalability

The IPFS public network currently has hundreds of thousands of nodes. Private networks also run IPFS without connecting to the main DHT, and are not included in the node count.

IPFS nodes have historically had high resource consumption, although improvements and 'low power' settings for weaker devices have since been added.

Metrics

Governance & Business Models

IPFS is developed by Protocol Labs, a VC-funded company that raised over 200 million in a token sale for Filecoin. The core implementations working group, consisting of both employees of the company and external contributors, has decision-making authority over contributions to the IPFS protocol. Libp2p, IPLD, and Filecoin are stewarded by separate working groups.

Contributors from the open source community either volunteer their time, or are funded through companies that have raised money to build on top of IPFS.

Implementations & Applications

Implementions of IPFS exist in go and javascript, and a Rust implementation is under development. Projects like Textile, OrbitDB, and 3box have built additional layers of tooling on top of IPFS to support a wider range of applications.

Examples of tools that have expanded the use cases of IPFS include:

  • Textile buckets - dynamic folders for decentralized applications, distributed over IPFS
  • OrbitDB is a serverless, p2p database that uses IPFS to store data, and IPFS PubSub to sync databases with peers. It uses CRDTs to resolve conflicts.
  • 3box has a web application framework that stores data in IPFS

Libp2p is used, independently of IPFS, by other decentralized networks such as Polkadot, ETH2, and Matrix, which is experimenting with as a transport layer for the p2p version.

A list of applications that use IPFS: https://awesome.ipfs.io/

Ecosystem

Notable p2p applications include:

Enterprise adoptions and integrations include: