ecosistema-social-decentral.../protocols/IPFS.md

9.4 KiB
Raw Blame History

IPFS

Overview

IPFS is a content-addressed protocol for peer-to-peer hypermedia storage and distribution. It builds on top of libp2p (as the peer-to-peer transport layer) and IPLD (as the data model for content-addressed merkle dags). Its main purpose is to add, share, find, and transfer files in a globally distributed file system without any central controller. IPFS is entirely open source, as are libp2p and IPLD - supported by an OSS community of over 4 thousand contributors. The IPFS protocol continues to evolve through the main reference implementations in go and JS, stewarded by the core implementations working group.

The IPFS Alpha launched in February 2015, and the ecosystem has since grown to serve millions of monthly users through hundreds of applications including social networking platforms (Peepeth, Akasha, Peergos), content distribution networks (Dtube, Everipedia, Audius), decentralized identity solutions (Microsoft ION, 3box, ENS) and a host of other projects (Textile, Terminal, Infura, AnyType, etc). The majority of users today use IPFS as a content addressable storage layer to back their decentralized applications, using libp2p for networking and IPLD as a low-level data model. Many groups, like Textile, orbitdb, and 3box, build additional layers of tooling on top of IPFS to support a wider range of developers.

Ecosystem

In order to build a universal, inclusive, resilient and sustainable network for human knowledge and information, we need a network that provides technical features like: Ability to connect to any other user in the network, independently of the terminal that the user is using (e.g. Browser, Mobile, VR, Desktop and so on) Ability to verify receiving and sending information to the desired destination, without requiring to reveal what that data is or who the destination is. Ability to verify the integrity of the information received as consumers of the network Support for the creation of applications and businesses Support for an ever growing number of devices and users Ability to adapt and evolve to adjust to new needs (future-proofing)

These challenges need to be solved at the network fabric level in order to preserve a baseline of what the values of the network are.

Network architecture & Connectivity

IPFS is fully peer-to-peer. When joining the network, nodes bootstrap off of long-lived peers or those in their local area network. A node running the IPFS protocol can opt-in to be part of the main public network and/or some other alternative network, either independently or simultaneously. Nodes joining the main public network will join the DHT as either clients (consumers) or servers (participants in providing the content routing service), or by directly connecting to nodes theyre interested in either by peer id or through subscribing to relevant pubsub channels. All nodes in the IPFS network use libp2p to arrange peer-to-peer connections, a project that is now used by other decentralized networks such as Polkadot, ETH2 and more (including Matrix, which is experimenting with it to become full p2p).

IPFS is a distributed protocol, what this means is that every single node gets to participate in the network in whatever capacity they so desire, enabling different kinds of network topologies to emerge (Distributed, Decentralized, Federated, Centralized, Full Mesh and so on).

Peers dial to each other using a multiaddr, a self-describing address that lets peers know a nodes prefered way to be dialed. All connections in IPFS are end-to-end encrypted and authenticated using modern cryptographic primitives (Public/Private Key Crypto).

Identity

IPFS is frequently used as a content addressed storage system for decentralized identity solutions (like Microsoft ION for example). This allows IPFS to flexibly serve a variety of opinionated privacy constraints and application-layer preferences around reputation, trust, and anonymity.

At the protocol layer, IPFS peer ids are pseudonymous - and can be reset as needed for increased privacy. IPNS and provider records both rely on time-based revocation after which records expire.

Data layer

IPFS is not just for files, in fact, the IPFS data layer is much more close to a graph database with elements of linked data then a file system. IPFS uses IPLD for representing any piece of data available in the network. This is a low level data structure that empowers many higher level abstractions to be built on top of it - like the orbitdb database, or Textile threads. These application-layer data models also often include privacy-preserving aspects like end-to-end encryption and access controls (like the boom.fyi exploding links). IPFS content identifiers (CIDs) are immutable, however many groups also build mutability layers on IPFS using public key cryptography - our native version of this is called IPNS, but many other versions are also compatible and supported (ENS, etc).

IPFS doesnt have any built-in persistence incentives, however it is compatible with many. Users tend to store their own data by pinning it to their local IPFS nodes, small communities collaborate in backing up relevant data to the group through tools like collaborative clusters, enterprises use pinning services like Infura to pay 3rd parties to store their data on IPFS and ensure fast reliability, and there is a growing contingent of decentralized incentive layers including Storj, Filecoin, and others.

Since each user is responsible for pinning their own data, deleting data just requires all the users with the data to stop hosting it. This is good for censorship-resistance (individual nodes with valuable data being blacklisted by an authoritarian regime wont delete all copies), and also good for agreed content deletion (where all data hosts can unpin or avoid resolving content deemed bad).

Monetization & Business models

IPFS is fully open source and free to use. Each individual in the network is responsible for persisting the data they care about by either adding their own resources (run a node) or incentivizing another group to persist their data (pay a pinning service). Therefore, the network grows in capacity as new users join. Services that rely on IPFS are incentivized to participate in the public DHT as servers to improve performance and availability of their data - and all participating nodes help with peer-to-peer data transfer and routing.

A number of developers in the IPFS ecosystem are supported by companies who have raised money to build projects on top of IPFS - including Protocol Labs, Textile, Anytype, Infura, 3box, Audius, and many others. While many open source developers contribute pro bono part time, other groups are funded through grants or bounties by one or many of these organizations.

Curation/Discovery

IPFS generally uses a DHT for finding content in the network. Each data host advertises the data theyre storing once per day, which can be looked up by consumers through the DHT. Nodes also discover peers through their local area network, and by bootstrapping with other nodes in the network. The DHT is also used for bootstrapping pubsub channels which groups can subscribe to for topic-based updates to content they care about.

These tools can be mixed and matched by applications developing on IPFS to create very flexible structures for curating and sharing data. A great example of this is Textile threads and buckets - which are both higher-level structures built on IPFS.

Moderation & Reputation

Libp2p has some low-level primitives around connection management which can be used to encode peer reputation (how good has this peer been at sending me the data Im asking for), however IPFS is mostly a layer below identity and reputation right now. Both libp2p and IPFS do support the explicit configuration to avoid or block known bad IPs.

In terms of content moderation - each IPFS node is in full control of the data it pins, and we have early designs for how to implement an autonomy-preserving content moderation system by which nodes can subscribe to denylists from entities they trust to help avoid or filter unwanted content

Scalability

The IPFS public network currently has hundreds of thousands of nodes, but there are also many private networks running IPFS without connecting to the main DHT. Most nodes participate as DHT clients, using the network to find desired content or propagate messages or data to other peers.

Other groups have built distributed search indexes over the public DHT either through incentivized curation or by introspecting public data announced to the wider network.

Governance

The core implementations working group is responsible for reviewing and merging/rejecting internal and external contributions to the IPFS protocol, rather than through broader consensus. There is also a wider community of 4000+ OSS contributors helping improve and test IPFS.

IPFS contributors interface closely with contributors to both libp2p and IPLD, since many features or improvements require cross-cutting collaboration, however all 3 protocols are independently stewarded and have their own unique end users they optimize for.

OrbitDB is a serverless, distributed, p2p database. It uses IPFS as its data storage, and IPFS PubSub to sync databases with peers. It uses CRDTs to resolve conflicts.