This commit is contained in:
Jay Graber 2020-07-08 22:46:49 -07:00
parent bc4870681d
commit 3dc80b0b66
6 changed files with 93 additions and 50 deletions

View File

@ -1,6 +1,6 @@
# Mastodon
Mastodon is a federated Twitter alternative. It is the most popular client using the [ActivityPub](../protocols/activitypub.md) federation protocol.
Mastodon is a federated Twitter alternative, released in 2016. It is the most popular client using the [ActivityPub](../protocols/activitypub.md) federation protocol.
Each server is called an "instance". The entire constellation of instances that can interoperate is called the “Fediverse”.

View File

@ -44,12 +44,12 @@ The distinction between protocols and applications is clearer in the federated s
- Decentralized identity
- In federated applications
- In p2p applications
- Blockchain identity systems
- DIDs
- Blockchain identity
- Key management
- Reputation, Trust
- Failure modes:
- Sybils & spam
- Account loss
- Impersonation
[Data](topics/data.md)
@ -59,9 +59,8 @@ The distinction between protocols and applications is clearer in the federated s
[Discovery](topics/discovery.md)
- Queries
- Curation
- Consistency & availability
- Search
[Moderation](topics/moderation.md)
@ -73,7 +72,7 @@ The distinction between protocols and applications is clearer in the federated s
- User metadata
- Private accounts
- Direct messaging
- Direct messages
[Monetization](topics/monetization.md)

View File

@ -16,9 +16,9 @@ In a recent [roadmap](https://beakerbrowser.com/2020/06/10/roadmap-summer-2020.h
### Networking
Hypercore data-structures are identified by a public key. They may also be identified by URLs which use the `hyper://` scheme and the base64-encoded public key as the domain. A `hyper://` URL may reference any kind of Hypercore-based data structure, including Hyperdrives.
Hypercore data-structures are identified by a public key. They may also be identified by URLs which use the `hyper://` scheme and the base64-encoded public key as the domain. A `hyper://` URL may reference any kind of Hyper-based data structure, including Hyperdrives.
Hypercore uses the [Hyperswarm](https://hypercore-protocol.org/#hyperswarm) networking module. Hyperswarm combines a Kademlia-based DHT for global discovery with MDNS to discover peers on local networks. Users [join the swarm](https://pfrazee.hashbase.io/blog/hyperswarm) for a “topic” and query periodically for other peers who are in the topic. The topic is comprised of the hash of the public key which identifies the data being shared. When ready to connect, Hyperswarm helps create a socket between them using either UTP or TCP. A set of bootstrap servers is used to help connect a new node to the network.
Hyper uses the [Hyperswarm](https://hypercore-protocol.org/#hyperswarm) networking module. Hyperswarm combines a Kademlia-based DHT for global discovery with MDNS to discover peers on local networks. Users [join the swarm](https://pfrazee.hashbase.io/blog/hyperswarm) for a “topic” and query periodically for other peers who are in the topic. The topic is comprised of the hash of the public key which identifies the data being shared. When ready to connect, Hyperswarm helps create a socket between them using either UTP or TCP. A set of bootstrap servers is used to help connect a new node to the network.
The Hyperswarm DHT includes a hole-punching protocol to help nodes establish connections through NATs and firewalls.
@ -52,7 +52,7 @@ Beaker browser has implemented an informal [social protocol](https://docs.beaker
By default, Hypercore Protocol connections are encrypted using the [Noise protocol](https://noiseprotocol.org/) encryption framework. Hyperswarm does not hide user IPs or mask metadata.
Hypercore datastructures can only be accessed by users with knowledge of the public key. Because public keys are unguessable, this means they must be shared out of band prior to access (it is not possible to access the data by watching the network).
Hyper datastructures can only be accessed by users with knowledge of the public key. Because public keys are unguessable, this means they must be shared out of band prior to access (it is not possible to access the data by watching the network).
Fine-grained access within Hypercore datastructures, such as access to individual files or folders in a Hyperdrive, is not yet implemented.
@ -62,10 +62,12 @@ Hyper is not interoperable with any other protocol, and has not prioritized impl
The Hyperdrive data-structure is designed to be POSIX-compatible which enables Hyperdrive access via a [FUSE interface](https://en.wikipedia.org/wiki/Filesystem_in_Userspace).
### Scalability & Metrics
### Scalability
Scalability statistics are not available for the Hyper network.
### Metrics
There are about 300 to 400 nodes in the Hyperswarm DHT as of June 2020.
On Github, Hyper has [38 contributors](https://github.com/hypercore-protocol/hypercore/graphs/contributors) on the Hypercore module, and 781 dependent projects. The Hyperdrive module has [27 contributers](https://github.com/hypercore-protocol/hyperdrive/graphs/contributors) and 531 dependent projects.
@ -80,6 +82,8 @@ Blue Link Labs's business model is to run cloud Hypercore hosting services. Hype
The [Beaker browser](https://beakerbrowser.com/) is a p2p web browser for Hyper sites.
[Kappa DB](https://github.com/kappa-db/kappa-core/blob/master/intro.md) is an append-only log DB that uses hypercores.
Applications using Hyper include:
- [Blahbity Blog](https://youtu.be/zwR6YyConQI?t=878), a twitter clone built in Beaker browser

View File

@ -1,33 +1,37 @@
# Discovery
In decentralized networks, whether federated or p2p, there is often no global search functionality, as no node has a unified view of the network. This section covers methods of content discovery in decentralized social networks.
Being able to discover great content is key to an engaging social network. This section covers methods of content discovery in decentralized social networks.
Data availability in decentralized social networks can be discussed in terms of the [CAP theorem](https://en.wikipedia.org/wiki/CAP_theorem), which states that a distributed data store can only provide two out of the three guarantees: Consistency, Availability, and Partition tolerance. Centralized social networks are consistent - all users see the most recent state of the network, or an error. Decentralized networks often sacrifice consistency for availability and partition tolerance. Networks that prioritize availability over consistency will always process a query and return the most recent version of the information even if it cant guarantee that its up to date.
Decentralized social networks take two main approaches to content discovery: "local is better", or "share everything". "Local is better" applications, such as Mastodon and ssb apps, embrace a small-world approach as part of their design philosophy. They do not attempt to offer universal search or content recommendation, as they are not trying to replicate the experience of centralized social apps. "Share everything" approaches, such as Aether, or applications that treat a blockchain as the database, instead choose to share all of the data among all of the peers.
## Curation
Most decentralized social networks use a chronological feed, or rely up on upvotes and downvotes to surface content.
### Mastodon
Mastodon servers store content from users followed by members of the server. Users are presented with three timelines: a home timeline with posts from accounts the user follows, a local timeline with posts from the local instance, and a federated timeline with all posts that have been retrieved from remote instances. There is no global search functionality. This issue being discussed in Mastodon: https://github.com/tootsuite/mastodon/issues/9529
Mastodon's feed is chronological. Users are presented with three timelines: a home timeline with posts from accounts the user follows, a local timeline with posts from the local instance, and a federated timeline with all posts that have been retrieved from remote instances. Servers store content from users followed by members of the server.
Mastodon has [public relays](https://source.joinmastodon.org/mastodon/pub-relay) which rebroadcast anything sent to it to anyone who subscribes to the pub.
To overcome the difficulties of new users finding people to follow to get connected to the network, [Trunk](https://communitywiki.org/trunk/) is a community-built tool that helps users find and follow people by category. Users have requested a global directory for [importing friends from other networks](https://github.com/tootsuite/mastodon/issues/11886). Mastodon users used to be able to find their Twitter friends using `bridge.joinmastodon.org`, but the service was shut down after the developer lost access to API keys and was not granted another set.
Mastodon's feed is chronological, not algorithmic.
Hashtags are used to filter and discover content in ssb, Diaspora, and Mastodon.
### Matrix
All conversations on Matrix take place through rooms, which people either join (if public), peek into (if viewable), or are invited to. Because of its focus on conversations in rooms, there is no focus on having globally discoverable content.
Matrix [optimizes for Availability and Partition Tolerance](https://matrix.org/docs/spec/) at the expense of Consistency. Homeservers model communication history as a partially ordered graph of events known as the room's "event graph", which is synchronised with eventual consistency between the participating servers using the "Server-Server API".
Mastodon [public relays](https://source.joinmastodon.org/mastodon/pub-relay) rebroadcast anything sent to it to anyone who subscribes to the pub.
### Ssb
Content is propagated and discovered through follow relationships in the ssb network. When a follow relationship is initiated, the posts of the user being followed begins to be synced to the follower's node. Those messages and files are stored locally on the user's computer, indefinitely, for applications running ssb to read.
Ssb prioritizes Availability and Partition Tolerance over Consistency. No node has a global view of the network, but can sync data with any other node it is connected to. Ssb applications can be used locally or offline, and the append-only log ensures that data will be ordered correctly, even if it is not up-to-date. Once synced to the network, a user will always see content when they open an ssb application.
## Search
### Blockchain social networks
In decentralized networks, whether federated or p2p, there is often no global search functionality, as no node has a unified view of the network.
Blockchain social networks essentially treat the blockchain as a distributed database guaranteeing global availability of content, prioritizing Consistency. Applications that store data on chain usually do not have Availability, as the user cannot get the most recent state unless they are connected to the blockchain. Tradeoffs of storing social network data on a blockchain include making the data public, immutable, and adding performance overhead for full-node servers that must sync the full state of the chain and perform mining or validation functions.
### Mastodon
Mastodon has no global search functionality. A Github issue discussing global indexing: https://github.com/tootsuite/mastodon/issues/9529
To overcome the difficulties of new users finding people to follow to get connected to the network, a community-built tool [Trunk](https://communitywiki.org/trunk/), exists to helps users find and follow people by category. Users have requested a directory for [importing friends from other networks](https://github.com/tootsuite/mastodon/issues/11886). Mastodon users used to be able to find their Twitter friends using `bridge.joinmastodon.org`, but the service was shut down after Twitter API keys were lost and not reissued.
### Ssb
A search of the network will yield results from the data set the user's node has access to.
### Yacy
[Yacy](https://yacy.net/) is a decentralized search engine. Users can define their own web index and start a web crawl.

View File

@ -22,7 +22,9 @@ OAuth is currently the most successful identity standard. OAuth was created to s
### Identity in federated applications:
- XMPP - User identity in XMPP is a username followed by the homeserver, and looks like an email address: `alice@example.comq`
Email is the most successful federated social application. As a result, many user identifiers in federated applications look similar to email addresses.
- XMPP - User identity in XMPP is a username followed by the homeserver: `alice@example.comq`
- Matrix - User identity in Matrix is a username followed by the homeserver: `@bob:matrix.org`
@ -34,7 +36,7 @@ OAuth is currently the most successful identity standard. OAuth was created to s
### Identity in p2p applications:
P2p systems that put identity entirely in the hands of users must deal with key management, key verification, and key backup. Account recovery is usually not possible, because there is no third party to recover an identity if a user loses their password or key.
P2p systems that put identity entirely in the hands of users must deal with [key management](##key-management), key verification, and key backup. Account recovery is usually not possible, because there is no third party to recover an identity if a user loses their password or key.
- Peergos - Peergos users are identified by unique usernames linked to public keys. The uniqueness of usernames is ensured through a global append-only log for [public key to username](https://book.peergos.org/architecture/pki.html) mappings that is mirrored on every node in the Peergos system. Names are taken on a first come first served basis. Currently, a single server determines the canonical state of this log, and other nodes sync to it. Long-term considerations include decentralizing the name server through a blockchain architecture. Peergos allows [multi-device login](https://book.peergos.org/features/multi.html) through a password-based interface. A user's private keys are derived every time they log in using their username, password and a published salt.
@ -44,15 +46,7 @@ P2p systems that put identity entirely in the hands of users must deal with key
- Aether - Identities in Aether are keypairs. Users can choose a custom nickname, but it is not unique. Multi-device usage is possible, but difficult, and requires manually porting a user config file across devices.
## Decentralized Identifiers (DIDs)
The [DID W3C standard](https://www.w3.org/TR/did-core/) is an emerging standard around decentralized identifiers. [DIDs](https://w3c-ccg.github.io/did-primer/) are a new type of globally unique identifier that do not require a centralized registration authority, and can serve as a decentralized public key infrastructure.
The format of a DID is: a scheme identifier, followed by the DID method, followed by a method-specific identifier. A simple example: `did:example:123456789abcdefghi`
- IPFS - Identity solutions have emerged that use IPFS as a data storage layer for decentralized identifiers. A recent blog post on [IPFS and Decentralized Identity](https://blog.ipfs.io/2020-06-11-identity-ipfs-ion/) lists examples of identity systems on IPFS, including [3ID](https://www.notion.so/3ID-Identity-System-fac2f47862a84602b366af1cd64f3523) and Microsoft's standards-based identity service ION.
### Blockchain Identity
### Blockchain Identity Systems
In 2001, Zooko Wilcox-O'Hearn named three desirable properties of decentralized network identifiers: human-meaningful (memorable), decentralized (global), and secure (unique). This became known as [Zooko's triangle](https://en.wikipedia.org/wiki/Zooko%27s_triangle). Prior to the invention of cryptocurrency blockchains, which enabled decentralized global consensus, it was thought that only two of these three properties could be achieved at one time. Now, many projects have created blockchain-based protocols for naming systems that fulfill all three properties.
@ -64,9 +58,39 @@ In 2001, Zooko Wilcox-O'Hearn named three desirable properties of decentralized
- Handshake - [Handshake](https://handshake.org/) is a blockchain for name registrations.
- Microsoft - [ION](https://techcommunity.microsoft.com/t5/identity-standards-blog/ion-booting-up-the-network/ba-p/1441552) is a Microsoft-led digital identity system built on Bitcoin.
## Decentralized Identifiers (DIDs)
- IBM - IBM is helping to create, operate and maintain [permissioned decentralized identity networks](<(https://www.ibm.com/blockchain/solutions/identity/networks)>) built using Hyperledger
[DIDs](https://w3c-ccg.github.io/did-primer/) are a new type of globally unique identifier that do not require a centralized registration authority, and can serve as a decentralized public key infrastructure. The concept is being formalized into an [emerging W3C standard](https://www.w3.org/TR/did-core/). The term "DIDs" was [originally coined](https://github.com/WebOfTrustInfo/rwot5-boston/blob/master/topics-and-advance-readings/did-primer.md) by the W3C Verifiable Claims Task Force in the spring of 2016. The groups researching decentralized identity that converged on the concept of DIDs were inspired by the potential of using blockchains as decentralized, yet global identity registrars.
DIDs aspire to be a [self-sovereign identity](http://www.lifewithalacrity.com/2016/04/the-path-to-self-soverereign-identity.html). They differ from other globally unique identifiers in that they are globally resolvable, decentralized, and cryptographically verifiable. DIDs require a global key-value database in which the database is a blockchain, distributed ledger, or decentralized network.
The format of a DID is: a scheme identifier, followed by the DID method, followed by a method-specific identifier. A simple example: `did:example:123456789abcdefghi`
As of 2020, a [Peer DID Method Specification](https://openssi.github.io/peer-did-method-spec/) is under development, which does not require any central source of truth, and is suitable for private relationships.
#### DID Implementations
DID implementations can store DID documents directly on the blockchain, construct them dynamically based on a blockchain record, or store a pointer on the blockchain to a document in a decentralized storage network like IPFS or STORJ.
There are DID implementations, but few applications, as it is still new and untested. The current biggest user of DIDs are applications using [3Box](https://3box.io/), as 3Box creates a DID for the user that is associated with their Ethereum address.
- [3ID](https://www.notion.so/3ID-Identity-System-fac2f47862a84602b366af1cd64f3523) - 3ID is a blockchain-agnostic DID system built by 3Box and Ceramic Network.
- [ION](https://techcommunity.microsoft.com/t5/identity-standards-blog/ion-booting-up-the-network/ba-p/1441552) is a Microsoft-led DID system. It is an implementation of [Sidetree](https://github.com/decentralized-identity/sidetree), a blockchain-agnostic DPKI protocol, that runs on Bitcoin. It stores transaction data in IPFS.
- IBM - IBM is helping to create, operate and maintain [permissioned decentralized identity networks](https://www.ibm.com/blockchain/solutions/identity/networks) that implement DIDs, built using Hyperledger, IBM's permissioned ledger.
## Key Management
Systems that place identities fully in the hands of users, such as p2p systems, blockchain identity systems, and DIDs, encounter the problem of key management. Providing a key management method that is secure yet convenient for users is a major design challenge. Users commonly lose and forget both passwords and cryptographic keys.
The increasing popularity of cryptocurrencies has created new solutions for secure private key management. The most secure solutions, such as [hardware wallets](https://coinfunda.com/best-cryptocurrency-hardware-wallets/) and [third-party custody services](https://www.investopedia.com/news/what-are-cryptocurrency-custody-solutions/#:~:text=Put%20simply%2C%20cryptocurrency%20custody%20solutions,of%20bitcoin%20or%20other%20cryptocurrencies.), are appropriate for high stakes keypairs that may control large amounts of money, but not suitable for social applications that are accessed more frequently and casually.
Web wallets, such as the [Metamask](https://metamask.io/) browser extension for Ethereum, provide a more usable solution for decentralized applications. Most decentralized applications built on Ethereum perform authentication through Metamask.
Brave browser, which enables micropayments between users, advertisers, and publishers, handles [key management for multiple wallets](https://support.brave.com/hc/en-us/articles/360035488071-How-do-I-manage-my-Crypto-Wallets-) natively in the browser.
[Torus](https://tor.us/) is a key management system that allows users to use OAuth with existing user accounts to authenticate with decentralized applications. It uses a Distributed Key Generation protocol and distributes key shards across a network of nodes running a private BFT network. The key is reassembled after the user authenticates.
[Dark Crystal](https://darkcrystal.pw/), a project in the ssb ecosystem, implements social key recovery. User keys are split into shards that are shared with trusted friends and family, and can later be used to reconstruct a lost key.
## Reputation & Trust
@ -76,12 +100,12 @@ Reputation in decentralized networks is established using many of the same [mech
- Sybils and spam - Spam, and the creation of many fake users to carry out attacks or misinformation campaigns, are problems for existing centralized social networks. These problems are also present in decentralized networks, and approaches to combat them are still evolving. Federated architectures allow server administrators to intervene and block or filter malicious accounts. However, ongoing harassment and abuse through sockpuppet accounts in Mastodon has motivated the creation of [OCapPub](https://gitlab.com/spritely/ocappub/blob/master/README.org), an object-capability based upgrade of ActivityPub. Steemit, a blockchain social network, requires new user registrations to be approved by a centralized service in order to combat the problem of fake accounts created to rig the voting system that determines monetary rewards for posts. P2p systems also struggle with spam and sockpuppets, although they have not seen a level of adoption that leads to high levels of abuse yet. Aether requires a hash computation to be performed for every event posted, raising the computational power required to mass spam the network.
- Account Loss - Federated networks can allow server admins to help users reset lost or forgotten passwords. For example, Mastodon users can ask their server for a password reset as they would any other service. P2p networks do not generally allow users to recover lost accounts, as there is no third-party to facilitate the exchange. [Dark Crystal](https://darkcrystal.pw/), a project in the ssb ecosystem, implements social key recovery to attempt to address this problem. User keys are split into shards that can be shared with trusted friends and family, and later used to reconstruct a lost key.
- Impersonation - Attempts to impersonate users for fraud or defamation purposes are widespread on centralized social networks. This threat also exists in decentralized social networks, although it has not been exploited to large extent because these networks have not achieved the same scale and prominence.
## Links
- [What are Decentralized Identifiers](https://www.evernym.com/blog/what-are-decentralized-identifiers-dids/)
- [Decentralizing the Social Web](https://hal.inria.fr/hal-01966561/document)
- [What are Decentralized Identifiers](https://www.evernym.com/blog/what-are-decentralized-identifiers-dids/)
- [DIDs](https://github.com/didecentral/didecentral.github.io)
- [DID Primer](https://github.com/WebOfTrustInfo/rwot5-boston/blob/master/topics-and-advance-readings/did-primer.md)
- [Rebooting the Web of Trust Papers](https://decentralized-id.com/literature/rebooting-web-of-trust/)

View File

@ -1,8 +1,18 @@
# Privacy
Designing for public communication requires less focus on privacy than social applications designed for close social circles. However, privacy is still important to consider on several counts: protecting user metadata, respecting private account settings, and supporting private direct messaging.
Designing for public communication requires less focus on privacy than social applications designed for close social circles. However, privacy is still important to consider on several counts: protecting user metadata, respecting private account settings, and supporting private direct messages.
### Direct messaging
### User metadata
At a large enough scale, user metadata collected by federated applications becomes a cause for privacy concerns. Examples of these kinds of concerns can be found in this [privacy report on Matrix](https://gitlab.com/libremonde-org/papers/research/privacy-matrix.org), conducted by a privacy-focused nonprofit.
### Private accounts
Mastodon has account-level and post-level privacy controls. When an account is locked, follow requests must be approved. Since posts are copied to the instances of followers, locking an account gives a user more control over where their posts will be distributed.
Individual posts, as well as the default post setting, can be set to "followers-only".
### Direct messages
Many decentralized social applications use e2e encryption to preserve the privacy of direct messages.
@ -15,8 +25,10 @@ Some more e2e messaging encryption options:
- [Noise protocol](http://www.noiseprotocol.org/), used by WhatsApp
- [Messaging Layer Security (MLS)](https://messaginglayersecurity.rocks/)
### Decentralized applications that focus on privacy
### Decentralized social applications focused on privacy
- [Peergos](../protocols/peergos.md) - Peergos provides [capability-based access control](https://github.com/Peergos/Peergos) for files on top of IPFS. Files are kept private. All encryption happens on the client, which could be a native Peergos client or a browser. Data is always encrypted on the servers. Servers do not have access to metadata or sensitive information. Access is controlled through cryptographic capabilities. Access is hierarchical, and stored in an encrypted structure called [cryptree](https://book.peergos.org/security/cryptree.html).
- [Peergos](../protocols/peergos.md) - Peergos provides [capability-based access control](https://github.com/Peergos/Peergos) for files on top of IPFS. Files are kept private. All encryption happens on the client, which could be a native Peergos client or a browser. Data is always encrypted on the servers. Servers do not have access to metadata or sensitive information. Access is controlled through cryptographic capabilities.
- [Zeronet](https://zeronet.io/) - Zeronet is an example of a p2p network that was designed with a focus on privacy. It is a browser for a decentralized network built on BitTorrent and Bitcoin, and instead of having IP addresses, Zeronet site addresses are Bitcoin public keys. ZeroMe is a proof-of-concept Twitter-like social network on Zeronet. It has not received wide usage. Other sites on Zeronet include ZeroTalk (like Reddit), ZeroBlog (microblogging), and ZeroMail (encrypted mail).
- [Zeronet](https://zeronet.io/) - Zeronet is a p2p browser built on BitTorrent and Bitcoin, designed with a focus on privacy. Instead of having IP addresses, Zeronet site addresses are Bitcoin public keys. ZeroMe is a proof-of-concept Twitter-like social network on Zeronet. Other sites on Zeronet include ZeroTalk (like Reddit), ZeroBlog (microblogging), and ZeroMail (encrypted mail).
- [Zbay](https://www.zbay.app/) - Zbay is a Slack-like messaging application with monetary transactions, which uses the Zcash blockchain as a database and transaction settlement layer. User identities are Zcash addresses. Usernames are registered by sending a message to an address everyone has a viewing key for, and providing the new user's public key. Private messages can then be sent to the user's address using encrypted transactions.