Magic Wormhole is What?

March 27, 2025

(Follow-up to Sending a File in 2025).

magic-wormhole is a library and command-line tool (written in Python) which makes it possible to securely and easily get arbitrary-sized files and directories (or short pieces of text) from one computer to another.

Two wizard characters demonstrating Magic Wormhole: speak the same code, receive messages from your peer

The library and protocol itself allow for much more general use: anything that passes messages between two peers.

(Note that there are implementations besides the Python one; see The Magic Wormhole Ecosystem).

Doesn’t TCP Do That?

It sure does.

Unfortunately the problem of NAT (Network Address Translation) existing didn’t go away. There is also the matter of “security”: pre-sharing keys or setting up accounts takes prior work.

The Magic Wormhole protocol solves both of these problems: you get a secure connection between two computers, even if they’re both behind a NAT.

On top of this connection one can run any sort of protocol that works over a TCP stream (or any other kind of stream). The wormhole command-line tool uses this core protocol to implement file-transfer – but many other possibilities exist.

Learning about Magic Wormhole

We will learn about both the “library” uses of Magic Wormhole alongside the main “file transfer” use-case in the CLI software.

I am working on ways for all languages to integrate without implementing this protocol directly (see fowl).

There are direct implementations in Python, Rust and Haskell (the Rust and Haskell implementations aren’t as complete as the Python one). In a subsequent blog post, I will discuss “Dilation” (which adds “durability”, arbitrary sub-channels and peer-to-peer connections to the core protocol).

Ten Thousand Feet

At the very highest level two applications open “a wormhole” by both using the same “magic code” at approximately the same time. They then pass (end-to-end) encrypted messages over the connection. These messages can only be read by two computers (the one that created the code, and the one that consumed it).

A simple diagram of Magic Wormhole interaction, depicting two ends that establish E2EE communication via a secret code

At the core of the security argument is “Password Authenticate Key Exchange” (PAKE). Magic Wormhole uses “SPAKE2”. Although the codes are short, security is maintained by only allowing a single guess (see below for more on this).

Thanks: All diagrams in this post were made using the fantastic FOSS vector drawing program Inkscape. Cool wizard character from Cream Soda Popsicle.

There Are Some Downsides

So we get end-to-end encrypted messages between any two computers, by humans transcribing a short code.

Sounds great! Why not use this for everything?

In order to avoid long, complicated URLs or some sort of discovery protocol, all participants use “a Mailbox server”. Brian Warner runs the default one, at ws://relay.magic-wormhole.io:4000/v1. (You may self-host your own and all participants need to use the same one to communicate successfully).

So, by exchanging some messages with the Mailbox server, both sides have a single chance to use the same password and mailbox combination to get a shared key. The peers then exchange end-to-end encrypted messages using this key.

That is, both applications have a way to pass messages to each other and only two computers (total) can possibly have the shared key. Usually, these two computers would be the intended ones – malicious parties (including a possibly malicious Mailbox server) get a single guess.

These Mailbox messages are not for bulk data transfer or large messages.

For the “file transfer” part, the CLI application passes some “hint” messages to attempt direct connections, at least one of which usually includes a hint to use a public “transit relay”. It is also possible to use Tor.

All clients initially connect via a single central server, presenting some problems: metadata leakage; a single point of failure; and doesn’t scale past a certain point.

Since the Mailbox server sees all the clients, it learns metadata about message timings and sizes (but not content). Clients can use Tor to avoid revealing their network location, but the server still learns some information.

The server could also mount a selective denial-of-service attack against clients (for example, refusing service to certain IP addresses based on inferred geolocations). Again, Tor could be used in this case.

A single server doesn’t scale. That said, a “single machine” can be quite powerful these days. Partially mitigating this is the fact that different applications could have different default servers (for example, application authors can host their own) allowing some amount of decentralization.

I would love to hear of any ideas you have for for distributing a single (logical) Mailbox Server across many computers!

Some More Detail

The above high-level view glosses over many details.

Adding in the details from the last section, we get a diagram like this:

The application code at each end makes some calls into the library: one side “creates” a short code (at 1), and the other side “consumes” it (at 2). These codes have two parts: the Nameplate (the short number) and the rest, separated by a “-” (a hyphen). In the example this is 2-newsletter-dogsled.

Usually, these codes are somehow transmitted between one or two humans who control the computers in question. For example, a human wishing to send a file sees the code output by wormhole send and reads it to their friend sitting beside them. This second human then can give the code as input to their program (e.g. wormhole receive 2-newsletter-dogsled).

Or, a human may wish to connect their phone and their laptop.

In the wormhole send with wormhole receive case, these programs then proceed to do a file transfer. However, it’s important to remember that this “file transfer” portion of the protocol can be replaced with any sort of protocol at all.

For example, the Tahoe-LAFS system uses it when inviting a new user: secret connection details are passed via the secure Mailbox. The Magic Folder tool also uses the Wormhole protocol to exchange secret information, this time granting access to a shared synchronized folder (see invite.py:164).

A Note on Naming

Some of the names involved here follow an analogy to a traditional North American “post office”. Our magic post-office has an “infinite” number of mailboxes (actually, only more than the number of atoms in the universe). By default, none of these mailboxes have any labels on them. (Perhaps we could name this “The Hilbert Post Office”).

The “Nameplates” refer to temporary labels mapping a short number (e.g. “2” or “17”) to an actual mailbox (which has a large, random ID).

The Mailbox is a relatively longer-lived entity, facilitating clients re-connecting in the face of intermittent networking. Mailboxes can last forever, as long as at least one side is connected.

In contrast, the Nameplates are allocated only until the second side has connected to the shared mailbox. After this, they are free to be used by a subsequent connection (mapping to a different random mailbox next time).

By default, the Mailbox Server removes a mailbox 10 minutes after the last side has disconnected (re-setting this timer whenever a client returns). The Mailbox merely contains an unordered bag of (encrypted) messages – any client that successfully connects receives all of the messages (but is only able to decrypt them if they have the correct shared secret).

Creating and Using a Code

Lets look at the steps involved:

We start from one Side’s perspective, which we’ll call “Initiator”.

The Initiator side contacts the Mailbox server, allocates “a Nameplate” and corresponding Mailbox
Supposing the server had the Nameplate “2” free, it could return that (mapping it to a random, fresh Mailbox)
This same Side now creates a secret, selecting (by default two) words from the PGP Word List
It can now show the human a code: "2-newsletter-dogsled" (where “newsletter” and “dogsled” are two words selected at random)
Using the secret value represented by "newsletter-dogsled", it performs SPAKE2 Symmetric and puts the resulting message into the Mailbox (as a "pake" phase message)

Now, the human may communicate this code to another human – or type it into a different device of their own.

That new device is the other “Side”. We’ll call this second Side “Responder”. The Responder now “consumes” the code:

Connecting to the Mailbox, the Responder “claims” the Nameplate from the code typed in by the human (“2” in this case)
Supposing no other client has tried “2” in the interim, this will succeed
The Responder now has the same Mailbox as the Initiator open, so the nameplate “2” is de-allocated
There was one message from the Initiator in the Mailbox already: the phase=pake message. This message is delivered to the Responder
The Responder now completes the corresponding SPAKE2 Symmetric math and puts the resulting message into the Mailbox (as its own “phase=pake” message)

Now the fun begins! Assuming both clients are connected (as would be the case without any network disruptions) the Responder’s PAKE message is delivered immediately via WebSocket to the Initiator.

Both clients now complete the SPAKE2 protocol (i.e. they each feed the other side’s message as the reply, completing the SPAKE2 algorithm). If both Sides used the same password (i.e. "newsletter-dogsled") then this results in the same key. If they used a different password (e.g. the second human mistyped it, or a malicious party tried a guess at Nameplate “2”) then the secret will be different.

Both sides now send a “phase=version” message that serves two purposes:

gives applications somewhere to put version-negotiation information;
confirms that the key is shared (i.e. if the message decrypts successfully to valid JSON)

This message serves as the thing called “Key Confirmation Message” in the SPAKE2 RFC.

These application messages are symmetrically encrypted with a key derived from the shared secret; when Dilation is used, the application messages are encrypted with Noise (specifically “Noise_NNpsk0_25519_ChaChaPoly_BLAKE2s”).

Aside: These Codes are Really Short?!

The codes have 16 bits of entropy. As per the official documentation, there is the very low possibility of a Machine in the Middle attack.

It is very important to realize that there can be only a single attempt at using a particular mailbox’s code, whether that attempt comes from the legitimate peer, or a malicious server, or a malicious third-party.

This is because of the features of PAKE, and in particular SPAKE2 as used here. When “making a guess”, you add another message to the Mailbox which is then delivered to the first peer. At this point, one of two things will be true: either the password part is correct, or it is not.

Both peers learn this when trying to send the “version” confirmation message: if the password was correct, they can decrypt and parse the message. So if a malicious peer (including the server) was incorrect, the legitimate peer learns about the problem.

Thus, a malicious entity gets a single guess, with 1 in 65536 possibility of getting it right. They’d have to get the humans to re-try approximately 32000 times (with the legitimate peer noticing each time) before expecting success. No attacker can expect that many tries!

Okay, Wait, Wat?

We added a lot there!

Let’s look at how the diagram looks now and break down more of this.

A more complete picture of the Wormhole protocol

So the stuff that the Initiator (on the left) does first is captured in box “1” (with green borders). It connects via WebSocket to relay.magic-wormhole.io and exchanges a few messages.

At the end of these exchanges, it has enough information to create a code. Note that only it knows the full secret code; the server only knows the Mailbox ID and Nameplate.

The interactions between the applications and the humans are captured at the bottom. Now that the code “2-newsletter-dogsled” has been produced, it is somehow shown to the human (box “2”) who communicates to their partner (box “3”). (Ideally this communication too is secure, but we discuss threats and considerations below!)

On the Responder computer, the human inputs code “2-newsletter-dogsled” (box “4”) and the software completes the parts in box “5” / pink borders. After this, both sides have exchanged the PAKE messages and will immediately each send a “version” message, confirming the correct key.

Now the application protocol takes over: it has a secure channel that lasts as long as both sides keep connected (including the chance to re-connect). The server’s mailbox timeout (by default 10 minutes) mentioned earlier comes into play here: if both sides are disconnected for more than (10 minutes) then the Mailbox is de-allocated and all messages are discarded. Note that a malicious Mailbox server could keep these messages, but would need to recover the PAKE secret from one of the endpoints – and non-malicious software will have discarded them already.

Application Protocols

At this point, we have learned about the important points of the initial / core “Magic Wormhole” protocol.

Both applications know a Mailbox ID (32 bits) and have a shared secret (256 bits). Symmetrically encrypted messages using the shared secret may be added to the Mailbox and are pushed (in real-time usually) to the other side. Either side can re-connect and will receive all messages in the Mailbox. Only two computers have the shared secret.

What is put into these subsequent messages is referred to as “the application protocol”. The main one most people encounter is the file-transfer protocol.

For developers: here is where you can imagine what you could do on top of this core protocol. Some ideas:

initial setup or key exchange (Tahoe-LAFS uses this to exchange access credentials)
sharing configuration or capabilities (Magic Folder does this to invite participants to a folder)
bootstrapping a peer-to-peer or similar higher-bandwidth connection (the “file transfer” protocol does this, as does Dilation)
securely pairing on “initial device setup” (e.g. IoT use-cases, WireGuard key exchange, SSH keys, etc).
…

Here Be Yaks: Even More Details

The deeper we delve, the less interesting this is for “normal” users and the more interesting it is for developers or cryptographers attempting to understand or break the system.

Please do contact meejah securely if you believe you’ve discovered a security-relevant issue.

With a visualization of the Mailbox Server’s message database, a more complete diagram looks like this:

A more complete picture of the Wormhole protocol, with messages database

There is an additional piece we haven’t covered here, since it’s related to either the file-transfer or “Dilation” protocols and that is direct peer-to-peer connection establishment. If this fails, a relay server (very much like a TURN server) is used.

That will be the topic of a future blog post.

Addendum: Threats, Secret Codes and Verifiers

Ideally, one would communicate the “secret code” in secret. However, often we’re solving this exact problem (“we don’t have a secure channel”)!

A full threat model is well beyond the scope of this blog post; please consult your cryptographer or security professional.

Communicating a code over voice phones isn’t “secure” (it’s not end-to-end encrypted) but may be “secure enough” for many applications. A similar argument may be entertained for speaking the code over a table, or an SMS text-message, or an email. If the legitimate user consumes the code first, any malicious entity cannot make any guess at all.

Although all of these methods can be subverted, there is an additional “verifier” feature that protocols (including the file-transfer one) can take advantage of: this allows the humans to do a little more work, but confirm they’re really using the same key. The “additional work” is confirming a hash of the shared secret key; if they’re the same, there is no MitM.

Let’s consider the “known threat” from the documentation: a successful attacker would want to establish a connection with the first side, quickly make a new code, and give that to the second side – thus inserting themselves as a Machine in the Middle. However, this scenario results in each of the legitimate recipients having a different shared key (i.e. the MitM will re-encode for the other side) so the verifier protects here.

In the simpler case of a successful (remember: very unlikely!) single guess by an attacker, the “verifier” pause will reveal the problem (because the legitimate user on the other end of the phone will not be connected at all). This gives the Initiator a chance to destroy the subverted connection before any files are sent.

This is definitely in the realm of “security versus convenience”. I cannot write your threat model for you – but the “verifier” gives you a tool allowing the two humans to ensure their computers have a shared secret key. The verifier step literally just compares the hashes of this key.