## New Research In

### Physical Sciences

### Social Sciences

#### Featured Portals

#### Articles by Topic

### Biological Sciences

#### Featured Portals

#### Articles by Topic

- Agricultural Sciences
- Anthropology
- Applied Biological Sciences
- Biochemistry
- Biophysics and Computational Biology
- Cell Biology
- Developmental Biology
- Ecology
- Environmental Sciences
- Evolution
- Genetics
- Immunology and Inflammation
- Medical Sciences
- Microbiology
- Neuroscience
- Pharmacology
- Physiology
- Plant Biology
- Population Biology
- Psychological and Cognitive Sciences
- Sustainability Science
- Systems Biology

# Cryptographic hashing using chaotic hydrodynamics

Edited by David A. Weitz, Harvard University, Cambridge, MA, and approved March 26, 2018 (received for review December 15, 2017)

## Significance

An essential component of digital communication is hashing, in which a complex piece of information (a document, video, etc.) is mathematically transformed into a unique signature that can later be used to identify the original piece of data. Here, we show that this process bears strong similarity to the chaotic behavior of certain types of flows observed when ordinary fluids mix, such as the stirring of dye into water. We use this analogy between rearranging information and stirring a fluid to construct a fluid-based hash function with comparable properties to traditional algorithms. Our work bears direct relevance to cases in which the physical representation of information affects its transmission, including in microfluidic self-assembly schemes and characterizing complex natural flows.

## Abstract

Fluids may store and manipulate information, enabling complex applications ranging from digital logic gates to algorithmic self-assembly. While controllable hydrodynamic chaos has previously been observed in viscous fluids and harnessed for efficient mixing, its application to the manipulation of digital information has been sparsely investigated. We show that chaotic stirring of a viscous fluid naturally produces a characteristic signature of the stirring process in the arrangement of particles in the fluid, and that this signature directly satisfies the requirements for a cryptographic hash function. This includes strong divergence between similar stirring protocols’ hashes and avoidance of collisions (identical hashes from distinct stirs), which are facilitated by noninvertibility and a broad chaotic attractor that samples many points in the fluid domain. The hashing ability of the chaotic fluidic map implicates several unexpected mechanisms, including incomplete mixing at short time scales that produces a hyperuniform hash distribution. We investigate the dynamics of hashing using interparticle winding statistics, and find that hashing starts with large-scale winding of kinetically disjoint regions of the chaotic attractor, which gradually gives way to smaller scale braiding of single-particle trajectories. In addition to providing a physically motivated approach to implementing and analyzing deterministic chaotic maps for cryptographic applications, we anticipate that our approach has applications in microfluidic proof-of-work systems and characterizing large-scale turbulent flows from sparse tracer data.

Recent experimental work has highlighted the ability of fluids to encode and store information (1, 2), motivating reciprocal inquiry into the role of computational rules in shaping the behavior of fluids in the natural world (3, 4). Such work has established applications in improving complex microfluidic devices and for characterizing large-scale complex flows using sparse data (5, 6), but it has broader implications for understanding constraints that shape active matter and self-assembly schemes (7). At the same time, studying logical operations and algorithmic performance in physical systems allows analytical tools borrowed from physics to be applied to traditional digital information systems, a line of inquiry that traces from Wheeler’s original “it from bit” conjecture to Landauer’s arguments about the role of physical representation on information processing (8, 9).

A potential new avenue for such inquiries is chaos, which has been widely investigated in digital applications due to the rich statistical structure it affords deterministic (and thus manipulable) dynamical systems (10). While hydrodynamic systems have been shown to exhibit chaos—both ubiquitously in turbulent flows (11, 12) but also unexpectedly in viscous flows via elegant analogies to classical dynamical nonintegrability (13)—the implications of chaos for digital fluid physics remain mostly unexplored.

Here, we exploit recent advances in the field of chaotic hydrodynamics to show how well-understood properties of chaotic maps can encode information about the underlying flow dynamics into the relative arrangements of advected particles. We show that this operation satisfies all of the properties of cryptographic hash functions that typically appear in digital security applications, including noninvertible compression of arbitrary inputs to fixed-length outputs, strong divergence between the hashes of similar inputs, and resistance to collisions between the hashes of two distinct inputs (14). We show that these unexpected properties arise naturally from the time scale-dependent dynamics of stirring a viscous liquid, implying potential new analysis techniques and applications at the interface of nonlinear dynamics and cryptography.

## Model

Our cryptographic hashing scheme is based on chaotic advection at low Reynolds number. Given a time-varying flow and a small set of particles being advected, our hash consists of a short digest containing the relative ordering of the particles along one dimension. In order for the hashing scheme to be effective, this digest must be unique to the specific flow, but the original flow itself should not be easily computed from the hash—a property that naturally emerges in chaotic flows.

Under our approach, if the flow being studied has a known, finite set of governing parameters (such as jet speeds or stirring rates), then the time-dependent flow itself may be denoted by a discrete sequence of L vectors of parameter values 𝝈, which we refer to as a “stirring protocol” for the flow. The time step between parameter changes is arbitrary and may even be infinitesimal (corresponding to an analog signal); however, we assume that dissipation is large enough (and thus the Reynolds number and inertia are small enough) that the stirring protocol fully and invertibly specifies the dynamics of particles in the flow. We associate the specific stirring protocol 𝝈 with a “message” of length L that we wish to encrypt.

We then specify M labeled particles at known initial positions and allow the flow to advect these particles for L time steps with the step-wise parameters specified by 𝝈. The final arrangement of these particles is discretized by recording their ordering, 𝝁, along the x axis, which is taken as the hash of the message 𝝈. This last step ensures that the hash 𝝁 is noninvertible, since it discards information about the final coordinates of the particles and thus ensures that knowledge of the initial conditions and the hash is insufficient to determine the stirring process, a necessary condition for a hash function. Our approach thus differs from standard block-cipher approaches to discrete chaotic encryption in that the original message is encoded in a time-varying set of map parameters rather than in the initial conditions of the particles (15, 16). Instead, our approach generalizes stream ciphers that encode information in the parameters of a chaotic map (15); however, the continuous-domain hydrodynamic map allows the message bandwidth to be made arbitrarily high using an arbitrarily long stirring sequence. The latter property satisfies the requirement that a hash function maps arbitrary-length inputs to fixed-length outputs (14).

Our approach is illustrated in Fig. 1, in which we demonstrate our hashing scheme using the classical “blinking vortex” flow, a canonical and early chaotic fluidic map that has been shown both theoretically and experimentally to demonstrate strong chaos over a wide range of parameter values (6, 13, 17, 18). This map consists of a unit circular domain with no-slip boundary conditions, which contains two vortices at positions *SI Appendix*, section 1.

The blinking vortex system has several properties that render it particularly useful for hashing. The two-vortex map has analytic, circular particle trajectories, which do not require numerical integration to compute and thus retain numerical precision after many iterations of the system. Moreover, the two-vortex flow is isomorphic to the widely studied linked twist map, and so it exhibits phenomena such as domain-wide streamline crossing that are broadly applicable to many other classes of chaotic maps (17).

Here, we exploit the fact that the only relevant parameter describing the map at a given time step is which of the two vortices is currently “on,” which causes the message vector 𝝈 to reduce to a binary encoding of the message information, with the message

While we use the quasi-uniform sunflower packing for different Ms throughout this work, we note that a more optimal set of initial conditions would take into account information about the specific chaotic map’s dynamics when assigning initial particle locations, to maximize mixing. For example, the distribution of principal eigenvalues associated with local stretching of the flow could be used to weigh initial locations; more generally, the set of maximal finite-time Lyapunov exponents corresponding to the message length L could be used to maximize mixing over that specific time scale (19). Conveniently, the blinking vortex flow already has a quasi-uniform spatial stretching distribution for *SI Appendix*, section 9). However, for more complex empirical mixing systems arising in experimental applications, preanalysis of mixing regions may be necessary to assign appropriate initial conditions.

We denote the hash associated with a message 𝝈 as 𝝁, which represents an ordered set of integers corresponding to the final locations of the stirred hash particles, ranked by their relative positions along the *x* axis. For example, for a hash of length *x* coordinates of their initial positions so that *x* axis, leading to a final hash *SI Appendix*, section 6).

## Results

Having defined our system, we next seek to characterize how well it performs in generating uncorrelated hashes **1** should be kept as small as possible, due to system-size saturation of the potential interparticle distances after applying the map.

We generate an ensemble of one million random binary messages, each with a distinct length L, hash length M, and mixing period T. For each message 𝝈, a second message is randomly generated via mutation, insertion, or deletion of a random bit, to generate a dual message

Fig. 2*A* shows

However, while it is intuitive that the relative symbol ratio in the stirring protocol (here, the ratio of 1s and 0s) should be nearly 1, we also observe (Fig. 2*C*) that more switches between the symbols generally increase

While the final projection of the coordinates onto the *x* axis ensures that the deterministic hashing procedure is noninvertible, a stirring-based hashing scheme may be subject to a second preimage or “birthday” attack, wherein two random inputs produce the same hash, resulting in a “collision” that renders the original inputs indistinguishable. To investigate the frequency of hash collisions under various conditions, we compute an ensemble of messages and determine how M and L affect the collision density *A* shows

In addition to avoiding collisions, a strong cryptographic hashing scheme distributes hashes uniformly across the space of possible values. If distributions of collision locations are not uniform, then neighboring hashes are correlated and a single collision will compromise many messages. We assess hash uniformity by generating *B* for various values of M and

We compare these curves to our expectation for a “perfect” hash function, which we assume would uniformly randomly sample (with replacement) the *B*, and it is apparent that, at large message lengths L and hash sizes M, the chaotic hash function approaches this upper bound in performance. Calculation of the expectation value of Eq. **3**, **2** provides a null estimate of the collision density, underlain as a dashed line in Fig. 3*A*. As expected, this function is strongly sigmoidal, predicting a sharp transition from

The analytic estimates provided by Eqs. **3** and **4** illustrate a surprising property of chaotic hashing: for small M and L, the hash function initially encounters fewer collisions than would be expected if it uniformly sampled the hash space. This effect is apparent as the concave transients above the blue trace in Fig. 3*B*. This anomalous scaling is consistent with hyperuniformity (22), a phenomenon in fluids and granular media in which long-wavelength density fluctuations are suppressed (22⇓–24). In our system, its occurrence at small M and L implies a dynamical transient wherein the particles retain information of their initial conditions: The particles have an initial separation

This apparent oversampling of new hashes at relatively short message lengths introduces a tradeoff: Hyperuniformity improves the hash function’s tolerance to random collisions, but it undermines the hash function’s cryptographic security because an attacker intending to infer a message’s hash from that of a similar message could exploit the finite set of locations sampled by a given particle over small L. Thus, for general process validation applications, hyperuniformity at small L and M may be desirable; however, applications with strongly correlated messages would require cryptographic hash security. These limits represent a form of tradeoff between fault tolerance (random collisions) and attack tolerance (collisions informed by prior information about the dynamics), a dichotomy previously observed in random networks (25).

We next investigate the origin of this hyperuniform transient, which we suspect arises from short-time mixing dynamics due to stretching associated with subregions in the fluid’s chaotic attractor that kinetically isolates distinct hash particles. To identify the signature of these dynamics, we study the short time pairwise winding statistics of particle trajectories, which recent work on topological chaos and braiding has shown to represent a characteristic invariant of the dynamical system (6, 26). Using the notation of Caussin and Bartolo (27), we define the pairwise linking number

Fig. 4*A* shows the root-mean-square of the total winding *A*, *Inset*). This suggests that, over short times, mixing in braid space is governed primarily by convergent behavior of hash particles with nearby initial conditions, which explore the chaotic attractor locally but have insufficient time to wander the full domain. These transient dynamical barriers partition the chaotic attractor into subsets of trajectories, which wind together as clusters over short time scales that gradually shrink as particles have more time to explore the attractor and intercalate their worldlines’ windings. This transition is illustrated in Fig. 4*C*, which shows for a single pair of hash points the relative density of their future positions in the domain over short and long time scales (Fig. 4*C*, *Left* and *Right*, respectively). Red (blue) regions correspond to those occupied by potential trajectories originating from the left (right) hash point, and white regions correspond to regions that trajectories for both particles explore to equal degrees (see *SI Appendix*, section 3 for more detail). Clustering indicates that over short time scales the initial positions strongly determine the regions of the attractor explored by a given hash particles, but over longer time scales, both particles sample domain relatively equally. Over short time scales, the large-scale winding of these clusters of kinetically accessible positions dominates the total winding, apparent as a strongly bimodal distribution of ensemble-averaged linking numbers across all possible particle pairs *D*, *Left*). Once the two point clouds fully intercalate, the average linking distribution becomes Gaussian (Fig. 4*D*, *Right*) and further growth in total winding becomes diffusive. For both time scales, the mean linkage is less than zero: because both vortices have positive rotation direction, particles over time tend to travel anticlockwise around the domain, inducing a global net twisting of particle worldlines that produces positive mean winding (27).

Analysis of the hashing process confirms the role that chaos plays in hyperuniformly randomizing the hash over short time scales: In early steps of hashing, large-scale transpositions of groups of neighboring particles within the hash (e.g., *SI Appendix*, section 4 for details). For all tested values of M, the root-mean-square growth of *B*). The two regimes of mixing and their effect on the hash function are thus analogous to initially cutting a deck of playing cards several times and then gradually riffling the halves of the deck back together.

## Discussion

In this article, we have presented a physically motivated chaotic cryptography system implemented using a widely studied, canonical hydrodynamical system. The two-vortex map we study explicitly describes fluid flow at low Reynolds number, allowing interpretation of the properties of the hash function in terms of familiar Lagrangian concepts from dynamical systems such as braiding, particle dispersion, and mixing. Our work suggests a role for tools from dynamical systems theory in constructing and characterizing cryptographic functions as well as potential utility for cryptographic concepts (e.g., collision tolerance and attack resistance) in understanding hydrodynamical systems. The design of modern digital hash functions for computer security is an active area of research (29, 30), and by demonstrating a simple model in which hashing arises from mathematically characterizable chaos, we anticipate our work could inform future efforts to design nonheuristic hash functions from first principles based in dynamical systems theory (15, 16).

However, we anticipate that the most direct application of our work arises in microfluidics and mixing theory, particularly in process verification for particle self-assembly (7, 27) and microfluidic proof-of-work systems (31). Such microscale applications raise the question of whether, in practice, chaos-based hashes are too fragile to external perturbations or irreversible interactions to guarantee repeatability (32). In *SI Appendix*, section 6, we perform extensive characterization of how our hashing scheme’s reproducibility is affected by random noise—either arising from numerical precision loss in digital implementations or from diffusion of particles across streamlines in a physical system. We parametrize the relative strength of noise using the dimensionless Péclet number Pe, which specifies the relative strength of advection relative to diffusive noise in the system: A higher Pe corresponds to larger-scale or faster-stirred physical systems, or computational settings with low error rates per operation. We find general repeatability when *SI Appendix*, sections 7 and 8).

Such constraints would not impede application of our hashing system to other, larger-scale problems in mixing theory, such as the identification and classification of coherent structures in complex flows using topological metrics (6, 37). Such efforts are often motivated by the characterization of large-scale ocean flows from sparse buoy data (38), but they have found diverse uses ranging from the biomechanics of swimming to jet dynamics (19, 39). Our work suggests that complex flows with chaotic dynamics may leave distinguishable signatures in their arrangement of advected particles, suggesting future uses for hash distributions in characterizing arbitrary flows from tracer trajectories.

## Materials and Methods

All simulations were carried out using the Mathematica software package (Version 11.0.1.0; Wolfram Research, Inc.). When possible, all map parameters (such as the domain radius and stirrer coordinates) were expressed using fractional representations (i.e., *SI Appendix*, section 5.

## Acknowledgments

I thank the US Department of Defense for supporting independent research through the National Defense Science & Engineering Fellowship program.

## Footnotes

- ↵
^{1}Email: wgilpin{at}stanford.edu.

Author contributions: W.G. designed research, performed research, contributed new reagents/analytic tools, analyzed data, and wrote the paper.

The author declares no conflict of interest.

This article is a PNAS Direct Submission.

This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1721852115/-/DCSupplemental.

Published under the PNAS license.

## References

- ↵
- Katsikis G,
- Cybulski JS,
- Prakash M

- ↵
- Fuerstman MJ,
- Garstecki P,
- Whitesides GM

- ↵
- Woodhouse FG,
- Fawcett JB,
- Dunkel J

- ↵
- ↵
- ↵
- ↵
- ↵
- Wheeler JA

- ↵
- ↵
- Ott E,
- Grebogi C,
- Yorke JA

- ↵
- ↵
- ↵
- ↵
- Paar C,
- Pelzl J

- ↵
- Amigo J,
- Kocarev L,
- Szczepanski J

- ↵
- Amigó J,
- Szczepanski J,
- Kocarev L

- ↵
- Sturman R,
- Ottino JM,
- Wiggins S

- ↵
- Aref H, et al.

- ↵
- ↵
- ↵
- Sinha K,
- Sinha BP

- ↵
- Torquato S,
- Stillinger FH

- ↵
- Weijs JH,
- Jeanneret R,
- Dreyfus R,
- Bartolo D

- ↵
- Jack RL,
- Thompson IR,
- Sollich P

- ↵
- ↵
- Boyland PL,
- Aref H,
- Stremler MA

- ↵
- Caussin JB,
- Bartolo D

- ↵
- Berger MA

- ↵
- Indesteege S

- ↵
- Preneel B

- ↵
- De Leo E,
- Galluccio L,
- Lombardo A,
- Morabito G

- ↵
- Michaelides EE

- ↵
- Stroock AD, et al.

- ↵
- ↵
- ↵
- Posner JD,
- Pérez CL,
- Santiago JG

- ↵
- ↵
- Wiggins S

- ↵

## Citation Manager Formats

## Sign up for Article Alerts

## Article Classifications

- Physical Sciences
- Physics