Searching permutations for constructing uniformly distributed point sets

Edited by Jeffrey Ullman, Stanford University, Stanford, CA; received December 3, 2024; accepted February 28, 2025
April 3, 2025
122 (14) e2424464122

Abstract

Uniformly distributed point sets of low discrepancy are heavily used in experimental design and across a very wide range of applications such as numerical integration, computer graphics, and finance. Recent methods based on Graph Neural Networks [T. K. Rusch, N. Kirk, M. M. Bronstein, C. Lemieux, D. Rus, Proc. Natl. Acad. Sci. U.S.A. 121, e2409913121 (2024).] and solver-based optimization identified point sets having much lower discrepancy than previously known constructions. We show in this note that further substantial improvements are possible by separating the construction of low-discrepancy point sets into i) the relative position of the points, and ii) the optimal placement respecting these relationships. Using tailored permutations, we construct point sets that are of 20% smaller discrepancy on average than those proposed by Rusch et al. In terms of inverse discrepancy, our sets reduce the number of points in dimension 2 needed to obtain a discrepancy of 0.005 from more than 500 points to less than 350. For applications where the sets are used to query time-consuming models, this is a significant reduction.
Point sets approximating a uniform distribution are required in many different fields, ranging from engineering sciences to economics and financial applications. Monte Carlo methods, relying on the O(1/n) error in numerical integration of randomly selected points, have been extensively used, in particular in higher dimensions. For lower dimensions, it is possible to obtain point sets that are much more regular. These constructions rely on discrepancy measures, a family of functions that quantify the distance between the points’ distribution and the uniform one. For all of these measures, the lower the discrepancy, the smaller the difference between the point set and a uniform distribution.
Traditionally, the L star discrepancy has been the most prominent discrepancy measure, largely because of its theoretical links to numerical integration via the Koksma–Hlawka inequality (1, 2). In one dimension, it corresponds exactly to the Kolmogorov–Smirnov test statistic for the uniform distribution. More generally, it is characterized by the worst absolute difference between the proportion of points that fall inside a box [0,q)[0,1]d and the volume of that box. Formally, for a point set P in [0,1)d, its L star discrepancy is given by
d(P):=supq[0,1)d||P[0,q)||P|λ([0,q))|,
[1]
where λ is the Lebesgue measure. Points with small L star discrepancy, low-discrepancy point sets or sequences, have proven essential in a wide variety of contexts from optimization (3) to computer graphics (4), and financial mathematics (5). Nevertheless, their most important application remains numerical integration (6) and design of experiments (7). The design of low-discrepancy points sets for a specific number of points n and dimension d is therefore of great practical interest. Traditional construction methods have relied heavily on number theory (8) and are focused more on the asymptotic order of the L star discrepancy in a fixed dimension d as the number of points n tends to infinity. Sequences such as the Halton–Hammersley sequence or the Sobol’ sequence are such constructions. More recently, attention has turned toward generating point sets of low-discrepancy targeted at specific applications and their associated n and d combinations. This is done with a large variety of methods, from the more application-focused optimization methods in refs. 911, to the Graph Neural Network approach in ref. 12. All these approaches have in common that they are, quite naturally, considering the problem as that of a point placement problem.
Inspired by a question of Dmitriy Bilyk in the Dagstuhl Seminar 23351, we show that this problem can instead be split into two: i) a permutation selection, or assignment problem, where we determine the relative position of the points, and ii) a coordinate selection problem more similar to the point placement problem. Perhaps surprisingly, solving the placement problem for a given assignment is very fast with the nonlinear programming approach from ref. 13, while solving the assignment problem for given coordinates is almost as hard as solving both problems at once. Nevertheless, the simpler coordinate selection problem leads to on average 20% improvements for the point sets’ discrepancy, on top of recent progress in refs. 12 and 13. This gain is particularly relevant from a practical perspective: Having a similar discrepancy value with 350 points rather than 500 can mean days of computation time saved for complex engineering problems, while obtaining the same error guarantee.

Results

Our approach is based on a mixed-integer programming model that is fully formulated and explained in more detail in SI Appendix. It optimizes the star discrepancy relative to two sets of variables: points’ coordinates and their relative position with each other—the permutation. In ref. 13, both were solved at the same time to obtain provably optimal point sets. We now fix the permutation variables beforehand, then solve the model as previously. In other words, given a fixed permutation, the solver then finds the best possible point coordinates that fit the relative position. In the following, given a point set P, π(P) refers to its associated permutation, M(π(P)) to the model, and Opt (π(P)) to its optimal solution, when optimizing only the precise coordinates of the points. Not only does this significantly reduce the difficulty of the problem, but it also allows the solver to eliminate a large number of constraints in the presolving.

Permutation Selection.

While selecting a permutation remains a difficult problem, there exists a large number of low-discrepancy point sets from which we can extract promising permutations. In particular, a well-known low-discrepancy sequence in one dimension, the Kronecker sequence with golden ratio, is defined as ({ϕi})iN, where ϕ is the golden ratio and {x} is the fractional part of x. For a fixed n, this can be lifted to a two-dimensional set, the Fibonacci set
{(i/n,{ϕi}):0in1}.
We can further generalize this to shifted Fibonacci sets, where we consider some integer offset α in the second coordinate: {ϕi} is replaced by {ϕ(i+α)}. Other sets and sequence prefixes tested include the van der Corput, Sobol’, and Kritzinger sequences, as well as Message-Passing Monte Carlo (MPMC), random and lattice sets. The permutations of the shifted Fibonacci sets were consistently among the best, and usually the best. Since they also have the advantage of not requiring any parameter search, they were kept for further experiments.
Fig. 1 shows the discrepancy values for dimension 2 obtained by using the permutations associated with shifted Fibonacci sequences, i.e., by solving M(π(Fib+α)). As there does not seem to be a common best shift α for any number of points n, we generally consider α=1 which gives consistently good results. For all tested values, we clearly outperform the state-of-the-art results from ref. 12. More detailed searches for the best shift for n{50,100,150} show that these results can be improved even further. In particular, we highlight that our results get even closer to the current best asymptotic order for the discrepancy by Ostromoukhov in ref. 14, which was shown to be valid for a sequence of more than 607 points.
Fig. 1.
Best discrepancy values in dimension 2 obtained with our permutation approach M(π(Fib+1)), compared to the traditional Fibonacci set shifted by one point Fib+1 and the MPMC results from ref. 12. The black crosses labeled “Best found” correspond to more extensive searches for the best shift. The current best theoretical upper bound for the asymptotic discrepancy order, obtained by Ostromoukhov (14) for a far larger number of points (between 607 and 608), is given by the green line 0.2223log (n)/n. The 0.3log (n)/n curve in blue is added for comparison.
The improvement in the discrepancy value becomes even more noteworthy when considering the inverse star discrepancy, specifying the number of points that are required to reach a specific discrepancy value. In practice, it specifies how many points are needed for an experimental design to guarantee a specific error. Fig. 2 shows the reduction in the number of points required from using our results.
Fig. 2.
Number of points required (y-axis) to reach specific discrepancy values ε (x-axis) in dimension 2, based on the permutation approach.

Structural Differences.

Visually, Fig. 3 (Right) shows that there is little difference in the overall structure of the point sets. However, the point sets obtained before and after optimization look very different from a local discrepancy perspective, as can be seen by comparing the two plots on the Left of Fig. 3. Here, we plot the local discrepancy values over (0,1)2 for both the original set which generated the permutation (Left) and the set obtained postoptimization with this permutation (Middle). Whereas the initial set had a clear zone coming close to the worst value around (0.65,0.85), the irregularities are much more evenly spread in the second image. As well as being better distributed, the discrepancy value is nearly 40% smaller.
Fig. 3.
Left and Middle: Truncated local discrepancy values over (0,1)2 for the 100-point Fibonacci set with shift 74 (Left), and its optimized variant from solving M(π(Fib+74)) (Middle). Brighter zones represent regions with high local discrepancy value. Local discrepancies more than 1/100 away from the maximum of each set were set to 0, for readability. The first set has a region where discrepancy is much higher, whereas in the second image, several regions come close to the discrepancy value; its dimmer colors visualize that the local discrepancies are much lower than on the Left. Right: Comparison of the two sets, with black being the original and red the optimized point placement.

Higher Dimensions.

The separation into an assignment problem and a coordinate placement problem remains applicable to higher dimensions, with d1 permutations required in dimension d. The main obstacle is computational: When expressing volume terms in the model, products of d variables are required. In combination with the size of the model, this quickly becomes untractable as the dimension increases. Nevertheless, we are able to provide results for d=3 showing that the permutation approach still performs convincingly, as summarized in Table 1. Here, we compare the discrepancy values obtained from solving the coordinate selection problem for the permutations of the (lifted 2d or original 3d) Sobol’ sequence and from the lifted Kritzinger sequence, a greedy construction based on L2 discrepancy minimization (15). All optimized point constructions vastly outperform the discrepancy of the default Sobol’ sequence.
Table 1.
Discrepancy values obtained in dimension 3 from the permutations generated by the lifted 2d Kritzinger and Sobol’ sequences [with an (L)], and the 3d Sobol’ sequence cation text
n25324050
Sobol’0.14740.12990.10660.08997
Opt (π(Sobol’))0.10150.084340.071610.05978
Opt (π(Kritzinger (L)))0.094280.075030.061340.05226
Opt (π(Sobol’ (L)))0.092310.077080.066000.05500
We add the nonoptimized Sobol’ sequence to compare with. For reference, the best 50 point set obtained with MPMC had a discrepancy of 0.05828.

Combining with Other Methods.

To obtain more promising inputs for our coordinate optimization method, we consider the permutations of the more recent MPMC method (12) and sets created using an L2 variant of ref. 11. Table 2 shows the improvement by our approach when used on these point sets, as well as their original discrepancy values. In all cases, we improve by at least 15% on the values obtained with MPMC.
Table 2.
Discrepancy values obtained in dimension 3 from the permutations generated by the point sets from MPMC with 50 000 epochs and an L2 variant of the subset selection method introduced in ref. 11
n25324050
Opt (π(L2 Subset))0.085190.073090.059880.04979
L2 Subset0.10120.079310.063920.05337
Opt (π(MPMC))0.083350.070850.062420.05067
MPMC0.106640.082340.081390.05828
The initial discrepancy values are provided for comparison, and the best values are in bold.

Discussion

In this note, we show how to separate the L star discrepancy optimization problem in two components, permutation selection and coordinate optimization. This approach rephrases a longstanding open problem, allowing a more targeted optimization of each component. We demonstrate its potential by providing state-of-the-art results in low dimensions.

Data, Materials, and Software Availability

Julia models and generated point sets in .txt files data have been deposited in https://github.com/frclement/LowDiscPermu (16).

Acknowledgments

Large parts of this work were done while the first author was employed by Sorbonne Université, CNRS, LIP6. This work was granted access to the High-Performance Computing (HPC) resources of the Service d’Aide au Calcul et á l’Analyse de Données (SACADO) MeSU platform at Sorbonne Université. This project was partially supported by the European Research Council (ERC) grant “dynaBBO”, grant no. 101125586, by the PHC Procope 50969UF project funded by the German Academic Exchange Service (Deutscher Akademischer Austauschdienst, Project-ID 57705092), the French Ministry of Foreign Affairs, and the French Ministry of Higher Education and Research. This work was also partially supported by the “PHC PESSOA” program (project DISCREPANCY-Discrepancy Problems-Algorithms and Complexity, number 49173PH), funded by the French Ministry for Europe and Foreign Affairs, the French Ministry for Higher Education and Research, and the Portuguese Foundation for Science and Technology (FCT). This work is partially funded by the FCT, I.P./Ministério da Ciência, Tecnologia e Ensino Superior (MCTES) through national funds (Programa de Investimentos e Despesas de Desenvolvimento da Administração Central), within the scope of Center for Informatics and Systems of the University of Coimbra (CISUC) R&D Unit-UID/CEC/00326/2020.

Author contributions

F.C., C.D., K.K., and L.P. designed research; F.C., C.D., K.K., and L.P. performed research; and F.C., C.D., K.K., and L.P. wrote the paper.

Competing interests

The authors declare no competing interest.

Supporting Information

Appendix 01 (PDF)

References

1
J. Koksma, A general theorem from the theory of the uniform distribution modulo 1. Math. B (Zutphen) 1, 7–11 (1942/1943).
2
E. Hlawka, Funktionen von beschränkter Variation in der Theorie der Gleichverteilung. Ann. Mat. Pum Appl. 54, 325–333 (1961).
3
M. Cauwet et al., “Fully parallel hyperparameter search: Reshaped space-filling” in Proceedings International Conference on Machine Learning, H. Daumé III, A. Singh, Eds. (PMLR, 2020), vol. 119, pp. 1338–1348.
4
L. Paulin et al., MatBuilder: Mastering sampling uniformiy over projections. ACM Trans. Graph 41, 1–13 (2022).
5
S. Galanti, A. Jung, “Low-discrepancy sequences: Monte-Carlo simulation of option prices” in Journal of Derivatives, J. M. Pimbley, Ed. (Euromoney Publications PLC, 1997), pp. 63–83.
6
J. Dick, F. Pillichshammer, Digital Nets and Sequences (Cambridge Univ. Press, 2010).
7
T. J. Santner, B. Williams, W. I. Notz, The Design and Analysis of Computer Experiments, Springer Series in Statistics, P. l’Écuyer, A. B. Owen, Eds. (Springer, 2003).
8
H. Niederreiter, Random Number Generation and Quasi-Monte Carlo Methods, SIAM CBMS-NSF Regional Conference Series in Applied Mathematics (Society for Industrial and Applied Mathematics, 1992), vol. 63.
9
B. Doignies et al., Differentiable owen scrambling. ACM Trans. Graphics 43, 1–12 (2024).
10
C. Doerr, F. M. de Rainville, “Constructing low star discrepancy point sets with genetic algorithms” in Proceedings of Genetic and Evolutionary Computation Conference, C. Blum, E. Alba, Eds. (ACM, 2013), pp. 789–796.
11
F. Clément, C. Doerr, L. Paquete, Heuristic approaches to obtain low-discrepancy point sets via subset selection. J. Complex. 83, 101852 (2024).
12
T. K. Rusch, N. Kirk, M. M. Bronstein, C. Lemieux, D. Rus, Message-passing Monte Carlo: Generating low-discrepancy point sets via graph neural networks. Proc. Natl. Acad. Sci. U.S.A. 121, e2409913121 (2024).
13
F. Clément, C. Doerr, K. Klamroth, L. Paquete, Constructing optimal L star discrepancy sets. arXiv [Preprint] (2024). https://doi.org/10.48550/arXiv.2311.17463 (Accessed 01 June 2024).
14
V. Ostromoukhov, “Recent progress in improvement of extreme discrepancy and star discrepancy of one-dimensional sequences 2008” in Monte Carlo and Quasi-Monte Carlo Methods (2009), pp. 561–572.
15
R. Kritzinger, Uniformly distributed sequences generated by a greedy minimization of the L2 discrepancy. Mosc. J. Comb. Number Theory 11, 215–236 (2022).
16
F. Clément, C. Doerr, K. Klamroth, L. Paquete, Low dIscrepancy permutations. GitHub. https://github.com/frclement/LowDiscPermu/. Deposited 4 February 2025.

Information & Authors

Information

Published in

The cover image for PNAS Vol.122; No.14
Proceedings of the National Academy of Sciences
Vol. 122 | No. 14
April 8, 2025
PubMed: 40178899

Classifications

Data, Materials, and Software Availability

Julia models and generated point sets in .txt files data have been deposited in https://github.com/frclement/LowDiscPermu (16).

Submission history

Received: December 3, 2024
Accepted: February 28, 2025
Published online: April 3, 2025
Published in issue: April 8, 2025

Keywords

  1. discrepancy
  2. optimization
  3. permutations

Acknowledgments

Large parts of this work were done while the first author was employed by Sorbonne Université, CNRS, LIP6. This work was granted access to the High-Performance Computing (HPC) resources of the Service d’Aide au Calcul et á l’Analyse de Données (SACADO) MeSU platform at Sorbonne Université. This project was partially supported by the European Research Council (ERC) grant “dynaBBO”, grant no. 101125586, by the PHC Procope 50969UF project funded by the German Academic Exchange Service (Deutscher Akademischer Austauschdienst, Project-ID 57705092), the French Ministry of Foreign Affairs, and the French Ministry of Higher Education and Research. This work was also partially supported by the “PHC PESSOA” program (project DISCREPANCY-Discrepancy Problems-Algorithms and Complexity, number 49173PH), funded by the French Ministry for Europe and Foreign Affairs, the French Ministry for Higher Education and Research, and the Portuguese Foundation for Science and Technology (FCT). This work is partially funded by the FCT, I.P./Ministério da Ciência, Tecnologia e Ensino Superior (MCTES) through national funds (Programa de Investimentos e Despesas de Desenvolvimento da Administração Central), within the scope of Center for Informatics and Systems of the University of Coimbra (CISUC) R&D Unit-UID/CEC/00326/2020.
Author contributions
F.C., C.D., K.K., and L.P. designed research; F.C., C.D., K.K., and L.P. performed research; and F.C., C.D., K.K., and L.P. wrote the paper.
Competing interests
The authors declare no competing interest.

Authors

Affiliations

Department of Mathematics, University of Washington, Seattle, WA 98195
Sorbonne Université, CNRS, LIP6, Paris 75005, France
Department of Mathematics and Computer Science, School of Mathematics and Natural Sciences, University of Wuppertal, Wuppertal 42119, Germany
Department of Informatics Engineering, University of Coimbra, Centre for Informatics and Systems of the University of Coimbra/LaboratÓrio Associado de Sistemas Inteligentes (CISUC/LASI), Departamento de Engenharia Informática (DEI), Coimbra 3030-290, Portugal

Notes

1
To whom correspondence may be addressed. Email: [email protected].

Metrics & Citations

Metrics

Note: The article usage is presented with a three- to four-day delay and will update daily once available. Due to ths delay, usage data will not appear immediately following publication. Citation information is sourced from Crossref Cited-by service.


Altmetrics




Citations

Export the article citation data by selecting a format from the list below and clicking Export.

View Options

View options

PDF format

Download this article as a PDF file

DOWNLOAD PDF

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Personal login Institutional Login

Recommend to a librarian

Recommend PNAS to a Librarian

Purchase options

Purchase this article to access the full text.

Single Article Purchase

Searching permutations for constructing uniformly distributed point sets
Proceedings of the National Academy of Sciences
  • Vol. 122
  • No. 14

Figures

Tables

Media

Share

Share

Share article link

Share on social media