Table S1.

Features and distances used in our algorithm

 Feature (ref.) Feature implementation details Distance implementation details SIFT (28) For each character j, we use the normalized SIFT descriptors d→i∈ℝ128 (with ‖d→i‖2=1) and the spatial locators l→i∈[1,aL]2 for at most 40 significant key points ki=(d→i,l→i), according to the original SIFT implementation. The resulting feature is a set fjSIFT={ki}i=140. The distance between f1SIFT and f2SIFT is determined as follows: i) For each key point ki1∈f1SIFT, find a matching key point mi2∈f2SIFT s. t. mi2=argmin(dj2,lj2)∈f2SIFTdist(ki1,kj2); where dist(ki1,kj2)=arccos(〈di1,dj2〉)⋅‖li1−lj2‖22. Thus, our definition augments the original SIFT distance by adding spatial information. ii) The one-sided distance is DSIFT1,2=mediani{dist(ki1,mi2)}. iii) The final distance is DSIFT(1,2)=DSIFT1,2+DSIFT2,12. Zernike (29) An off-the-shelf (39) implementation was used. Zernike moments up to the fifth order were calculated. DZernike is the L1 distance between the Zernike feature vectors. DCT MATLAB (R2009a) default implementation was used. DDCT is the L1 distance between the DCT feature vectors. Kd-tree (30) An off-the-shelf (40) implementation was used. Both orders of partitioning are used (first height, then width, and vice versa) DKd−tree is the L1 distance between the Kd-tree feature vectors. Image projections (31) The implementation results in cumulative distribution functions of the histogram on both axes. DProj is the L1 distance between the projections’ feature vectors; this is similar to the Cramér–von Mises criterion (which uses L2 distance). L1 Existing character binarizations. DL1 is the L1 distance between the character images. CMI (32) Existing character binarizations, with values in {0,1}. The CMI computes a difference between the averages of the foreground and the background pixels of ℑ, marked by a binary mask M, CMI(M,ℑ)=μ1−μ0, where μk=mean{ℑ(p,q)|M(p,q)=k}k=0,1 In our case, given character binarizations B1,B2, the one-sided distance is DCMI1,2=1−CMI(B1,B2). The final distance is DCMI(1,2)=DCMI1,2+DCMI2,12.