Table S1.

Features and distances used in our algorithm

Feature (ref.)Feature implementation detailsDistance implementation details
SIFT (28)For each character j, we use the normalized SIFT descriptors di128 (with di2=1) and the spatial locators li[1,aL]2 for at most 40 significant key points ki=(di,li), according to the original SIFT implementation. The resulting feature is a set fjSIFT={ki}i=140.The distance between f1SIFT and f2SIFT is determined as follows:
i) For each key point ki1f1SIFT, find a matching key point mi2f2SIFT s. t. mi2=argmin(dj2,lj2)f2SIFTdist(ki1,kj2); where dist(ki1,kj2)=arccos(di1,dj2)li1lj222. Thus, our definition augments the original SIFT distance by adding spatial information.
ii) The one-sided distance is DSIFT1,2=mediani{dist(ki1,mi2)}.
iii) The final distance is DSIFT(1,2)=DSIFT1,2+DSIFT2,12.
Zernike (29)An off-the-shelf (39) implementation was used. Zernike moments up to the fifth order were calculated.DZernike is the L1 distance between the Zernike feature vectors.
DCTMATLAB (R2009a) default implementation was used.DDCT is the L1 distance between the DCT feature vectors.
Kd-tree (30)An off-the-shelf (40) implementation was used. Both orders of partitioning are used (first height, then width, and vice versa)DKdtree is the L1 distance between the Kd-tree feature vectors.
Image projections (31)The implementation results in cumulative distribution functions of the histogram on both axes.DProj is the L1 distance between the projections’ feature vectors; this is similar to the Cramér–von Mises criterion (which uses L2 distance).
L1Existing character binarizations.DL1 is the L1 distance between the character images.
CMI (32)Existing character binarizations, with values in {0,1}.The CMI computes a difference between the averages of the foreground and the background pixels of , marked by a binary mask M, CMI(M,)=μ1μ0, where μk=mean{(p,q)|M(p,q)=k}k=0,1
In our case, given character binarizations B1,B2, the one-sided distance is DCMI1,2=1CMI(B1,B2).
The final distance is DCMI(1,2)=DCMI1,2+DCMI2,12.