New Research In
Physical Sciences
Social Sciences
Featured Portals
Articles by Topic
Biological Sciences
Featured Portals
Articles by Topic
 Agricultural Sciences
 Anthropology
 Applied Biological Sciences
 Biochemistry
 Biophysics and Computational Biology
 Cell Biology
 Developmental Biology
 Ecology
 Environmental Sciences
 Evolution
 Genetics
 Immunology and Inflammation
 Medical Sciences
 Microbiology
 Neuroscience
 Pharmacology
 Physiology
 Plant Biology
 Population Biology
 Psychological and Cognitive Sciences
 Sustainability Science
 Systems Biology
A robust method for searching the smallest set of smallest rings with a pathincluded distance matrix

Edited by SungHou Kim, University of California, Berkeley, CA, and approved May 1, 2009 (received for review December 22, 2008)
This article has a Correction. Please see:
Abstract
The perception of rings in graphs is widely used in many fields of science and engineering. Algorithms developed in the chemistry community, called smallest set of smallest rings (SSSR), applicable only for simple graphs or chemical structures. In contrast, algorithms developed by the computer science community, called minimum cycle basis (MCB) are identical to SSSR yet exhibit greater robustness. MCBbased algorithms can correctly reveal all rings in any complex graph. However, they are slow when applied to large complex graphs due to the inherent limitations of the algorithms used. Here, we suggest a heuristic method called RPPath. This method is a robust, simple, and fast search method with O(n^{3}) runtime algorithm that correctly identifies the SSSR of all of the test case of complex graphs by using approach different from the MCBbased method. Both the robustness and improvement in speed are achieved by using a pathincluded distance matrix and describing the characteristic features of rings in the matrix. This method is accurate and faster than any other methods and may find many application in various fields of science and engineering that use complicated graphs with thousands of nodes.
Complicated systems, for example, biological networks, telecommunication networks, and electrical circuits are described with graphs for mathematical manipulation. Smallest set of smallest ring (SSSR) is the characteristic feature of such graphs. Therefore, it is crucial to have robust and fast SSSR calculation method for accurate analysis and manipulations of a graph.
All reported algorithms developed by the chemistry community to find the SSSR, are based on two main methods, a walkthrough connection table (1–4), and graph manipulations (5–8). Some theoretical aspects are summarized by Downs et al. (9). Although various methods to find SSSRs have been developed using algorithms based on both methods, none are robust (10).
From a computer science perspective, identifying SSSRs, known as minimum cycle basis (MCB), is also interesting. In addition to its SSSR identification capability, the compute time of a particular method represented by polynomial time is an important consideration toward its performance. Horton proposed a robust polynomial time algorithm to find the MCB of a graph (11). The runtime was obtained as O(m^{3}n) for a graph with m edges and n vertices. In this approach, the algorithm created a superset of MCB and then extracted it as the shortest m − n + 1 linearly independent cycles from the superset using Gaussian elimination. De Pina (12) proposed a method with a runtime of O(m^{3} + mn^{2}logn) by using a method similar to Padberg and Rao (13) to solve the minimum weighted Todd cut problem. Several methods have been developed based on Horton and De Pina methods and their computing performance are summarized in Table 1(14–17).
Here, we present a heuristic method for ring perception method called RPPath. Instead of using a distance matrix as in Horton and De Pina's methods, our method uses pathincluded distance (PID) matrices, which allow us simply calculate path information and size of the ring simultaneously. With the aids of PID matrices, a robust ring perception method was developed with a runtime of O(n^{3}). Consisting of two units, ring candidate perception followed by ring making and SSSR selection from the candidates, RPPath is faster than any other algorithms developed thus far.
RPPath is consist of two units, (i) ring candidates perception and (ii) ring making and SSSR selection from the ring candidates.
PathIncluded Distance Matrix.
Let G = (V, E) be a connected graph with n vertices and m edges. V(G) = {v_{1,} v_{2,} v_{3}, …, v_{n}} is a finite nonempty set of vertices and E(G) = {e_{1,} e_{2,} e_{3}, …, e_{m}} is a finite (possibly empty) set of edges e_{ij} represented by unordered pairs of vertices in V. The distance matrix D = D(G) of a graph G with n vertices is the square n × n symmetric matrix in which diagonal elements are zero, d_{ii} = 0. The element d_{ij} corresponds to the minimum path length between vertices i and j. The Floyd–Warshall method is efficient at finding allpair shortest paths on a graph (18, 19). For example, the distance matrix of the graph in Fig. 1 is shown below.
To find a ring in a circuit with the shortest distance matrix (SDM), Horton's method uses the distance information between one point(v) and two points(x and y) connected with one edge, C(v, x,y) = p(v, x) + p(v, y) + p(x, y), as described in Fig. 2. Using the SDM, this method finds the paths of p(v, x), p(v, y), and p(x, y). In order for the three points v, x, and y to be members of a ring, the edges of the three paths, p(v, x), p(v, y), and p(x, y), are not shared among them (Fig. 2A). In the case of Fig. 2B, the three points are not members of a ring because p(v, x) and p(v, y) overlap. Because the SDM contains information on the shortest distances between two vertices only, another timeconsuming procedure is needed to identify the exact path. All possible rings can be revealed by considering all possible combinations of one point and two points connected by one edge, C(v, x, y) = p(v, x) + p(v, y) + p(x, y); however, this is also a timeconsuming procedure. To optimize the SSSR identification procedure, variations of Horton's method have tried to improve performance by increasing the speed of calculating the shortest path between all vertices or by eliminating unnecessary calculations for finding rings with the shortest path. However, due to shortcoming inherit in using a distance matrix, Horton's method and its variations have limitations in reducing computation time.
To overcome the limitations in path search resulting from using distance matrix, we introduced the PID matrix. The PID matrix calculation was based on the Floyd–Warshall method, which was originally developed to calculate the distance matrix. During the shortest path calculation, the path and the shortest path's number were stored in the PID matrix, which is crucial for increasing the speed of the ring perception. In addition to distance matrix, RPpath introduce additional two PID matrices, P and P′, that store shortestpath information compared with the other methods in Table 1. RPpath requires additional memory for P and P′ than other method. Whereas, the other methods in Table 1 require more memory than RPpath during ring perception procedure because other methods have to save all of the intermediate rings informations(Horton Set). The memory requirement varies depending on the types of graphs, a typical three graph types are shown in Table 2. Each elements of PID matrix are integers, which are assigned to an edge for an integer, and have different sizes depending on the length of the path information. For implementation, some programming languages, e.g., C, might need a indexing to indicate the starting points of each element.
For RPPath, two matrices, P_{e} and P_{e}′ were newly defined. P_{e} stores all of the possible shortest paths with a length of d_{ij} between vertices i and j. P_{e}′ stores all paths that have one more vertex than the shortest paths with a length of d_{ij} + 1. P_{e} and P_{e}′ for the graph in Fig. 1 are presented in Fig. 3.
With these two matrices, one can identify easily all possible rings and the size of these rings. Knowing the size of ring is very important for speeding up in terms of ring perception. Let P_{e} [i, j] be the number of possible path of P_{e} [i, j]. If P_{e} [i, j] is larger than 2 or P_{e} [i, j] is 1 and P_{e}′[i, j] is larger than 1, then the vertices i and j are members of a ring (Fig. 4).
In this manner, if the paths p_{ij,1} and p_{ij,2} in Fig. 4 are the shortest path between v_{i} and v_{j} and the edges are not shared, then the vertices v_{i} and v_{j} are bounded by two paths p_{ij,1} and p_{ij,2}, which form an evenmembered ring. where p_{ij,1} and p_{ij,2} are the lengths of p_{ij,1} and p_{ij,2}, respectively.
If p_{ik} is the shortest path (d_{ik}) between v_{i} and v_{k} in the P_{e} matrix and the d_{ij} + 1 path p_{ik}′ in the P_{e}′ matrix exists with edges not shared by the paths p_{ik} and p_{ik}′, then the vertices v_{i} and v_{k} are bounded by the paths p_{ik} and p_{ik}′ to form an oddmembered ring.
Finding the SSSR.
Because a set of all possible rings in a graph {C} can be described by two PID matrices P_{e} and P_{e}′, then the next step was to select SSSR members {C_{SSSR}} within {C}. The XOR operation (20) was introduced to select the SSSR from {C}. Once a ring is characterized by the PID matrices, the ring is compared with previously identified rings in {C_{SSSR}} by XOR. If a newly identified ring is not identical to any members of {C_{SSSR}} or does not contain any members in {C_{SSSR}}, it becomes a new member of {C_{SSSR}}.
Because largemembered rings may contain smaller rings and are not members of {C_{SSSR}}, ring identification starts from the smallest ring (e.g., a threemembered ring) and repeated with increasing numbers of ring members until the total number of SSSRs reaches the theoretical number of smallest rings, n_{SSSR} (nullity, cyclomatic number), as defined by the Euler rule (9). With Horton's methods, ring size is unknown until all paths are fully identified. In contrast, RPPath recognize the ring size and paths immediately. With this information, RPPath can start with the smallest rings to identify the SSSR. RPPath consider fewer candidate rings than Horton's methods. A detailed comparison of computing time for each stage in Horton's method and RPPath is summarized in Table 3.
Results
The robustness of RPPath in finding the SSSR was compared with some widely used SSSR finding methods in chemical information, namely the Figueras (21) and Welch (5) methods, which are implemented in the chemistry development kit (CDK) (22) and PreADMET (BMDRC), respectively.
A set of graphs tested by Berger et al. was used to test the robustness of RPPath in finding SSSR (10). Fig. 5A has a unique minimum cycle basis consisting entirely of 36 triangles. Our RPPath identified all threemembered rings correctly, whereas the Figueras and Welch methods failed to uncover all of them(Table 4). Moreover, these methods did not produce the correct outcome even for simpler test graphs that were analogous to the molecular graph, whereas the RPPath correct identified the SSSR of all graphs (B, C, and D in Table 4).
Computing performance described by the worst case performance of RPPath was compared with Horton's method (Table 3). RPPath calculates the PID(P_{e} and P_{e}′) and the distance matrices, whereas Horton method calculates only the distance matrix, Thus, RPPath has a slightly longer computing time during stage 1. However, the whole procedure for ring perception became much simpler and faster once PID was introduced. In stage 2, because Horton's method considers three paths using the SDM, all rings could be identified by considering all possible combinations of the three points. In contrast, RPPath identifies the rings and their size easily with little computation. Because RPPath discovers rings using a simple algorithm and considers fewer candidates in finding the SSSR, The computing speed of the stages 3 and 4 are much faster than that of Horton's method. The variations of Horton's methods have tried to focus on increasing the speed of this calculation. However, due to limitations inherent in using an SDM, Horton's method and its variations cannot be improved much more.
For the case of a graph with a single cycle, m is equal to n and the performance of RPpath algorithm is comparable with the De Pina, Berger, and Kavitha algorithms. For the case of a complete graph, every pair of distinct vertices are connected by edges, m is n(n − 1)/2 and then m is proportional to n^{2}. In complete graph, RPPath is faster than the other methods in Table 1. General cases are between the two extreme cases.
Conclusion
We present a robust and computationally fast method for ring perception and SSSR identification from complex graphs. Instead of using an SDM, RPPath uses a PID matrix that contains information on the paths and size of the rings. With this information, we developed a robust ring perception method with dramatically reduced computing time. This method may have considerable influence on various fields of science and engineering, including electrical circuit testing (23, 24), periodic timetable planning (25), dealing with rigidity of framework structure (26), chemical ring perception (9), and in network analysis of complicated biological pathways with thousands of nodes.
Methods
A ring perception method is presented here followed by the SSSR identification method that incorporates all of the features described by the RPPath method.

Step 1. Take information from a graph with M vertices and N edges.

Step 2. Remove all open acyclic nodes with one edge(connectivity equal to 1, corresponding to terminal nodes) and repeat the procedure, stoping at nodes with a connectivity larger than 1. Now, M and N become m and n, respectively.

Step 3. Calculate the theoretical number of SSSR : n_{SSSR} = m − n + 1, If n_{SSSR} = 0, stop process.

Step 4. Calculate the PID matrices P_{e} and P_{e}′.

Step 5. Create the cycle set([C_{num}, P_{e}[i,j], P_{e}′[i,j]]) using the PID matrices. Cases in which the shortest path number equals zero, or [P_{e}[i,j] = 1 and P_{e}′[i,j] = 0] can be omitted.

Step 6. Order the cycle set by C_{num} number.

Step 7. Construct ring using the PID matrix from the smallestmembered rings.

Step 8. Use the XOR operation to find the SSSR from this ring.

Step 9. If a ring is a new member of the SSSR, then save it.

Step 10. Repeat steps 7 to 9, until n_{ringidx} reaches n_{SSSR}, where n_{ringidx} is the number of founded SSSR.
Acknowledgments
This work was partly supported by the IT R&D program of MKE/IITA, which is funded by MKE/IITA Grant 2008F02901, Development of eOrgan system based on Cyber Computing; and the 21st Century Frontier R&D Program Grant CBM31B2000010000 of the Center for Biological Modulators funded by the Ministry of Education, Science and Technology, Korea.
Footnotes
 ^{1}To whom correspondence should be addressed. Email: ktno{at}yonsei.ac.kr

Author contributions: C.J.L., Y.M.K., K.H.C., and K.T.N. designed research, performed research, contributed new reagents/analytic tools, analyzed data, and wrote the paper.

The authors declare no conflict of interest.

This article is a PNAS Direct Submission.

This article contains supporting information online at www.pnas.org/cgi/content/full/0813040106/DCSupplemental.
 Received December 22, 2008.
References
 ↵
 Bersohn M
 ↵
 Esack A
 ↵
 Zamora A
 ↵
 ↵
 ↵
 ↵
 Fujita S
 ↵
 Fujita S
 ↵
 Downs G,
 Gillet V,
 Holliday J,
 Lynch M
 ↵
 ↵
 ↵
 de Pina J
 ↵
 ↵
 Golynski A.,
 Horton J
 ↵
 ↵
 Kavitha T,
 Mehlhorn K,
 Michail D,
 Paluch K
 ↵
 Mehlhorn K,
 Michail D
 ↵
 Floyd R
 ↵
 ↵
 Wipke W,
 Dyott T
 ↵
 Figueras J
 ↵
 ↵
 Chua L,
 Chen L
 ↵
 ↵
 ↵
Citation Manager Formats
Sign up for Article Alerts
Article Classifications
 Biological Sciences
 Biophysics and Computational Biology