-
Notifications
You must be signed in to change notification settings - Fork 10
Add an example app with paths extraction #49
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
79c0dd3
36a33ce
5ece2ae
38dd6a0
8672268
573f321
7e7ea11
0f27589
38072c4
0cc147c
dc6ab72
77f2a5e
85a5231
03b2582
b813da6
19c1444
ff15198
5ca7f8a
9d28557
cb70a3c
cb8c88e
5c16ff5
4ddcde2
7ddd261
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,246 @@ | ||
| >[!CAUTION] | ||
| >For demo purposes only! | ||
| >Do not expect big graphs to be processed successfully (in reasonable time or without out-of-memory errors). | ||
|
|
||
| This demo is based on UCFS, which, for a given grammar represented as an RSM, a graph, and start vertices, produces an SPPF. | ||
|
|
||
| **RSM** (Recursive State Machine) is an automaton-like representation of context-free languages. | ||
|
|
||
| **SPPF** (Shared Packed Parse Forest) is a derivation-tree-like structure that represents **all** possible paths satisfying the specified grammar. If the number of such paths is infinite, the SPPF contains cycles. | ||
| SPPF consists of nodes of the types listed below. Each node has a unique Id and detailed information specific to its type. | ||
|
|
||
| * **Nonterminal** node contains the name of the non-terminal and pairs of vertices from the input graph that are the start and end of paths derived from that non-terminal. | ||
|
|
||
|  | ||
|
|
||
| This node has number ```0``` and is the root of all derivations for all paths from 1 to 4 derivable from non-terminal ```S``` | ||
|
|
||
| * **Terminal** node is a leaf and corresponds to an edge. | ||
|
|
||
|  | ||
|
|
||
| This node depicts edge ```3 -alloc-> 4```. | ||
|
|
||
| * **Epsilon** node is a simplified way to represent that $\varepsilon$ is derived at a specific position. | ||
|
|
||
|  | ||
|
|
||
| * **Range** node is a supplementary node that helps reuse subtrees. | ||
|
|
||
|  | ||
|
|
||
| This node represents all subpaths from 0 to 4 that are accepted while the RSM transitions from ```S_0``` to ```S_2```. | ||
|
|
||
| * **Intermediate** node is a supplementary node used to connect subpaths. | ||
|
|
||
|  | ||
|
|
||
| This node depicts that the path from 0 to 2 is composed of two parts: from 0 to 1 and from 1 to 2. | ||
|
|
||
| **Requirements**: 11 java | ||
|
|
||
| **To run (from project root)**: | ||
|
|
||
| ```bash | ||
| ./gradlew :cfpq-app:run | ||
| ``` | ||
|
|
||
| **Input graphs:** ```src/main/resources/``` | ||
|
|
||
| **Grammar and code for paths extraction:** ```src/main/kotlin/me/vkutuev/Main.kt``` | ||
|
|
||
| >[!NOTE] | ||
| > We implemented a very naive path extraction algorithm solely to demonstrate SPPF traversal. | ||
|
|
||
| ## Examples | ||
|
|
||
| We provide a few code snippets, the corresponding graphs to be analyzed, parts of the resulting SPPFs, and extracted paths. | ||
|
|
||
| For analysis, we use the following extended points-to grammar (start non-terminal is ```S```), which allows us to analyze chains of fields. | ||
| ``` | ||
| PointsTo -> ("assign" | ("load_i" Alias "store_i"))* "alloc" | ||
| FlowsTo -> "alloc_r" ("assign_r" | ("store_i_r" Alias "load_o_r"))* | ||
| Alias -> PointsTo FlowsTo | ||
| S -> (Alias? "store_i")* PointsTo | ||
| ``` | ||
|
|
||
| For all our examples, we use a common grammar with $i \in [0..3]$. | ||
| The corresponding RSM is presented below: | ||
|
|
||
|  | ||
|
|
||
| ### Example 1 | ||
| Code snippet: | ||
| ```java | ||
| val n = new X() | ||
| val y = new Y() | ||
| val z = new Z() | ||
| val l = n | ||
| val t = y | ||
| l.u = y | ||
| t.v = z | ||
| ``` | ||
|
|
||
| Respective graph: | ||
|
|
||
|  | ||
|
|
||
| Resulting SPPF: | ||
|
|
||
|  | ||
|
|
||
| Three trees are extracted because there are three paths of interest from node 1. | ||
| We do not extract subpaths derivable from non-terminals ```Alias``` and ```PointsTo```, as they contain no useful information for restoring fields. | ||
|
|
||
|
|
||
| Respective paths: | ||
|
|
||
| * [(1-PointsTo->0)] | ||
|
|
||
| This path is trivial. Such paths will be omitted in further examples. | ||
|
|
||
| * [(1-Alias->2), (2-store_0->3), (3-PointsTo->4)] | ||
|
|
||
| This path means that ```n.u = new Y()```. Vertex 2 is an alias for 1 (corresponding to ```n```), and 2 has a field ```u``` that points to ```new Y()``` (```store_0``` corresponds to ```l.u = y```). | ||
|
|
||
| * [(1-Alias->2), (2-store_0->3), (3-Alias->5), (5-store_1->6), (6-PointsTo->7)] | ||
|
|
||
| This path means that ```n.u.v = new Z()```. | ||
|
|
||
|
|
||
| ### Example 2 | ||
|
|
||
| Code snippet: | ||
| ```java | ||
| val n = new X() | ||
| val l = n | ||
| while (...){ | ||
| l.next = new X() | ||
| l = l.next | ||
| } | ||
| ``` | ||
|
|
||
| Respective graph: | ||
|
|
||
|  | ||
|
|
||
| Part of resulting SPPF: | ||
|
|
||
|  | ||
|
|
||
| This part contains a cycle formed by vertices 27–31–34–37–38–40–42–44–47–49–52–56 (colored in red). This is because there are infinitely many paths of interest. We extract some of them: | ||
|
|
||
| * [(0-Alias->2), (2-store_0->3), (3-PointsTo->4)] | ||
|
|
||
| ```n.next = new X () // line 4``` | ||
|
|
||
| * [(0-Alias->2), (2-store_0->3), (3-Alias->2), (2-store_0->3), (3-PointsTo->4)] | ||
|
|
||
| ```n.next.next = new X () // line 4``` | ||
|
|
||
| * [(0-Alias->2), (2-store_0->3), (3-Alias->2), (2-store_0->3), (3-Alias->2), (2-store_0->3), (3-PointsTo->4)] | ||
|
|
||
| ```n.next.next.next = new X () // line 4``` | ||
|
|
||
| * [(0-Alias->2), (2-store_0->3), (3-Alias->2), (2-store_0->3), (3-Alias->2), (2-store_0->3), (3-Alias->2), (2-store_0->3), (3-PointsTo->4)] | ||
|
|
||
| ```n.next.next.next.next = new X () // line 4``` | ||
|
|
||
| * [(0-Alias->2), (2-store_0->3), (3-Alias->2), (2-store_0->3), (3-Alias->2), (2-store_0->3), (3-Alias->2), (2-store_0->3), (3-Alias->2), (2-store_0->3), (3-PointsTo->4)] | ||
|
|
||
| ```n.next.next.next.next.next = new X () // line 4``` | ||
|
|
||
| * [(0-Alias->2), (2-store_0->3), (3-Alias->2), (2-store_0->3), (3-Alias->2), (2-store_0->3), (3-Alias->2), (2-store_0->3), (3-Alias->2), (2-store_0->3), (3-Alias->2), (2-store_0->3), (3-PointsTo->4)] | ||
|
|
||
| ```n.next.next.next.next.next.next = new X () // line 4``` | ||
|
|
||
| More paths can be extracted if needed. Traversal should be tuned accordingly. | ||
|
|
||
| ### Example 3 | ||
|
|
||
| Code snippet: | ||
| ```java | ||
| val n = new X() | ||
| val l = n | ||
| while (...){ | ||
| val t = new X() | ||
| l.next = t | ||
| l = t | ||
| } | ||
| ``` | ||
|
|
||
| Respective graph: | ||
|
|
||
|  | ||
|
|
||
| Part of resulting SPPF: | ||
|
|
||
|  | ||
|
|
||
| This SPPF also contains a cycle (3–5–7–11), so there are infinitely many paths of interest, and we extract only a few of them. | ||
|
|
||
| * [(0-Alias->1), (1-store_0->2), (2-PointsTo->3)] | ||
|
|
||
| ```n.next = new X() // line 4``` | ||
|
|
||
| * [(0-Alias->1), (1-store_0->2), (2-Alias->1), (1-store_0->2), (2-PointsTo->3)] | ||
|
|
||
| ```n.next.next = new X() // line 4``` | ||
|
|
||
| * [(0-Alias->1), (1-store_0->2), (2-Alias->1), (1-store_0->2), (2-Alias->1), (1-store_0->2), (2-PointsTo->3)] | ||
|
|
||
| ```n.next.next.next = new X() // line 4``` | ||
|
|
||
| * [(0-Alias->1), (1-store_0->2), (2-Alias->1), (1-store_0->2), (2-Alias->1), (1-store_0->2), (2-Alias->1), (1-store_0->2), (2-PointsTo->3)] | ||
|
|
||
| ```n.next.next.next.next = new X() // line 4``` | ||
|
|
||
| * [(0-Alias->1), (1-store_0->2), (2-Alias->1), (1-store_0->2), (2-Alias->1), (1-store_0->2), (2-Alias->1), (1-store_0->2), (2-Alias->1), (1-store_0->2), (2-PointsTo->3)] | ||
|
|
||
| ```n.next.next.next.next.next = new X() // line 4``` | ||
|
|
||
| * [(0-Alias->1), (1-store_0->2), (2-Alias->1), (1-store_0->2), (2-Alias->1), (1-store_0->2), (2-Alias->1), (1-store_0->2), (2-Alias->1), (1-store_0->2), (2-Alias->1), (1-store_0->2), (2-PointsTo->3)] | ||
|
|
||
| ```n.next.next.next.next.next.next = new X() // line 4``` | ||
|
|
||
|
|
||
| ### Example 4 | ||
|
|
||
| Code snippet: | ||
|
|
||
| ```java | ||
| val n = new X() | ||
| val z = new Z() | ||
| val u = new U() | ||
| z.x = n | ||
| u.y = n | ||
| val v = z.x | ||
| v.p = new Y() | ||
| val r = u.y | ||
| r.q = new P() | ||
| ``` | ||
| Respective graph: | ||
|
|
||
|  | ||
|
|
||
| For this example, we omit the figure of the SPPF due to its size. However, we present the respective paths. Note that in this example, we specify two vertices as start: 1 and 8. | ||
|
|
||
| * [(1-Alias->9), (9-store_3->11), (11-PointsTo->13)] | ||
|
|
||
| ```n.q = new P()``` | ||
|
|
||
| * [(1-Alias->8), (8-store_2->10), (10-PointsTo->12)] | ||
|
|
||
| ```n.p = new Y() ``` | ||
|
|
||
| * [(8-Alias->9), (9-store_3->11), (11-PointsTo->13)] | ||
|
|
||
| ```v.q = new P() ``` | ||
|
|
||
| * [(8-store_2->10), (10-PointsTo->12)] | ||
|
|
||
| ```v.p = new Y() ``` | ||
|
|
||
| * [(8-Alias->8), (8-store_2->10), (10-PointsTo->12)] | ||
|
|
||
| ```v.p = new Y() ``` |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,21 @@ | ||
| plugins { | ||
| kotlin("jvm") version "2.3.0" | ||
| application | ||
| } | ||
|
|
||
| repositories { | ||
| mavenCentral() | ||
| } | ||
|
|
||
| dependencies { | ||
| implementation(kotlin("stdlib")) | ||
| implementation(project(":solver")) | ||
| } | ||
|
|
||
| kotlin { | ||
| jvmToolchain(11) | ||
| } | ||
|
|
||
| application { | ||
| mainClass = "org.ucfs.paths.MainKt" | ||
| } |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,129 @@ | ||
| package org.ucfs.paths | ||
|
|
||
| import org.ucfs.grammar.combinator.Grammar | ||
| import org.ucfs.grammar.combinator.extension.StringExtension.or | ||
| import org.ucfs.grammar.combinator.extension.StringExtension.times | ||
| import org.ucfs.grammar.combinator.regexp.Nt | ||
| import org.ucfs.grammar.combinator.regexp.Option | ||
| import org.ucfs.grammar.combinator.regexp.many | ||
| import org.ucfs.grammar.combinator.regexp.or | ||
| import org.ucfs.input.DotParser | ||
| import org.ucfs.input.InputGraph | ||
| import org.ucfs.input.TerminalInputLabel | ||
| import org.ucfs.parser.Gll | ||
| import org.ucfs.sppf.getSppfDot | ||
| import org.ucfs.sppf.node.* | ||
| import java.nio.file.Files | ||
| import java.nio.file.Path | ||
|
|
||
| class PointsToGrammar : Grammar() { | ||
| val S by Nt().asStart() | ||
| val Alias by Nt() | ||
| val PointsTo by Nt(many("assign" or ("load_0" * Alias * "store_0") | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Are any paper/online page with more detailed description of this task and grammar? A link would be great. |
||
| or ("load_1" * Alias * "store_1") | ||
| or ("load_2" * Alias * "store_2") | ||
| or ("load_3" * Alias * "store_3") | ||
| ) * "alloc") | ||
| val FlowsTo by Nt("alloc_r" * many("assign_r" or ("store_0_r" * Alias * "load_0_r") | ||
| or ("store_1_r" * Alias * "load_1_r") | ||
| or ("store_2_r" * Alias * "load_2_r") | ||
| or ("store_3_r" * Alias * "load_3_r"))) | ||
|
|
||
| init { | ||
| Alias /= PointsTo * FlowsTo | ||
| S /= many( Option(Alias) * ("store_0" or "store_1" or "store_2" or "store_3")) * PointsTo | ||
| } | ||
| } | ||
|
|
||
| fun readGraph(name: String): InputGraph<Int, TerminalInputLabel> { | ||
| val dotGraph = object {}.javaClass.getResource("/$name")?.readText() | ||
| ?: throw RuntimeException("File $name not found in resources") | ||
| val dotParser = DotParser() | ||
| return dotParser.parseDot(dotGraph) | ||
| } | ||
|
|
||
| data class OutEdge(val start: Int, val symbol: String, val end: Int){ | ||
| override fun toString(): String = "(" + start.toString() + "-" + symbol + "->" + end.toString() + ")" | ||
| } | ||
|
|
||
| fun getPathFromSppf(node: RangeSppfNode<Int>, maxDepth: Int): List<List<OutEdge>>? { | ||
| if (maxDepth == 0) { | ||
| return null | ||
| } | ||
|
Comment on lines
+49
to
+52
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Would be grear unit tests for this speccific part and pointsTo task in general |
||
| when (val nodeType = node.type) { | ||
| is TerminalType<*> -> { | ||
| val range = node.inputRange ?: throw RuntimeException("Null inputRange for TerminalType node of SPPF") | ||
| return listOf(listOf(OutEdge(range.from, nodeType.terminal.toString(), range.to))) | ||
| } | ||
|
|
||
| //Do not extract subpaths for non-terminal Alias because they are useless. | ||
| is NonterminalType if nodeType.startState.nonterminal.name == "Alias" -> { | ||
| val range = node.inputRange ?: throw RuntimeException("Null inputRange for Alias Nonterminal node of SPPF") | ||
| return listOf(listOf(OutEdge(range.from, "Alias", range.to))) | ||
| } | ||
|
|
||
| //Do not extract subpaths for non-terminal PointsTo because they are useless. | ||
| is NonterminalType if nodeType.startState.nonterminal.name == "PointsTo" -> { | ||
| val range = node.inputRange ?: throw RuntimeException("Null inputRange for PointsTo Nonterminal node of SPPF") | ||
| return listOf(listOf(OutEdge(range.from, "PointsTo", range.to))) | ||
| } | ||
|
|
||
| is EpsilonNonterminalType -> { | ||
| return listOf(emptyList()) | ||
| } | ||
|
|
||
| is EmptyType -> { | ||
| throw RuntimeException("SPPF cannot contain EmptyRange") | ||
| } | ||
|
|
||
| is IntermediateType<*>, is NonterminalType -> { | ||
| val subPaths = node.children.map { getPathFromSppf(it, maxDepth - 1) } | ||
| if (subPaths.any { it == null }) { | ||
| return null | ||
| } | ||
| val paths = subPaths.filterNotNull().fold(listOf(listOf<OutEdge>())) { acc, lst -> | ||
| acc.flatMap { list -> lst.map { element -> list + element } } | ||
| } | ||
| return paths | ||
| } | ||
|
|
||
| is Range -> { | ||
| val paths = node.children.map { | ||
| getPathFromSppf(it, maxDepth - 1)?.filterNotNull() | ||
| }.filterNotNull().flatten() | ||
| if (paths.isEmpty()) { | ||
| return null | ||
| } | ||
| return paths | ||
| } | ||
|
|
||
| else -> { | ||
| println("Type of node is ${node.type.javaClass}") | ||
| throw RuntimeException("Unknown RangeType in SPPF") | ||
| } | ||
| } | ||
| } | ||
|
|
||
| fun saveSppf(name: String, sppf: Set<RangeSppfNode<Int>>) { | ||
| val graphName = name.removeSuffix(".dot") | ||
| val genPath = Path.of("gen", "sppf") | ||
| Files.createDirectories(genPath) | ||
| val file = genPath.resolve("${graphName}_sppf.dot").toFile() | ||
|
|
||
| file.printWriter().use { out -> | ||
| out.println(getSppfDot(sppf)) | ||
| } | ||
| } | ||
|
|
||
| fun main() { | ||
| listOf("graph_1.dot", "graph_2.dot", "graph_3.dot", "graph_4.dot").forEach { graphName -> | ||
| val graph = readGraph(graphName) | ||
| val grammar = PointsToGrammar() | ||
| val gll = Gll.gll(grammar.rsm, graph) | ||
| val sppf = gll.parse() | ||
| println("Founded paths in $graphName") | ||
| sppf.forEach { getPathFromSppf(it, maxDepth = 30)?.forEach { println(it.toString()) } } | ||
| println() | ||
| saveSppf(graphName, sppf) | ||
| } | ||
| } | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why "me" and "vkutuev" in path? It would be better to name with corresponds to project/task semantic, not author (you can set author in file jacadoc if needed)