For the last week, I have been prototyping a graph planner and executor for graph languages. Specifically, I have been motivated to pursue this line of thought for Gremlin, but the engine being designed is generic to support any finite grammar. This has been demonstrated by supporting a dummy grammar in the initial prototype. The prototype has support for Gremlin.
Dummy Grammar
The grammar supported as the prototype is known as "DummyGrammar". The structure of the graph language supported is as below:
NodeName.WhereClause.MatchesClause
Parser
The parser parses the raw query and populates an AST for further processing.
Heuristic Planner
The heuristic planner works in two stages:
1) First Phase Optimizer: First phase optimizer applies heuristics on the original AST and allows the AST to become more structurally optimized. These optimizations can be language agnostic.
2) Second Phase Optimizer: Second Phase Optimizer is responsible for the actual generation of the plan for execution. Second phase optimizer takes dynamic factors into account, such as the current memory usage to take decisions about the best execution strategy for the search.
Executor
The executor works by creating candidates for the next partial solution using the chosen execution strategy as defined by the heuristic planner. However, the executor has a dynamic feedback mechanism, where the executor has the flexibility to generate candidates using different heuristics suited for different conditions within the same algorithm. For eg, within Hill Climbing algorithm, a choice of simple heuristic vs steep climbing heuristic may be done by the executor during execution by generating candidates using both heuristics and then assigning them a cost. This allows for any number of heuristic functions to be dynamically added and used within a search algorithm, thanks to the service based design of the cost factory referred to by the executor.
ToDo
The project is in no way complete and needs extensive tests, planner heuristics and better execution in terms of multiple heuristics present.
The objective of the prototype is to demonstrate how a cost-based heuristical model can be applied to a graph traversal system, thus allowing a cost-based optimizer and declarative queries on top of graph-structured storage systems. This will open new avenues in the space of graph query languages, especially around ease of usage, performance and analytics.
Code
Github Link
Dummy Grammar
The grammar supported as the prototype is known as "DummyGrammar". The structure of the graph language supported is as below:
NodeName.WhereClause.MatchesClause
Parser
The parser parses the raw query and populates an AST for further processing.
Heuristic Planner
The heuristic planner works in two stages:
1) First Phase Optimizer: First phase optimizer applies heuristics on the original AST and allows the AST to become more structurally optimized. These optimizations can be language agnostic.
2) Second Phase Optimizer: Second Phase Optimizer is responsible for the actual generation of the plan for execution. Second phase optimizer takes dynamic factors into account, such as the current memory usage to take decisions about the best execution strategy for the search.
Executor
The executor works by creating candidates for the next partial solution using the chosen execution strategy as defined by the heuristic planner. However, the executor has a dynamic feedback mechanism, where the executor has the flexibility to generate candidates using different heuristics suited for different conditions within the same algorithm. For eg, within Hill Climbing algorithm, a choice of simple heuristic vs steep climbing heuristic may be done by the executor during execution by generating candidates using both heuristics and then assigning them a cost. This allows for any number of heuristic functions to be dynamically added and used within a search algorithm, thanks to the service based design of the cost factory referred to by the executor.
ToDo
The project is in no way complete and needs extensive tests, planner heuristics and better execution in terms of multiple heuristics present.
The objective of the prototype is to demonstrate how a cost-based heuristical model can be applied to a graph traversal system, thus allowing a cost-based optimizer and declarative queries on top of graph-structured storage systems. This will open new avenues in the space of graph query languages, especially around ease of usage, performance and analytics.
Code
Github Link