Skip to content

ATLAS-5218: Add command-line utility to run Gremlin queries#531

Open
pinal-shah wants to merge 2 commits intomasterfrom
ATLAS-5218
Open

ATLAS-5218: Add command-line utility to run Gremlin queries#531
pinal-shah wants to merge 2 commits intomasterfrom
ATLAS-5218

Conversation

@pinal-shah
Copy link
Collaborator

@pinal-shah pinal-shah commented Feb 17, 2026

What changes were proposed in this pull request?

A command-line utility to run Gremlin queries against Atlas' embedded JanusGraph backend (just for developers)

How was this patch tested?

  • build and started Atlas server
  • extracted tar ball from distro/target/apache-atlas--atlas-gremlin-cli.tar.gz
  • Set the ATLAS_CONF environment variable to the path to the Atlas configuration directory.
  • Set the ATLAS_CLASSPATH environment variable to the path to the dependency jars.
  • Run the atlas-gremlin-cli.sh script.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds a command-line utility for running Gremlin queries against Atlas' embedded JanusGraph backend. The tool provides direct access to the graph database for debugging and administrative queries, supporting both inline queries and script files with optional transaction commit.

Changes:

  • Introduces a new maven module atlas-gremlin-cli-tool with core CLI functionality
  • Adds a shell script wrapper to configure and launch the Java tool
  • Provides logback configuration for logging and a README with usage instructions

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 9 comments.

Show a summary per file
File Description
tools/atlas-gremlin-cli/src/main/java/org/apache/atlas/tools/GremlinCli.java Main Java implementation handling query execution, transaction management, and command-line argument parsing
tools/atlas-gremlin-cli/scripts/atlas-gremlin-cli.sh Shell script wrapper that sets up classpath, validates environment, and launches the Java tool
tools/atlas-gremlin-cli/src/main/resources/atlas-logback.xml Logback configuration for logging to file with rolling policy
tools/atlas-gremlin-cli/pom.xml Maven module definition with dependencies scoped as "provided"
tools/atlas-gremlin-cli/README Documentation explaining setup and usage
distro/src/main/assemblies/atlas-gremlin-cli.xml Assembly configuration to package the tool as a tar.gz distribution
distro/pom.xml Adds the new assembly descriptor to the distribution build
pom.xml Adds the new tool module to the parent POM

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +39 to +52
<scope>provided</scope>
</dependency>

<dependency>
<groupId>org.apache.atlas</groupId>
<artifactId>atlas-graphdb-janus</artifactId>
<version>${project.version}</version>
<scope>provided</scope>
</dependency>

<dependency>
<groupId>org.slf4j</groupId>
<artifactId>slf4j-api</artifactId>
<scope>provided</scope>
Copy link

Copilot AI Feb 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tool relies on ATLAS_CLASSPATH being set by the user to provide all dependencies (commons-cli, atlas-graphdb-janus, slf4j-api, and their transitive dependencies). This differs from other tools like notification-analyzer which bundle all dependencies in their distribution. If ATLAS_CLASSPATH is not properly set or is missing required JARs, the tool will fail with ClassNotFoundException at runtime. Consider documenting the exact JAR requirements in the README or following the notification-analyzer pattern of bundling dependencies in the distribution for better user experience.

Suggested change
<scope>provided</scope>
</dependency>
<dependency>
<groupId>org.apache.atlas</groupId>
<artifactId>atlas-graphdb-janus</artifactId>
<version>${project.version}</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>org.slf4j</groupId>
<artifactId>slf4j-api</artifactId>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>org.apache.atlas</groupId>
<artifactId>atlas-graphdb-janus</artifactId>
<version>${project.version}</version>
</dependency>
<dependency>
<groupId>org.slf4j</groupId>
<artifactId>slf4j-api</artifactId>

Copilot uses AI. Check for mistakes.
Comment on lines +45 to +91

public static void main(String[] args) throws Exception {
CommandLine cmd = parseArgs(args);

if (cmd.hasOption("h")) {
printHelp();
return;
}

final String query = getQuery(cmd);
final boolean commit = cmd.hasOption("commit");

new AtlasJanusGraphDatabase();

JanusGraph graph = AtlasJanusGraphDatabase.getGraphInstance();

try {
GraphTraversalSource g = graph.traversal();
GremlinGroovyScriptEngine engine = new GremlinGroovyScriptEngine();

Bindings bindings = engine.createBindings();
bindings.put("graph", graph);
bindings.put("g", g);
bindings.put("__", __.class);
bindings.put("P", P.class);

Object result = eval(engine, bindings, query);

if (result instanceof Traversal) {
result = ((Traversal<?, ?>) result).toList();
}

System.out.println(String.valueOf(result));

finishTx(graph, commit);
} catch (ScriptException se) {
safeRollback(graph);
throw se;
} catch (Throwable t) {
safeRollback(graph);
throw t;
} finally {
try {
graph.close();
} catch (Exception e) {
LOG.warn("Failed to close graph", e);
}
Copy link

Copilot AI Feb 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main method lacks proper exit code handling. Similar Atlas tools like RepairIndex and BulkFetchAndUpdate use System.exit with explicit exit codes (EXIT_CODE_SUCCESS=0, EXIT_CODE_FAILED=1) to signal success or failure to the calling process. Without this, the CLI will always return exit code 0 even when exceptions occur, which prevents shell scripts from detecting failures. Consider adding exit codes and System.exit calls similar to other tools in the repository.

Suggested change
public static void main(String[] args) throws Exception {
CommandLine cmd = parseArgs(args);
if (cmd.hasOption("h")) {
printHelp();
return;
}
final String query = getQuery(cmd);
final boolean commit = cmd.hasOption("commit");
new AtlasJanusGraphDatabase();
JanusGraph graph = AtlasJanusGraphDatabase.getGraphInstance();
try {
GraphTraversalSource g = graph.traversal();
GremlinGroovyScriptEngine engine = new GremlinGroovyScriptEngine();
Bindings bindings = engine.createBindings();
bindings.put("graph", graph);
bindings.put("g", g);
bindings.put("__", __.class);
bindings.put("P", P.class);
Object result = eval(engine, bindings, query);
if (result instanceof Traversal) {
result = ((Traversal<?, ?>) result).toList();
}
System.out.println(String.valueOf(result));
finishTx(graph, commit);
} catch (ScriptException se) {
safeRollback(graph);
throw se;
} catch (Throwable t) {
safeRollback(graph);
throw t;
} finally {
try {
graph.close();
} catch (Exception e) {
LOG.warn("Failed to close graph", e);
}
private static final int EXIT_CODE_SUCCESS = 0;
private static final int EXIT_CODE_FAILED = 1;
public static void main(String[] args) throws Exception {
int exitCode = EXIT_CODE_SUCCESS;
JanusGraph graph = null;
try {
CommandLine cmd = parseArgs(args);
if (cmd.hasOption("h")) {
printHelp();
return;
}
final String query = getQuery(cmd);
final boolean commit = cmd.hasOption("commit");
new AtlasJanusGraphDatabase();
graph = AtlasJanusGraphDatabase.getGraphInstance();
try {
GraphTraversalSource g = graph.traversal();
GremlinGroovyScriptEngine engine = new GremlinGroovyScriptEngine();
Bindings bindings = engine.createBindings();
bindings.put("graph", graph);
bindings.put("g", g);
bindings.put("__", __.class);
bindings.put("P", P.class);
Object result = eval(engine, bindings, query);
if (result instanceof Traversal) {
result = ((Traversal<?, ?>) result).toList();
}
System.out.println(String.valueOf(result));
finishTx(graph, commit);
} catch (ScriptException se) {
safeRollback(graph);
LOG.error("Error executing Gremlin script", se);
exitCode = EXIT_CODE_FAILED;
} catch (Throwable t) {
safeRollback(graph);
LOG.error("Unexpected error executing Gremlin CLI", t);
exitCode = EXIT_CODE_FAILED;
}
} catch (Exception e) {
LOG.error("Error running Gremlin CLI", e);
exitCode = EXIT_CODE_FAILED;
} finally {
if (graph != null) {
try {
graph.close();
} catch (Exception e) {
LOG.warn("Failed to close graph", e);
}
}
System.exit(exitCode);

Copilot uses AI. Check for mistakes.
Comment on lines +61 to +92
try {
GraphTraversalSource g = graph.traversal();
GremlinGroovyScriptEngine engine = new GremlinGroovyScriptEngine();

Bindings bindings = engine.createBindings();
bindings.put("graph", graph);
bindings.put("g", g);
bindings.put("__", __.class);
bindings.put("P", P.class);

Object result = eval(engine, bindings, query);

if (result instanceof Traversal) {
result = ((Traversal<?, ?>) result).toList();
}

System.out.println(String.valueOf(result));

finishTx(graph, commit);
} catch (ScriptException se) {
safeRollback(graph);
throw se;
} catch (Throwable t) {
safeRollback(graph);
throw t;
} finally {
try {
graph.close();
} catch (Exception e) {
LOG.warn("Failed to close graph", e);
}
}
Copy link

Copilot AI Feb 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The GremlinGroovyScriptEngine should be cleaned up after use by calling reset() to prevent memory leaks from script compilation. The existing codebase pattern in AtlasJanusGraph shows that the script engine should be released after execution. Consider adding a finally block to release the engine, similar to how it's done in AtlasJanusGraph.releaseGremlinScriptEngine().

Copilot uses AI. Check for mistakes.
Comment on lines +167 to +183
private static String getQuery(CommandLine cmd) throws IOException {
String q = cmd.getOptionValue("q");
String f = cmd.getOptionValue("f");

if ((q == null || q.trim().isEmpty()) && (f == null || f.trim().isEmpty())) {
throw new IllegalArgumentException("Missing query. Provide -q/--query or -f/--file. Use -h for help.");
}

if (q != null && f != null) {
throw new IllegalArgumentException("Provide only one of -q/--query or -f/--file.");
}

if (q != null) {
return q;
}

return new String(Files.readAllBytes(Paths.get(f)), StandardCharsets.UTF_8);
Copy link

Copilot AI Feb 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No validation is performed on the file path before reading. This could potentially allow reading arbitrary files that the Atlas process has access to. Consider validating that the file path doesn't contain directory traversal patterns (..) or restrict it to a specific directory. Additionally, verify that the file is readable and exists before attempting to read it to provide better error messages.

Copilot uses AI. Check for mistakes.
Comment on lines +57 to +58
new AtlasJanusGraphDatabase();

Copy link

Copilot AI Feb 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The AtlasJanusGraphDatabase is instantiated but the return value is not used. Looking at similar tools like RepairIndex, they directly call AtlasJanusGraphDatabase.getGraphInstance() without instantiating the database object first. This unnecessary instantiation could be removed as the static initialization happens inside getGraphInstance().

Suggested change
new AtlasJanusGraphDatabase();

Copilot uses AI. Check for mistakes.
Comment on lines +80 to +83
} catch (ScriptException se) {
safeRollback(graph);
throw se;
} catch (Throwable t) {
Copy link

Copilot AI Feb 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exception handling is inconsistent with other tools in the repository. RepairIndex and BulkFetchAndUpdate both use LOG.error() with the message and exception together in a single call. The catch blocks here rethrow exceptions without logging error messages, which makes debugging harder for users. Consider logging errors before rethrowing, similar to the pattern: LOG.error("Failed!", e).

Suggested change
} catch (ScriptException se) {
safeRollback(graph);
throw se;
} catch (Throwable t) {
} catch (ScriptException se) {
LOG.error("Gremlin script execution failed", se);
safeRollback(graph);
throw se;
} catch (Throwable t) {
LOG.error("Gremlin CLI execution failed", t);

Copilot uses AI. Check for mistakes.
Comment on lines +62 to +92
GraphTraversalSource g = graph.traversal();
GremlinGroovyScriptEngine engine = new GremlinGroovyScriptEngine();

Bindings bindings = engine.createBindings();
bindings.put("graph", graph);
bindings.put("g", g);
bindings.put("__", __.class);
bindings.put("P", P.class);

Object result = eval(engine, bindings, query);

if (result instanceof Traversal) {
result = ((Traversal<?, ?>) result).toList();
}

System.out.println(String.valueOf(result));

finishTx(graph, commit);
} catch (ScriptException se) {
safeRollback(graph);
throw se;
} catch (Throwable t) {
safeRollback(graph);
throw t;
} finally {
try {
graph.close();
} catch (Exception e) {
LOG.warn("Failed to close graph", e);
}
}
Copy link

Copilot AI Feb 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The GraphTraversalSource 'g' should be closed when done, as it may hold resources. Consider adding g.close() in the finally block, similar to how graph.close() is handled. This ensures proper resource cleanup even when exceptions occur.

Copilot uses AI. Check for mistakes.
return q;
}

return new String(Files.readAllBytes(Paths.get(f)), StandardCharsets.UTF_8);
Copy link

Copilot AI Feb 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reading entire file contents into memory with Files.readAllBytes could cause OutOfMemoryError for large Groovy script files. Consider adding a file size check or using a streaming approach. For example, check the file size before reading and reject files larger than a reasonable threshold (e.g., 10MB) to prevent potential DoS scenarios.

Copilot uses AI. Check for mistakes.
Comment on lines +21 to +29
<appender name="console" class="ch.qos.logback.core.ConsoleAppender">
<param name="Target" value="System.out"/>
<encoder>
<pattern>%date [%thread] %level{5} [%file:%line] %msg%n</pattern>
</encoder>
<filter class="ch.qos.logback.classic.filter.ThresholdFilter">
<level>INFO</level>
</filter>
</appender>
Copy link

Copilot AI Feb 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The console appender is defined but never used in the root logger configuration. Line 44-45 only reference the FILE appender. If console output is intended for debugging purposes, it should be added to the root logger, or if it's not needed, the console appender definition can be removed to avoid confusion.

Copilot uses AI. Check for mistakes.
if [ -z "${ATLAS_CONF:-}" ]; then
echo "ATLAS_CONF is not set. Example: export ATLAS_CONF=/etc/atlas/conf" >&2
echo "This script will set: -Datlas.conf=\$ATLAS_CONF" >&2
exit 2
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • I suggest introducing ATLAS_HOME, with default set to /opt/atlas (the location where Atlas is installed in docker container)
  • Instead of failing here, I suggest using default value of ${ATLAS_HOME}/conf
  • Similarly for ATLAS_CLASSPATH, when not specified include a default set of libraries under ${ATLAS_HOME}/server/webapp/atlas/WEB-INF/lib/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants

Comments