WALA: Call graph edges missing in programs with reflection set to STRING_ONLY compared to NONE

Hi,

I’ve noticed some (seemingly) strange behavior when running WALA on programs under different reflection settings.

For example, consider the antlr.jar (antlr.zip) program from DaCapo-2006: specifically, the following piece of code from antlr.Tool.doEverything.

        try {
            ANTLRParser p = new ANTLRParser(tokenBuf, behavior, this);
            p.setFilename(grammarFile);
            p.grammar();
            if (hasError()) {
                fatalError("Exiting due to errors.");
            }
            checkForInvalidArguments(modifiedArgs, cmdLineArgValid);

            // Create the right code generator according to the "language" option
            CodeGenerator codeGen;

            // SAS: created getLanguage() method so subclass can override
            //      (necessary for VAJ interface)
            String codeGenClassName = "antlr." + getLanguage(behavior) + "CodeGenerator";
            try {
                // HERE
                Class codeGenClass = Class.forName(codeGenClassName);
                codeGen = (CodeGenerator)codeGenClass.newInstance();
                codeGen.setBehavior(behavior);
                codeGen.setAnalyzer(analyzer);
                codeGen.setTool(this);
                codeGen.gen();
            }
            catch (ClassNotFoundException cnfe) {
                panic("Cannot instantiate code-generator: " + codeGenClassName);
            }
            catch (InstantiationException ie) {
                panic("Cannot instantiate code-generator: " + codeGenClassName);
            }
            catch (IllegalArgumentException ie) {
                panic("Cannot instantiate code-generator: " + codeGenClassName);
            }
            catch (IllegalAccessException iae) {
                panic("code-generator class '" + codeGenClassName + "' is not accessible");
            }

When I run WALA with the NONE Reflection Option, I get a call graph that includes the calls to Class.forName and Class.newInstance in the inner try block, but not the calls to setBehavior, setAnalyzer, or setTool. Even more strangely, when I run WALA with Reflection set to the STRING_ONLY setting, I don’t even the edges to class.forName or class.newInstance(). Notably, if I compile antlr with Java 8 (the attached version was compiled with Java 11), then the edges to setBehavior, setAnalyzer, and setTool are still missing, but both configurations report the edges to Class.forName and class.newInstance, so maybe this is an issue with some newer JVM bytecode features like invokedynamic? However, I was able to inspect the control flow graph and see that the call to newInstance and forName were reachable.

I’ve noticed this behavior on multiple programs. For example, on hsqldb.jar and hsqldb-deps.jar (hsqldb.zip) we see strange behavior on the following try-catch block in org.hsqldb.TestBase.setUp:

        try {
            Class.forName("org.hsqldb.jdbcDriver");
        } catch (Exception e) {
            e.printStackTrace();
            System.out.println(this + ".setUp() error: " + e.getMessage());
        }

WALA run with the NONE reflection option reports an edge to Throwable.getMessage from the call to Exception.getMessage in the catch block. WALA run with the STRING_ONLY option does not report any outbound edges from this block, and indeed, from the method this block is in at all.

Finally, on pmd.jar (pmd.zip), consider the following method in net.sourceforge.pmd.RuleSetFactory:

   private void parseInternallyDefinedRuleNode(RuleSet ruleSet, Node ruleNode)
            throws ClassNotFoundException, InstantiationException, IllegalAccessException {
        Element ruleElement = (Element) ruleNode;

        String attribute = ruleElement.getAttribute("class");
        Rule rule = (Rule) classLoader.loadClass(attribute).newInstance();
}

the WALA configuration with STRING_ONLY does not report the calls to ClassLoader.loadClass or Class.newInstance on the last line, while the configuration with NONE does.

Any insight as to why this is happening? I’m not sure if this is a bug or expected behavior of the STRING_ONLY option. Thanks!

Austin

About this issue

  • Original URL
  • State: open
  • Created 2 years ago
  • Comments: 34 (18 by maintainers)

Commits related to this issue

Most upvoted comments

Yes, we used ZERO_CFA for these.

Sure. It’s kind of a long file since we made our driver try to support all of WALA’s configurability:

class Application {

    //private static final Logger logger = LoggerFactory.getLogger(Application.class);

    private static CommandLineOptions clo;

    public static void main(String[] args)
            throws WalaException, CallGraphBuilderCancelException, IOException {
        // Initialize command line and print help if requested.
        Application.clo = new CommandLineOptions();
        new CommandLine(clo).parseArgs(args);
        if (clo.usageHelpRequested) {
            CommandLine.usage(new CommandLineOptions(), System.out);
            return;
        }

        // Build call graph.
        CallGraph cg = new Application().makeCallGraph(clo);

        // Print to output.
        FileWriter fw = new FileWriter(String.valueOf(clo.callgraphOutput));
        for (CGNode cgn : cg) {
            Iterator<CallSiteReference> callSiteIterator = cgn.iterateCallSites();
            while (callSiteIterator.hasNext()) {
                CallSiteReference csi = callSiteIterator.next();
                for (CGNode target : cg.getPossibleTargets(cgn, csi)) {
                    fw.write(String.format(
                            "%s\t%s\t%s\t%s\t%s\n",
                            cgn.getMethod(),
                            csi.toString(),
                            cgn.getContext(),
                            target.getMethod().getSignature(),
                            target.getContext()));
                }
            }
        }
        System.out.println("Wrote callgraph to " + clo.callgraphOutput.toString());
        fw.close();
    }

    public CallGraph makeCallGraph(CommandLineOptions clo)
            throws ClassHierarchyException, IOException, CallGraphBuilderCancelException {
        AnalysisScope scope =
                AnalysisScopeReader.makeJavaBinaryAnalysisScope(clo.appJar, null);

        ClassHierarchy cha = ClassHierarchyFactory.make(scope);

        Iterable<Entrypoint> entrypoints =
                com.ibm.wala.ipa.callgraph.impl.Util.makeMainEntrypoints(scope, cha);
        AnalysisOptions options = new AnalysisOptions(scope, entrypoints);
        options.setReflectionOptions(clo.reflection);
        options.setHandleStaticInit(clo.handleStaticInit);
        options.setUseConstantSpecificKeys(clo.useConstantSpecificKeys);
        options.setUseStacksForLexicalScoping(clo.useStacksForLexicalScoping);
        options.setUseLexicalScopingForGlobals(clo.useLexicalScopingForGlobals);
        options.setMaxNumberOfNodes(clo.maxNumberOfNodes);
        options.setHandleZeroLengthArray(clo.handleZeroLengthArray);

        // //
        // build the call graph
        // //
        CallGraphBuilder<InstanceKey> builder;
        switch (clo.callGraphBuilder) {
            case NCFA:
                builder = Util.makeNCFABuilder(clo.sensitivity, options, new AnalysisCacheImpl(), cha, scope);
                break;
            case NOBJ:
                builder = Util.makeNObjBuilder(clo.sensitivity, options, new AnalysisCacheImpl(), cha, scope);
                break;
            case VANILLA_NCFA:
                builder =
                        Util.makeVanillaNCFABuilder(clo.sensitivity, options, new AnalysisCacheImpl(), cha, scope);
                break;
            case VANILLA_NOBJ:
                builder =
                        Util.makeVanillaNObjBuilder(clo.sensitivity, options, new AnalysisCacheImpl(), cha, scope);
                break;
            case RTA:
                builder = Util.makeRTABuilder(options, new AnalysisCacheImpl(), cha, scope);
                break;
            case ZERO_CFA:
                builder = Util.makeZeroCFABuilder(Language.JAVA, options, new AnalysisCacheImpl(), cha, scope);
                break;
            case ZEROONE_CFA:
                builder = Util.makeZeroOneCFABuilder(Language.JAVA, options, new AnalysisCacheImpl(), cha, scope);
                break;
            case VANILLA_ZEROONECFA:
                builder =
                        Util.makeVanillaZeroOneCFABuilder(Language.JAVA, options, new AnalysisCacheImpl(), cha, scope);
                break;
            case ZEROONE_CONTAINER_CFA:
                builder = Util.makeZeroOneContainerCFABuilder(options, new AnalysisCacheImpl(), cha, scope);
                break;
            case VANILLA_ZEROONE_CONTAINER_CFA:
                builder = Util.makeVanillaZeroOneContainerCFABuilder(options, new AnalysisCacheImpl(), cha, scope);
                break;
            case ZERO_CONTAINER_CFA:
                builder = Util.makeZeroContainerCFABuilder(options, new AnalysisCacheImpl(), cha, scope);
                break;
            default:
                throw new IllegalArgumentException("Invalid call graph algorithm.");
        }
        long startTime = System.currentTimeMillis();

        MonitorUtil.IProgressMonitor pm = new MonitorUtil.IProgressMonitor() {
            private boolean cancelled;

            @Override
            public void beginTask(String s, int i) {

            }

            @Override
            public void subTask(String s) {

            }

            @Override
            public void cancel() {
                cancelled = true;
            }

            @Override
            public boolean isCanceled() {
                if (System.currentTimeMillis() - startTime > clo.timeout) {
                    cancelled = true;
                }
                return cancelled;
            }

            @Override
            public void done() {

            }

            @Override
            public void worked(int i) {

            }

            @Override
            public String getCancelMessage() {
                return "Timed out.";
            }
        };
        return builder.makeCallGraph(options, pm);
    }
}

The actual call graph is built and printed in main. clo is a JCommander command line object; you can see the options we support at https://github.com/amordahl/WALAInterface/blob/main/src/main/java/edu/utdallas/amordahl/CommandLineOptions.java. The pom.xml file is at https://github.com/amordahl/WALAInterface/blob/main/src/main/java/edu/utdallas/amordahl/CommandLineOptions.java; we are using wala 1.5.7.

This is a strange one, I agree. Can you come up with a reduced test case for this one? Ideally a self-contained program that does not involve hsqldb? That would make it much easier for me to debug.

Sure. Here’s a program that exhibits the behavior:

// Application.java
public class Application {

    public static void main(String[] args) {
        try {
            Class clazz = Class.forName("MyClass");
        } catch (Exception e) {
            System.out.println(e.getMessage());
        }
    }
}
// MyClass.java
public class MyClass {

    public String sayHello() {
        return "Hello from MyClass!";
    }
}

Under STRING_ONLY, WALA does not have an edge to Exception.getMessage(). On NONE, it does. I compiled the program with temurin-jdk 8, and am using the code here to run WALA and print the call graph: https://github.com/amordahl/WALAInterface

This I don’t fully follow. Are you saying that with Java 11, WALA successfully finds the call graph edges to setBehavior() etc?

No. If I compile the antlr JAR with Java 8, then both STRING_ONLY and NONE show edges to Class.forName and Class.newInstance. If I compile the jar with Java 11, then only NONE has edges to these methods. I don’t think either configuration has edges to setBehavior etc. under either compilation setting. That’s why I was wondering whether it might have something to do with invokedynamic.

Yeah, I agree this is inconsistent with my explanation. I haven’t looked at this code in a long time and I have forgotten to some degree how it works. Do you have a small example of code where the WALA call graph is missing CG edges to Class.forName and Class.newInstance? That would be helpful to track down what is going on.

This is tricky, I’ve been trying to do this myself but it has been a difficult task. Here’s a reduced version of hsqldb.jar hsqldb.zip that shows the behavior (only about 300 LoC). Specifically, the configuration with STRING_ONLY misses the edges here, in org.hsqldb.util.ScriptTool.java (this is the decompiled code as produced by IntelliJ):

    public void execute(String[] var1) {
        Properties var2 = pProperties;
        String var3 = var2.getProperty("driver", "org.hsqldb.jdbcDriver");
        boolean var4 = var2.getProperty("log", "false").equalsIgnoreCase("true");

        try {
            Class.forName(var3).newInstance();
        } catch (Exception var6) {
        }

    }
}