Running the Hypercube Engine locally

Owner	Approvers	Participants
Daniel Miller	Jun Qi, Kathy Guo	Craig Boucher, Jong Ho Lee

The Mangrove build currently runs some integration tests which verify that generated code can actually run and compute the expected values. Here is the current status of those tests per compute fabric:

Kusto	Scope	Hypercube	Spark SQL
Implemented	In progress	Planned	Not planned

Since the build runs a Kusto script and asserts that the results are as expected, developers can make changes to the Mangrove codebase, confident that those changes won't break basic Kusto code-gen functionality. Soon, they will have that same confidence for Scope code-gen. However, currently there is no simple way to verify that generated Java code can be compiled into a metric set JAR, let alone run to produce expected results. In light of that, this RFC (Request For Comment) provides a set of requirements for Hypercube integration testing and proposes a way of satisfying those requirements. It also discusses some considered alternatives and any open concerns.

Requirements

The Hypercube integration tests should test real functionality. That is, they should use the MasterCoordinator to generate Java files, compile them using standard tools (Gradle or SBT) and run a scorecard job using the latest production copy of the Hypercube engine.

The developer experience should remain as close as possible to "clone repo + build + test in Visual Studio". The current suite of integration tests doesn't require a developer to do anything beyond log into their copy of Visual Studio, it will then use the developers credentials to obtain the necessary secrets needed to connect to Kusto and the Metric Service.

The Azure DevOps build defined in azure-pipelines.yml should successfully run all unit and integration tests. That is, it can't simply skip the Hypercube tests.

The Hypercube integration tests should be run as part of the standard suite. That is, they need to be expressible as xUnit tests, not a one-off PowerShell script or command in the build.

Proposal

Compile generated Java files and run the metric set JAR against the Hypercube engine using a Linux Docker container containing both Gradle and the Hypercube engine.

The good news: the Mangrove codebase stays pretty self-contained. It's probably okay to require Docker to be installed for certain integration tests to run.

The bad news: since Linux Containers on Windows (LCOW) is still in preview, it is not supported on the Windows Azure DevOps agents. In order to run an Ubuntu container, you need to split the tests into Windows and Linux phases.

To actually implement Hypercube integration tests, we need to;

Update the IComputeFabric abstraction to accept a ComputationConfig object.
Add logic to convert a Mangrove config object into a job JSON file, using the C# object model published in the NuGet package AnE.ExP.Hypercube.JobConfig.Contract.
Add an implementation of IComputeFabric which accepts the generated Java files, compiles them to a JAR using a Linux container, converts the computation config into a job JSON, and then runs a single Hypercube job using the generated metric set JAR and job JSON.

Alternatives

This RFC provides one proposal for running Hypercube integration tests, but there were other alternatives. This section describes them, along with their pros and cons.

Databricks

Use a small Azure Databricks cluster to run the job, similar to how the Kusto integration tests use the ane Kusto cluster.

Pros: Keeps the Mangrove build and codebase clean.

Cons: We need to maintain an Azure resource. Any outages in that resource cause the Mangrove build to fail. Also, even we can run the scorecard job against a cluster, where to compile generated Java files into a JAR? We would also have to maintain a separate "Java compiler" service somewhere.

Gradle

Run the tests using Gradle and checked-in Hadoop binaries on a Windows agent, similar to the hypercube-spark repository.

Pros: Codebase is relatively self-contained.

Cons: You need to use some evil hacks to get a local copy of the Hadoop binaries. Also, the codebase takes a dependency on Java being installed and the environment variable JAVA_HOME having the correct value.