brainstorm: splitting mozharness
[stating the problem]
Mozharness currently handles a lot of complexity. (It was designed to be able to, but the ideal is still elegantly simple scripts and configs.)
Our production-oriented scripts take (and sometimes expect) config inputs from multiple locations, some of them dynamic; and they contain infrastructure-oriented behavior like clobberer, mock, and tooltool, which don't apply to standalone users.
We want mozharness to be able to handle the complexity of our infrastructure, but make it elegantly simple for the standalone user. These are currently conflicting goals, and automating jobs in infrastructure often wins out over making the scripts user friendly. We've brainstormed some ideas on how to fix this, but first, some more details:
A lot of the current complexity involves config inputs from many places:
- buildbot-configs, through
- environment variables,
- command line options, and
- buildbot properties sent through buildprops.json
- mozharness, through
- in-tree config files (e.g. testing/config/mozharness/linux_config.py or b2g/config/emulator/config.json), and
- web-based resources, whether through hgweb or a service like mapper.
We want to lock the running config at the beginning of the script run, but we also don't want to have to clone a repo or make external calls to web resources during
__init__(). Our current solution has been to populate runtime configs during one of our script actions, but then to support those runtime configs we have to check multiple config locations for our script logic.
We're able to handle this complexity in mozharness, and we end up with a single config dict that we then dump to the log + to a json file on disk, which can then be reused to replicate that job's config. However, this has a negative effect on humans who need to either change something in the running configs, or who want to simplify the config to work locally.
[in-tree vs out-of-tree]
We also want some of mozharness' config and logic to ride the trains, but other portions need to be able to handle outside-of-tree processes and config, for various reasons:
- some processes are volatile enough that they need to change across the board across all trees on a frequent basis;
- some processes act across multiple trees and revisions, like the bumper scripts and vcs-sync;
- some infrastructure-oriented code needs to be able to change across all processes, including historical-revision-based processes; and
- some processes have nothing to do with the gecko tree at all.
Part of the solution is to move logic out of mozharness. Desktop Firefox builds and repacks moving to mach makes sense, since they're
- configurable by separate mozconfigs,
- tasks completely shared by developers, and
- completely dependent on the tree, so tying them to the tree has no additional downside.
However, Andrew Halberstadt wanted to write the in-tree test harnesses in mozharness, and have mach call the mozharness scripts. This broke some of the above assumptions, until we started thinking along the lines of splitting mozharness: a portion in-tree running the test harnesses, and a portion out-of-tree doing the pre-test-run machine setup.
(I'm leaning towards both splitting mozharness and using helper objects, but am open to other brainstorms at this point...)
In effect, the wrapper, out-of-tree portion of mozharness would be taking all of the complex inputs, simplifying them for the in-tree portion, and setting up the environment (mock, tooltool, downloads+installs, etc.); the in-tree portion would take a relatively simple config and run the tests.
We could do this by having one mozharness script call another. We'd have to fix the logging bug that causes us to double-log lines when we instantiate a second BaseScript, but that's not an insurmountable problem. We could also try
execing the second script, though I'd want to verify how that works on Windows. We could also modify our buildbot ScriptFactory to be able to call two scripts consecutively, after the first script dynamically generates the simplified config for the second script.
We could land the portions of mozharness needed to run test harnesses in-tree, and leave the others out-of-tree. There will be some duplication, especially in the
mozharness.base code, but that's changing less than the scripts and
We would be able to present a user-friendly "inner" script with limited inputs that rides the trains, while also allowing for complex inputs and automation-oriented setup beforehand in the "outer" script. We'd most likely still have to allow for automation support in the inner script, if there's some reporting or error checking or other automation task that's needed after the handoff, but we'd still be able to limit the complexity of that inner script. And we could wrap that inner script in a mach command for easy developer use.
Currently, most of mozharness' logic is encapsulated in
self. We do have helper objects: the BaseConfig and the
self.config for config; the
self.log_obj that handles all logging;
MercurialVCS for cloning,
SUTDeviceHandler for mobile device wrangling. But a lot of what we do is handled by mixins inherited by
A while back I filed a bug to create a LocalLogger and BaseHelper to enable parallelization in mozharness scripts. Instead of cloning 90 locale repos serially, we could create 10 helper objects that each clone a repo in parallel, and launch new ones as the previous ones finish. This would have simplified Armen's parallel emulator testing code. But even if we're not planning on running parallel processes, creating a helper object allows us to simplify the config and logic in that object, similar to the "inner" script if we split mozharness into in-tree and out-of-tree instances, which could potentially also be instantiated by other non-mozharness scripts.
Essentially, as long as the object has a
self.log_obj, it will use that for logging. The
LocalLogger would log to memory or disk, outside of the main script log, to avoid parallel log interleaving; we would use this if we were going to run the helper objects in parallel. If we wanted the helper object to stream to the main log, we could set its
log_obj to our
self.log_obj. Similarly with its config. We could set its config to our
self.config, or limit what config we pass to simplify.
(Mozharness' config locking is a feature that promotes easier debugging and predictability, but in practice we often find ourselves trying to get around it somehow. Other config dicts, self.variables, editing
_pre_config_lock() ... Creating helper objects lets us create dynamic config at runtime without violating this central principle, as long as it's logged properly.)
Because this "helper object" solution overlaps considerably with the "splitting mozharness" solution, we could use a combination of the two to great efficacy.
[functions and globals]
This idea completely alters our implementation of mozharness, by moving
self.config to a global config, directly calling
logging methods (or wrapped
logging methods). By making each method a standalone function that's only slightly different from a standard python function, it lowers the bar for contribution or re-use of mozharness code. It does away with both the downsides and benefits of objects.
The first, large downside I see is this solution appears incompatible with the "helper objects" solution. By relying on a global config and logging in our functions, it's difficult to create standalone helpers that use minimized configs or alternate logging configurations. I also think the global logging may make the double-logging bug more prevalent.
It's quite possible I'm downplaying the benefit of importing individual functions like a standard python script. There are decorators to transform functions into class methods and vice versa, which might allow for both standalone functions and object-based methods with the same code.
BaseConfig, but Argparse requires python 2.7 and mozharness locks the config.