Skip to content

Extract embedded proguard specs from JARs and AARs during R8#460

Open
snazhmudinov wants to merge 2 commits intobazelbuild:mainfrom
snazhmudinov:extend-r8-proguard-extractor
Open

Extract embedded proguard specs from JARs and AARs during R8#460
snazhmudinov wants to merge 2 commits intobazelbuild:mainfrom
snazhmudinov:extend-r8-proguard-extractor

Conversation

@snazhmudinov
Copy link
Contributor

@snazhmudinov snazhmudinov commented Feb 12, 2026

Add jar_embedded_proguard_extractor to scan META-INF/proguard/ and META-INF/com.android.tools/ in the deploy JAR before R8, and extend aar_embedded_proguard_extractor to also read targeted R8 rules from META-INF/com.android.tools/ inside classes.jar. Based on https://developer.android.com/topic/performance/app-optimization/library-optimization#support-different.

Add jar_embedded_proguard_extractor to scan META-INF/proguard/ and
META-INF/com.android.tools/ in the deploy JAR before R8, and extend
aar_embedded_proguard_extractor to also read targeted R8 rules from
META-INF/com.android.tools/ inside classes.jar.
@snazhmudinov snazhmudinov marked this pull request as ready for review February 12, 2026 10:47
@@ -0,0 +1,74 @@
# pylint: disable=g-direct-third-party-import
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file is almost the exact same as the aar_embedded_proguard_extractor tool. Let's find a way to implement this new behavior without copying 90% of the other tool's code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Created a unified tool.


if proguard_spec in aar.namelist():
output.write(aar.read(proguard_spec))

# For AARs, META-INF/com.android.tools/ lives inside classes.jar
if classes_jar in aar.namelist():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

classes_jar is essentially a const here. Can we just check if "classes.jar" in aar.namelist()?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assigned it to the var and also used it below.

@snazhmudinov snazhmudinov requested a review from ted-xie February 12, 2026 21:13
@snazhmudinov
Copy link
Contributor Author

if the changes look good, I can squash them into one commit before merge.

@snazhmudinov
Copy link
Contributor Author

@ted-xie can you check the latest changes I pushed?

@ted-xie
Copy link
Contributor

ted-xie commented Mar 4, 2026

I think some more context about how the PR import process works would be helpful here. The latest commits you pushed strayed a little far from the vision I had (I probably should have explained my reasoning better).

Background information

This repository is an export of the internal Starlark Android rules that Google uses in our monorepo. We use a tool called Copybara to 1) do a bunch of text-replace transformations; 2) copy translated files to Github. The text transformations are a mixed bag and can range from changing package import paths to changing the default values of attributes to even removing certain code paths entirely. There is a lot of stuff in the Google rules that is not applicable to OSS builds, and vice versa: there are some features that are OSS-specific that we don't want to enable internally. R8 compilation is one of those features that is specific to OSS builds; we have another code optimization tool for internal builds.

To import a PR, we have to pull these changes in via a reverse Copybara transform, which is sometimes tricky, but mostly straightforward. There are certain things that make the import more complicated, since we have to change the Copybara configuration to deal with them. File name changes or deletions are one of the change types that can make this process complicated.

The next step is always tricky and labor-intensive: we have to verify that newly-introduced code does not change how builds work for every single Android target in the entire monorepo. Not just binaries, but tests and libraries, too. This includes huge apps like Gmail, Maps, etc. The more logic that a PR touches, especially in intricate parts of the codebase such as resource processing or proguard spec mangling, the more likely it is that the PR will cause an app breakage. We have had to permanently revert community PRs before because they broke internal apps. Just setting up the monorepo-wide tests can take a day+, not to mention running the tests (takes a whole day), and triaging breakages.

This PR

Your proposed change has a lot of features that complicate the import considerably.

  • Renaming a toolchain attribute: This isn't the worst thing, but a little unnecessary (I'll elaborate below)
  • Renaming a tool and adding new features to it
    • Since git doesn't really support file "move" changes, the diff appears as a deletion and new file addition, so it's impossible for me to see what actually changed. You might have only changed 4 lines, but it appears as ~300 new lines, which is unreviewable.
  • Unconditionally changing how AAR proguard collection works. Maybe this works for your particular usecase, but it might cause proguard rule forwarding behavior to change for internal apps, not to mention non-Google apps that depend on this codebase.

My suggestions

Given the background information and my notes above, I think there are a few things we could do to still try and get this change imported.

  • I would much rather see a change where much of the business logic in tools/android/aar_embedded_proguard_extractor.py is abstracted to a library, since that logic would be common to proguard spec extraction for both AARs and JARs. Any new additional behavior for jar proguard extraction could be added on top of the core business logic, and any new jar_proguard_extractor_tool would basically be a thin shell around the library call. This minimizes logic duplication and follows general conventions for behavior encapsulation.
    • You'll still probably have to add a new toolchain attribute, but attr additions are much easier to test and vet than renames for us.
  • The new logic you've added in aar_import() should be hidden behind some kind of feature flag. Is it possible to check within aar_import() whether R8 was used for compilation? If not, please consider adding a Starlark flag that optionally enables this behavior.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants