Skip to content

Don't inline callees if they might require _compute_sparams #59013

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

serenity4
Copy link
Member

@serenity4 serenity4 commented Jul 15, 2025

Fixes #58915. See #58996 for an initial workaround to that issue.

This PR prevents inlining callees that use static parameters in a way that results in the insertion of a _compute_sparams call that may survive post-inlining optimizations, because that built-in is quite expensive. The implementation in this PR is quite conservative, because post-inlining optimizations may remove _compute_sparams, therefore I'm not sure this is the right approach (one way might be via DCA on the inlined body if the returned value is unused and the static parameter was used to compute that return value). However, a more robust approach would most likely require to inline, optimize then de-inline if _compute_sparams survives, which seems like it would add quite a lot of complexity.

EDIT: a more recent PR #59018 also addresses the original issue with a different approach, which might be a more robust fix and easier to get in.

@serenity4 serenity4 requested review from vtjnash and topolarity July 15, 2025 19:11
@serenity4 serenity4 self-assigned this Jul 15, 2025
@topolarity
Copy link
Member

topolarity commented Jul 15, 2025

Is there any concern for if @generated functions that don't use static parameters?

I wasn't able to produce an MWE that would behave poorly, but it seems like it would still be a bad decision to inline those if it means we effectively lock into the fall-back "else" path, instead of being eligible for evaluation of the "then" block.

edit: Here's a test case:

julia> function foo(x)
           if @generated
               if x isa DataType && x.name === Type.body.name
                   return Core.sizeof(x.parameters[1])
               end
           else
               return (println("slow path"); Core.invokelatest(Core.sizeof, x))
           end
       end;
julia> z = Int; # intentionally not a `const` for poor inference
julia> bar() = foo(z)
julia> @code_typed bar()
CodeInfo(
1%1 = Main.z::Any
│           invoke Main.println("slow path"::String)::Any%3 = dynamic builtin (invokelatest)(Core.sizeof, %1)::Any
└──      return %3
) => Any

@serenity4
Copy link
Member Author

This concern seems valid to me. Where do we decide whether or not to take the fallback pack for functions containing if @generated?

if uses_sparams && !validate_sparams(mi.sparam_vals)
# Inlining this would require the use of `_compute_sparams`,
# which is a bit too expensive to reasonably inline.
return MAX_INLINE_COST
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this mean that Core._compute_sparams is completely unused now?
(unless an adventurous user writes it by hand)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, yes, unless we decide to go for a decision post-inlining (where we would rely on Core._compute_sparams being eventually removed), in this case it would live during optimization but would never be executed.

@topolarity
Copy link
Member

This concern seems valid to me. Where do we decide whether or not to take the fallback pack for functions containing if @generated?

I'm not very familiar, but I believe this is handled by retrieve_code_info / get_staged and implicitly depends on the MethodInstance queried (via Base.may_invoke_generator I think)

@serenity4
Copy link
Member Author

serenity4 commented Jul 15, 2025

Thanks for the test case! I would consider it reasonable to not inline methods that have generators if we will not able to invoke the said generator because of non-concrete argument type information. This way, they get the chance to have their generator evaluated through dynamic dispatch, with the assumption that a @generated fallback will be slower than a dynamic dispatch.

Would there be any other situations where we evaluate the fallback branch after that change?

EDIT: this could be an alternative to this PR, as it would also address the raised issue. Nevertheless, I think the original concern about _compute_sparams surviving through inlining is valid and might justify keeping the approach taken in this PR (or another one, e.g. doing that post-inlining) as well.

@JeffBezanson
Copy link
Member

I would consider it reasonable to not inline methods that have generators if we will not able to invoke the said generator

Agree. Originally, if we didn't infer concrete argument types for a generated function we did no optimizations at the caller at all, and punted to run-time code generation. That's the point of @generated. Later when we added fallbacks, we only used them to try to get better type inference at the call site but nothing else. That's generally how it should work.

The only case where we could statically decide to call or inline the fallback is in a build for compiler-less deployment (e.g trimming), but that's a very marginal/unimportant case since anyone doing such a build will want their generated functions fully generated and optimized at compile time anyway.

@serenity4
Copy link
Member Author

serenity4 commented Jul 16, 2025

Thanks for the clarification. I assume this would be an unsound implementation that we expect to not work properly?

function f59013_generator_const(x)
    if @generated
        if x isa DataType && x.name === Type.body.name
            return 0
        end
    else
        return -1
    end
end

(currently, inference goes through the else branch, then constprops the result to -1 regardless of inlining)

@serenity4
Copy link
Member Author

The implementation in this PR is quite conservative, because post-inlining optimizations may remove _compute_sparams, therefore I'm not sure this is the right approach

A concrete example of this is

julia> named_tuple_elim(name::Symbol, result) = NamedTuple{(name,)}(result)
named_tuple_elim (generic function with 1 method)

julia> src = code_typed1(named_tuple_elim, Tuple{Symbol, Tuple})
CodeInfo(
1%1 =   builtin Core.tuple(name)::Tuple{Symbol}%2 =   builtin Core.apply_type(Main.NamedTuple, %1)::Type{NamedTuple{_A}} where _A
│   %3 =   dynamic (%2)(result)::NamedTuple
└──      return %3
)

whereas previously we inlined the NamedTuple constructor and optimized _compute_sparams away:

julia> src = code_typed1(named_tuple_elim, Tuple{Symbol, Tuple})
CodeInfo(
1%1 =   builtin Core.tuple(name)::Tuple{Symbol}%2 =   builtin Core.typeof(result)::Type{<:Tuple}%3 =   builtin Core.apply_type(Core.NamedTuple, %1, %2)::Type{NamedTuple{_A, var"#s185"}} where {_A, var"#s185"<:Tuple}
│   %4 = %splatnew(%3, result)::NamedTuple
└──      return %4
)

@JeffBezanson
Copy link
Member

Thanks for the clarification. I assume this would be an unsound implementation that we expect to not work properly?

Yes this would be undefined behavior.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Poor codegen with setindex on a NamedTuple
3 participants