Skip to content

Conversation

@kuzjka
Copy link

@kuzjka kuzjka commented Aug 11, 2025

New feature

This PR continues the mailing list thread and suggests system properties to control base64Binary values formatting:

  • org.apache.xml.security.base64.lineSeparator, org.apache.xml.security.base64.lineLength - allows to override default MIME encoding settings
  • org.apache.xml.security.base64.ignoreLineBreaks - disables line wrapping for base64 value, but allows to keep the whole XML pretty-printed. Takes precedence over line wrapping settings above.
  • existing org.apache.xml.security.ignoreLineBreaks takes precedence over all new options.

Default values result to CRLF line breaks with 76 chars in the line, making the whole thing fully compatible with previous implementation.

Other changes

Some of the logic is included to XMLUtils to provide better encapsulation:

  • adding line breaks in the beginning and in the end of wrapped base64 values in the element is implemented as XMLUtils.encodeElementValue
  • Base64OutputStream from Commons, used in XML encryption, is replaced with a java.util implementation and provided by XMLUtils, making it consistent with configured Base64 encoder.

Base64.Encoder/Decoder instances are thread-safe, so sharing a single instance should give us a little performance gain.

Test coverage

As the base64 configuration takes place during class load, unit-tests leverage from custom ClassLoader and Reflection API to reinitialize the class in each test.

Motivation

https://www.w3.org/TR/xmlschema-2/#base64Binary - base64Binary definition has note, that original RFC2045 line-length limitation must not be enforced. In current implementation it is only possible to remove all line breaks in the document using org.apache.xml.security.ignoreLineBreaks, sacrificing human readability.

Also, using LF instead of CRLF may be desired in systems where verifying side does not expect escape sequences in base64Binary values.


I would be thankful for your comments and suggestions.

@seanjmullan
Copy link
Member

How about also removing org.apache.xml.security.utils.Base64? Is it needed anymore?

@kuzjka
Copy link
Author

kuzjka commented Oct 31, 2025

I can't see any usage of this class inside Santuario. However, I didn't touch it to avoid breaking changes. Despite deprecated, I can see it was retained in 8016fc3 for compatibility. Probably, @coheigea would give us better insight.

Copy link
Member

@seanjmullan seanjmullan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments on a first pass.

* DOM and XML accessibility and comfort functions.
*
* @implNote
* Following system properties affect XML formatting:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: "The following ..."

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed in 8a598d0, thank you

import java.util.Iterator;
import java.util.List;
import java.util.Map;
import java.util.*;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you revert this change? I think it is better not to wildcard imports.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, my IDE did it :) reverted

return ignoreLineBreaks;
}

public void setIgnoreLineBreaks(boolean ignoreLineBreaks) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of set methods, why not add a constructor that takes the options and also checks for illegal syntax?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we would like to retain meaningful log messages for invalid property values, I couldn't find a better way than to move all properties reading together with validation logic to the constructor of Base64FormattingOptions.

Added warnings in case a property is set, but ignored.
Added handling of NumberFormatException as suggested in the comment below.

Example log:

testIgnoreLineBreaksTakesPrecedence:
... org.apache.xml.security.ignoreLineBreaks property takes precedence over org.apache.xml.security.base64.ignoreLineBreaks, line breaks will be ignored
... Property org.apache.xml.security.base64.lineSeparator has no effect since line breaks are ignored
... Property org.apache.xml.security.base64.lineLength has no effect since line breaks are ignored

testIllegalPropertiesAreIgnored:
... Illegal value of org.apache.xml.security.base64.lineSeparator property is ignored: illegal
... Illegal value of org.apache.xml.security.base64.lineLength property is ignored: illegal

(PrivilegedAction<Boolean>) () -> Boolean.getBoolean("org.apache.xml.security.ignoreLineBreaks"));
(PrivilegedAction<Boolean>) () -> Boolean.getBoolean(IGNORE_LINE_BREAKS_PROP));

private static Base64FormattingOptions base64Formatting =
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if org.apache.xml.security.ignoreLineBreaks is true, none of these options matter, so did you consider skipping the getting of the other properties?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also should log a warning if the other properties are set, since they have no impact if org.apache.xml.security.ignoreLineBreaks is true.

}

private static final Logger LOG = System.getLogger(XMLUtils.class.getName());
Integer lineLength = Integer.getInteger(BASE64_LINE_LENGTH_PROP);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be better to use System.getProperty and then Integer.valueOf(String) because Integer.getInteger does not distinguish between the property not being set and an invalid value for the integer.

/**
* Creates new formatting options by reading system properties.
*/
public Base64FormattingOptions() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The constructors and methods of this class should be package-private since the class is package-private.

this.bytes = bytes;
}

public byte[] getBytes() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change to package-private.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing Apache license header.

import static org.hamcrest.MatcherAssert.assertThat;
import static org.hamcrest.Matchers.*;
import static org.junit.jupiter.api.Assertions.*;

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a comment describing what this test does.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you consider testing the properties by generating signed XML and then checking the format? This test won't catch issues where the internal methods are not called accidentally.

Copy link
Author

@kuzjka kuzjka Nov 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fact that XMLUtils is a utility class with static methods, and the configuration can only be done with system properties at the class load time makes it hard to test. That's why the test contains custom class loader which reloads the class in each test, allowing to run static fields initialization in the middle of the test.

To build a sort of integration test against public API e.g. XMLSignature, I believe we will need to load XMLSignature and all the classes it depends on with this custom classloader. There are additional considerations regarding Jigsaw module system: since the new classloader loads classes in its own module, all packages from xmlsec which are not exported cannot be used by this classloader. I can try to implement such a test, but I'm afraid it will be totally unreadable...

Intuitively, a good test interacts with the code in a way it is intended to be used. In our case, parts of the public interface are system properties. It looks proper to run the test suite with different JVMs started with different properties. Probably, Maven has something to achieve this, I'll investigate.

On the other hand, we can make it more testable (and probably a little more convenient for usage) if we provide a way to set Base64 options in runtime, for example:

  • refer to system properties and resolve the options each time we do base64 encoding (e.g. inside encodeToString etc.)
  • add public setters for base64FormattingOptions and ignoreLineBreaks
  • add public setter for base64Encoder - then we don't need new options at all. We can default to Base64.getEncoder() / Base64.getMimeEncoder() depending on ignoreLineBreaks, and if the clients need custom formatting - they just provide any implementation of Base64.Encoder

However, I'm not aware of security implications behind this...

All other inline comments were addressed in 8e67aaf. Thank you for checking the code carefully.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like a limitation of junit in that it isn't possible or easy to run tests with different system property settings. Anyway, I don't think you should spend an inordinate amount of time trying to address my concern. See what you can do, but if it seems like too much work or is not feasible for some reason, let's just use the current test. I would prefer to avoid adding hooks to set the options at runtime as that has the potential to be misused. Thanks.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please, take a look at 908c97f

We can check the formatting in various scenarios with these steps:

  1. Annotate test class or method with @FormattingTest (this adds a JUnit tag)
  2. Get a FormattingChecker instance using FormattingCheckerFactory - it returns an implementation depending on system properties
  3. Implement test scenario, i.e. create an XML signature.
  4. Check line breaks in the whole document and formatting of Base64 values using FormattingChecker
  5. Maven Surefire takes care to run the tagged tests with various configurations because of additional <executions>

Looks a little clearer than cheating with class loaders.

I started by refactoring the XMLUtilsTest, which can be retained as a unit test suite. It looks much simpler now.
Now, if the approach looks good, I can add a few high-level tests for basic scenarios (signature/encryption, DOM/StAX)

@coheigea
Copy link
Contributor

8016fc3

I think we are good to remove Base64, as OpenSAML stopped using it here https://shibboleth.atlassian.net/browse/OSJ-287

* added FormattingTest annotation (JUnit tagging)
* added FormattingChecker interface, various implementations for different formatting configurations and a factory to get appropriate implementation
* added formatting options properties sets and multiple executions for Surefire plugin
* refactored XMLUtilsTest: made it a @FormattingTest, removed hacks with classloader
@kuzjka
Copy link
Author

kuzjka commented Nov 11, 2025

I think we are good to remove Base64

Thank you for clearing it up. There are also two duplicating getters XMLUtils#ignoreLineBreaks() and XMLUtils#isIgnoreLineBreaks(). After removing Base64 neither of them is used internally. Probably we can deprecate at least one of them.

Please let me know if you would like me to do this cleanup as a separate change.

@coheigea
Copy link
Contributor

If they're not used then let's deprecated both of them, thanks.

@seanjmullan
Copy link
Member

I think we are good to remove Base64

Thank you for clearing it up. There are also two duplicating getters XMLUtils#ignoreLineBreaks() and XMLUtils#isIgnoreLineBreaks(). After removing Base64 neither of them is used internally. Probably we can deprecate at least one of them.

Please let me know if you would like me to do this cleanup as a separate change.

Probably best to do this as a separate change just in case we have to later restore them for some reason.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants