Skip to content

Conversation

jholecek-rh
Copy link
Contributor

All Pull Requests:

Check all that apply:

  • Have you followed the guidelines in our Contributing document, including the instructions about commit messages?
  • No Sensitive Data in this change?
  • Is this PR to correct an issue?
  • Is this PR an enhancement?

Complete Description of Additions/Changes:

Fixes two issues and adds two improvements:

  • FileProvider would read files outside the root if the root were a prefix of the file path string-wise. Consider a root path /tmp/sample/root and a relative path ../root.txt, resulting in reading a file /tmp/sample/root.txt.

  • The "inside root" check was performed after checking if file existed. This would expose information about files outside the root through the exception. While an exploit would be difficult (if possible at all), it is better to fix this weakness anyway.

  • The exception type is changed from the most generic Exception to ValueError to make it easier to distinguish it from ContentException that is raised in other cases.

  • The "/" literal in path operations is changed to os.sep to improve portability.

Fixes two issues and adds two improvements:

* FileProvider would read files outside the root if the root were
  a prefix of the file path string-wise. Consider a root path
  `/tmp/sample/root` and a relative path `../root.txt`, resulting in
  reading a file `/tmp/sample/root.txt`.

* The "inside root" check was performed after checking if file existed.
  This would expose information about files outside the root through the
  exception. While an exploit would be difficult (if possible at all),
  it is better to fix this weakness anyway.

* The exception type is changed from the most generic `Exception` to
  `ValueError` to make it easier to distinguish it from
  `ContentException` that is raised in other cases.

* The `"/"` literal in path operations is changed to `os.sep` to improve
  portability.

Signed-off-by: Jan Holeček <71874510+jholecek-rh@users.noreply.github.com>
@codecov-commenter
Copy link

codecov-commenter commented Jun 5, 2025

Codecov Report

Attention: Patch coverage is 81.81818% with 2 lines in your changes missing coverage. Please review.

Project coverage is 77.67%. Comparing base (771b3ba) to head (a039111).

Files with missing lines Patch % Lines
insights/core/spec_factory.py 81.81% 1 Missing and 1 partial ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##           master    #4474   +/-   ##
=======================================
  Coverage   77.67%   77.67%           
=======================================
  Files         745      745           
  Lines       41737    41742    +5     
  Branches     6703     6704    +1     
=======================================
+ Hits        32418    32423    +5     
  Misses       8290     8290           
  Partials     1029     1029           
Flag Coverage Δ
unittests 77.66% <81.81%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.


self.validate()

def _is_inside_root(self):
Copy link
Contributor

@xiangce xiangce Jun 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As this method is not used anywhere out the validate, better make it an inner funciton of the validate, Otherwise, we may need to add particular tests it (although it's a private one)

Copy link
Contributor Author

@jholecek-rh jholecek-rh Jun 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I cannot agree with this. In my opinion, nested function and class definitions are a niche feature for specific cases (e.g. wrappers). Nested functions cannot be accessed, reused, or called manually when troubleshooting.

I intentionally chose a private name for _is_inside_root so that it remains an implementation detail. It is not part of the public interface and thus should not need a dedicated test. The implementation is properly covered by tests that I did add (in this PR and in #4458) that use the public validate method.

In fact, I would argue that validate should not be public either. We touched the issue of interfaces of provider classes in our private conversations and I do intend to continue those. However, clarifying the public interface of the FileProvider class was beyond the scope of this PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P.S. The FileProvider.validate() method is not used only in FileProvider.__init__. Should it be defined as a nested function in the __init__ method?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I didn't get your point

  1. Although the validate() is called only in the __ini__, but for different child classes, it has different implementation. It should not be designed as inner function.
  2. inner function, only works for function which is very simple (just like this one) and sometimes a bit common (can be called times) for its parent function.

If its function is common enough, and is useful not only for its parent caller. We can upgrade it as you did or might need to consider moving it to the insights.util as a utility function.

Because I didn't get from the code and feel hard to imagine the possibility that it will be used by other cases in this class, I asked you move it to a inner function or just expand it in the caller.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although the validate() is called only in the ini, but for different child classes, it has different implementation. It should not be designed as inner function.

I do agree that the validate() method should not be defined as an inner function. I used this example to demonstrate why I do not agree with the reasons that you provided for defining _is_inside_root() as an inner function.

As for the "different implemenations", no subclass of FileProvider overrides the validate() method. The CommandOutputProvider class happens to define a similar method, but the implementation is completely independent. I have brought up issues in the insight-core hierarchy of provider classes with you in a separate discussion. I think the hierarchy deserves a close review of the interfaces and separation of responsibilities. I would prefer to keep this topic separate from this PR that aims to fix a vulnerability.

As for the rest, it seems that our views of what a well-structured, readable code looks like differ significantly. The structure and readability of insights-core code is actually one of the broader topics that we wanted to discuss with you, because my team continuously struggles when dealing with the insights-core code base.

I cannot continue this PR without a better understanding of/alignment on desired code style and its justification. I will reach out to you in private to agree next steps and timelines.

@xiangce xiangce added the data processing Changes that would impact the data processing/analyzing label Jul 28, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
data processing Changes that would impact the data processing/analyzing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants