Skip to content

Conversation

@wenyingd
Copy link

@wenyingd wenyingd commented Sep 4, 2025

  1. Refine the inbound function in openflow stream.
  2. Fix byte slice copy issues.
  3. Fix the potential overflow issue when encoding packetOut message

@vanytsvetkov
Copy link

vanytsvetkov commented Sep 4, 2025

A few questions, if I may.

  1. If it's not a secret, what problem does this patch solve?
  2. Are you sure that adding a proprietary vendor AntreaLargePacket field is the right way to develop the project? Why I'm asking this question is that when making changes to the openvswitch source code, we encountered a similar problem with the extensibility of the libOpenflow source code, which forced us to make a fork and make our own edits directly to it. I suggest looking towards the possibility of increasing the number of fields from outside the repository, rather than adding a custom field at the first problem.

@wenyingd
Copy link
Author

wenyingd commented Sep 4, 2025

A few questions, if I may.

1. If it's not a secret, what problem does this patch solve?

2. Are you sure that adding a proprietary vendor AntreaLargePacket field is the right way to develop the project? Why I'm asking this question is that when making changes to the openvswitch source code, we encountered a similar problem with the extensibility of the source code, which forced us to copy the source code and make our own edits directly to it. I suggest looking towards the possibility of increasing the number of fields from outside the repository, rather than adding a custom field at the first problem.

@vanytsvetkov Thanks for providing the feedback.

It is the issue this change is expected to resolve: antrea-io/antrea#7386

The new value of the property is set/consumed only inside libOpenflow rather than updating OVS source code. The new flag is used only when a large packetIn message is received in which the actual size of message is overflow, so the length directly decoded from the bytes is untrusted. If we search solution from outside repository, it means we need to ask OVS to extend the OpenFlow message format, e.g., switch the type of "length" field from uint16 to uint32, it may introduce more change in most of the OpenFlow users. So I don't think it is acceptable in a short time.

@vanytsvetkov
Copy link

A few questions, if I may.

1. If it's not a secret, what problem does this patch solve?

2. Are you sure that adding a proprietary vendor AntreaLargePacket field is the right way to develop the project? Why I'm asking this question is that when making changes to the openvswitch source code, we encountered a similar problem with the extensibility of the source code, which forced us to copy the source code and make our own edits directly to it. I suggest looking towards the possibility of increasing the number of fields from outside the repository, rather than adding a custom field at the first problem.

@vanytsvetkov Thanks for providing the feedback.

It is the issue this change is expected to resolve: antrea-io/antrea#7386

The new value of the property is set/consumed only inside libOpenflow rather than updating OVS source code. The new flag is used only when a large packetIn message is received in which the actual size of message is overflow, so the length directly decoded from the bytes is untrusted. If we search solution from outside repository, it means we need to ask OVS to extend the OpenFlow message format, e.g., switch the type of "length" field from uint16 to uint32, it may introduce more change in most of the OpenFlow users. So I don't think it is acceptable in a short time.

Thank you for your reply, now everything is clear.

@antoninbas
Copy link

@wenyingd Does this PR supersede #66 as well?

n += 4
p.ExperimenterType = binary.BigEndian.Uint32(data[n:])
n += 4
if len(data) < int(p.Length) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am having a hard time understanding this code. I took a look at the Openflow spec and we should always have len(data) >= int(p.Length), i.e., the length of the message should be at least equal to the property length (which includes header + experimenter data). What am I missing?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, it should be an issue existing for long. I think here should be if n < int(p.Length)

@wenyingd
Copy link
Author

wenyingd commented Sep 5, 2025

@wenyingd Does this PR supersede #66 as well?

I don't think so. Do you want me to do that?

@wenyingd wenyingd force-pushed the refine_stream branch 2 times, most recently from a1f3333 to 3bcc3ff Compare September 5, 2025 02:22
@antoninbas
Copy link

@wenyingd Does this PR supersede #66 as well?

I don't think so. Do you want me to do that?

Not necessarily. I just noticed that you commented on that PR, so I was wondering if there was any relationship between the 2 changes.

@wenyingd wenyingd force-pushed the refine_stream branch 2 times, most recently from 319558c to 58ad864 Compare September 8, 2025 10:54
@wenyingd wenyingd requested a review from antoninbas September 9, 2025 08:51
@wenyingd wenyingd requested a review from antoninbas September 10, 2025 05:30
@wenyingd
Copy link
Author

@antoninbas I removed "AntreaLargePacket" by modifying the message "unmarshal" logic in the latest update, could you take another look?

@wenyingd wenyingd force-pushed the refine_stream branch 2 times, most recently from ddba8b4 to e3d225b Compare September 12, 2025 06:33
return uint16(p.ActLength())
}

func (p *PacketIn2PropPacket) ActLength() int {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at the code I am having a hard time understanding why we need the ActLength method instead of just putting the correct implementation in the Len method for PacketIn2PropPacket? Then we would not need getPacketIn2PropertyLength?

Copy link
Author

@wenyingd wenyingd Sep 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The type of return value for function Len is uint16, it always map to the value of field "Length". And the function Len is defined in the interface Message. If we update the value type from uint16 to int, the impact is huge, we need to update almost all messages.

@wenyingd
Copy link
Author

@antoninbas Do you have other comments on this PR? Or can we move forward?

Copy link

@antoninbas antoninbas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I am fine with the change, as I mentioned to you offline it feels like this is a bug in OVS (they should never generate an invalid Openflow message like this), and we should try to get to the bottom of this

}

return nil
func getPacketIn2PropertyLength(prop Property) int {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The getPacketIn2PropertyLength function returns the length of any property, not just PacketIn2PropPacket, so I am not sure why the function is named like this.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed as getPropertyLength.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't seem that this comment has been addressed?
Can you make sure other resolved comments have been addressed as well?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated.

@wenyingd
Copy link
Author

OVS issue is created: openvswitch/ovs-issues#371

@wenyingd wenyingd requested a review from antoninbas September 22, 2025 01:36
@wenyingd
Copy link
Author

@antoninbas Do you have other comments on this change?

}

return nil
func getPacketIn2PropertyLength(prop Property) int {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't seem that this comment has been addressed?
Can you make sure other resolved comments have been addressed as well?

util/stream.go Outdated
}
// Check if the message is Type_PacketIn2 or not.
experimenterType := binary.BigEndian.Uint32(venderHeaderMessageBytes[12:])
if experimenterType != 30 {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, and anyway this code is not supposed to be specific to openflow15

But maybe we should just redefine local constants in this file?

}

// Handle inbound messages
func (m *MessageStream) inbound() {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you have good confidence that the updated function will perform as well, or almost as well (performance-wise), as the previous version?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not did some performance test yet. In theory, the new logic should perform better.

antoninbas
antoninbas previously approved these changes Sep 24, 2025
Copy link

@antoninbas antoninbas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, I think @tnqn should take a look too if he has time

util/stream.go Outdated
}
// Check if the message is Type_PacketIn2 or not.
experimenterType := binary.BigEndian.Uint32(venderHeaderMessageBytes[12:])
if experimenterType != 30 {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was just suggesting adding local constants const experimenterTypePacketIn2 = 30 and const messageTypeVendorHeader = 4 in this file, not any major refactoring

Comment on lines 776 to 782
return uint16(p.actualLength())
}

func (p *PacketIn2PropPacket) actualLength() int {
n := int(p.PropHeader.Len()) + int(p.Packet.Len())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If Len() just calls actualLength(), why it needs two functions?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function Len return type is uint16, which is defined in the interface, if the sum of the fields is exceeds 2^16, the result will overflow. Function actualLength return type is int,

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added comments in the code.

@wenyingd wenyingd requested a review from tnqn September 26, 2025 06:07
1. Refine the inbound function in openflow stream.
2. Fix byte slice copy issues.
3. Fix the potential overflow issue when encoding packetOut message

Signed-off-by: Wenying Dong <wenying.dong@broadcom.com>
length, err := getMessageLength(reader)
var smallOFErr *smallMessageError
var netErr *net.OpError
if errors.Is(err, io.EOF) ||
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't Peek return ErrBufferFull when n is larger than buffer size? Why doesn't it handle that but EOF and other errors?

Copy link
Author

@wenyingd wenyingd Sep 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't get your question, "n" is always controlled by our call on Peek, the max value used in our function "getMessageLength" is 20, while reader may set its buf size with 4096 at least, so we shall not hit the issue.

While other errors we explicitly set are to catch the exceptions that we may hit in the runtime like connection issues and the message encoding issues from the sender (e.g. OVS).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

got it, thanks for explaining.

@wenyingd wenyingd merged commit 4ae5ac9 into antrea-io:main Sep 26, 2025
4 checks passed
antoninbas pushed a commit to antrea-io/antrea that referenced this pull request Sep 26, 2025
Fixes #7386

See antrea-io/libOpenflow#80

Signed-off-by: Wenying Dong <wenying.dong@broadcom.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants