(After a lengthy discussion on the IPv6 WG mailing list, Jari Arkko summarizes the results:) I'll try to summarize the discussion about this issue and propose a way forward. The proposal was to allow the use of randomly generated addresses a la 3041 for link-local addresses and care-of addresses in mobile nodes while they are roaming around. Since RFC 2462 requires less drastic collision actions (disable address) for random addresses than for EUI-64 based addresses (disable interface), this was thought to have a positive effect to the resiliency of the system, if after failure one would still like to use the interface after moving to another link. A part of the discussion that arose dealt with the issue of skipping DAD and how useful (or not) it is. This part of the discussion is not relevant for our original question, because we follow the current RFCs and do DAD as specified. However, it appears to be a periodically discussed subject in the IPv6 WG. As such, some further work in the WG might look at that. Ongoing efforts to look at optimistic DAD is one such item and I will suggest another one later in this e-mail. There was also a lot of discussion of EUI-64 -based address conflicts. This is also outside the scope of our proposal, and is ignored in this e-mail. But back to the proposal and the issues around it. The following points were made: Protocol issues: * Clearly, mobile nodes can use RFC 3041 addresses like any other application. No issue here (Hesham). * Mobile IPv6 should disable the link but when moving to a new link can re-try DAD (Pekka). * Discussion concludes that MAY use 3041 (Mika). * Move on without this and discuss the issue in depth in IPv6 later (Jim). * There is a need to work on clarifications and treatment of wireless cases for DAD (Greg). Desired behaviour: * Collision is so rare, should not bother to prepare for nice treatment if it happens (Mika). * Rare errors should also be handled, many IPv6 error handling mechanisms such as DAD work with rare events (Hesham, Jari). * Collision means something is serious wrong, should not continue (Bob, Pekka). * Devices should recover by themselves after moving elsewhere, otherwise device needs user attention or even service (Jari). * Disabling the interface is harmful to users (Charlie). * Diagnostics on the mobile node will be able to tell what's going on (Greg). Specifics of collisions: * There's a difference if the DAD failure is due to this node or another node, in the latter case disabling interface harms future use. Trouble is, hard to know which case it is (Jari). * Inspection of SSLA can tell you whether to disable the interface or not (Greg). * DAD failure means the address is bad, not interface (Charlie). * Why does the collision happen - EUI-64 collision, software/hardware problem, malicious user, random collision? This has an impact on the proper "cure". (Jari, Pekka, Vijay) Summary A key requirement question is of course whether it is desireable for the mobile nodes to disable themselves or have some form of automatic resilience against temporary problems. There was no clear consensus on this issue (though the question about whether EUI-64 or something else is used affected the discussion). My belief is that some level of resilience is necessary, unless EUI-64 addresses are used. This idea seems to be built into RFC 2462 as well, because it has different collision actions. Another important question is to what extent the existing protocols already support the desired behaviour. Clearly, mobile nodes can use 3041 addresses when they move around just like other nodes. Given the current 2462 rules, collisions for such addresses do not disable the interface. There are two limitations of this, however: (a) 3041 does not specify the creation of link-local addresses and (b) 3041 says that after 5 collisions, stop generating any more addresses on that interface. This is almost sufficient for the desired behaviour, since mobile node could presumably use only global care-of addresses, and getting to five collisions is highly unlikely. (Though it would seem better to have the limit per link.) Another current protocol facility is Neighbor Discovery in general, its concept of an interface. We've lately started to realize that it may not always be clear what is considered as an interface. Does movement to a new link constitute the initialization of an interface, resetting all knowledge of DAD failures, reconstructing all data structures? Recommended way forward: Existing specifications already can give (very) limited support for the feature that we desired. We could add the link-local case and in doing so say that the 5-collision limit is per link. We could also specify that DAD issues in general are "reset" upon movement to a new link. The latter in particular would appear quite reasonable. However, I am afraid of two things: (1) Perhaps the understanding for what ND information gets reset in what kind of movements should be developed in full for all of ND, and not just for the particular case of collisions. (2) Continued discussion. Therefore, my recommendation is that we delete the Section 7.6 from the Mobile IPv6 draft that deals with this issue, and work for the better understanding of what an interface means in the context of the following work: - A full & optimized movement detection scheme that has been suggested to be worked on in the MIP WG or one of its follow-up WGs. - Optimized DAD. - Possible other new work taking place wrt ND or DAD. ---------------- Jim Bound responds to Jari Arkko: Charlie and I have been having an offline discussion. We don't fully agree but we agree on a lot more :--) And reading the mail. Here is wild suggestion and close to what you first proposed with restriction clause. In the spec: MAY use 3041 in DAD failure. But I would suggest wording that this is for the specific case of a "mobile node" moving. ---------------- Jari Arkko responds to Jim Bound: Well, perhaps this could be one way forward. But the above e-mail and some private conversions with other folks have alerted me to the fact that the original text that we proposed was unclear. It seems that many people interpreted it as "IF you run into a DAD failure, THEN you MAY start to use temporary addresses". However, this was not the intention. What we wanted to say was "You MAY use temporary addresses all of the time". Clarifying that might be another way forward. I do like your idea of binding this to movements only, perhaps that could also be done. How about: 7.6 Failures from Duplicate Addresses Upon failing Duplicate Address Detection, [13] requires IPv6 nodes to stop using the address and wait for manual configuration of a new address. In addition, if the failed address was a link-local address formed from an interface identifier, the interface should be disabled. Mobile nodes that wish to avoid a disabled link MAY avoid the use of interface identifiers when generating link-local addresses (and subsequent global addresses). If a collision is detected for such an address, it is no longer used but the status of the interface stays unchanged. Instead, mobile nodes MAY use temporary link-local addresses by generating a random interface identifier and using it for assigning itself a link-local address and derived care-of addresses for use on this link. In order to do this, the mobile node applies to the link-local address the procedure described in RFC 3041 [18] for global addresses. Implementations SHOULD NOT make more than 5 consecutive attempts to generate such addresses and test them through Duplicate Address Detection. If after these attempts no unique address was found, the mobile node SHOULD log a system error and give up attempting to find a link-local address on that interface, until the node moves to a new link. ---------------- Pekka Savola responds to Jari Arkko: > Bob: I have changed "failure" terminology below, thanks for > noticing that. Regarding possible remaining problems, I'm not > sure I see what they are -- can you explain? Note that if there > are problems, it would appear that use of 3041 addresses in > general would have to have same kind of problems...? The use of RFC3041 addresses *does* have problems, that's acknowledged. The scope of such problems and possible fixes (other than disclaimers not to trust in it entirely etc.) are unclear at the moment. This is one reason I'd prefer not to create a dependency, although voluntary, on RFC3041. > Pekka: I have tried to clarify the wording. > I agree that no interoperability is required in the > sense that all involved protocols (3041, DAD) exist already > and its all in the mobile node side. But the purpose of the > text was to give advice on exactly what you said "implementations > can do something like that if they really feel it's necessary.". I can live with this text, but IMO it's an unnecessary complication. The use of RFC3041 doesn't fix any problems -- unless there is a genuine clash of EUI64 generated addresses on that specific link, which is not likely. On the other hand, it creates a dependency on RFC3041 and applies it to the situation which is unknown and likely has some problems; RFC3041 is typically used as a form of additional addresses, which are short-lived: what if the MN stays in a subnet longer than the lifetime of the RFC3041 link-local address, for example? On the other hand, saying that disabling the interface until movement to a new link fixes the issue -- in the case of a malicious/misconfigured link. Therefore, leaving it unspecified _or_ just "clarifying" that disabling the interface only refers to the interface when connected to the link where the DAD failed (you can argue this is true: RFC2461/2462 disable-if-clash originated with an assumption of a stationary node connected to one link with an interface -- the situation could be different on a different link) seems like the best way forward. I'd be rather sorry to see MN implementations implementing RFC3041 based on the MIPv6 spec recommendation even though it doesn't fix the real underlying problem. But I can live with it.. ---------------- Bob Hinden responds to Jari Arkko: > Bob: I have changed "failure" terminology below, thanks for > noticing that. Regarding possible remaining problems, I'm not > sure I see what they are -- can you explain? Note that if there > are problems, it would appear that use of 3041 addresses in > general would have to have same kind of problems...? Right. I see this as adding complexity with little value. Except for the case that is the least likely to happen (a real duplicate MAC based EIU-64 IID) , I don't see how the proposed mechanism helps. The 3041 address are also likely to fail. My preference would be to defer this (e.g., remove the third paragraph below) and wait until there is some practice with the protocol and see if there any evidence DAD detecting duplicates. I am not aware of this happening in any of the current IPv6 deployments. I don't see the text as creating any real harm other than making mobile IPv6 implementations more complex, so I won't object further if it stays in the draft. ---------------- Jari Arkko responds to Pekka Savola: > Therefore, leaving it unspecified _or_ just "clarifying" that disabling > the interface only refers to the interface when connected to the link > where the DAD failed (you can argue this is true: RFC2461/2462 > disable-if-clash originated with an assumption of a stationary node > connected to one link with an interface -- the situation could be > different on a different link) seems like the best way forward. Well, I'd be happy with defining that the disabled situation lasts only while we are on this link. This does fix the problem, and as you say it does not create a dependency to 3041. But I had the feeling that this definition would have been too drastic in other people's opinion (Bob?). I think the concern is that if you did use EUI-64s and did have a collision, something is very wrong and recovery may not be desireable. I'm not sure I agree, but if this is what people feel, we are left with the leave-it-unspecified option... ---------------- Jari Arkko writes: Trouble is, Bob and Pekka were still leaning to alternative 1 (not include the text), though maybe not so strongly as before. We might run out of time to agree. A key thing is of course that Erik also agrees what to do, since he wanted us to raise the discussion... Erik, Gabriel, I think you talked about this. What was the conclusion? ---------------- Erik Nordmark responds to Jari Arkko: > Yes. Trouble is, Bob and Pekka were still leaning to alternative > 1 (not include the text), though maybe not so strongly as before. > We might run out of time to agree. A key thing is of course that > Erik also agrees what to do, since he wanted us to raise the > discussion... Erik, Gabriel, I think you talked about this. > What was the conclusion? My conclusion is that as long as the alternative does not apply to link-locals derived from EUI-64 then trying a different link-local address a few times should be harmless; worst case is some extra traffic when there is a DAD DoS attack i.e. when all such nodes on the link will try 5 times instead of 1 time before giving up. Whether it is actually useful depends whether folks will use non-EUI-64 derived addresses and what their collision properties will be. Some nits on the text. > Upon detecting a collision in Duplicate Address Detection, IPv6 > nodes do not use the address which was determined to be non-unique. > In addition, if this address was a link-local address formed > from an interface identifier, the interface should be disabled. ... as specified in RFC 2462. > Address Detection. If after these attempts no unique address was > found, the mobile node SHOULD log a system error and give up > attempting to find a link-local address on that interface, until > the node moves to a new link. "If after ..." reads as if the node will do 5 attempt even if the first one is succesful. I suggest "If all of these attempts fail ...". Presumably the node should also disable the interface when all fail. ---------------- Gabriel Montenegro writes: Hmm... Yes, Bob and Pekka are now in "don't object" mode. (and Jim Bound has said: sure, go ahead with the text). Perhaps we should not expect anything beyond "don't object" from them. If they have removed their objection, that's about as strong a statement as we'll get (i.e., I don't believe getting them to actually like the mechanism and ask for it was ever part of the plan). At this point I would only worry about Erik. To answer your question from another email, I don't believe we arrived at any conclusion when we chatted. If we go back to what prompted this whole thread: Erik didn't seem to have an objection per se. He just wanted this to be discussed in the IPv6 alias. This is now done, and despite some initial objections, I interpret the current state of affairs as lack of objection. So we've satisfied Erik's request for discussion in the IPv6 alias. Hence, I'd say: move forward with your latest text. I'm cc-ing Erik in case I'm misinterpreting anything here. ---------------- Pekka Savola writes: It seems to me that the consequence of the latest proposal is that, at least those implementing to the letter, would *never* use EUI64 generated addresses at all. Because when you detect a duplicate, the interface should be disabled (arguable) -- NOT that you should generate a RFC3041 address and try with that 5 times. And that consequence -- completely moving to RFC3041-only -- quite frankly gives me creeps. ---------------- Erik Nordmark responds to Pekka Savola: I'm not aware of any technical basis on which to prefer RFC3041 style link-locals over EUI-64 link-locals. The former can see some duplicates (due to randomness) and the latter should in theory not see any duplicates at all. And in both cases significant duplicates is an evidence of a DAD DoS attack. So if the text in the MIPv6 spec is read as RFC3041 style link-locals is better then I think this is a mistake. ---------------- Pekka Savola responds to Erik Nordmark: No technical basis, indeed, but spec-wise basis, maybe; RFC2462: 5.4.5. When Duplicate Address Detection Fails A tentative address that is determined to be a duplicate as described above, MUST NOT be assigned to an interface and the node SHOULD log a system management error. If the address is a link-local address formed from an interface identifier, the interface SHOULD be disabled. > The former can see some duplicates (due to randomness) and the latter should > in theory not see any duplicates at all. > And in both cases significant duplicates is an evidence of a DAD DoS attack. Yep. > So if the text in the MIPv6 spec is read as RFC3041 style link-locals > is better then I think this is a mistake. The latest proposal (which was worded so to be 100% compliant with the letter of RFC2462) certainly can be. And I agree that would be a mistake. ---------------- Jari Arkko responds to Erik Nordmark: The only difference is the severity of the collision action per 2462, as Pekka already noted. So the question remains what to do: (1) Explain the situation, and say that you MAY avoid the collision action by doing X (the current text). (2) Fix DoS attacks, then we could claim that no significant sources of collision remain (in progress at SEND). (3) Make DAD more resilient for the case of multiple links and movements. For instance, say that disabled interface lasts only while on this link, or make a permanent disable only after getting the same problem on N different links... (4) Some combination of the above. ---------------- Basavaraj Patil responds to Jari Arkko: > So the question remains what to do: > > (1) Explain the situation, and say that you MAY > avoid the collision action by doing X (the > current text). Preferably in an appendix. Maybe we should move the whole section 7.6 to an appendix? ---------------- Jari Arkko responds to Basavaraj Patil: Doesn't it look more like we should delete all of it? I'm sorry I have continued this thread for so long (and for so insignificant issue), but I think it is now apparent that we don't agree on the right solution. With Pekka's and Bob's dont-object-but-dont-like-it opinion, we could have kept Section 7.6. But since even Erik disagrees... ---------------- Jim Bound responds to Jari Arkko: Sounds about right. ---------------- Basavaraj Patil responds to Jari Arkko: I can live with that. ---------------- Jari Arkko writes: I'm sure you saw the thread that went on in the IPv6 list about Section 7.6 in our Mobile IPv6 specification. At the end of that thread we were sort of on-and-off whether the text should be kept. Since then we have continued the discussion privately in an attempt to find consensus. You can see most of the e-mails through the issue 302 description. My interpretation of the discussion is that we appear to have differing opinions and its probably best to skip Section 7.6 so we can get to publishing the draft*, and deal with DAD recovery issues later in other contexts such as SEND, optimistic DAD, or movement detection. ---------------- Erik Nordmark responds to Jari Arkko: Color me concerned - not outright disagreeing. If folks want this text because they think random + 5 tries is strictly better than EIU-64 with one try then I think they are deluding themselves. How serious the consequences of such delusion is hard to tell. If the node/interface doesn't have an EUI-64 then using random + 5 tries would be quite reasonable IMHO. ---------------- Jari Arkko responds to Erik Nordmark: > Color me concerned - not outright disagreeing. I'd rather see you in black or white since I want to make a black and white decision on including the text today! Or are you saying "concerned" color leads to putting the text in an appendix, as Basavaraj suggested? > If folks want this text because they think random + 5 tries is > strictly better than EIU-64 with one try then I think they > are deluding themselves. How serious the consequences of such delusion > is hard to tell. I think the root cause of this issue is the need for resilience for various IP layer functions, and the need for this is higher when you move around. But admittedly the text does not really deal with the root cause directly. Rather, it tries to lessen the impacts without modifying the RFCs that deal with the issue directly. > If the node/interface doesn't have an EUI-64 then using random + 5 tries > would be quite reasonable IMHO. Yes, though I suspect that should already be cover in some other specification. Otherwise even stationary nodes without EUI-64 couldn't get access... ---------------- ---------------- ---------------- ----------------