Discussion:
[ofw] IBAL CM ignores responder resources of the passive side
Leonid Keller
2011-06-22 15:46:50 UTC
Permalink
Hi guys,

Shortly:
As far as understand, IBAL CM ignores responder resources of the passive side and use the active side InitDepth as an ultimate value for the responder resources.
It's look like a bug.

Details:
Look at __save_user_rep(), which stores REP data into the CEP.
It trims p_cep->resp_res in case it exceeds the HW capabilities.
Then CM uses this field to set REP MAD resp_res field.
For some reason __save_user_rep() doesn't look at p_cm_rep->resp_res field, which claims the amount of responder resources the passive side is ready to allocate.

static void
__save_user_rep(
IN cep_agent_t* const p_port_cep,
IN kcep_t* const p_cep,
IN const iba_cm_rep* const p_cm_rep,
IN uint8_t rnr_nak_timeout )
{
AL_ENTER( AL_DBG_CM );

p_cep->local_qpn = p_cm_rep->qpn;
p_cep->rq_psn = p_cm_rep->starting_psn;
p_cep->init_depth = p_cm_rep->init_depth;

ci_ca_lock_attr( p_port_cep->h_ca->obj.p_ci_ca );
/* Check the CA's responder resource max and trim if necessary. */
if( p_port_cep->h_ca->obj.p_ci_ca->p_pnp_attr->max_qp_resp_res <
p_cep->resp_res )
{
/*
* The CA cannot handle the requested responder resources.
* Set the response to the CA's maximum.
*/
p_cep->resp_res =
p_port_cep->h_ca->obj.p_ci_ca->p_pnp_attr->max_qp_resp_res;
}

ci_ca_unlock_attr( p_port_cep->h_ca->obj.p_ci_ca );

p_cep->rnr_nak_timeout = rnr_nak_timeout;

AL_EXIT( AL_DBG_CM );
}

I suggest to fix it the following way. What do you think ?

static void
__save_user_rep(
IN cep_agent_t* const p_port_cep,
IN kcep_t* const p_cep,
IN const iba_cm_rep* const p_cm_rep,
IN uint8_t rnr_nak_timeout )
{
AL_ENTER( AL_DBG_CM );

p_cep->local_qpn = p_cm_rep->qpn;
p_cep->rq_psn = p_cm_rep->starting_psn;
p_cep->init_depth = p_cm_rep->init_depth;

ci_ca_lock_attr( p_port_cep->h_ca->obj.p_ci_ca );
/* Check the CA's responder resource max and trim if necessary. */
if( p_port_cep->h_ca->obj.p_ci_ca->p_pnp_attr->max_qp_resp_res <
p_cep->resp_res )
{
/*
* The CA cannot handle the requested responder resources.
* Set the response to the CA's maximum.
*/
p_cep->resp_res =
p_port_cep->h_ca->obj.p_ci_ca->p_pnp_attr->max_qp_resp_res;
}

/* set responder resources, claimed by Accept, if they are less then Connect ones */
if ( p_cm_rep->resp_res && (p_cep->resp_res > p_cm_rep->resp_res) )
p_cep->resp_res = p_cm_rep->resp_res;

ci_ca_unlock_attr( p_port_cep->h_ca->obj.p_ci_ca );

p_cep->rnr_nak_timeout = rnr_nak_timeout;

AL_EXIT( AL_DBG_CM );
}
Hefty, Sean
2011-06-22 16:04:18 UTC
Permalink
without looking at more of the code, it looks like a bug
Post by Leonid Keller
__save_user_rep(
IN
cep_agent_t* const p_port_cep,
IN
kcep_t* const p_cep,
IN const iba_cm_rep* const
p_cm_rep,
IN
uint8_t
rnr_nak_timeout )
{
AL_ENTER( AL_DBG_CM );
p_cep->local_qpn = p_cm_rep->qpn;
p_cep->rq_psn = p_cm_rep->starting_psn;
p_cep->init_depth = p_cm_rep->init_depth;
I think we can just always do the assignment:

p_cep->resp_res = p_cm_rep->resp_res;

here. Is the p_cep->resp_res the value that actually goes into the REP?
Post by Leonid Keller
ci_ca_lock_attr( p_port_cep->h_ca->obj.p_ci_ca );
/* Check the CA's responder resource max and trim if necessary. */
if( p_port_cep->h_ca->obj.p_ci_ca->p_pnp_attr-
Post by Leonid Keller
max_qp_resp_res <
whine: The max_qp_resp_res isn't going to dynamically change. We should use a saved value somewhere that doesn't require serializing all CM users for this check.

- Sean
Leonid Keller
2011-06-22 17:02:43 UTC
Permalink
I'm afraid that just

p_cep->resp_res = p_cm_rep->resp_res;

can break some existing applications, which do not fill p_cm_rep->resp_res and still work.
Maybe

if (p_cm_rep->resp_res)
p_cep->resp_res = p_cm_rep->resp_res;
-----Original Message-----
Sent: Wednesday, June 22, 2011 7:04 PM
To: Leonid Keller; Fab Tillier
Cc: ofw_list
Subject: RE: IBAL CM ignores responder resources of the passive side
without looking at more of the code, it looks like a bug
Post by Leonid Keller
__save_user_rep(
IN
cep_agent_t* const p_port_cep,
IN
kcep_t* const
p_cep,
Post by Leonid Keller
IN const iba_cm_rep*
const
Post by Leonid Keller
p_cm_rep,
IN
uint8_t
rnr_nak_timeout )
{
AL_ENTER( AL_DBG_CM );
p_cep->local_qpn = p_cm_rep->qpn;
p_cep->rq_psn = p_cm_rep->starting_psn;
p_cep->init_depth = p_cm_rep->init_depth;
p_cep->resp_res = p_cm_rep->resp_res;
here. Is the p_cep->resp_res the value that actually goes into the REP?
Post by Leonid Keller
ci_ca_lock_attr( p_port_cep->h_ca->obj.p_ci_ca );
/* Check the CA's responder resource max and trim if necessary. */
if( p_port_cep->h_ca->obj.p_ci_ca->p_pnp_attr-
Post by Leonid Keller
max_qp_resp_res <
whine: The max_qp_resp_res isn't going to dynamically change. We
should use a saved value somewhere that doesn't require serializing all
CM users for this check.
- Sean
Hefty, Sean
2011-06-22 17:42:33 UTC
Permalink
Post by Leonid Keller
I'm afraid that just
p_cep->resp_res = p_cm_rep->resp_res;
can break some existing applications, which do not fill p_cm_rep->resp_res and still work.
So, there's no way for an app to accept the connection, but reject RDMA? Oh well...
Post by Leonid Keller
Maybe
if (p_cm_rep->resp_res)
p_cep->resp_res = p_cm_rep->resp_res;
This sounds good to me then.
Fab Tillier
2011-06-22 23:52:55 UTC
Permalink
I'd just let existing broken apps get fixed and fix it in the CM properly. That is, do the straight assignment, not the conditional assignment.

Cheers,
-Fab

-----Original Message-----
From: Leonid Keller [mailto:***@mellanox.co.il]
Sent: Wednesday, June 22, 2011 10:03 AM
To: Hefty, Sean; Fab Tillier
Cc: ofw_list
Subject: RE: IBAL CM ignores responder resources of the passive side

I'm afraid that just

p_cep->resp_res = p_cm_rep->resp_res;

can break some existing applications, which do not fill p_cm_rep->resp_res and still work.
Maybe

if (p_cm_rep->resp_res)
p_cep->resp_res = p_cm_rep->resp_res;
-----Original Message-----
Sent: Wednesday, June 22, 2011 7:04 PM
To: Leonid Keller; Fab Tillier
Cc: ofw_list
Subject: RE: IBAL CM ignores responder resources of the passive side
without looking at more of the code, it looks like a bug
Post by Leonid Keller
__save_user_rep(
IN
cep_agent_t* const p_port_cep,
IN
kcep_t* const
p_cep,
Post by Leonid Keller
IN const iba_cm_rep*
const
Post by Leonid Keller
p_cm_rep,
IN
uint8_t
rnr_nak_timeout )
{
AL_ENTER( AL_DBG_CM );
p_cep->local_qpn = p_cm_rep->qpn;
p_cep->rq_psn = p_cm_rep->starting_psn;
p_cep->init_depth = p_cm_rep->init_depth;
p_cep->resp_res = p_cm_rep->resp_res;
here. Is the p_cep->resp_res the value that actually goes into the REP?
Post by Leonid Keller
ci_ca_lock_attr( p_port_cep->h_ca->obj.p_ci_ca );
/* Check the CA's responder resource max and trim if necessary. */
if( p_port_cep->h_ca->obj.p_ci_ca->p_pnp_attr-
Post by Leonid Keller
max_qp_resp_res <
whine: The max_qp_resp_res isn't going to dynamically change. We
should use a saved value somewhere that doesn't require serializing all
CM users for this check.
- Sean
Leonid Keller
2011-06-23 12:07:40 UTC
Permalink
I'd say, it's not "broken" applications, but applications, that do not care what Responder Resources should be.
And if an application in a third company works with previous drivers on 1000 computers and now they get new drivers and install them on 1000 computers and suddenly the application fails ...

I'm going to add in our code the following lines:

ASSERT(p_cm_rep->resp_res);
if (p_cm_rep->resp_res)
p_cep->resp_res = p_cm_rep->resp_res;

You may think of leaving in the OFED code only the last line.
-----Original Message-----
Sent: Thursday, June 23, 2011 2:53 AM
To: Leonid Keller; Hefty, Sean
Cc: ofw_list
Subject: RE: IBAL CM ignores responder resources of the passive side
I'd just let existing broken apps get fixed and fix it in the CM
properly. That is, do the straight assignment, not the conditional
assignment.
Cheers,
-Fab
-----Original Message-----
Sent: Wednesday, June 22, 2011 10:03 AM
To: Hefty, Sean; Fab Tillier
Cc: ofw_list
Subject: RE: IBAL CM ignores responder resources of the passive side
I'm afraid that just
p_cep->resp_res = p_cm_rep->resp_res;
can break some existing applications, which do not fill p_cm_rep-
Post by Leonid Keller
resp_res and still work.
Maybe
if (p_cm_rep->resp_res)
p_cep->resp_res = p_cm_rep->resp_res;
Post by Leonid Keller
-----Original Message-----
Sent: Wednesday, June 22, 2011 7:04 PM
To: Leonid Keller; Fab Tillier
Cc: ofw_list
Subject: RE: IBAL CM ignores responder resources of the passive side
without looking at more of the code, it looks like a bug
Post by Leonid Keller
__save_user_rep(
IN
cep_agent_t* const
p_port_cep,
Post by Leonid Keller
Post by Leonid Keller
IN
kcep_t* const
p_cep,
Post by Leonid Keller
IN const iba_cm_rep*
const
Post by Leonid Keller
p_cm_rep,
IN
uint8_t
rnr_nak_timeout )
{
AL_ENTER( AL_DBG_CM );
p_cep->local_qpn = p_cm_rep->qpn;
p_cep->rq_psn = p_cm_rep->starting_psn;
p_cep->init_depth = p_cm_rep->init_depth;
p_cep->resp_res = p_cm_rep->resp_res;
here. Is the p_cep->resp_res the value that actually goes into the REP?
Post by Leonid Keller
ci_ca_lock_attr( p_port_cep->h_ca->obj.p_ci_ca );
/* Check the CA's responder resource max and trim
if
Post by Leonid Keller
Post by Leonid Keller
necessary. */
if( p_port_cep->h_ca->obj.p_ci_ca->p_pnp_attr-
Post by Leonid Keller
max_qp_resp_res <
whine: The max_qp_resp_res isn't going to dynamically change. We
should use a saved value somewhere that doesn't require serializing
all
Post by Leonid Keller
CM users for this check.
- Sean
Leonid Keller
2011-06-27 07:54:40 UTC
Permalink
We did the below patch and got instantly the assert.
Which brought me to understanding, that assert should be removed, but 'if' should be kept.
Why?
Because the active side do not look at ReponderResources and InitDepth of the passive side (sent in REP).
It allows CM to set all the parameters.
The old code gave to the passive side a possibility not to negotiate parameters, but to leave it to CM.
And I think, that we should keep this possibility.
-----Original Message-----
From: Leonid Keller
Sent: Thursday, June 23, 2011 3:08 PM
To: 'Fab Tillier'; Hefty, Sean
Cc: ofw_list
Subject: RE: IBAL CM ignores responder resources of the passive side
I'd say, it's not "broken" applications, but applications, that do not care
what Responder Resources should be.
And if an application in a third company works with previous drivers on 1000
computers and now they get new drivers and install them on 1000 computers and
suddenly the application fails ...
ASSERT(p_cm_rep->resp_res);
if (p_cm_rep->resp_res)
p_cep->resp_res = p_cm_rep->resp_res;
You may think of leaving in the OFED code only the last line.
-----Original Message-----
Sent: Thursday, June 23, 2011 2:53 AM
To: Leonid Keller; Hefty, Sean
Cc: ofw_list
Subject: RE: IBAL CM ignores responder resources of the passive side
I'd just let existing broken apps get fixed and fix it in the CM
properly. That is, do the straight assignment, not the conditional
assignment.
Cheers,
-Fab
-----Original Message-----
Sent: Wednesday, June 22, 2011 10:03 AM
To: Hefty, Sean; Fab Tillier
Cc: ofw_list
Subject: RE: IBAL CM ignores responder resources of the passive side
I'm afraid that just
p_cep->resp_res = p_cm_rep->resp_res;
can break some existing applications, which do not fill p_cm_rep-
Post by Leonid Keller
resp_res and still work.
Maybe
if (p_cm_rep->resp_res)
p_cep->resp_res = p_cm_rep->resp_res;
Post by Leonid Keller
-----Original Message-----
Sent: Wednesday, June 22, 2011 7:04 PM
To: Leonid Keller; Fab Tillier
Cc: ofw_list
Subject: RE: IBAL CM ignores responder resources of the passive side
without looking at more of the code, it looks like a bug
Post by Leonid Keller
__save_user_rep(
IN
cep_agent_t* const
p_port_cep,
Post by Leonid Keller
Post by Leonid Keller
IN
kcep_t* const
p_cep,
Post by Leonid Keller
IN const iba_cm_rep*
const
Post by Leonid Keller
p_cm_rep,
IN
uint8_t
rnr_nak_timeout )
{
AL_ENTER( AL_DBG_CM );
p_cep->local_qpn = p_cm_rep->qpn;
p_cep->rq_psn = p_cm_rep->starting_psn;
p_cep->init_depth = p_cm_rep->init_depth;
p_cep->resp_res = p_cm_rep->resp_res;
here. Is the p_cep->resp_res the value that actually goes into the REP?
Post by Leonid Keller
ci_ca_lock_attr( p_port_cep->h_ca->obj.p_ci_ca );
/* Check the CA's responder resource max and trim
if
Post by Leonid Keller
Post by Leonid Keller
necessary. */
if( p_port_cep->h_ca->obj.p_ci_ca->p_pnp_attr-
Post by Leonid Keller
max_qp_resp_res <
whine: The max_qp_resp_res isn't going to dynamically change. We
should use a saved value somewhere that doesn't require serializing
all
Post by Leonid Keller
CM users for this check.
- Sean
Fab Tillier
2011-06-27 16:06:42 UTC
Permalink
Leonid Keller wrote on Mon, 27 Jun 2011 at 00:54:40
Post by Leonid Keller
We did the below patch and got instantly the assert.
Running which ULP in WinOFED? Looking at the code, the ND provider does the right thing. WSD seems to not set resp_res and I'd say that was a bug. Since the WSD provider gets updated with any driver update, there's little chance of users running into this if both get fixed concurrently.
Post by Leonid Keller
Which brought me to understanding, that assert should be removed, but 'if' should be kept.
I think it's important to allow the CEP Manager to support 0 as a valid value, indicating that no RDMA reads are supported. I would keep the assert and remove the if. If you had a different value to indicate "auto-negotiate" you could have the proxy trap for zero and convert before passing down to the CEP manager.
Post by Leonid Keller
Why?
Because the active side do not look at ReponderResources and InitDepth of
the passive side (sent in REP).
It allows CM to set all the parameters.
The old code gave to the passive side a possibility not to negotiate
parameters, but to leave it to CM.
And I think, that we should keep this possibility.
Both the active and passive sides should support auto-negotiating, but it shouldn't be done using the value 0.

I think clients should be able to specify whatever values they desire, and the CM should then cap to the abilities of the hardware/negotiate on behalf of the client. The actual values can always be examined during connection establishment (when retrieving the QP attributes for QP transitions) or queried from the QP.

If an application knows that it could have up to 8 RDMA reads in flight at a time, it could always use 8 as the value. If the hardware doesn't support 8, it would cap it to whatever it does support. The client would only get an error if the underlying hardware had a limit of zero (unable to do RDMA reads), though I'm not sure that's even allowed by the IB standard.

-Fab
Post by Leonid Keller
-----Original Message-----
From: Leonid Keller
Sent: Thursday, June 23, 2011 3:08 PM
To: 'Fab Tillier'; Hefty, Sean
Cc: ofw_list
Subject: RE: IBAL CM ignores responder resources of the passive side
I'd say, it's not "broken" applications, but applications, that do not
care what Responder Resources should be. And if an application in a
third company works with previous drivers on 1000 computers and now
they get new drivers and install them on 1000 computers and suddenly
the application fails ...
ASSERT(p_cm_rep->resp_res);
if (p_cm_rep->resp_res)
p_cep->resp_res = p_cm_rep->resp_res;
You may think of leaving in the OFED code only the last line.
-----Original Message-----
Sent: Thursday, June 23, 2011 2:53 AM
To: Leonid Keller; Hefty, Sean
Cc: ofw_list
Subject: RE: IBAL CM ignores responder resources of the passive side
I'd just let existing broken apps get fixed and fix it in the CM
properly. That is, do the straight assignment, not the conditional
assignment.
Cheers,
-Fab
-----Original Message-----
Sent: Wednesday, June 22, 2011 10:03 AM
To: Hefty, Sean; Fab Tillier
Cc: ofw_list
Subject: RE: IBAL CM ignores responder resources of the passive side
I'm afraid that just
p_cep->resp_res = p_cm_rep->resp_res;
can break some existing applications, which do not fill p_cm_rep-
Post by Leonid Keller
resp_res and still work.
Maybe
if (p_cm_rep->resp_res)
p_cep->resp_res = p_cm_rep->resp_res;
Post by Leonid Keller
-----Original Message-----
Sent: Wednesday, June 22, 2011 7:04 PM
To: Leonid Keller; Fab Tillier
Cc: ofw_list
Subject: RE: IBAL CM ignores responder resources of the passive side
without looking at more of the code, it looks like a bug
Post by Leonid Keller
__save_user_rep(
IN
cep_agent_t* const p_port_cep,
IN
kcep_t* const p_cep,
IN const iba_cm_rep*
const
Post by Leonid Keller
p_cm_rep,
IN
uint8_t
rnr_nak_timeout )
{
AL_ENTER( AL_DBG_CM );
p_cep->local_qpn = p_cm_rep->qpn;
p_cep->rq_psn = p_cm_rep->starting_psn;
p_cep->init_depth = p_cm_rep->init_depth;
p_cep->resp_res = p_cm_rep->resp_res;
here. Is the p_cep->resp_res the value that actually goes into the REP?
Post by Leonid Keller
ci_ca_lock_attr( p_port_cep->h_ca->obj.p_ci_ca );
/* Check the CA's responder resource max and trim
if
Post by Leonid Keller
Post by Leonid Keller
necessary. */
if( p_port_cep->h_ca->obj.p_ci_ca->p_pnp_attr-
Post by Leonid Keller
max_qp_resp_res <
whine: The max_qp_resp_res isn't going to dynamically change. We
should use a saved value somewhere that doesn't require serializing
all CM users for this check.
- Sean
Leonid Keller
2011-06-27 17:24:54 UTC
Permalink
I think there are several application, that should be fixed and tested with your propose.
See, for example, __ndi_fill_cm_rep()

BTW, could someone remind me why do we have ib_cm_rep (which doesn't contain resp_res at all) and iba_cm_rep ?
Is the first one regarded as outdated ?
Do we have two CM APIs ?
What's the diff ?
-----Original Message-----
Sent: Monday, June 27, 2011 7:07 PM
To: Leonid Keller; Hefty, Sean
Cc: ofw_list
Subject: RE: IBAL CM ignores responder resources of the passive side
Leonid Keller wrote on Mon, 27 Jun 2011 at 00:54:40
Post by Leonid Keller
We did the below patch and got instantly the assert.
Running which ULP in WinOFED? Looking at the code, the ND provider does the
right thing. WSD seems to not set resp_res and I'd say that was a bug. Since
the WSD provider gets updated with any driver update, there's little chance of
users running into this if both get fixed concurrently.
Post by Leonid Keller
Which brought me to understanding, that assert should be removed, but 'if'
should be kept.
I think it's important to allow the CEP Manager to support 0 as a valid value,
indicating that no RDMA reads are supported. I would keep the assert and
remove the if. If you had a different value to indicate "auto-negotiate" you
could have the proxy trap for zero and convert before passing down to the CEP
manager.
Post by Leonid Keller
Why?
Because the active side do not look at ReponderResources and InitDepth of
the passive side (sent in REP).
It allows CM to set all the parameters.
The old code gave to the passive side a possibility not to negotiate
parameters, but to leave it to CM.
And I think, that we should keep this possibility.
Both the active and passive sides should support auto-negotiating, but it
shouldn't be done using the value 0.
I think clients should be able to specify whatever values they desire, and the
CM should then cap to the abilities of the hardware/negotiate on behalf of the
client. The actual values can always be examined during connection
establishment (when retrieving the QP attributes for QP transitions) or
queried from the QP.
If an application knows that it could have up to 8 RDMA reads in flight at a
time, it could always use 8 as the value. If the hardware doesn't support 8,
it would cap it to whatever it does support. The client would only get an
error if the underlying hardware had a limit of zero (unable to do RDMA
reads), though I'm not sure that's even allowed by the IB standard.
-Fab
Post by Leonid Keller
-----Original Message-----
From: Leonid Keller
Sent: Thursday, June 23, 2011 3:08 PM
To: 'Fab Tillier'; Hefty, Sean
Cc: ofw_list
Subject: RE: IBAL CM ignores responder resources of the passive side
I'd say, it's not "broken" applications, but applications, that do not
care what Responder Resources should be. And if an application in a
third company works with previous drivers on 1000 computers and now
they get new drivers and install them on 1000 computers and suddenly
the application fails ...
ASSERT(p_cm_rep->resp_res);
if (p_cm_rep->resp_res)
p_cep->resp_res = p_cm_rep->resp_res;
You may think of leaving in the OFED code only the last line.
-----Original Message-----
Sent: Thursday, June 23, 2011 2:53 AM
To: Leonid Keller; Hefty, Sean
Cc: ofw_list
Subject: RE: IBAL CM ignores responder resources of the passive side
I'd just let existing broken apps get fixed and fix it in the CM
properly. That is, do the straight assignment, not the conditional
assignment.
Cheers,
-Fab
-----Original Message-----
Sent: Wednesday, June 22, 2011 10:03 AM
To: Hefty, Sean; Fab Tillier
Cc: ofw_list
Subject: RE: IBAL CM ignores responder resources of the passive side
I'm afraid that just
p_cep->resp_res = p_cm_rep->resp_res;
can break some existing applications, which do not fill p_cm_rep-
Post by Leonid Keller
resp_res and still work.
Maybe
if (p_cm_rep->resp_res)
p_cep->resp_res = p_cm_rep->resp_res;
Post by Leonid Keller
-----Original Message-----
Sent: Wednesday, June 22, 2011 7:04 PM
To: Leonid Keller; Fab Tillier
Cc: ofw_list
Subject: RE: IBAL CM ignores responder resources of the passive side
without looking at more of the code, it looks like a bug
Post by Leonid Keller
__save_user_rep(
IN
cep_agent_t* const p_port_cep,
IN
kcep_t* const p_cep,
IN const iba_cm_rep*
const
Post by Leonid Keller
p_cm_rep,
IN
uint8_t
rnr_nak_timeout )
{
AL_ENTER( AL_DBG_CM );
p_cep->local_qpn = p_cm_rep->qpn;
p_cep->rq_psn = p_cm_rep->starting_psn;
p_cep->init_depth = p_cm_rep->init_depth;
p_cep->resp_res = p_cm_rep->resp_res;
here. Is the p_cep->resp_res the value that actually goes into the REP?
Post by Leonid Keller
ci_ca_lock_attr( p_port_cep->h_ca->obj.p_ci_ca );
/* Check the CA's responder resource max and trim
if
Post by Leonid Keller
Post by Leonid Keller
necessary. */
if( p_port_cep->h_ca->obj.p_ci_ca->p_pnp_attr-
Post by Leonid Keller
max_qp_resp_res <
whine: The max_qp_resp_res isn't going to dynamically change. We
should use a saved value somewhere that doesn't require serializing
all CM users for this check.
- Sean
Hefty, Sean
2011-06-27 17:40:28 UTC
Permalink
Post by Leonid Keller
I think there are several application, that should be fixed and tested with your propose.
See, for example, __ndi_fill_cm_rep()
BTW, could someone remind me why do we have ib_cm_rep (which doesn't
contain resp_res at all) and iba_cm_rep ?
Is the first one regarded as outdated ?
Do we have two CM APIs ?
What's the diff ?
The iba_cm_* structures are defined in ib_cm_ifc.h. This is a low level interface for the IB CM. It is used by winverbs.

The ib_cm_* structures are used as part of the ibal integrated interface, and requires access to other ibal data handles (such as the ibal qp handle). This basically sits over the iba_cm_* functionality.

- Sean
Fab Tillier
2011-06-27 17:44:11 UTC
Permalink
Leonid Keller wrote on Mon, 27 Jun 2011 at 10:24:54
Post by Leonid Keller
I think there are several application, that should be fixed and tested with your propose.
Yes.
Post by Leonid Keller
See, for example, __ndi_fill_cm_rep()
That's a bug, the IOCTL contains the resp_res value, but it is being ignored by the IOCTL handler. Just like WSD, though, the ND provider gets updated along with the drivers so fixing it should be no problem.

-Fab
Post by Leonid Keller
BTW, could someone remind me why do we have ib_cm_rep (which doesn't
contain resp_res at all) and iba_cm_rep ?
Is the first one regarded as outdated ?
Do we have two CM APIs ?
What's the diff ?
-----Original Message-----
Sent: Monday, June 27, 2011 7:07 PM
To: Leonid Keller; Hefty, Sean
Cc: ofw_list
Subject: RE: IBAL CM ignores responder resources of the passive side
Leonid Keller wrote on Mon, 27 Jun 2011 at 00:54:40
Post by Leonid Keller
We did the below patch and got instantly the assert.
Running which ULP in WinOFED? Looking at the code, the ND provider
does the right thing. WSD seems to not set resp_res and I'd say that
was a bug. Since the WSD provider gets updated with any driver update,
there's little chance of users running into this if both get fixed
concurrently.
Post by Leonid Keller
Which brought me to understanding, that assert should be removed, but
'if' should be kept.
I think it's important to allow the CEP Manager to support 0 as a valid
value, indicating that no RDMA reads are supported. I would keep the
assert and remove the if. If you had a different value to indicate
"auto-negotiate" you could have the proxy trap for zero and convert
before passing down to the CEP manager.
Post by Leonid Keller
Why? Because the active side do not look at ReponderResources and
InitDepth of the passive side (sent in REP). It allows CM to set all
the parameters. The old code gave to the passive side a possibility
not to negotiate parameters, but to leave it to CM. And I think, that
we should keep this possibility.
Both the active and passive sides should support auto-negotiating, but it
shouldn't be done using the value 0.
I think clients should be able to specify whatever values they desire,
and the CM should then cap to the abilities of the hardware/negotiate
on behalf of the client. The actual values can always be examined
during connection establishment (when retrieving the QP attributes for
QP transitions) or queried from the QP.
If an application knows that it could have up to 8 RDMA reads in flight at a
time, it could always use 8 as the value. If the hardware doesn't support 8,
it would cap it to whatever it does support. The client would only get an
error if the underlying hardware had a limit of zero (unable to do RDMA
reads), though I'm not sure that's even allowed by the IB standard.
-Fab
Post by Leonid Keller
-----Original Message-----
From: Leonid Keller
Sent: Thursday, June 23, 2011 3:08 PM
To: 'Fab Tillier'; Hefty, Sean
Cc: ofw_list
Subject: RE: IBAL CM ignores responder resources of the passive side
I'd say, it's not "broken" applications, but applications, that do not
care what Responder Resources should be. And if an application in a
third company works with previous drivers on 1000 computers and now
they get new drivers and install them on 1000 computers and suddenly
the application fails ...
ASSERT(p_cm_rep->resp_res);
if (p_cm_rep->resp_res)
p_cep->resp_res = p_cm_rep->resp_res;
You may think of leaving in the OFED code only the last line.
-----Original Message-----
Sent: Thursday, June 23, 2011 2:53 AM
To: Leonid Keller; Hefty, Sean
Cc: ofw_list
Subject: RE: IBAL CM ignores responder resources of the passive side
I'd just let existing broken apps get fixed and fix it in the CM
properly. That is, do the straight assignment, not the conditional
assignment.
Cheers,
-Fab
-----Original Message-----
Sent: Wednesday, June 22, 2011 10:03 AM
To: Hefty, Sean; Fab Tillier
Cc: ofw_list
Subject: RE: IBAL CM ignores responder resources of the passive side
I'm afraid that just
p_cep->resp_res = p_cm_rep->resp_res;
can break some existing applications, which do not fill p_cm_rep-
Post by Leonid Keller
resp_res and still work.
Maybe
if (p_cm_rep->resp_res)
p_cep->resp_res = p_cm_rep->resp_res;
Post by Leonid Keller
-----Original Message-----
Sent: Wednesday, June 22, 2011 7:04 PM
To: Leonid Keller; Fab Tillier
Cc: ofw_list
Subject: RE: IBAL CM ignores responder resources of the passive side
without looking at more of the code, it looks like a bug
Post by Leonid Keller
__save_user_rep(
IN
cep_agent_t* const p_port_cep,
IN
kcep_t* const p_cep,
IN const iba_cm_rep*
const
Post by Leonid Keller
p_cm_rep,
IN
uint8_t
rnr_nak_timeout )
{
AL_ENTER( AL_DBG_CM );
p_cep->local_qpn = p_cm_rep->qpn;
p_cep->rq_psn = p_cm_rep->starting_psn;
p_cep->init_depth = p_cm_rep->init_depth;
p_cep->resp_res = p_cm_rep->resp_res;
here. Is the p_cep->resp_res the value that actually goes into the REP?
Post by Leonid Keller
ci_ca_lock_attr( p_port_cep->h_ca->obj.p_ci_ca );
/* Check the CA's responder resource max and trim
if
Post by Leonid Keller
Post by Leonid Keller
necessary. */
if( p_port_cep->h_ca->obj.p_ci_ca->p_pnp_attr-
Post by Leonid Keller
max_qp_resp_res <
whine: The max_qp_resp_res isn't going to dynamically change. We
should use a saved value somewhere that doesn't require serializing
all CM users for this check.
- Sean
Vijay Jayapala
2014-05-10 15:54:29 UTC
Permalink
Post by Leonid Keller
Hi guys,
 
As far as understand, IBAL CM ignores responder resources of the passive
side and use the active side InitDepth as an ultimate value for the
responder resources.
Post by Leonid Keller
It’s look like a bug.
 
Look at __save_user_rep(), which stores REP data into the CEP.
It trims p_cep->resp_res in case it exceeds the HW capabilities.
Then CM uses this field to set REP MAD resp_res field.
For some reason __save_user_rep() doesn’t look at p_cm_rep->resp_res
field, which claims the amount of responder resources the passive side is
ready to allocate.
Post by Leonid Keller
 
static void
__save_user_rep(
               
IN                                                           cep_agent_t*
const                                       p_port_cep,
Post by Leonid Keller
               
IN                                                           kcep_t*
const                                                   p_cep,
Post by Leonid Keller
                IN                           const     iba_cm_rep*
const                                         p_cm_rep,
Post by Leonid Keller
               
IN                                                          
uint8_t                                                                     
            rnr_nak_timeout )
Post by Leonid Keller
{
                AL_ENTER( AL_DBG_CM );
 
                p_cep->local_qpn = p_cm_rep->qpn;
                p_cep->rq_psn = p_cm_rep->starting_psn;
                p_cep->init_depth = p_cm_rep->init_depth;
 
                ci_ca_lock_attr( p_port_cep->h_ca->obj.p_ci_ca );
                /* Check the CA's responder resource max and trim if
necessary. */
Post by Leonid Keller
                if( p_port_cep->h_ca->obj.p_ci_ca->p_pnp_attr-
max_qp_resp_res <
                                p_cep->resp_res )
                {
                                /*
                                 * The CA cannot handle the requested
responder resources.
Post by Leonid Keller
                                 * Set the response to the CA's maximum.
                                 */
                                p_cep->resp_res =
                                                p_port_cep->h_ca-
obj.p_ci_ca->p_pnp_attr->max_qp_resp_res;
                }
 
                ci_ca_unlock_attr( p_port_cep->h_ca->obj.p_ci_ca );
 
                p_cep->rnr_nak_timeout = rnr_nak_timeout;
 
                AL_EXIT( AL_DBG_CM );
}
 
I suggest to fix it the following way. What do you think ?
 
static void
__save_user_rep(
               
IN                                                           cep_agent_t*
const                                       p_port_cep,
Post by Leonid Keller
               
IN                                                           kcep_t*
const                                                   p_cep,
Post by Leonid Keller
                IN                           const     iba_cm_rep*
const                                         p_cm_rep,
Post by Leonid Keller
               
IN                                                          
uint8_t                                                                     
            rnr_nak_timeout )
Post by Leonid Keller
{
                AL_ENTER( AL_DBG_CM );
 
                p_cep->local_qpn = p_cm_rep->qpn;
                p_cep->rq_psn = p_cm_rep->starting_psn;
                p_cep->init_depth = p_cm_rep->init_depth;
 
                ci_ca_lock_attr( p_port_cep->h_ca->obj.p_ci_ca );
                /* Check the CA's responder resource max and trim if
necessary. */
Post by Leonid Keller
                if( p_port_cep->h_ca->obj.p_ci_ca->p_pnp_attr-
max_qp_resp_res <
                                p_cep->resp_res )
                {
                                /*
                                 * The CA cannot handle the requested
responder resources.
Post by Leonid Keller
                                 * Set the response to the CA's maximum.
                                 */
                                p_cep->resp_res =
                                                p_port_cep->h_ca-
obj.p_ci_ca->p_pnp_attr->max_qp_resp_res;
                }
 
                /* set responder resources, claimed by Accept, if they are
less then Connect ones */
Post by Leonid Keller
                if ( p_cm_rep->resp_res && (p_cep->resp_res > p_cm_rep-
resp_res) )
                                p_cep->resp_res = p_cm_rep->resp_res;
               
                ci_ca_unlock_attr( p_port_cep->h_ca->obj.p_ci_ca );
 
                p_cep->rnr_nak_timeout = rnr_nak_timeout;
 
                AL_EXIT( AL_DBG_CM );
}
_______________________________________________
ofw mailing list
ofw <at> lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ofw
Hi Leonid,

I have been using Mlnx_Winof_4.2.1165 drivers/SDK at my work place. Iam
seeing this issue, i.e when i set the responder resource during the cm_req,
it is not reflected in the cm_reply. So, when I call cm_rtu, I always get a
response as invalid parameter.

Has this fix been included in the 4.2 drivers?
Is there something Im missing in the ULP because of which im hitting this
issue?

Would really appreciate any kind of help on this.

Thanks in advance.
Best Regards
Vijay

Loading...