Skip to content

Commit 22e3817

Browse files
Daniel Jurgensdavem330
authored andcommitted
net/mlx4_core: Do not BUG_ON during reset when PCI is offline
The PCI channel could go offline during reset due to EEH. Don't bug on in this case, the error is recoverable. Fixes: f6bc11e ('net/mlx4_core: Enhance the catas flow to support device reset') Signed-off-by: Daniel Jurgens <danielj@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
1 parent 6b94bab commit 22e3817

File tree

1 file changed

+9
-2
lines changed
  • drivers/net/ethernet/mellanox/mlx4

1 file changed

+9
-2
lines changed

drivers/net/ethernet/mellanox/mlx4/catas.c

Lines changed: 9 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -182,10 +182,17 @@ void mlx4_enter_error_state(struct mlx4_dev_persistent *persist)
182182
err = mlx4_reset_slave(dev);
183183
else
184184
err = mlx4_reset_master(dev);
185-
BUG_ON(err != 0);
186185

186+
if (!err) {
187+
mlx4_err(dev, "device was reset successfully\n");
188+
} else {
189+
/* EEH could have disabled the PCI channel during reset. That's
190+
* recoverable and the PCI error flow will handle it.
191+
*/
192+
if (!pci_channel_offline(dev->persist->pdev))
193+
BUG_ON(1);
194+
}
187195
dev->persist->state |= MLX4_DEVICE_STATE_INTERNAL_ERROR;
188-
mlx4_err(dev, "device was reset successfully\n");
189196
mutex_unlock(&persist->device_state_mutex);
190197

191198
/* At that step HW was already reset, now notify clients */

0 commit comments

Comments
 (0)