CentOS Client - Unable to Establish iSCSI connection with multiple interfaces on the initiator

Posted by slashdot on Server Fault See other posts from Server Fault or by slashdot
Published on 2012-01-05T01:54:09Z Indexed on 2012/06/06 16:42 UTC
Read the original article Hit count: 409

So after upgrading to CentOS 6.2, I am seemingly no longer able to login into my iSCSI targets. I have multiple interfaces on different subnets on the system, and I first thought that it had to do with the fact that I may not be binding correct interfaces, which seems to be the case when looking at netstat, as this is clearly wrong:

[root]? netstat -na|grep .90
tcp        0      1 10.10.100.60:42354          10.10.8.90:3260             SYN_SENT    
tcp        0      1 10.10.100.60:40777          10.10.9.90:3260             SYN_SENT 

I then went ahead and disabled all but one interface, and so as a result netstat appears to be correct, but the issue with login remains. I am positive that the target never sees a packet, because I see nothing by SYN_SENT. I know the problem is on my client, because the target is servicing multiple systems, none of which are CentOS 6.2. At this point I am pretty confident that some things changed between CentOS 6.0/6.1 and 6.2. So, if anyone have any thoughts, or ran into this, I would very much like to hear your thoughts.

[root]? iscsiadm --mode node --targetname iqn.2011-12.dom.homer:01:lab-centos-servers-00001 --portal 10.10.8.90:3260,2 --interface=sw-iscsi-0 --login
Logging in to [iface: sw-iscsi-0, target: iqn.2011-12.dom.homer:01:lab-centos-servers-00001, portal: 10.10.8.90,3260] (multiple)
iscsiadm: Could not login to [iface: sw-iscsi-0, target: iqn.2011-12.dom.homer:01:lab-centos-servers-00001, portal: 10.10.8.90,3260].
iscsiadm: initiator reported error (8 - connection timed out)
iscsiadm: Could not log into all portals


[root]? netstat -rn
Kernel IP routing table
Destination     Gateway         Genmask         Flags   MSS Window  irtt Iface
10.10.8.0       0.0.0.0         255.255.255.0   U         0 0          0 eth2.7
10.10.9.0       0.0.0.0         255.255.255.0   U         0 0          0 eth3.7
10.10.100.0     0.0.0.0         255.255.252.0   U         0 0          0 eth0
169.254.0.0     0.0.0.0         255.255.0.0     U         0 0          0 eth0
169.254.0.0     0.0.0.0         255.255.0.0     U         0 0          0 eth1
169.254.0.0     0.0.0.0         255.255.0.0     U         0 0          0 eth2
169.254.0.0     0.0.0.0         255.255.0.0     U         0 0          0 eth3
169.254.0.0     0.0.0.0         255.255.0.0     U         0 0          0 eth2.7
169.254.0.0     0.0.0.0         255.255.0.0     U         0 0          0 eth3.7
0.0.0.0         10.10.100.1     0.0.0.0         UG        0 0          0 eth0

Output of ip addr show for the two interfaces involved:

[root]? for i in 2.7 3.7; do ip addr show eth$i; done
6: eth2.7@eth2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP 
    link/ether 00:0c:29:94:5b:8d brd ff:ff:ff:ff:ff:ff
    inet 10.10.8.60/24 brd 10.10.8.255 scope global eth2.7
    inet6 fe80::20c:29ff:fe94:5b8d/64 scope link 
       valid_lft forever preferred_lft forever
7: eth3.7@eth3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP 
    link/ether 00:0c:29:94:5b:97 brd ff:ff:ff:ff:ff:ff
    inet 10.10.9.60/24 brd 10.10.9.255 scope global eth3.7
    inet6 fe80::20c:29ff:fe94:5b97/64 scope link 
       valid_lft forever preferred_lft forever

Update 01/06/2012:

This issue is getting even more interesting by the day it seems. I went a few weeks back and grabbed a snapshot of this system from before upgrading to 6.2. I spun up a new system from the snapshot, and reconfigured interface info and host keys, as well as iSCSI initiator and iscsi interface info to match new MACs. Changed nothing else.

Then, I attempted to connect to my targets, and no issues at all. I cannot say that this was unexpected. I then went ahead and compared sysctl settings from both systems and there were differences after the upgrade, but nothing seemingly relevant to iSCSI or IP that could contribute to this. I also noticed that by default now two sessions per connection were enabled after the upgrade, but I changed it back to 1 session in /etc/iscsi/iscsid.conf.

On the problematic system we can see that source interface is seemingly wrong, but even when I disable the 10.10.100 interface, problems persist. So, while this may be relevant, I could not validate it for certain. Needless to say, further research is necessary. Something is clearly different between releases. Working system is on 6.1, and non-working is 6.2.

::Working System::
tcp        0      0 10.10.8.210:39566           10.10.8.90:3260             ESTABLISHED 
tcp        0      0 10.10.9.210:46518           10.10.9.90:3260             ESTABLISHED 

[root]? ip route show
10.10.8.0/24 dev eth2.6  proto kernel  scope link  src 10.10.8.210 
10.10.9.0/24 dev eth3.7  proto kernel  scope link  src 10.10.9.210 
10.10.100.0/22 dev eth0  proto kernel  scope link  src 10.10.100.210 
169.254.0.0/16 dev eth0  scope link  metric 1002 
169.254.0.0/16 dev eth2.6  scope link  metric 1006 
169.254.0.0/16 dev eth3.7  scope link  metric 1007 
default via 10.10.100.1 dev eth0

::Non-working System::
tcp        0      1 10.10.100.60:44737          10.10.9.90:3260             SYN_SENT    
tcp        0      1 10.10.100.60:55479          10.10.8.90:3260             SYN_SENT

[root]? ip route show
10.10.8.0/24 dev eth2.6  proto kernel  scope link  src 10.10.8.60 
10.10.9.0/24 dev eth3.7  proto kernel  scope link  src 10.10.9.60 
10.10.100.0/22 dev eth0  proto kernel  scope link  src 10.10.100.60 
169.254.0.0/16 dev eth0  scope link  metric 1002 
169.254.0.0/16 dev eth2.6  scope link  metric 1006 
169.254.0.0/16 dev eth3.7  scope link  metric 1007 
default via 10.10.100.1 dev eth0 

And the result is still same:

[root]? iscsiadm: Could not login to [iface: sw-iscsi-0, target: iqn.2011-12.dom.homer:01:lab-centos-servers-00001, portal: 10.10.8.90,3260].
iscsiadm: initiator reported error (8 - connection timed out)
iscsiadm: Could not login to [iface: sw-iscsi-1, target: iqn.2011-12.dom.homer:02:lab-centos-servers-00001, portal: 10.10.9.90,3260].
iscsiadm: initiator reported error (8 - connection timed out)
iscsiadm: Could not log into all portals

Update 01/08/2012:

I believe I have been able to figure out the answer to my issue. It is quite obscure and I doubt this will happen to anyone else any time soon. It turns out that setting iface.iscsi_ifacename and iface.hwaddress in the interfaces configuration file is not legal. When one manually adds an iscsi target, such as below, all settings from the interface config file are copied into the node config file, that gets created by the below command. Result is parameters iface.iscsi_ifacename and iface.hwaddress together in the same config file. These parameters are seemingly mutually exclusive, which does not exactly make sense, or there is perhaps an oversight in the codepath. Perhaps I will investigate further.

# iscsiadm -m node --op new -T iqn.2011-12.dom.homer:01:lab-centos-servers-00001 -p 10.10.8.90,3260,2 -I sw-iscsi-0
# iscsiadm -m node --op new -T iqn.2011-12.dom.homer:02:lab-centos-servers-00001 -p 10.10.9.90,3260,2 -I sw-iscsi-1

Notice, below I commented out iface.hwaddress and iface.ipaddress, after which I re-added targets, with same command as above. All works just fine.

[root]? cat *
# BEGIN RECORD 2.0-872.33.el6
iface.iscsi_ifacename = sw-iscsi-0
iface.net_ifacename = eth2.6
#iface.hwaddress = XX:XX:XX:XX:XX:XX 
#iface.ipaddress = 10.10.8.60
iface.transport_name = tcp
iface.vlan_id = 6
iface.vlan_priority = 0
iface.iface_num = 0
iface.mtu = 0
iface.port = 0
# END RECORD
# BEGIN RECORD 2.0-872.33.el6
iface.iscsi_ifacename = sw-iscsi-1
iface.net_ifacename = eth3.7
#iface.hwaddress = XX:XX:XX:XX:XX:XX
#iface.ipaddress = 10.10.9.60
iface.transport_name = tcp
iface.vlan_id = 7
iface.vlan_priority = 0
iface.iface_num = 0
iface.mtu = 0
iface.port = 0
# END RECORD

Again, chances of this happening to someone else are slim to none, so likely waste of time typing this up. But, if someone does encounter this issue, I hope this post will help.

© Server Fault or respective owner

Related posts about networking

Related posts about centos