« My Grandfather: March 20, 1911 - October 24, 2006 | Main | Seriously Cool OS X Apps & Tools »

November 3, 2006

Strange OS X Ethernet Behavior Driving Me Batsh*t

I've stumbled upon a strange and frustrating problem on my MacBook Pro this week and it's driving me crazy. Through various tests in the last few days, I've narrowed the problem down to a recognizable pattern, but I've yet to determine if there is anything I can do to fix it. More than anything, though, my frustration stems from not being able to determine why this problem has suddenly cropped up.

I first ran into the problem around Wednesday morning. iChat Video has replaced the telephone for most conversations that I have with my Mom. If we're both at our computers (which is a lot of the time... at least in my case), clicking the green camera icon in the Buddy List is easier than picking up the telephone. We do this a lot and seldom, if ever, have any problems outside of the occasional blips typical of real-time data traversing the wild and woolly Internet.

Starting that morning, we couldn't keep a connection going for more than a few minutes before it would freeze and spit out an error indicating the other end hadn't sent any data for ten seconds. This, of course, annoyed me to no end and I started looking for some kind obvious cause.

Three days later and after numerous elaborate experiments, I've opened Pandora's Box and I believe I've found a reproducible pattern.

In a nutshell, nearly every time my Mac blurps out a seemingly normal IGMP Membership Report (V1 & V2), it goes mute for anywhere from a couple of seconds to tens of seconds before returning to normal. Of course, a span of that many seconds will cause the iChat Video stream to crash and burn (as well as stall downloads, delay web page loading, etc).


I initially thought it was a problem with my wireless link to the outside world (a 1.2 mile link between the campus where I work and the house). I was startled to realize as I began testing, though, that I could see these little seizures in my Mac's output even on traffic even between my Mac and the Linux workstation sitting immediately to its left. All that separates the two systems is a high-end 10/100 ethernet switch here in the house.

I ran an ethernet cable directly between the two machines (no switches or hubs) and I was able to ping thousands of packets without any loss (as one would expect). This made me suspect the switch until I swapped it out temporarily and had the same problem with another model.

On a whim, I shut down the port on the switch that uplinks to my wireless span (therefore cutting myself off from the campus and Internet) and found that this scenario also eliminated the packet loss. This made me realize it had something to do with an outside influence, so I started looking more closely at packet captures until I spotted a pattern.

As mentioned above, the pattern I found was IGMP announcements from my Mac always immediately preceding the periods of my Mac going silent. These IGMP reports don't tend to happen with any frequency (if at all) when I detach the house from the broadcast traffic coming from hundreds of computers on the campus LAN.

During these periods while my Mac is effectively mute, it is continuing to accept and record broadcast traffic, but not unicast traffic. In other words, I don't see the normal broadcast noise mixed with the incoming ping requests that are going unanswered... I simply don't see any unicast traffic at all in the Mac's Ethereal capture when this is happening. There is no sign of the incoming pings until the machine snaps out of its funk and, even then, it just resumes with the most current incoming ping request and starts replying like nothing ever happened. The simultaneous capture on the Linux side, however, clearly shows the pings going out unanswered. (see my diagram above for a visual illustration)

I've also determined that the problem does NOT affect my Mac's AirPort connection... only the physical ethernet.

So, as it stands now, I have a sense of what is happening but no idea what caused it to suddenly start behaving in this manner or what I can do to effectively correct it. I'm still planning to do a clean OS X install on an external drive in hopes I can narrow down a software problem by comparison. The only thing I recall updating or installing immediately prior to this was the iTunes 7.0.2 update, but I don't see anything in that package that would have this kind of influence at such a low level.

I've attached an illustration I've made showing the packet capture comparison between the Linux workstation and my MacBook Pro. It should help clarify what I'm describing.

If anybody has any additional insights or suggestions, please drop me a note. If I make any discoveries of note, I'll post them here.

- Aaron

Posted by amahler on November 3, 2006 at 5:12 PM

Trackback Pings

TrackBack URL for this entry:
http://www.halfpress.com/mt/mt-tb.cgi/42

Comments

What happens if you use UDP pings rather than ICMP pings? Are those packets dropped too? Or is it just ICMP packets?

Does it happen if you just direct connect the linux box and the mac? Is it a switch issue? I see you swapped the switch for another switch to test it but have you tried it without the switch?

IGMP reminds me of Rendezvous and mDNS.

I've seen odd problems with multicast traffic (for Rendezvous) with unmanaged switches when the configuration changes (bad caching of multicast MAC addresses for devices moving between ports, I think) but unicast works fine.

Posted by: Kevin Purcell on November 6, 2006 at 4:59 PM

Post a comment



(optional)


Remember Me?