Page 1 of 1

Hardware or Software Failure?

Posted: Sat Jul 22, 2006 8:35 am
by dann
I've been getting the following error from Firewire drives:

sd 3:0:0:0: rejecting I/O to offline device
sd 3:0:0:0: rejecting I/O to offline device
printk: 5381 messages suppressed.
Buffer I/O error on device sda1, logical block 7110
lost page write due to I/O error on sda1
Buffer I/O error on device sda1, logical block 7111
lost page write due to I/O error on sda1
Buffer I/O error on device sda1, logical block 7112

I have tried three different drives on two different Firewire ports. Thinking the partition tables or partitions themselves may have been hosed, I repartitioned and formated using either ext3 or reiserfs. This has changed nothing.

When I start to copy files over it seems to be working. After a bit I get this error and it bombs out. The files I am trying to copy are in excess of 5gb. This was never a problem before.

Since there have been a lot of power outages at work I thought maybe the system became corrupted (Slackware 10.1). So, I installed Arch; but I am getting the same error.

I'm beginning to wonder if it is a hardware issue. The only thing I have not tried yet was moving the Disks to another machine. But, since I tried three different disks already; I find it improbably that all three would be hosed.

Posted: Sun Jul 23, 2006 1:54 am
by Tsuroerusu
Back in October of last year, I got very similar errors in FreeBSD regarding my data partition, and I backed up very quickly, and two days later.... the drive died. I'd say it's a hardware error, you could try running some kind of dianostic tool.

Posted: Sun Jul 23, 2006 10:16 am
by dann
I'm thinking it is a hardware failure, but on the controller of the FW chip. I don't think the drives are failing. Three external drives at one time? That's pretty improbable.

Posted: Mon Jul 24, 2006 8:12 am
by dann
I'm pretty sure this is a hardware failure on the system end. I took one of the drives and had no problem wriiting to it on another machine. When I put it back on this backup system, it through out errors after a few minutes of writing the file to the drive.

Re: Hardware or Software Failure?

Posted: Mon Jul 24, 2006 9:36 am
by Gomer_X
dann wrote:I've been getting the following error from Firewire drives:

sd 3:0:0:0: rejecting I/O to offline device
sd 3:0:0:0: rejecting I/O to offline device
printk: 5381 messages suppressed.
Buffer I/O error on device sda1, logical block 7110
lost page write due to I/O error on sda1
Buffer I/O error on device sda1, logical block 7111
lost page write due to I/O error on sda1
Buffer I/O error on device sda1, logical block 7112
These errors look like what I get when I unplug my USB drive without properly unmounting it. The system is trying to write to the drive, but it's not there. I'd suspect a faulty connection, but the fact that you've switched hardware and still had the problem is puzzling. Using multiple ports on the same machine could still mean a hardware problem on that machine.

I'd try tests with files under 2 gigs to make sure it's not a filesystem problem. Sometimes a specific tool (ssh, tar, rsync) balks at large files when the underlying FS doesn't.

Posted: Mon Jul 24, 2006 5:52 pm
by godzero
Also look at doing a MEMtest.

Prolly the controller ( or other bottleneck between the devices)

As another thought. 5GB is just bigger than 4GB. I'm only stating this to be complete. Is there a chance that the software/firmware might not be able to handle anything bigger than a 32 bit unsigned INT?. Have you tried moving a >4GB file on this setup before?