Hardware or Software Failure?

Hey drop us a line about the show. Feel free to ask questions, provide feedback and criticism, or just ramble on about anything your little heart desires.

Moderators: snarkout, Patrick, dann

Post Reply
User avatar
dann
Site Admin
Posts: 1132
Joined: Mon Apr 26, 2004 10:55 pm
Location: Hampton, Va, USA
Contact:

Hardware or Software Failure?

Post by dann » Sat Jul 22, 2006 8:35 am

I've been getting the following error from Firewire drives:

sd 3:0:0:0: rejecting I/O to offline device
sd 3:0:0:0: rejecting I/O to offline device
printk: 5381 messages suppressed.
Buffer I/O error on device sda1, logical block 7110
lost page write due to I/O error on sda1
Buffer I/O error on device sda1, logical block 7111
lost page write due to I/O error on sda1
Buffer I/O error on device sda1, logical block 7112

I have tried three different drives on two different Firewire ports. Thinking the partition tables or partitions themselves may have been hosed, I repartitioned and formated using either ext3 or reiserfs. This has changed nothing.

When I start to copy files over it seems to be working. After a bit I get this error and it bombs out. The files I am trying to copy are in excess of 5gb. This was never a problem before.

Since there have been a lot of power outages at work I thought maybe the system became corrupted (Slackware 10.1). So, I installed Arch; but I am getting the same error.

I'm beginning to wonder if it is a hardware issue. The only thing I have not tried yet was moving the Disks to another machine. But, since I tried three different disks already; I find it improbably that all three would be hosed.

Tsuroerusu
Posts: 2551
Joined: Mon Sep 05, 2005 8:51 am
Location: Silkeborg, Denmark
Contact:

Post by Tsuroerusu » Sun Jul 23, 2006 1:54 am

Back in October of last year, I got very similar errors in FreeBSD regarding my data partition, and I backed up very quickly, and two days later.... the drive died. I'd say it's a hardware error, you could try running some kind of dianostic tool.
Image
Image

"Hatred does not cease by hatred, but only by love. This is the eternal rule."
- Siddhattha Gotama (Buddha), founder of Buddhism.

User avatar
dann
Site Admin
Posts: 1132
Joined: Mon Apr 26, 2004 10:55 pm
Location: Hampton, Va, USA
Contact:

Post by dann » Sun Jul 23, 2006 10:16 am

I'm thinking it is a hardware failure, but on the controller of the FW chip. I don't think the drives are failing. Three external drives at one time? That's pretty improbable.

User avatar
dann
Site Admin
Posts: 1132
Joined: Mon Apr 26, 2004 10:55 pm
Location: Hampton, Va, USA
Contact:

Post by dann » Mon Jul 24, 2006 8:12 am

I'm pretty sure this is a hardware failure on the system end. I took one of the drives and had no problem wriiting to it on another machine. When I put it back on this backup system, it through out errors after a few minutes of writing the file to the drive.

User avatar
Gomer_X
Posts: 901
Joined: Fri Jun 03, 2005 1:31 pm
Location: Cincinnati, Ohio, USA
Contact:

Re: Hardware or Software Failure?

Post by Gomer_X » Mon Jul 24, 2006 9:36 am

dann wrote:I've been getting the following error from Firewire drives:

sd 3:0:0:0: rejecting I/O to offline device
sd 3:0:0:0: rejecting I/O to offline device
printk: 5381 messages suppressed.
Buffer I/O error on device sda1, logical block 7110
lost page write due to I/O error on sda1
Buffer I/O error on device sda1, logical block 7111
lost page write due to I/O error on sda1
Buffer I/O error on device sda1, logical block 7112
These errors look like what I get when I unplug my USB drive without properly unmounting it. The system is trying to write to the drive, but it's not there. I'd suspect a faulty connection, but the fact that you've switched hardware and still had the problem is puzzling. Using multiple ports on the same machine could still mean a hardware problem on that machine.

I'd try tests with files under 2 gigs to make sure it's not a filesystem problem. Sometimes a specific tool (ssh, tar, rsync) balks at large files when the underlying FS doesn't.

User avatar
godzero
Posts: 34
Joined: Wed May 31, 2006 7:50 am

Post by godzero » Mon Jul 24, 2006 5:52 pm

Also look at doing a MEMtest.

Prolly the controller ( or other bottleneck between the devices)

As another thought. 5GB is just bigger than 4GB. I'm only stating this to be complete. Is there a chance that the software/firmware might not be able to handle anything bigger than a 32 bit unsigned INT?. Have you tried moving a >4GB file on this setup before?

Post Reply