Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
Home
Discussion GroupsWindows VistaWindows XPWindows MeWindows 98Windows 95Virtual PCInternet ExplorerOutlook ExpressWindows MediaSecurity
Related Topics
MS Server ProductsMS OfficePC HardwareMore Topics ...

Windows Forum / Windows 98 / Disks / File System / May 2004

Tip: Looking for answers? Try searching our database.

Maxtor HDDs - Powermax Surface Test Error Mysteriously Disappears??

Thread view: 
Enable EMail Alerts  Start New Thread
Thread rating: 
Saidean - 26 Apr 2004 19:52 GMT
Hi,

Hoping someone with experience using powermax can help me understand
this. My system specs:

Win98SE
P4 1.8ghz CPU 512mb PC2700 ram
2 Maxtor 40gb DIamondmax UDMA133 drives in RAID ARRAY (Striped) via
MBFastTrack 133 lite built into Gigabyte GA8IEXP Mobo (Promise Tech
RAID Controller).
1 Seagate 40gb HDD (Data)

2 days ago, my PC hung and I had to shutdown the system (win98se) and
reboot. When I rebooted, I did a scandisk and it got stuck at 48%
scanning the file allocation table (FAT), and promptly coughed up
"Scandisk encountered data error while reading the FAT on drive C".
Panic attack. Tried using scandisk several times with various options
- no go. Used Norton Disk Doctor (in DOS mode) - no luck either.
Strangely enough, I was still able to reboot, and because I had my
startup menu options available by default, I could always boot to
command prompt mode and even the Norton Antivirus would be able to
scan and run there. I was able to copy out my critical files from my C
Drive without any problems.

This made me think that there is simply a mismatch between the
original FAT and the backup (2nd) FAT which needs to be corrected
somehow. Or bad sectors might be the cause. So I downloaded Powermax
from Maxtor to check the 2 HDDs. It detected them properly and then I
ran an advanced surface test on the first HDD. After some time it
encountered an error and asked if I wanted to fix it. As I hadn't
completed copying out my files, I said no to the fix.

I tried running Win98 in Safe mode and that worked without any
problems. I used scandisk from within safemode and it recognised the
FAT error and fixed it along with some other errors (time stamp
errors) it encountered. Rebooted to normal windows after this, and no
problems. Did Scandisk and NDD 3-4 times including surface tests and
again no problems.

Did a powermax surface test again, and this time it certified both
HDDs as clear of errors/bad sectors!

What happened to the initial error? Does powermax's surface test
'check' the FAT to see if there is a problem there? I thought all it
does is check the surface for bad sectors? If it does check FAT then I
can understand why it's now cleared of errors, but if not... where did
my error/bad sector disappear to?

Is it time to change my HDDs??

Any help greatly appreciated on this!
glee - 26 Apr 2004 22:46 GMT
Most disk diags from the drive manufacturers test the file system before they test the surface of the drive.  However you state that you were (apparently) well into the surface check when Powermax reported an error.

Modern hard drives have the ability to repair themselves, insofar as they can use a spare cluster to "replace" a damaged one, copying the data over if possible, and then removing the bad cluster from use.

Once the drive accomplishes that, neither Scandisk nor Powermax should see the bad cluster at all anymore.  This may be what occurred in your case.  Not a very technical answer, I know, but I think I have the gist of it right.  :-)
Signature

Glen Ventura, MS MVP W95/98 Systems
http://dts-l.org/goodpost.htm

> Hi,
>
[quoted text clipped - 47 lines]
>
> Any help greatly appreciated on this!
Folkert Rienstra - 26 Apr 2004 23:31 GMT
> Most disk diags from the drive manufacturers test the file system before they test the surface of the drive.  However you state
> that you were (apparently) well into the surface check when Powermax reported an error.
>
> Modern hard drives have the ability to repair themselves,

> insofar as they can use a spare cluster to "replace" a damaged one,

Hard drives do not use clusters.

> copying the data over if possible, and then removing the bad cluster from use.
>
> Once the drive accomplishes that, neither Scandisk nor Powermax should see the bad cluster at all anymore.  This may be what
> occurred in your case.  

> Not a very technical answer, I know, but I think I have the gist of it right.  :-)

Barely. And it doesnt explain what he saw.

> --
> Glen Ventura, MS MVP W95/98 Systems

> http://dts-l.org/goodpost.htm

And now the even allow trolls to be MVPs.

> > Hi,
> >
[quoted text clipped - 47 lines]
> >
> > Any help greatly appreciated on this!
AlmostBob - 27 Apr 2004 03:11 GMT
Just a hint,
a google search for "Hard drive cluster" returns 397,000 responses
Details view in defrag displays a cluster map
Start/Help/Search/"cluster" returns this text
Drive Converter (FAT32) is an improved version of the File Allocation Table
(FAT) that allows hard drives over two gigabytes to be formatted as a single
drive. Drive Converter uses smaller clusters than FAT drives, resulting in
more efficient space use. Windows 98 includes a graphical Drive Converter
conversion utility, which quickly and safely converts a hard drive from the
original FAT to FAT32.

1.Perhaps Hard drives do use clusters,
and 2.Glee earns his stripes, you havent
Signature

Adaware http://www.lavasoft.de
spybot http://security.kolla.de
AVG http://www.grisoft.com
Panda online scan http://www.pandasoftware.com/ActiveScan/
Catalog of removal tools http://www.pandasoftware.com/download/utilities/
Blocking Unwanted Parasites with a Hosts file
http://mvps.org/winhelp2002/hosts.htm
links provided as a courtesy,
Grateful thanks to the authors/webmasters

| > Most disk diags from the drive manufacturers test the file system before they test the surface of the drive.  However you state
| > that you were (apparently) well into the surface check when Powermax reported an error.
[quoted text clipped - 4 lines]
|
| Hard drives do not use clusters.

<<Snipped for brevity>>
glee - 27 Apr 2004 04:02 GMT
Clusters, however, is not the correct word for what I was saying...I meant to use "sectors".
I spent time in the comp.sys.ibm.pc.hardware.storage group a few years ago, and Folkert was just as rude then as now.  It is unfortunate, because he does know more than a little about hard drives.  I think his tendency to answer posts only to berate or belittle the other replies pretty well shows who the troll is.
Signature

Glen Ventura, MS MVP W95/98 Systems
http://dts-l.org/goodpost.htm

> Just a hint,
> a google search for "Hard drive cluster" returns 397,000 responses
[quoted text clipped - 23 lines]
> |
> <<Snipped for brevity>>
Eric Gisin - 27 Apr 2004 15:08 GMT
What a moron. Spend a day at pcguide.com. Hard drives do not have clusters.

> Just a hint,
> a google search for "Hard drive cluster" returns 397,000 responses
[quoted text clipped - 9 lines]
> 1.Perhaps Hard drives do use clusters,
> and 2.Glee earns his stripes, you havent
AlmostBob - 27 Apr 2004 18:15 GMT
A cluster is the group of sectors allocated by the OS as the minimum
manipulated disk size. Therefore a zero byte file occupies 1 cluster, and
continues to occupy 1 cluster until the size of the file is 1 byte larger than
the cluster size, at which point it occupies 2 clusters. Cluster size under
fat32 is dependent on Drive size, and may be 1k 2k 4k etc, but regardless of
the number of physical sectors that comprise a cluster, it is still a cluster,
the clusters still exist on the drive, which is why scandisk /surface displays
the cluster number as it scans
and for you Morons, this reference from your information source of choice
http://www.pcguide.com/ref/hdd/file/partCluster-c.html
http://www.pcguide.com/ref/hdd/file/fat.htm
and this quoted text "However, for performance reasons, individual sectors are
not allocated to files in the FAT system. The reason is that it would take a
lot of overhead (time and space) to keep track of pieces of files that were
this small: a 10 GB disk partition has 20,000,000 sectors! The hard disk is
instead broken into larger pieces called clusters, or alternatively,
allocation units. Each cluster contains a number of sectors. Typically,
clusters range in size from 2,048 bytes to 32,768 bytes, which corresponds to
4 to 64 sectors each"

can you spell a.shole, or do you just assume it when you look in the mirror
Signature

Adaware http://www.lavasoft.de
spybot http://security.kolla.de
AVG http://www.grisoft.com
Panda online scan http://www.pandasoftware.com/ActiveScan/
Catalog of removal tools http://www.pandasoftware.com/download/utilities/
Blocking Unwanted Parasites with a Hosts file
http://mvps.org/winhelp2002/hosts.htm
links provided as a courtesy,
Grateful thanks to the authors/webmasters

| What a moron. Spend a day at pcguide.com. Hard drives do not have clusters.
|
[quoted text clipped - 11 lines]
| > 1.Perhaps Hard drives do use clusters,
| > and 2.Glee earns his stripes, you havent
Folkert Rienstra - 27 Apr 2004 22:19 GMT
> Just a hint,
> a google search for "Hard drive cluster" returns 397,000 responses

Hardly surprising when one doesn't know how to conduct a search.

> Details view in defrag displays a cluster map
>
[quoted text clipped - 5 lines]
> conversion utility, which quickly and safely converts a hard drive from the
> original FAT to FAT32.

None of that has to do with how a harddrive is organized/accessed on the
hardware level. Obviously you don't even know the difference between a
physical drive (harddrive) and a logical drive (formatted partition) and how
filesystems work.

> 1.Perhaps Hard drives do use clusters,

Nope, they use sectors (or blocks depending on depending on
whether you speak IDE or SCSI).  Filesystems use clusters.

> and 2.Glee earns his stripes,

Sure he does, applied with a whip, obviously.

> you havent

Clueless.
Can't even setup his newsreader properly so thats actually no surprise.

> | > Most disk diags from the drive manufacturers test the file system before
> they test the surface of the drive.  However you state
[quoted text clipped - 8 lines]
> |
> <<Snipped for brevity>>
glee - 27 Apr 2004 03:57 GMT
I meat to say "sector" not "cluster".
I gave a possible scenario, not being sure from the info in the post what was fixed by scandisk in Safe Mode, nor what was found originally by Powermax.  I answered the post in the win98.gen_discussion group, which is not the hang-out of disk specialists....there is no need for you to be rude.

BTW, Folkert, you calling someone else a troll is pretty funny, from what I recall of the time I spent in the pc.hardware.storage group a few years ago....why not do a search in Google groups archive, where you will see that I am certainly not a troll.
Signature

Glen Ventura, MS MVP W95/98 Systems
http://dts-l.org/goodpost.htm

> > Most disk diags from the drive manufacturers test the file system before they test the surface of the drive.  However you state
> > that you were (apparently) well into the surface check when Powermax reported an error.
[quoted text clipped - 72 lines]
> > >
> > > Any help greatly appreciated on this!
Folkert Rienstra - 27 Apr 2004 23:25 GMT
> I meat to say "sector" not "cluster".

Well, that can happen when you are a very lazy person that doesn't proofread
his posts, which you obviously didn't do this time as well. A very lazy person that
obviously lets the newsclient do the line breaking instead of doing that himself.

> I gave a possible scenario,

> not being sure from the info in the post what was fixed by scandisk in Safe Mode,

A good reason to refrain from answering, I would think.
And if you had read his post properly you would have found that he did post at
what point the drive was corrected.

> nor what was found originally by Powermax.  

He was very clear on that, initially.

> I answered the post in the win98.gen_discussion group, which is not the hang-out
> of disk specialists....

Obviously more reason to refrain and let the disk specialists answer it.

> there is no need for you to be rude.  

Ofcourse there is.

> BTW, Folkert, you calling someone else a troll is pretty funny,

Nope, it is pretty serious.

> from what I recall of the time I spent in the pc.hardware.storage
> group a few years ago....why not do a search in Google groups archive,

> where you will see that I am certainly not a troll.

You are, in my book, when you post a link to goodpost.htm but obviously have
never read it yourself and break about every rule in there and then some.

You toppost, you post using quoted printable, don't use linebreaks and to top
it off you have an account at Mindspring. Yeah, you obviously are no troll.

>>> Most disk diags from the drive manufacturers test the file system before they test the surface of the drive.  However you state
>>> that you were (apparently) well into the surface check when Powermax reported an error.
[quoted text clipped - 72 lines]
>>>>
>>>> Any help greatly appreciated on this!
Folkert Rienstra - 26 Apr 2004 23:45 GMT
> Hi,
>
[quoted text clipped - 26 lines]
> ran an advanced surface test on the first HDD. After some time it
> encountered an error and asked if I wanted to fix it.

> As I hadn't completed copying out my files, I said no to the fix.

> I tried running Win98 in Safe mode and that worked without any
> problems.

> I used scandisk from within safemode and it recognised the FAT error and
> fixed it along with some other errors  (time stamp errors) it encountered.

> Rebooted to normal windows after this, and no problems. Did Scandisk
> and NDD 3-4 times including surface tests and  again no problems.
[quoted text clipped - 3 lines]
>
> What happened to the initial error?

It got cleared.

> Does powermax's surface test 'check' the FAT to see if there is a problem
> there?

Probably not, although some utes are FS aware.

> I thought all it does is check the surface for bad sectors?

> If it does check FAT then I can understand why it's now cleared of errors,

Nope, you said you didn't allow it to.

> but if not... where did my error/bad sector disappear to?

You answered that yourself:
You used scandisk from within safemode and it recognised the FAT error and
fixed it along with some other errors  (time stamp errors) it encountered.

Simple, no?

> Is it time to change my HDDs??
>
> Any help greatly appreciated on this!
Saidean - 28 Apr 2004 01:39 GMT
> > Did a powermax surface test again, and this time it certified both
> > HDDs as clear of errors/bad sectors!
> >
> > What happened to the initial error?
>
> It got cleared.
While I can understand that I 'cleared' it via scandisk in safe mode,
I ask this question because as you stated next:

> > Does powermax's surface test 'check' the FAT to see if there is a problem
> > there?
>
> Probably not, although some utes are FS aware.

what I'm still confused over is that IF (as you stated) powermax's
surface test does NOT check the FAT, then the initial problem detected
by powermax's surface test should still be there, even though the FAT
was repaired by scandisk in safe mode. Isn't this correct?

(Also what is FS aware?)

> > but if not... where did my error/bad sector disappear to?
>
[quoted text clipped - 3 lines]
>
> Simple, no?
I don't see it though - IF scandisk repaired the error (which is
true), then powermax should STILL detect its error via its surface
test, since that doesn't involve the FAT. But after the FAT has been
repaired, powermax's surface test no longer showed this bad sector
error anymore.

If as a previous poster said, the HDD 'repaired itself', how does the
HDD know there are bad sectors if they were never marked? The HDD
never had any bad sectors marked or any problems with bad sectors
before this.  Even now after several surface scans using scandisk, NDD
and powermax, it shows all clear.

So unless powermax DOES verify the FAT when it does the surface test,
(which would account for the error no longer being reported after
scandisk fixed it), I cannot understand how the bad sector error
reported by powermax disappeared after scandisk fixed the FAT.
Jeff Richards - 28 Apr 2004 07:19 GMT
Hard disk errors can be transient. It is quite possible for the system to
report an error on one pass, and then report no errors the next time the
same procedure is run. The original hang might have been a disk error - it's
not really possible to say.  The message reported by Scandisk was probably a
disk data error, although it might be the controller or even RAM.

It is quite possible that a FAT error will not affect the OS at all. A large
part of the FAT (corresponding to the unused portion of the disk) is not
used by the OS, so an error in that section will be ignored. That's probably
why you were able to boot and run tests.

Powermax does not check the FAT, but it does test the area of the disk where
the FAT is stored. Note that timestamps are in the directory entry, not the
FAT, so Scandisk found and fixed errors in different parts of the disk. What
Scandisk fixed sounds like a temporary corruption of data, as can be caused
by a glitch during a write procedure. What Powermax detected seems more like
a hard disk fault. Powermax never indicated it had fixed anything, so
perhaps it was marginal.  You could consider using Powermax regularly to
check that the drive isn't deteriorating.
Signature

Jeff Richards
MS MVP W95/W98

> > > Did a powermax surface test again, and this time it certified both
> > > HDDs as clear of errors/bad sectors!
[quoted text clipped - 40 lines]
> scandisk fixed it), I cannot understand how the bad sector error
> reported by powermax disappeared after scandisk fixed the FAT.
Folkert Rienstra - 28 Apr 2004 14:17 GMT
> > > Did a powermax surface test again, and this time it certified both
> > > HDDs as clear of errors/bad sectors!
[quoted text clipped - 15 lines]
> by powermax's surface test should still be there, even though the FAT
> was repaired by scandisk in safe mode.

> Isn't this correct?

No, it isn't.
By correcting the FAT it also corrected the bad sector in the FAT by
overwriting it. That sectors problem was either corrected at that point
or it was replaced by a spare sector.

> (Also what is FS aware?)

File System aware, know where the file systems administration is and not
touch that. E.g., IBMs DFT can overwrite (=zero) bad sectors that are in
user data but refuse to do so for sectors in MBR, FAT or Directories.

> > > but if not... where did my error/bad sector disappear to?
> >
[quoted text clipped - 7 lines]
> true), then powermax should STILL detect its error via its surface
> test, since that doesn't involve the FAT.

Of course does it involve the FAT, the FAT is on that 'surface'.

> But after the FAT has been repaired, powermax's surface test no longer
> showed this bad sector error anymore.

Because scandisk repaired it.
The bad sector was the cause of the FAT needing repair in the first place.

> If as a previous poster said, the HDD 'repaired itself', how does the
> HDD know there are bad sectors if they were never marked?

Because it can't read them, maybe? Simple eh. All it has to do is to mark
the sector(s) internally that they refused to read. On the next write to the
sector the drive can test it beforehand and decide to reuse it or replace it.

> The HDD never had any bad sectors marked or any problems with bad sectors
> before this.  

A flat tyre is only a flat tyre because it wasn't flat before it became a flat tyre.

> Even now after several surface scans using scandisk, NDD and powermax, it
> shows all clear.

Yes, your tyre is no longer flat and will stay so until it springs a new leak.

> So unless powermax DOES verify the FAT when it does the surface test,
> (which would account for the error no longer being reported after
> scandisk fixed it), I cannot understand how the bad sector error
> reported by powermax disappeared after scandisk fixed the FAT.
PCR - 28 Apr 2004 01:44 GMT
It could be as Glee said, that it was auto-fixed by a chip on the hard
drive. But... indeed you do appear to have said Scandisk fixed the
error, two paragraphs before you ask where it went to.... "I used
scandisk from within safemode and it recognized the FAT error and fixed
it".

But, in case you are as batty as Reienstra/Gisin are nasty, just look
inside C:\Scandisk.log. What does it say in there?

Uhhhhh, I think, as the drive now passes all scans, it is still good.
Also, there was a good reason for the error-- you did crash first. Go
on, make a full system backup, though. And next week run those scans
again!

Signature

Thanks or Good Luck,
There may be humor in this post, and,
Naturally, you will not sue,
should things get worse after this,
PCR
pcrrcp@netzero.net

| Hi,
|
[quoted text clipped - 47 lines]
|
| Any help greatly appreciated on this!
cquirke (MVP Win9x) - 28 Apr 2004 13:35 GMT
>It could be as Glee said, that it was auto-fixed by a chip on the hard
>drive. But... indeed you do appear to have said Scandisk fixed the
>error, two paragraphs before you ask where it went to.... "I used
>scandisk from within safemode and it recognized the FAT error and fixed

Several items to clarify.

Firstly, Scandisk spends most of its time dealing with file system
logic errors that are unrelated to the HD's physical condition.  Only
when Scandisk does a surface scan, does it look for defective
clusters, tho bad sectors can trip up its logic checks and repairs.

Secondly, it's important to remember that while the HD's firmware and
the OS's surface scanning do the same sort of thing, they work at
different levels, and the one is oblivious of the other.

At the hardware level, disk space is split up into physical 512-byte
sectors.  These are addressed via the hard drive's own internal
hardware; everything else, from the UIDE controller backwards, can
only "ask nicely" for the HD to access these sectors.  

If the HD's firmware defect management copies the contents from a
failing sector, writes it to another sector, and from then on maps the
old sector's raw address to the new one - then no software running on
the PC is any the wiser, unless it can query the vendor-specific HD
firmware in some way.  Scandisk and the rest of the OS can't do that;
only HD-vendor-specific tools may have a chance there.

At the OS level, the OS sees an expanse of disk that has been set
aside as one or more volumes for its use (according to the
system-level partitioning scheme).  It divides the volume into a file
system structure area and a data cluster area (a cluster contains
multiple sectors).  

When Scandisk surface scan tests the cluser area, it can "fix"
clusters with failing sectors by copying them to a new cluster
address.  This is the addressing scheme it sees; not raw sectors.

>Uhhhhh, I think, as the drive now passes all scans, it is still good.

I'm less sure.  The thing to do is use the drive vendor's diagnostics,
or failing that to do a Scandisk surface scan in DOS mode so that you
can watch the cluster progress counter.

When the OS (including Scandisk surface scan) accesses a failing
sector, it may take several retries before the HD finds the sector and
passes CRC-OK data from it.  During that time, the HD's firmware may
"fix" the defect by remapping the bad sector, which you may hear as
nyaak-nyaak noises as the heads move about.  

If it takes long enough to do this - and Scandisk accepts an absurdly
long time as "normal" - then Scandisk may timeout and call the cluster
Bad, and do its own cluster-map-level relocation.

The point about all this is that a HD that has latency and perhaps
even visible bad sectors or clusters that "get better" is still a
highly suspect HD that IMO should be dragged out and pulped once the
data's been evacuated.  The HD firmware may have relocated the bad
sector, with or without loss of that sector's contents, and if so, a
fresh format may find no bad clusters.  Plus, because the HD keeps
"spare" sectors for this purpose, you won't see a capacity drop.  

Nonetheless, this is a HD that has started to fail.  You can judge
whether all these auto-fixing shenanigans go about delaying support
calls until the HD warranty expires, or a genuine attempt to make life
smoother for the user (even if unsuccessful sector moves lose data).

>Also, there was a good reason for the error-- you did crash first.

Maybe the HD defect was the good reason for the crash?

>-------------------- ----- ---- --- -- - -  -   -
 Running Windows-based av to kill active malware is like striking
 a match to see if what you are standing in is water or petrol.
>-------------------- ----- ---- --- -- - -  -   -
Folkert Rienstra - 28 Apr 2004 18:16 GMT
>> It could be as Glee said, that it was auto-fixed by a chip on the hard drive.
>> But... indeed you do appear to have said Scandisk fixed the error,
[quoted text clipped - 5 lines]
> Firstly, Scandisk spends most of its time dealing with file system
> logic errors that are unrelated to the HD's physical condition.  

> Only when Scandisk does a surface scan, does it look for defective
> clusters,

Not only clusters, it also scans the system area, apparently.

> tho bad sectors can trip up its logic checks and repairs.

Please explain.

> Secondly, it's important to remember that while the HD's firmware
> and the OS's surface scanning do the same sort of thing,

No, they don't.

> they work at different levels, and the one is oblivious of the other.
>
[quoted text clipped - 5 lines]
> If the HD's firmware defect management copies the contents from a
> failing sector, writes it to another sector,

That is 'failing' in the sense of 'about to fail', 'not yet failed'.
Obviously, it can only copy them when the sectors can still be read.

> and from then on maps the  old sector's raw address to the new one -
> then no software running on the PC is any the wiser, unless it can query
[quoted text clipped - 21 lines]
> sector, it may take several retries before the HD finds the sector
> and passes CRC-OK data from it.  

ECC actually.

> During that time, the HD's firmware may "fix" the defect by remapping the
> bad sector, which you may hear as nyaak-nyaak noises as the heads move about.

Right, although the nyaak-nyaak noises are the retries, not the remapping.

> If it takes long enough to do this - and Scandisk accepts an absurdly
> long time as "normal" -

So the drive will likely succeed (or fail) well within that time.

> then Scandisk may timeout and call the cluster Bad, and do its own cluster-
> map-level relocation.

I don't think that there is relocation. Where-to? There are no spare clusters.
The cluster is marked as bad and not available anymore.

> The point about all this is that a HD that has latency and perhaps
> even visible bad sectors or clusters that "get better" is still a
> highly suspect HD that IMO should be dragged out and pulped once the
> data's been evacuated.  

That is because you have no idea what a bad sector is and your
lower instincts are taking over: Don't understand? Smash it!

> The HD firmware may have relocated the bad
> sector, with or without loss of that sector's contents, and if so, a
> fresh format may find no bad clusters.  Plus, because the HD keeps
> "spare" sectors for this purpose, you won't see a capacity drop.
>
> Nonetheless, this is a HD that has started to fail.  

Or a powersupply or ps-connector that has started to fail or is getting
old and tired.

> You can judge whether all these auto-fixing shenanigans go about delaying
> support calls until the HD warranty expires, or a genuine attempt to make
> life smoother for the user

> (even if unsuccessful sector moves lose data).

Huh?

>> Also, there was a good reason for the error-- you did crash first.
>
[quoted text clipped - 4 lines]
>   a match to see if what you are standing in is water or petrol.
>> -------------------- ----- ---- --- -- - -  -   -
cquirke (MVP Win9x) - 29 Apr 2004 15:51 GMT
On Wed, 28 Apr 2004 19:16:23 +0200, "Folkert Rienstra"
>"cquirke (MVP Win9x)" <cquirkenews@nospam.mvps.org> wrote

>> Firstly, Scandisk spends most of its time dealing with file system
>> logic errors that are unrelated to the HD's physical condition.  

>> Only when Scandisk does a surface scan, does it look for defective
>> clusters,

>Not only clusters, it also scans the system area, apparently.

Yes, where "system area" refers to file system structure within the
volume, but not disk outside the volume.  That implies a Scandisk
surface scan will miss bad sectors in Cyl 0, Head 0 no matter what
drive letters you tell it to scan, as that's not in any volume.

Even with one primary (almost) filling the HD as C:, it's confusing:
 - CHS addressing addresses entire HD, but may be "virtual"
 - physical sector addressing addresses entire HD
 - logical sector addressing addresses entire volume (from PBR)
 - cluster addressing addresses volume data space (from root)

>> tho bad sectors can trip up its logic checks and repairs.

>Please explain.

Well, Scandisk may be intent only on checking the file system logic,
and trip over a bad sector within the file system structure (say,
within a FAT as I now see this thread is about).  How it reports this
my be a bit unpredictable; so far I've not seen it often enough (or in
all permutations) to say.  The structures where this may happen
include MBR, PBR, FATs, root, and subdir chains.

>> Secondly, it's important to remember that while the HD's firmware
>> and the OS's surface scanning do the same sort of thing,

>No, they don't.

They do, in the sense they react to difficulties in reading disk.  The
support code for NTFS does this on the fly in XP, whereas AFAIK the
rest of the system flags the event in the FAT's "bad disk" bit so that
the next startup will run a surface check.

>> they work at different levels, and the one is oblivious of the other.

>> At the hardware level, disk space is split up into physical 512-byte
>> sectors.  These are addressed via the hard drive's own internal
[quoted text clipped - 3 lines]
>> If the HD's firmware defect management copies the contents from a
>> failing sector, writes it to another sector,

>That is 'failing' in the sense of 'about to fail', 'not yet failed'.

In the sense that it takes too many retries to come up with correct
CRC value, or whatever serror-checking it does.

>Obviously, it can only copy them when the sectors can still be read.

Yes, but what do you think happens when it can't do this?  I suspect
it just maps the address to the new sector and to hell with the
contents.  After all, "everyone knows Windows sucks", so it's easy to
play kill/bury/deny and let the user blame something else.

I've seen this often enough in data recovery:
 - attempts to copy off file haks and haks on retries
 - then suddenly it works with no delay
 - but there's 512 bytes of void or garbage in the file

When copying stuff off sick HDs, I select source and dest and go
Properties to see that the file and byte counts match.  With one
particular large subtree I noticed the *source* file and byte counts
were dropping, suggesting some sectors in subdirs were zeroing out.

(and no, I wasn't moving instead of copying!  <g> )

>> When the OS (including Scandisk surface scan) accesses a failing
>> sector, it may take several retries before the HD finds the sector
>> and passes CRC-OK data from it.  

>ECC actually.

Thanks; couldn't remember the term   ;-)

>> During that time, the HD's firmware may "fix" the defect by remapping the
>> bad sector, which you may hear as nyaak-nyaak noises as the heads move about.

>Right, although the nyaak-nyaak noises are the retries, not the remapping.

Most likely, yes; I always interpreted them to be attempts to find the
cylinder, but they could be trips from source to destination (tho why
it would go to dest without good data from source is a question) or
they could even be difficulties in writing to dest (more likely the
case with Scandisk surface, which AFAIK chooses "far" clusters)

>> If it takes long enough to do this - and Scandisk accepts an absurdly
>> long time as "normal" -

>So the drive will likely succeed (or fail) well within that time.

Success is relative.  A failing surface that has to be "fixed" is not
something I'd accept as "good".  Remember, these are not inbuilt
manufacturing defects but acquired defects, so they are NOT normal.  

Plus, a head strike that damages disk can have three other effects:
 - pollution of the sealed airspace with abrasive debris
 - head damage
 - damage to adjacent disk surface currently seen as "OK"

>> then Scandisk may timeout and call the cluster Bad, and do its own cluster-
>> map-level relocation.

>I don't think that there is relocation. Where-to? There are no spare clusters.
>The cluster is marked as bad and not available anymore.

If there's data in the cluster, then Scandisk copies (what it can of)
that data to another free cluster within the FAT.  The FAT chain is
updated to point to this replacement cluster, while the original
cluster is marked Bad.  You're right; there's no "spare" clusters to
use, and the process IS visible at the FAT level, unlike frmware "fix"

So yes; there's relocation, though typically there may be a 512-byte
hole in the data for the sector that couldn't be copied.  Unaffected
sectors within the cluster get copied OK though, and this may help
with the problem of sick adjacent sectors to a limited extent.

>> The point about all this is that a HD that has latency and perhaps
>> even visible bad sectors or clusters that "get better" is still a
>> highly suspect HD that IMO should be dragged out and pulped once the
>> data's been evacuated.  

>That is because you have no idea what a bad sector is and your
>lower instincts are taking over: Don't understand? Smash it!

No, not really.  I know enough about this industry not to trust vendor
motives in such cases, having also noted that some HD vendor's
refurbished "serviceable used part" warranty replacements tend to die
within 6 to 18 months.  A bad HD is a bad HD.

>> The HD firmware may have relocated the bad
>> sector, with or without loss of that sector's contents, and if so, a
[quoted text clipped - 5 lines]
>Or a powersupply or ps-connector that has started to fail or is getting
>old and tired.

On what basis do you posit that?

>> You can judge whether all these auto-fixing shenanigans go about delaying
>> support calls until the HD warranty expires, or a genuine attempt to make
>> life smoother for the user

>> (even if unsuccessful sector moves lose data).

>Huh?

Well, if the vendor's happy to accept user data loss as a fair price
for keeping them off the support lines, then I think that really does
point away from a "customer value uber alles" mindset.

>-------------------- ----- ---- --- -- - -  -   -
 Running Windows-based av to kill active malware is like striking
 a match to see if what you are standing in is water or petrol.
>-------------------- ----- ---- --- -- - -  -   -
Folkert Rienstra - 29 Apr 2004 23:51 GMT
> On Wed, 28 Apr 2004 19:16:23 +0200, "Folkert Rienstra"
> > > On Tue, 27 Apr 2004 20:44:07 -0400, "PCR" <pcrrcp@netzero.net> wrote:
[quoted text clipped - 28 lines]
> in all permutations) to say.  The structures where this may happen
> include MBR, PBR, FATs, root, and subdir chains.

Ok, just saying that a bad sector in the system area may cause logical
FS errors.  It shouldn't but it may be different in practice, especially
under conditions other than Scandisk.

> > > Secondly, it's important to remember that while the HD's firm-
> > > ware and the OS's surface scanning do the same sort of thing,
>
> > No, they don't.
>
> They do, in the sense they react to difficulties in reading disk.  

Yes, but a drive doesn't "scan".
(Though SMART may do if set to do so).

> The support code for NTFS does this on the fly in XP,

Ok, I will have to believe you on that. That comes closer to what
a drive may do.

> whereas AFAIK the rest of the system flags the event in the FAT's
> "bad disk" bit so  that the next startup will run a surface check.
[quoted text clipped - 13 lines]
> In the sense that it takes too many retries to come up with correct
> CRC value, or whatever error-checking it does.

ECC, yes.

> > Obviously, it can only copy them when the sectors can still be read.
>
> Yes, but what do you think happens when it can't do this?  I suspect
> it just maps the address to the new sector and to hell with the contents.  

Of course not. That would create a very unreliable drive.
1st    you would have corruption without ever knowing it
2nd    you'd loose any opportunity to recover the data

There have been some Quantums though that indeed reassigned
bad sectors "by default". The default was changeable.

> After all, "everyone knows Windows sucks", so it's easy to
> play kill/bury/deny and let the user blame something else.

Pity harddrives aren't used exclusively for Windows.

> I've seen this often enough in data recovery:
>   - attempts to copy off file haks and haks on retries
>   - then suddenly it works with no delay

>   - but there's 512 bytes of void or garbage in the file

So actually it didn't work and the routine substituded an emp-
ty sector or the actual bad data in the sector. It is possible
to read bad data from a sector by specific PIO commands.

> When copying stuff off sick HDs, I select source and dest and go
> Properties to see that the file and byte counts match.  With one
[quoted text clipped - 17 lines]
>
> Most likely, yes; I always interpreted them to be attempts to find the cylinder,

It is, from the 'zero' position.

> but they could be trips from source to destination (tho why
> it would go to dest without good data from source is a question)

It shouldn't be a question. Just No.

> or they could even be difficulties in writing to dest

Not normally but with Reassigns the target sector might well be write-checked.

> (more likely the case with Scandisk surface, which AFAIK chooses "far"
> clusters)
[quoted text clipped - 7 lines]
> something I'd accept as "good".  Remember, these are not inbuilt
> manufacturing defects but acquired defects, so they are NOT normal.

There is no such thing as an inbuilt defect, they are just defects that
turn-up immediate where as "aquired defects" are just defects that
turn-up later. They are the same type of defect except that the (to be)
aquired one fell still within the margins at the time the drive left the
factory. That is what the sparing system is designed for in the first
place, to take care of the sectors that fall through the margins.

> Plus, a head strike that damages disk can have three other effects:
>   - pollution of the sealed airspace with abrasive debris

Not a big problem.
The damage to the surface itself is a far greater risk to the head.

>   - head damage

Yup.

>   - damage to adjacent disk surface currently seen as "OK"

A latent bad sector that may become reassigned sooner or later as long
as it is accessed before it becomes an uncorrectable error bad sector.

> > > then Scandisk may timeout and call the cluster Bad, and do its own
> > > clustermap-level relocation.
[quoted text clipped - 4 lines]
> If there's data in the cluster, then Scandisk copies (what it can of)
> that data to another free cluster within the FAT.  

I'll assume that Scandisk will notify the user of lost data.

> The FAT chain is
> updated to point to this replacement cluster, while the original
[quoted text clipped - 18 lines]
> refurbished "serviceable used part" warranty replacements tend to die
> within 6 to 18 months.  

What is so strange about that? The fact that you got one of those
means that you send in a dead one in the first place. They *do* die.
And what killed your original may well kill your replacement too.

> A bad HD is a bad HD.
>
[quoted text clipped - 9 lines]
>
> On what basis do you posit that?

From an IBM manual:
"
6.1 Data loss at Power off
o ...
o No more than one sector can be lost by power down during write operation
  while write cache is disabled.
o Power off during write operations may make an incomplete sector which
 will report hard data error when read. The sector can be recovered by a
 rewrite operation.
"

Like I said  "That is because you have no idea what a bad sector is " ;-)

> > > You can judge whether all these auto-fixing shenanigans go about delaying
> > > support calls until the HD warranty expires, or a genuine attempt to make
[quoted text clipped - 7 lines]
> for keeping them off the support lines, then I think that really does
> point away from a "customer value uber alles" mindset.

Quantum stopped doing that and now no drives reassign uncorrectable
read error bad sectors on reads anymore.

From that same IBM manual:

"
13.13.2 Nonrecovered read errors

When a read operation is failed after defined ERP is fully carried
out, a hard error is reported to the host system. This location is
registered internally as a candidate for the reallocation. When a
registered location is specified as a target of a write operation,
a sequence of media verification is performed automatically.
When the result of this verification  meets the criteria, this
sector is reallocated.

13.13.3 Recovered read errors
When a read operation for a sector has failed once and then has
recovered at the specific ERP step, this sector of data is reallo-
cated automatically. A media verification sequence may be run
prior to the relocation according to the predefined conditions.
"
cquirke (MVP Win9x) - 01 May 2004 01:24 GMT
On Fri, 30 Apr 2004 00:51:28 +0200, "Folkert Rienstra"
>"cquirke (MVP Win9x)" <cquirkenews@nospam.mvps.org>
>> On Wed, 28 Apr 2004 19:16:23 +0200, "Folkert Rienstra"
>> > > On Tue, 27 Apr 2004 20:44:07 -0400, "PCR"

>> > > Firstly, Scandisk spends most of its time dealing with file system
>> > > logic errors that are unrelated to the HD's physical condition.
>> > > ...tho bad sectors can trip up its logic checks and repairs.
>> > Please explain.

>> Well, Scandisk may be intent only on checking the file system logic,
>> and trip over a bad sector within the file system structure

>Ok, just saying that a bad sector in the system area may cause logical
>FS errors.  It shouldn't but it may be different in practice, especially
>under conditions other than Scandisk.

Yes it can, but more to the point, an unexpected failure to read disk
can throw off the logic checker.  Most software ASSumes disk access
will always succeed; by its nature, Scandisk should half-expect
problems, but MS seems to have lost that sort of clue.

From what I recall, Scandisk does handle it gracefully (unless it
never emerged from the inevitable retry blues before the user's lost
patience and reset the system).  But I've not seen it often enough,
nor have I tested the effects of bad sectors in all possible contexts.

Whether it reports it non-ambiguously is another matter.

>> > > Secondly, it's important to remember that while the HD's firm-
>> > > ware and the OS's surface scanning do the same sort of thing,

>> > No, they don't.

>> They do, in the sense they react to difficulties in reading disk.  

>Yes, but a drive doesn't "scan".
>(Though SMART may do if set to do so).

Oh, ICWYM.  Yes, both HD firmware, NTFS's support code and FATxx's
flagging code operate opportunistically, whereas Scandisk surface scan
and ChkDsk /R formally scan the whole volume looking for errors.

>> > Obviously, it can only copy them when the sectors can still be read.
>>
>> Yes, but what do you think happens when it can't do this?  I suspect
>> it just maps the address to the new sector and to hell with the contents.  

>Of course not. That would create a very unreliable drive.
>1st    you would have corruption without ever knowing it
>2nd    you'd loose any opportunity to recover the data

Quite.  But from what I see, this is exactly what happens.

HD vendors aren't alone in this; the tendency from MS has been to
throw your data if that means fewer support calls:
 - Win98 defaulted auto-Scandisk to kill/bury/deny
 - WinME did same, but offered far less control to override this
 - XP offers NO control to override this, no interactive checking,
   plus on-the-fly rollback and "fixing" of bad clusters

No-one else has any responsability for your data.  If you don't look
after it, no-one else will.

>There have been some Quantums though that indeed reassigned
>bad sectors "by default". The default was changeable.

>> After all, "everyone knows Windows sucks", so it's easy to
>> play kill/bury/deny and let the user blame something else.

>Pity harddrives aren't used exclusively for Windows.

That's most of the market, but now that you mention it, I think the
heads-up on what these HD vendors were doing may well have started
from the Linux community.  I suspect S.M.A.R.T. was a retro-fitted
window into what all this auto-fixing was doing, after some complaints
arose.  However, I find most sick HDs were NOT alerted by S.M.A.R.T.
even when this was enabled in CMOS setup (it's revealing that most
BIOS/CMOS settings duhfault to disabling S.M.A.R.T.)

>> I've seen this often enough in data recovery:
>>   - attempts to copy off file haks and haks on retries
[quoted text clipped - 3 lines]
>So actually it didn't work and the routine substituded an emp-
>ty sector or the actual bad data in the sector.

Yep - without any alerts.  Let's be charitable and posit that a sector
full of nuls may match the same CRC or ECC value, especially if that
also is nul.  Or wonder if the vendor just shrugs off the implication.

>It is possible to read bad data from a sector by specific PIO commands.

At some point, even that must fail, surely?

>There is no such thing as an inbuilt defect, they are just defects that
>turn-up immediate where as "aquired defects" are just defects that
>turn-up later.

There's a difference.  Some "flat spots" have always been accepted,
and in the old days these were visible to the user.  HDs shipped with
a defect map pasted on the shell, and low-level format (LLF) utilities
allowed defect management such as entering this map, re-scanning to
build the map from fresh and so on.

The idea was always to identify all questionable sectors and exclude
these from use, so the rest could be relied upon.  The LLF marked the
sector preambles and IDs, and was used to "freshen" these should the
HD later become unreliable.  Before voice-call servo positioning came
in, the heads were positioned by stepper motor much as they are for
diskettes, and hot HD platters could expand away from optimal
positioning; hence "warm up HD before format".

Even then, there was a distinction made between "manufactured" and
"acquired" defects.  If the only defects were those in the defect
list, then nothing's changed; the HD is as good as it was.  But when
new defects are acquired, it implies the disk is deteriorating.

Today, "manufactured" defects are hidden by the manufacturing process,
and the LLF is done at the factory and cannot be renewed.  Even
"acquired" defects can be hidden from the user, as described.  So by
the time defects are visible to Scandisk (noting how lax Scandisk
surface scan is about failing "slow" clusters) it's highly significant

>They are the same type of defect except that the (to be)
>aquired one fell still within the margins at the time the drive left the
>factory. That is what the sparing system is designed for in the first
>place, to take care of the sectors that fall through the margins.

Point is, a sector that was "this" side of the margin that is now
"that" side of the margin represents a failing disk.  My policy is
zero tolerance to failing HDs; too much is at stake.

>> > > then Scandisk may timeout and call the cluster Bad, and do its own
>> > > clustermap-level relocation.
[quoted text clipped - 4 lines]
>> If there's data in the cluster, then Scandisk copies (what it can of)
>> that data to another free cluster within the FAT.  

>I'll assume that Scandisk will notify the user of lost data.

Same as for firmware auto-fixing - I wouldn't rely on that.  Some
partitioning/geometry errors can cause a wad of "bad clusters" to
appear at the end of the last volume (because the cluster address
space extends beyond the end of physical disk) and in such cases,
surface scan relocates do fail.

>> No, not really.  I know enough about this industry not to trust vendor
>> motives in such cases, having also noted that some HD vendor's
>> refurbished "serviceable used part" warranty replacements tend to die
>> within 6 to 18 months.  

>What is so strange about that? The fact that you got one of those
>means that you send in a dead one in the first place. They *do* die.

Point is, the failure rate on refurbs is waaaay higher than "real" new
HDs, and refurbs were not always clearly marked as such (some butt has
been kicked there; now several vendors who used to claim they were
replacing with new HDs are doing refurbs marked as such).

When a HD fails under warranty, the replacement HD is warrantied only
as long as the original HD's warranty (irrespective of the date on the
HD).  Fair enough.  Sometimes you get a new replacement; sometimes you
wait for a "special" new replacement even when the same model is in
stock as new product, sometimes you get an acknowledged refurb.

Often, these replacement HDs fail considerably sooner than you'd
expect.  Seagate started the "refurb" thing (and more recently, were
first to go 1-yr rather than 3-yr warranty) and it was getting
ridiculous; I'd have clients with a Seagate that failed after 18
months who would go through two replacements in 3 months and refuse to
accept the third.  In some cases this was in a different PC running in
a different building, so...

>And what killed your original may well kill your replacement too.

...does not apply.  This is common where the user elects to get a new
HD for the PC (to speed turnaround time and upgrade the PC), while the
warranty replacement goes to an older PC to upgrade that.

>> > > Nonetheless, this is a HD that has started to fail.

>> > Or a powersupply or ps-connector that has started to fail or is getting
>> > old and tired.

>> On what basis do you posit that?

>From an IBM manual:
>"
[quoted text clipped - 8 lines]
>
>Like I said  "That is because you have no idea what a bad sector is " ;-)

In such cases, you shouldn't see increased latency elswhere on the HD
when surface scanning it from DOS mode; yet that's what I usually do
see.  I've seen fresh refurbs do that too; no Bad clusters found, but
clearly, this is someone else's sick HD that's been papered over.

Think about this whole refurb thing.  I was told, with a straight
face, that refurbs replaced everything within the original drive shell
with brand new parts, as assembled in a clean room.  

Does that make *any* sense at all?  Is the value of the shell so
significant that it's cost-effective to manually rebuild the thing in
a clean room, rather than pull a new one off the conveyer belt?

I suspect all that happens is, one person's bad HD return gets the
"hide bad sectors" treatment using factory tools, then gets shipped to
someone else as a "serviceable used part".

>Quantum stopped doing that and now no drives reassign uncorrectable
>read error bad sectors on reads anymore.

Ask yourself why they did this, and why did they stop...

Quantum's gone.  IBM got class-actioned for the dodgy 75 series.  WD
'fessed up to bad batches and had the grace to accept returns of these
on a no-questions-asked basis - but I found ongoing high failure rates
on unlisted batches and HD models after the crisis passed.

Commodity drives are commodity-priced, HD vendors are struggling with
low margins, warranties have been getting flakier and shorter, and
several HD vendors have shaken out or consolidated.

Do I trust 'em?  Not much.  Empathise, yes; trust, no.

>------------ ----- ---- --- -- - -  -    -
 Our senses are our UI to reality
>------------ ----- ---- --- -- - -  -    -
Folkert Rienstra - 01 May 2004 14:25 GMT
> On Fri, 30 Apr 2004 00:51:28 +0200, "Folkert Rienstra"
> > "cquirke (MVP Win9x)" <cquirkenews@nospam.mvps.org>
[quoted text clipped - 49 lines]
>
> Quite.  But from what I see, this is exactly what happens.

Nope, it can't be. All the mfgrs manuals I checked stated the same.

> HD vendors aren't alone in this; the tendency from MS has been to
> throw your data if that means fewer support calls:
[quoted text clipped - 5 lines]
> No-one else has any responsability for your data.  If you don't look
> after it, no-one else will.

Your drive does (as far as not automatically copy-
ing bad data to replacement sectors, unnoticed).

> > There have been some Quantums though that indeed reassigned
> > bad sectors "by default". The default was changeable though.
[quoted text clipped - 23 lines]
> full of nuls may match the same CRC or ECC value, especially if that
> also is nul.  Or wonder if the vendor just shrugs off the implication.

By routine I meant software, not hardware.

> > It is possible to read bad data from a sector by specific PIO commands.
>
> At some point, even that must fail, surely?

Well, unless the magnetic sheet is scraped off completely there is data
there, whatever it is that is left there. The PIO command will fail but
the bad data is in the buffer nevertheless. Your recovery routine must
be written so that it copies the buffer even when fail is reported.
It's up to you whether you notify the user or not when you do.

> > There is no such thing as an inbuilt defect, they are just defects that
> > turn-up immediate where as "aquired defects" are just defects that
[quoted text clipped - 5 lines]
> allowed defect management such as entering this map, re-scanning to
> build the map from fresh and so on.

Less automatic but still the same idea, I would think? And limited to
hardware maintenance. Software still won't know about it.
You can still lookup the defects in SCSI and I believe they are working
on it for (S)ATA too. Not that the info serves any purpose other than
for people who have a need to worry about everything and nothing.

> The idea was always to identify all questionable sectors and exclude
> these from use, so the rest could be relied upon.  The LLF marked the
[quoted text clipped - 7 lines]
> "acquired" defects.  If the only defects were those in the defect
> list, then nothing's changed; the HD is as good as it was.  

> But when new defects are acquired, it implies the disk is deteriorating.

That is what disks do, they deteriorate.
The goal is that they do that very gradually and over a very long time.

> Today, "manufactured" defects are hidden by the manufacturing process,
> and the LLF is done at the factory and cannot be renewed.  

Nope, it can.
The servo writing cannot be renewed but the LLF is certainly possible.
IBM drives even have it implemented.

> Even "acquired" defects can be hidden from the user, as described.  

SCSI introduced it, ATA just followed.

> So by the time defects are visible to Scandisk (noting how lax Scandisk
> surface scan is about failing "slow" clusters) it's highly significant.

Unrecoverable read error bad sectors are never reassigned on reads
so they always will be visible to scandisk.

> > They are the same type of defect except that the (to be)
> > aquired one fell still within the margins at the time the drive left the
[quoted text clipped - 3 lines]
> Point is, a sector that was "this" side of the margin that is now
> "that" side of the margin represents a failing disk.  

Maybe.
This is expected to happen on an aging disk, and that is what it is designed
for. So when it actually happens that isn't necessarily cause for alarm.
The mechanism is also based on accessing a sector before it goes unrecover-
ably bad. If that doesn't happen just because the sector is never accessed
then a sector may go bad and become visible. But overwriting that sector
will then take care of it.

> My policy is  zero tolerance to failing HDs; too much is at stake.

Like I said elsewhere, problem is how to determine a failing disk.

> > > > > then Scandisk may timeout and call the cluster Bad, and do its own
> > > > > clustermap-level relocation.
[quoted text clipped - 12 lines]
> space extends beyond the end of physical disk) and in such cases,
> surface scan relocates do fail.

Sorry, can't see the relevance of nonexisting sectors to notifying
the user of not copied data.

> > > No, not really.  I know enough about this industry not to trust vendor
> > > motives in such cases, having also noted that some HD vendor's
[quoted text clipped - 53 lines]
> see.  I've seen fresh refurbs do that too; no Bad clusters found, but
> clearly, this is someone else's sick HD that's been papered over.

Uhh, we are talking about his HD, not yours.

> Think about this whole refurb thing.  I was told, with a straight
> face, that refurbs replaced everything within the original drive shell
> with brand new parts, as assembled in a clean room.

Yes, that tale appears to go around.

> Does that make *any* sense at all?  Is the value of the shell so
> significant that it's cost-effective to manually rebuild the thing in
[quoted text clipped - 3 lines]
> "hide bad sectors" treatment using factory tools, then gets shipped to
> someone else as a "serviceable used part".

Which is not really all that wrong when it is rigourously tested first
in a climate chamber and under lowered voltage supply conditions.
I have a 8 year old IBM sitting somewhere with a manufacture date
of 3-4 years back. It was LLFed factory style, all grown defects
were added to the primary defect list and then some more because of
the lowered conditions. Then it was ruthlessly tested under the same
conditions. If it didn't pass the tests it was scrapped. I have another
one that passed the tests but they would have scrapped it because
they felt the Primary Defects number was too high for their liking.
Otherwise the drive was fine. Because it was mine I got it back.

> > Quantum stopped doing that and now no drives reassign uncorrectable
> > read error bad sectors on reads anymore.
>
> Ask yourself why they did this, and why did they stop...

Actually, I don't know if they really did that.
It was the manual that stated that it was the default.

> Quantum's gone.  

IBM too.

> IBM got class-actioned for the dodgy 75 series.  WD 'fessed up
> to bad batches and had the grace to accept returns of these on a
[quoted text clipped - 10 lines]
>   Our senses are our UI to reality
> > ------------ ----- ---- --- -- - -  -    -
PCR - 28 Apr 2004 22:21 GMT
| >It could be as Glee said, that it was auto-fixed by a chip on the hard
| >drive. But... indeed you do appear to have said Scandisk fixed the
[quoted text clipped - 7 lines]
| when Scandisk does a surface scan, does it look for defective
| clusters, tho bad sectors can trip up its logic checks and repairs.

Seems reasonable to me. Let me see what that nasty one (Rienstra)
said... "also scans the system area". Well! I may do so, but still be
looking for logic errors, as you say, cquirke. It is looking to see the
two FAT match & their pointers are sensible.

| Secondly, it's important to remember that while the HD's firmware and
| the OS's surface scanning do the same sort of thing, they work at
| different levels, and the one is oblivious of the other.

That seems reasonable too, "same sort of thing" meaning they want to
know whether the hard drive can hold it's data & "different levels"
meaning Scandisk gets what the other decides to pass it. It's a wonder
Scandisk should ever find a surface flaw, though. Well, I've never seen
a red-slashed cluster in Defrag, EVEN during my hard drive crash of
2001! However, ultimately, I refrained from running Defrag, & was just
doing six-hour Scandisks. What a horrible, stinging barrage of ugly,
dread error messages it finally did spew out! That is why the nastiness
of Rienstra/Gisin is nothing to me!

| At the hardware level, disk space is split up into physical 512-byte
| sectors.  These are addressed via the hard drive's own internal
| hardware; everything else, from the UIDE controller backwards, can
| only "ask nicely" for the HD to access these sectors.

Sounds reasonable to me. The HDD chip has first access, if the drive is
new enough to have such a chip that does repairs/remapping.

| If the HD's firmware defect management copies the contents from a
| failing sector, writes it to another sector, and from then on maps the
| old sector's raw address to the new one - then no software running on
| the PC is any the wiser, unless it can query the vendor-specific HD
| firmware in some way.  Scandisk and the rest of the OS can't do that;
| only HD-vendor-specific tools may have a chance there.

Sounds as reasonable now as it did in the ancient thread we first
discussed it, if I may presume to call it a discussion. Or was it you &
Blanton? Then, it was a discussion.

| At the OS level, the OS sees an expanse of disk that has been set
| aside as one or more volumes for its use (according to the
| system-level partitioning scheme).  It divides the volume into a file
| system structure area and a data cluster area (a cluster contains
| multiple sectors).

1. MBR (Master Boot Record)
2. PBR (Partition Boot Record)
3. FAT1 & FAT2. (Pointers to used/available clusters)
4. Folders & files.

I guess those four are all there is, except for perhaps unused sectors
or portions of sectors.

| When Scandisk surface scan tests the cluser area, it can "fix"
| clusters with failing sectors by copying them to a new cluster
| address.  This is the addressing scheme it sees; not raw sectors.

I think you have said elsewhere, if Scandisk finds a bad surface, it
must really indicate a dead/dying drive. This is because the
chip/firmware on the HDD should only be passing it good sectors. I'm
thinking, the data on a remapped sector may have been copied badly (from
a bad one the chip found), but Scandisk should only discover a logic
error in that case. The surface should be fine.

| >Uhhhhh, I think, as the drive now passes all scans, it is still good.
|
| I'm less sure.  The thing to do is use the drive vendor's diagnostics,
| or failing that to do a Scandisk surface scan in DOS mode so that you
| can watch the cluster progress counter.

cquirke, you snipped out where I said to do that: "Go on, make a full
system backup, though. And next week run those scans again!" But, I do
think, unless there are multiple occasions of surface errors, one may
continue with the hard drive, theoretically.

| When the OS (including Scandisk surface scan) accesses a failing
| sector, it may take several retries before the HD finds the sector and
[quoted text clipped - 22 lines]
|
| Maybe the HD defect was the good reason for the crash?

I think I surely would take the precaution of a full system backup &
ocassionally run the manufacturers tool &/or a Scandisk /Surface. But as
it has passed further scans, I still believe it was the chicken possibly
came first, not the egg!

| >-------------------- ----- ---- --- -- - -  -   -
|   Running Windows-based av to kill active malware is like striking
|   a match to see if what you are standing in is water or petrol.
| >-------------------- ----- ---- --- -- - -  -   -

Signature

Thanks or Good Luck,
There may be humor in this post, and,
Naturally, you will not sue,
should things get worse after this,
PCR
pcrrcp@netzero.net

Eric Gisin - 28 Apr 2004 23:58 GMT
> | Firstly, Scandisk spends most of its time dealing with file system
> | logic errors that are unrelated to the HD's physical condition.  Only
> | when Scandisk does a surface scan, does it look for defective
> | clusters, tho bad sectors can trip up its logic checks and repairs.

Scandisk only deals with logical errors, unless you specify /surface which
checks for bad sectors.

> Seems reasonable to me. Let me see what that nasty one (Rienstra)
> said... "also scans the system area". Well! I may do so, but still be
[quoted text clipped - 14 lines]
> dread error messages it finally did spew out! That is why the nastiness
> of Rienstra/Gisin is nothing to me!

Whatever.

> | At the hardware level, disk space is split up into physical 512-byte
> | sectors.  These are addressed via the hard drive's own internal
[quoted text clipped - 3 lines]
> Sounds reasonable to me. The HDD chip has first access, if the drive is
> new enough to have such a chip that does repairs/remapping.

It's the firmware, not the chip. Bad sector reallocation has been around for
10 years.

> | If the HD's firmware defect management copies the contents from a
> | failing sector, writes it to another sector, and from then on maps the
[quoted text clipped - 17 lines]
> 3. FAT1 & FAT2. (Pointers to used/available clusters)
> 4. Folders & files.

The standard terms are:
2: Boot sectors; 3: FATs; 4: Clusters

> I guess those four are all there is, except for perhaps unused sectors
> or portions of sectors.
>
> | When Scandisk surface scan tests the cluser area, it can "fix"
> | clusters with failing sectors by copying them to a new cluster
> | address.  This is the addressing scheme it sees; not raw sectors.

It truncates the file at the first bad cluster. It cannot recover bad sectors.

> I think you have said elsewhere, if Scandisk finds a bad surface, it
> must really indicate a dead/dying drive. This is because the
> chip/firmware on the HDD should only be passing it good sectors. I'm
> thinking, the data on a remapped sector may have been copied badly (from
> a bad one the chip found), but Scandisk should only discover a logic
> error in that case. The surface should be fine.

If scandisk sees a bad sector, it means it had too many bad bits for ECC
correction. That happens whenever power is lost during a write. It is also
caused by long media defects, which could be a minor scratch or a failing
head.

> | >Uhhhhh, I think, as the drive now passes all scans, it is still good.
> |
[quoted text clipped - 29 lines]
> | calls until the HD warranty expires, or a genuine attempt to make life
> | smoother for the user (even if unsuccessful sector moves lose data).

The only way to tell if your drive is failing is to run the manufacturer
diagnostics.
PCR - 29 Apr 2004 07:00 GMT
| > | Firstly, Scandisk spends most of its time dealing with file system
| > | logic errors that are unrelated to the HD's physical condition.  Only
[quoted text clipped - 3 lines]
| Scandisk only deals with logical errors, unless you specify /surface which
| checks for bad sectors.

I think that's what cquirke said, all right. It does appear to me,
though, that the longer /Surface scan appears to incorporate/begin with
the shorter System scan. IOW, I doubt there is a way to get JUST a
/Surface.

Well, actually, here is the blurb from "START, Run, Scandisk,
Thorough"... "Checks the files and folders on the selected drives for
errors, and also checks the physical integrity of your disk's surface.
To change the settings that ScanDisk uses to check files and folders,
click Advanced. To change the settings that ScanDisk uses to check the
surface of your disks, click Options."

And this, at "...Options, Do not perform write testing" (which is
generally how I run it)... "Specifies whether ScanDisk reads the
contents of each sector of your disk but does not write it back or
whether ScanDisk reads the contents of each sector and then writes the
contents back to verify that the disk can be read from and written to
correctly."

Well, I guess I am getting a read test that way ("Do not perform...") of
every cluster, on top of a System check.

| > Seems reasonable to me. Let me see what that nasty one (Rienstra)
| > said... "also scans the system area". Well! I may do so, but still be
[quoted text clipped - 16 lines]
| >
| Whatever.

Believe me, it was quite a traumatic experience, for me & for my
ceiling!

| > | At the hardware level, disk space is split up into physical 512-byte
| > | sectors.  These are addressed via the hard drive's own internal
[quoted text clipped - 6 lines]
| It's the firmware, not the chip. Bad sector reallocation has been around for
| 10 years.

I mean the chip on the hard drive, which is firmware. (I guess it was
Blanton I picked up "chip" from.)

| > | If the HD's firmware defect management copies the contents from a
| > | failing sector, writes it to another sector, and from then on maps the
[quoted text clipped - 20 lines]
| The standard terms are:
| 2: Boot sectors; 3: FATs; 4: Clusters

Whatever. But I was a tad more decriptive!

| > I guess those four are all there is, except for perhaps unused sectors
| > or portions of sectors.
[quoted text clipped - 4 lines]
| >
| It truncates the file at the first bad cluster. It cannot recover bad sectors.

I leave you in cquirke's hands, (poor fellow)...
http://users.iafrica.com/c/cq/cquirke/scandisk.htm

| > I think you have said elsewhere, if Scandisk finds a bad surface, it
| > must really indicate a dead/dying drive. This is because the
[quoted text clipped - 7 lines]
| caused by long media defects, which could be a minor scratch or a failing
| head.

Watch it! Rienstra may thrash you! Scandisk deals with clusters, which
do contain multiple Sectors. I think it's really best when the HDD
chip/firmware discovers a surface flaw. Then, it will map away a single
sector, 512 bytes by cquirke's count. The smallest cluster will be about
4 KB.

| > | >Uhhhhh, I think, as the drive now passes all scans, it is still good.
| > |
[quoted text clipped - 32 lines]
| The only way to tell if your drive is failing is to run the manufacturer
| diagnostics.

I believe doing so may be preferable to waiting on Scandisk.

Signature

Thanks or Good Luck,
There may be humor in this post, and,
Naturally, you will not sue,
should things get worse after this,
PCR
pcrrcp@netzero.net

cquirke (MVP Win9x) - 29 Apr 2004 16:19 GMT
>"cquirke (MVP Win9x)" <cquirkenews@nospam.mvps.org> wrote in message

>| >Uhhhhh, I think, as the drive now passes all scans, it is still good.

>| I'm less sure.  The thing to do is use the drive vendor's diagnostics,
>| or failing that to do a Scandisk surface scan in DOS mode so that you
>| can watch the cluster progress counter.

>cquirke, you snipped out where I said to do that: "Go on, make a full
>system backup, though. And next week run those scans again!"

No, I didn't miss that (tho snip it I did).  I just don't think that's
a safe enough alternative to dumping a defective HD - but then I am
assuming the HD is to hold material the user wants to see again.

>But, I do think, unless there are multiple occasions of surface errors,
>one may continue with the hard drive, theoretically.

That's where we disagree.  If you want HDs with "just one bad
cluster", I have a shelf full here (in case I need the logic boards).
Most of these will pass surface scan without showing any new bad
clusters, but then again, most of these will show patchy latency that
you'd miss if you ran the test unattended, or used Windows Scandisk.

>--------------- ----- ---- --- -- -  -    -
  Who is General Failure and
  why is he reading my disk?
>--------------- ----- ---- --- -- -  -    -
PCR - 29 Apr 2004 21:43 GMT
Although I did have a hard drive crash late in 2001 (just after the
warranty did expire), still most of what I say is theoretical.
Theoretically, as OP said there was a Windows crash before his Powermax
discovered an error, it could well be it was that crash was the cause of
something that was interpreted to be a surface flaw. It's gone: he scans
well now. If it should show up again, especially this time w/o a crash
of Windows first, then I would worry. (It could happen w/o a crash of
Windows, if it happens in a non-system/sensitive area.) But I understand
your side of it, I guess. So, better do a full system backup. That's
all.

I do know, when my hard drive did crash, there was little doubt about
it. It put a hole in the ceiling where my own head smashed through!

The HDD heads hung on for about a week, though. I was still able to boot
once/twice a day after multiple noisy attempts, & Windows appeared to
work flawlessly for up to 20 mins. I guess this part of the story
supports your thought, in that one could have 20 mins. of bliss on an
HDD that one knows must be full of flaw. I guess the firmware was going
nuts remapping it all away. But it was obvious the thing was dying. (I
only wish I could recall what those Scandisk errors were, but it was too
traumatic! And did I see a red-slashed cluster in Defrag?)

Signature

Thanks or Good Luck,
There may be humor in this post, and,
Naturally, you will not sue,
should things get worse after this,
PCR
pcrrcp@netzero.net

| >"cquirke (MVP Win9x)" <cquirkenews@nospam.mvps.org> wrote in message
|
[quoted text clipped - 24 lines]
|    why is he reading my disk?
| >--------------- ----- ---- --- -- -  -    -
Folkert Rienstra - 29 Apr 2004 23:27 GMT
> > "cquirke (MVP Win9x)" <cquirkenews@nospam.mvps.org> wrote in message
>
[quoted text clipped - 9 lines]
> No, I didn't miss that (tho snip it I did).  I just don't think that's
> a safe enough alternative to

> dumping a defective HD -

Problem is, what *is* a defective HD.

> but then I am assuming the HD is to hold material the user wants to see again.
>
[quoted text clipped - 6 lines]
> clusters, but then again, most of these will show patchy latency that
> you'd miss if you ran the test unattended, or used Windows Scandisk.

Well, to run the test properly you should run a zero wipe first so
that any trace of a previous externally caused problem is erased.
Eric Gisin - 28 Apr 2004 15:09 GMT
What did I say to piss you off? Are you one of the idiots who think disks have
clusters?

> It could be as Glee said, that it was auto-fixed by a chip on the hard
> drive. But... indeed you do appear to have said Scandisk fixed the
[quoted text clipped - 9 lines]
> on, make a full system backup, though. And next week run those scans
> again!
glee - 28 Apr 2004 19:38 GMT
Have you never in your life posted a wrong word, in error?
Signature

Glen Ventura, MS MVP W95/98 Systems
http://dts-l.org/goodpost.htm

> What did I say to piss you off? Are you one of the idiots who think disks have
> clusters?
[quoted text clipped - 12 lines]
> > on, make a full system backup, though. And next week run those scans
> > again!
Hugh Candlin - 28 Apr 2004 20:05 GMT
> What did I say to piss you off?

Does "What a moron" sound familiar?

> Are you one of the idiots who think disks have clusters?

No. I'm one of the volunteers in microsoft.public.win98.gen_discussion
who is secure enough that I see no need to insult people less technically
knowledgable than average, as traumatic discontinuities usually occur.

I would much rather demonstrate my grasp of the technicalities by explaining
the intricacies of the subject at hand in a straightforward manner.

Perhaps you could have explained that cluster is a logical concept, not a physical entity.
Instead, you didn't explain ANYTHING, leading the cognoscenti to believe
that you are nothing but a clueless Google jockey looking for attention.

I learned a long time ago, in a galaxy far far away,
that knowledge doesn't make you intellectually superior
to anyone else if you act like a pratt while dispensing it.

Cluster Size

An operating system function or term,
describing the number of sectors
that the operating system allocates
each time disc space is needed.

"Piss me off"?  No.  You simply demeaned yourself and your reputation.
Eric Gisin - 28 Apr 2004 20:18 GMT
> > What did I say to piss you off?
>
> Does "What a moron" sound familiar?

Appropriate way to deal with someone who repeats a false statement, and tries
to use google "hard disk clusters" as proof.

> > Are you one of the idiots who think disks have clusters?
>
[quoted text clipped - 4 lines]
> I would much rather demonstrate my grasp of the technicalities by explaining
> the intricacies of the subject at hand in a straightforward manner.

No, I am not going to explain disk fundamentals to newbies. I gave him
pcguide.com, which does so.

> Perhaps you could have explained that cluster is a logical concept, not a physical entity.
> Instead, you didn't explain ANYTHING, leading the cognoscenti to believe
> that you are nothing but a clueless Google jockey looking for attention.

Now you are name calling too. I am not supposed to?

> I learned a long time ago, in a galaxy far far away,
> that knowledge doesn't make you intellectually superior
[quoted text clipped - 8 lines]
>
> "Piss me off"?  No.  You simply demeaned yourself and your reputation.

Hardly. I just don't have time for idiots.
chrisv - 29 Apr 2004 13:59 GMT
>> I would much rather demonstrate my grasp of the technicalities by explaining
>> the intricacies of the subject at hand in a straightforward manner.
>
>No, I am not going to explain disk fundamentals to newbies. I gave him
>pcguide.com, which does so.

I agree.  It's silly to explain the same thing over and over, when Web
pages (or googling the newsgroups) have the answers.
Hugh Candlin - 29 Apr 2004 22:23 GMT
> > Eric Gisin <ericgisin@graffiti.net> wrote in message
> news:c6oenu03bs@enews4.newsguy.com...
[quoted text clipped - 4 lines]
> Appropriate way to deal with someone who repeats a false statement, and tries
> to use google "hard disk clusters" as proof.

I would have handled it differently, but I am sure that
I am not going to convert you to my way of thinking.

> > > Are you one of the idiots who think disks have clusters?
> >
[quoted text clipped - 7 lines]
> No, I am not going to explain disk fundamentals to newbies. I gave him
> pcguide.com, which does so.

A favorite reference of mine also, and one that I have frequently pointed
people to.  I would simply have supplied the reference,
and left off the "moron" comment.

> > Perhaps you could have explained that cluster is a logical concept, not a
> physical entity.
> > Instead, you didn't explain ANYTHING, leading the cognoscenti to believe
> > that you are nothing but a clueless Google jockey looking for attention.
> >
> Now you are name calling too.

No.  I am not.  You are reading more into that than I said.

If you check my record, you will see that I am NOT in the habit
of throwing insults at people, although I assure you that I am
perfectly capable of doing so.

I didn't say that the conclusion was true, just that your conduct
was indistinguishable from that of such people.  Big difference.

> I am not supposed to?

I expressed my opinion on that already.

> > I learned a long time ago, in a galaxy far far away,
> > that knowledge doesn't make you intellectually superior
[quoted text clipped - 10 lines]
> >
> Hardly. I just don't have time for idiots.

What are you going to do when everyone is smarter than you are?
chrisv - 30 Apr 2004 15:47 GMT
>I would simply have supplied the reference,
>and left off the "moron" comment.

Well, AlmostBob's google comment was pretty moronic.  Hell, I just
googled "dinosaurs alive" and got over 100,000 hits - I don't think
that's proof that dinosaurs are alive...   8)
chrisv - 30 Apr 2004 19:51 GMT
>>I would simply have supplied the reference,
>>and left off the "moron" comment.
>
>Well, AlmostBob's google comment was pretty moronic.  Hell, I just
>googled "dinosaurs alive" and got over 100,000 hits - I don't think
>that's proof that dinosaurs are alive...   8)

For that matter, I googled "rod speed human" and got over 200,000
hits!

8)
PCR - 30 Apr 2004 21:57 GMT
I am not a proponent of the use of such terms as "moronic". However, if
users of such terms saw fit to apply it to AlmostBob's Google search,
what could they possibly think of these Google searches YOU are coming
up with?

Signature

Thanks or Good Luck,
There may be humor in this post, and,
Naturally, you will not sue,
should things get worse after this,
PCR
pcrrcp@netzero.net

|
| >>I would simply have supplied the reference,
[quoted text clipped - 8 lines]
|
| 8)
Eric Gisin - 30 Apr 2004 23:15 GMT
It's a f.cking joke! Get a life, luser!

> I am not a proponent of the use of such terms as "moronic". However, if
> users of such terms saw fit to apply it to AlmostBob's Google search,
[quoted text clipped - 12 lines]
> |
> | 8)
glee - 01 May 2004 03:33 GMT
> Well, AlmostBob's google comment was pretty moronic.  Hell, I just
> googled "dinosaurs alive" and got over 100,000 hits - I don't think
> that's proof that dinosaurs are alive...   8)

You mean, they're not??
http://www.sandiegozoo.org/calendar/wap_dino_mtn.html
PCR - 01 May 2004 05:01 GMT
That's Hardmeier's zoo, I guess.

Signature

Thanks or Good Luck,
There may be humor in this post, and,
Naturally, you will not sue,
should things get worse after this,
PCR
pcrrcp@netzero.net

"chrisv" wrote in message
news:05j490htpqf63obkgetgj3chl4puum22qk@4ax.com...
> Well, AlmostBob's google comment was pretty moronic