|Meet the Gang 1 2 3 4 5 6 7 8 9|
Answered By Faber Fedor, Guy Milliron, Ben Okopnik, Robos, Karl-Heinz Herrman
Today I rebooted my happily working winXP/RH 7.2 system to a grub error 17. I can boot with a grub floppy into windows (chainloader +1), but not Linux. When I try to mount the linux partition in rescue mode (booting a Redhat 7.2 CD) and mount the partition it doesn't work.
[Faber] Partitioning problems! I like these!
An error message here would be nice, but you did so well in the rest of the message, I'll forgive you this time.
Here is the listing from fdisk -l:
Disk /dev/hda: 255 heads, 63 sectors, 2482 cylinders Units = cylinders of 16065 * 512 bytes Device Boot Start End Blocks Id System /dev/hda1 1025 1723 5614717+ 7 HPFS/NTFS /dev/hda2 1718 2416 5614717+ 83 Linux /dev/hda3 2417 2481 522112+ 82 Linux swap
If I run "fdisk /hda" and go (x)pert mode and then (v)erify partition table I get "warning: partition 1 overlaps partition 2. 16466623 unallocated sectors"
And as you may have noticed in the fdisk listing, my ntfs partition does indeed end after the beginning of my linux partition.
[Faber] So? Un-overlap it!
A flippant answer, you say? That's what I'm known for! But seriously, let's think about this...
Somehow, someway, your partition table got flaky. Now, the overlap occurs at the end of one parition and the beginning of another. What are the chances that you have Windows data residing at the end of hda1? If there is a good chance, then you're frelled, and you have learned why you should never put your operating system, your user data, and your application all on one partition.
If you had, say, put / on hda2, /home on hda6 and /usr on hda7, then in your scenario only / would be affected and that could be taken care of with a re-install as a worse case. As it is, a re-install would end up wiping out everything on hda2.
However, if there is a low chance that there is data at the end of hda1, you might/should be in good shape.
So, I'm assuming there is no Windows data written to the end of hda1, which means the Linux data is still on the hard drive (the partition table is read by the computer to determine where the partitions start and stop; there is no division done on the hard drive itself).
So I don't see why you can't fire up fdisk, and go in and set the end of hda1 to block 1717. Write the changes to disk, shut down the machine, sacrifice two chickens under a full moon (which it is tonight, so you're lucky you don't have to wait another month) and restart the machine. If you sacrificed the right kind of chickens (which is left as an exercise for the reader), it will come back up.
That actually occured to me, but I thought I could only make things worse by manually toying with this stuff.
[Ben] There's a far higher probability of making things worse by letting some automatic process twiddle with it. I'm afraid it doesn't usually get resolved by divine intervention, so the manual method is what's left.
Uninformed manual twiddling is something to be afraid of. Dynamite is not of itself dangerous; however, it becomes a terrifying thing when handled by the ignorant. Knowledge is the key factor that makes all the difference. Just to throw in my $0.02, Faber has hit this particular nail on the head. Also, note that just changing the disk parameters as he has suggested is fairly harmless, as long as you don't write any data to those partitions; if you write down the current numbers, you can always revert to them in the worst case (however, you already know that they're wrong, so that's not much help.)
One question though - what command under fdisk do I use to set the end of a partition?
[Ben] 'x' to get you into expert mode, 'c' to change the number of cylinders. Again, writing down the current values is a good thing, even if it's of marginal value in this specific case.
When I try to fsck /dev/hda2 I get "Bad magic number in super-block while trying to open /dev/hda2"
[Faber] That makes sense. fsck needs to read the super-block to do it's thing. It assumes that the first super-block is at block...uh...1 *from the beginning of the partition*. f that super-block is frelled, you can try the backup superblocks; the first one is located at block 32 and the others are located Ghu knows where on your system, but I'm sure we could devise a way to find out if you pleaded very nicely.
My guess is that the first super-block is located in the overlap area, so that wouldn't help you anyway.
So try my suggestion <spooky music>if you dare</spooky music> and let us know how it turns out.
Oh, and next time, make some backups...
I'm at the end of my rope here. There is a small amount of data on the partition I'd really like to retrieve. I can't think of anything unusual that I've done recently to cause this problem - certainly nothing with my partition tables.
Regarding the last sentences I have an idea: might it have been windows doing some "defragmenting"? Someone quite recently told me that win packs the stuff it intends to move to some other place temporarily at the end of the partition since there is most of the time some space left. This, in conjunction with your overlapping partitions, might have been the end of that particular linux partition (and might also not be saved by the methods destribed by the others since they assume that the data in the linux part is still intact...so I'd recommend 2 chicken, 2 ox and, if handy, a virgin...)
I appreciate any help you guys can offer.
PS: Error 17 is described in the GRUB manual (http://www.gnu.org/manual/grub-0.90/html_mono/grub.html) as "17 : Cannot mount selected partition: This error is returned if the partition requested exists, but the filesystem type cannot be recognized by GRUB."
[Guy] I just wanted to say, Faber, very well done. Few people really understand Partitioning so well.
I spent about a year working at STAC Electronics in SQA (Software Quality Assurance - AKA Alpha Stage Testing) and nearly had intimate relations with HD's and their functionality.
For those lost in this conversation, Faber answered a question concerning HD Partitioning very well. STAC Electronics writes a program called Stacker (Double your disk capacity - Runtime/Real Time disk compression) For DOS/W!n3.x-9x and OS/2. I was the lead tester for the software as it came downstairs from the programmers.
Well, I still haven't been able to effect any permenant change with fdisk. If I do "fdisk /dev/hda" the (c)hange command described before wants to change the number of cylinders on the whole disk.
[K.-H.] Hmm.... I don't know that "c" command, maybe that's what it does. Were you in expert mode?
If I do "fdisk /dev/hda1" and do that command
[K.-H.] bad idea. the partition table you want to change is definitely at /dev/hda and nowhere else. I guess what you did change is a "partition table " at the beginning of hda1 (therefore changing the first block of hda1 which may or may not be important, it's the Win boot sector IIRC).
it seems to ask the right question "Number of cylinders? Default: 699" ...699 being about right. You get 698 when you subtract 1723 from 1025 (See my fdisk listing).
[K.-H.] If there is no size change command do in fdisk (norml mode, not expert): p for the table
then delete the partition 1 (d or r ?)
and create a new one wih the same start cylinder but the correct end cylinder number. Then it's smaller. This only changes the data in the partiton talbe and nothing on the drive itself and is simply a resizeing of the partiton. The deletion in the partition table will not delete anything on the partition itself.
Now, when I try to set it to a lower number (692 by my subtraction) and (w)rite to partition table it calls ioctl() etc. and does its thing, but the partition table is still the same when I "fdisk -l"...Am I missing something here?
Also, if the beginning of the ext3 partition have been written to, are there no recovery tools to get the data back? I know in FAT land, Norton Utilities has saved my bacon more than once in similar situations.
[Robos] Concerning the last part, I one heard that the norton clone midnight commander has some option to undelete stuff, alternatively there is an undeletion howto at linuxdoc.org. But I never tried either of 'em... Concerning the fdisk part:
sfdisk has four (main) uses: list the size of a partition, list the partitions on a device, check the partitions on a device, and - very dangerous - repartition a device.
This is the tool for the real "nerves made of steel" types. Never used it (I'm such a sissy ) but a friend of mine uses it in cases like yours.
[K.-H.] You can run e2fsck and it will do what it can -- but overwritten data at the beginning of the drive is overwritten data. The problem is that you never know which of your files got corrupted unless you can check then one by one. You could try to keep the inode numbers e2fsck reports somewhere (logfile option of e2fsck?) and there is an option to ls to show the inode numbers of files and compare, but it's tedious work -- even to write the script/program doing the work.
if your NTFS has written over your ext3 partition it trashed all information in the first part of hda2 -- including inode information, data blocks,... So e2fsck in that part does not understand it's inodes anymore, maybe misinterprets some of them causing even more problems. On the other hand you may be quite lucky and it's not overwritten at all or e2fsck can repair the rest of the partition without to much problems. You don't have the original partition table around? Do you maybe remember with what "+500M" or whatever size parameters you made them? That would help a lot in finding the real cylinder border between hda1 and hda2.
I used DOS fdisk to make the first partition (later converted to NTFS by WinXP) and Disk Druid to install the rest of it. So I don't know what parameters were used to create the table. Also, I don't have the old partition table. Is that something I would keep around if I were L33T? I guess it would be easy to print out once the system is up and running.
[K.-H.] I don't know if I'm L33T, but I do keep printed (i.e. on paper) partiton tables around or at least digital version on other computers.
I started with this behaviour when suddenly my partition table on a multi boot system got messed up (NT 3.51 was told by me to install itself in a logic partition where other logic partitions were used by Linux -- NT chose to disregard this wish of mine.... This was evolutionary not a wise thing to do, one NT down . At that time fortunately I knew that I made my partitions by giving Linux fdisk sizes in round numbers (like +500M) and fdisk did indeed calculated the same cylinder boundaries. (Data fully recovered, only the partition right after NT was clobbered -- which was / and had no user data /usr and /home stayed untouched (but were gone in the partition table)).
There's a tip for the future.
[K.-H.] I think so, yes.
As for the FAT -> NTFS conversion causing this problem, I doubt it because that happened a couple of months ago.
So how dangerous to my winXP partition is this operation? I will, at worst, lose some data if it happens to be at the end of the partition, right? I probably won't make the NTFS partition un-bootable, right? (I use "probably" because I know nothing is certain in this case)
[K.-H.] Hmm.... NT usually had all important stuff at the beginning of the drive. Win9X always shows some block at the partition end as "system and unmovable" but on the other hand after that Partition Magic can resize the partition easily. So I think neither Win Fileformat has usually important thing at the end.
So I think you should be reasonably save from damaging the XP partiton completely. And if XP did increase its partition over its original size there shouldn't be anything belongig to XP anyway.
e2fsck has some option to just check but change nothing -- that could help testing if by this partition change you can recover the Linux partition. If this seems to work let it write and hope for the best.
from man e2fsck:
-n Open the filesystem read-only, and assume an answer of `no' to all questions. Allows e2fsck to be used non-interactively. (Note: if the -c, -l, or -L options are specified in addition to the -n option, then the filesystem will be opened read-write, to permit the bad-blocks list to be updated. However, no other changes will be made to the filesystem.)
|Meet the Gang 1 2 3 4 5 6 7 8 9|