Too Busy For Words - the PaulWay Blog

Preventative partitioning

The virus which is attacking my nose and throat is not letting up, and while the lab I work in is really interested in identifying viruses quickly, I don't think they want to be identifying this first-hand. But I'm not doing myself any favours if I stay up until 2:30 in the morning hacking.

Last night's adventure was redoing that command that I've been killing the MythTV machine with. My first suspect was the mirrored LV - something about mirrors came up in the messages when I did it. The second time, I ran pvmove with the -v (verbose) option, which showed the following bit of logging:

[root@media ~]# pvmove -v /dev/hdb /dev/hda3
    Wiping cache of LVM-capable devices
    Finding volume group "vg_storage"
    Archiving volume group "vg_storage" metadata (seqno 80).
    Creating logical volume pvmove0
    Moving 56559 extents of logical volume vg_storage/lv_storage
    Moving 86 extents of logical volume vg_storage/lv_swap
    Moving 0 extents of logical volume vg_storage/lv_backup_root
    Found volume group "vg_storage"
    Found volume group "vg_storage"
    Updating volume group metadata
    Creating volume group backup "/etc/lvm/backup/vg_storage" (seqno 81).
    Found volume group "vg_storage"
    Found volume group "vg_storage"
    Suspending vg_storage-lv_storage (253:0)
    Found volume group "vg_storage"
    Found volume group "vg_storage"
    Suspending vg_storage-lv_swap (253:1)
    Found volume group "vg_storage"
    Creating vg_storage-pvmove0
    Loading vg_storage-pvmove0 table
  device-mapper: reload ioctl failed: Invalid argument
  ABORTING: Temporary mirror activation failed.  Run pvmove --abort.
    Found volume group "vg_storage"
    Loading vg_storage-pvmove0 table
  device-mapper: reload ioctl failed: Invalid argument
    Loading vg_storage-lv_storage table
  device-mapper: reload ioctl failed: Invalid argument
    Found volume group "vg_storage"
    Loading vg_storage-pvmove0 table
  device-mapper: reload ioctl failed: Invalid argument
    Loading vg_storage-lv_swap table
  device-mapper: reload ioctl failed: No such device or address
[root@media ~]# pvmove --abort

And there it fails. The ioctl makes me think that moving a PV on a physical volume (rather than its own partition) must be what's causing it. Maybe the IO commands just don't like working on that part of the disk. So, test number three - try moving the data off the 160GB disk. Nothing wrong with that, the PV is all on a partition just fine.

Nup. Same problem again.

This time I'm quicker to restore the system to working order. I booted up off the System Rescue CD, but I couldn't issue the pvmove --abort command because the VG isn't complete (because I've unplugged one drive for the DVD). That's OK. I resize my root partition, create a new, small partition in the available space, copy the root filesystem off the System Rescue CD (i.e. off where it's uncompressed it - nifty trick, the partition image has a header that's a shell script that loads the cloop driver and mounts itself), copy the initrd and vmlinuz files off where the CD boots from, mangle up a new item in grub's config with those files, and, after changing the drives back so the LVM can initialise correctly, boot it.

It doesn't quite go perfectly, but it gets a long way further than I'd feared. I spend a bit of time trying to work out how this particular mangling of gentoo boots, but I don't get too far. So I just mount the uncompressed file system where it should be, run pvmove --abort and check the LVM state: everything comes up good. Reboot again and it's all working.

So now I'm making my own little equivalent to Dell's system rescue install - an unobtrusive partition that will give me all the tools to get the MythTV machine back on line without having to fiddle with cables and disable the LVM and so forth. Maybe I should sit down and make something like this - a version of the System Rescue CD, or a tool within its boot config, to install a version of the System Rescue CD on a partition, including (if possible) plumbing the correct options into grub to get it to come up as an option. It makes a lot of sense for those times when you want to get around a failure but can't put in a CD (or can't get in the box to attach one).

Maybe all the true Linux gurus just boot off their distro's rescue CD and go from there, though.

Last updated: | path: tech / fedora | permanent link to this entry

Tue 21st Nov, 2006

Not learning the hard way

It's now mid Sunday, and I'm mentally and physically wrecked. This is partly due to a throat infection thing that's going around, and partly due to weird LVM stuff, and partly due to sheer bloody-minded stupidity.

It started a couple of days ago, when I took a day off because of a sore throat. I decided, finally, to upgrade my MythTV machine to Fedora Core 6, and in the process remove the old boot drive and change over to a new Logical Volume (LV) under LVM. After several attempts at this problem before, I'd decided to use a mirrored LV to store the root volume on. Luckily I had three disks - mirroring in LV requires a disk per mirror and an extra for the 'transaction log' - and I set it up, copied the old root file system to the new mirror, and that was enough for Fedora Core to recognise in order to install.

But, after a couple of mysterious crashes that ended up in file system checks throwing pages and pages of errors, I started really wondering. LVM is wonderful and stable and allows you to agglomerate disks in ways you would otherwise pay lots of money for hardware solutions to achieve, but my experience so far is that when it goes bad, it starts getting rather difficult to recover. Having the root file system stored in a way that I wasn't sure I could ever recover if one disk went bad - all FAQs and HowTos to the contrary - I decided to go back to plain old partitions.

I bought a 400GB disk for $199 at Aus PC Market with the intention of pensioning the 160GB drive off in the MythTV machine, and giving it a bit more recording headroom. But for various otiose reasons Friday knocked me out and I started feeling very congested and the sore throat had returned from Wednesday. Unable to sleep, I put the new drive in the MythTV machine, partitioned it, copied all the files from the old root file system across, and booted it - it came up fine. In a fit of what seemed at the time to be inspiration but I now know to be a madness brought on by addiction to Lemsip, I also decided to move the data off one of the 250GB drives temporarily so I could partition it.

Long ago when I was setting up the system, I had realised that LVM PVs can be created on the raw disk device as well as in partitions. This sounded like a brilliant idea - no partition to worry about, LVM could put LVs on it anyway, and one less command to perform. Interestingly, you also get about 96MB of extra space. However, this decision has come back to haunt me.

Firstly, back when I was first trying to eliminate the old 40GB disk, I wanted to have a three-way RAID. LVM doesn't do that, but MD does. But you need three partitions the same size. I couldn't repartition /dev/hdb because, well, there wasn't a partition on there to alter. So that idea eventually went out the window.

Now, I thought, I could lay the problem to rest. I had used pvmove before to move space off a SATA disk that I'd bought without knowing that (at the time, at least) the way SATA drives are accessed also causes my DVB cards to stutter (I think it's something to do with DMA, but I haven't traced this down). So, innocently, I issued pvmove /dev/hdb /dev/hda3.

Nothing happened. It wouldn't respond to Ctrl-C or Ctrl-Z (although other characters, uselessly, came up fine). Then every process that tried to access the LVM also seized up. "OK," I thought, "reboot and it'll be fine." But no: rebooting threw up a bunch of errors about a bad LVM state and kernel panicked. It's 5AM and I'm not feeling well and I have a dead MythTV machine - brilliant.

Of course, to add to my complications, I had returned to the old four-drive problem - I had to unplug one of the LVM drives (and thus render the LVM inoperable) in order to plug in the DVD drive to install something. I had the old MythTV partition still backed up in LVM (hopefully), so I reinstalled Fedora Core 6 from scratch (after a bunch of fruitless searching about how to disable the LVM checks at boot-up - it's possible, but you have to edit the nash init script and repack your initrd image and even then it didn't work perfectly; I was hoping for a nice kernel command-line option). Oh, and I have to install in Text mode because I didn't feel like lugging the monitor from downstairs, and even though the NVidia GeForce 5200 will display boot-up on all monitors and TV sets you have plugged in, it won't thereafter show any graphical modes on the TV without options in the Xorg config. Yay.

The new Fedora Core install allowed me to do a pvmove --abort, which then allowed me to see the storage VG and the old root VG. "Hell," I thought, "while I'm here I'll just rebuild the thing from scratch - I've got too much ATRPMS kruft in there anyway." That merrily ate up the hours from six until nine - copying config across, setting daemons to start, turning unwanted services off, updating the repository config with local mirrors, getting the video drivers working again, and so forth.

That night, I woke up for otiose reasons at about four in the morning. Unable to get back to sleep, I decided to look at the config again. The wool in my head and the nettles in my throat made me decide that retrying the pvmove command would be perfectly reasonable - it must have been a temporary glitch. This time, just in case, I dd'd the entire newly-created partition over to another system on my network, created a new 'old root' LV that wasn't striped, mirrored or afraid of water, copied the old 'old root' LV over to that, and removed the old one just in case it was something to do with the mirroring that had caused LVM to bork out. Now secure in my preventative measures, I issued the pvmove command again.

Same result. System locked up.

I rebooted, this time using the System Rescue CD, which allowed me to see the network and the partitions. Right, copy the dd image back again, and reboot... Nope, same problem. Worse, now the LVM partition on /dev/hda3 doesn't exist. Hmmmm. This is bad. Hmmmmm. /dev/hda3 sounds familiar - with that growing horror that computer problems specialise in, I realise that I copied the 20GB partition to /dev/hda3 (the LVM PV) rather than /dev/hda2 (the ext2 file system). Bugger. I can boot, and everything runs, but now the VG won't come up because one of its PVs is AWOL.

I tried grabbing the first couple of sectors of another PV, inserting the correct UUID (which, fortunately, the VG still knows about and includes in its complaints) in the correct spot (after a bit of guesswork - thank Bram Moolenaar for the binary editing capabilities of vim). Nup, no luck - didn't think I could fool it that easily. No-one in any of the IRC channels I was in could offer any assistance (#lvm on freenode is usually quiet as a grave anyway).

One of my worst habits is the way I avoid any problem that's stumped me a bit. Several games of Sudoku, Spider and Armagetron and a lot of idle chatting on various IRC channels later, I was still no nearer a solution. Then, realising that no-one was going to help me and I had to do it myself, I probed around in the options of pvcreate, and found I could specify a UUID. Brilliant! Suddenly the PV, VG and LV was back on the air. Five hours after I'd woken up, I collapsed back into bed. It was Sunday. (At this point, LVM hadn't put anything permanently in the /dev/hda3 PV, so it was merely a question of making sure it was included.)

That afternoon, I made sure that MythTV was going to update its programme guide and relaxed, watching a few TV shows. It seemed an uncommon luxury.

Last updated: | path: tech / fedora | permanent link to this entry

Wed 19th Jul, 2006

Disks Gone Mad

Or: keep on going when you've lost sight of your aim. I should know by now. Whenever I think I'm doing pretty well at something, and I think I've got a reasonable handle on it, in about forty-five seconds (on average) something is going to come along and shatter that perception entirely. The more confident that I've been that some technological idea will work, the more likely that I'll be sweating and swearing away in six hours time still trying to fix the broken pieces, having long since forgotten what I changed, what my objectives were, and why it seemed like such a good idea in the first place.

This was brought home to me forcefully yesterday. At 9:15, in my cycling pants ready to go to work, I thought "I'll just switch over the root partition labels and bring the MythTV machine back up". At 9:18 I was looking at a kernel panic, as it failed to find the new root disk, because either LVM and MD weren't started yet or my clever tripartite disk wasn't set to come up automatically. At 9:30 I was burning a copy of the Fedora Core 5 rescue CD. At 11:45 I'd successfully put the labels back but the thing was getting to point where it loads the MBR off disk and stopping. At 1:45 I was going through grub-install options with a patient guy on the #fedora channel. At 2:00, in desperation I pulled all the drive connectors off except the one I was trying to boot off. Success! I felt like curling up in bed. I still had my bike pants on.

Of course, I'd made things more difficult for myself. I have four IDE drives in this system, so I had to unplug one in order to put the IDE CD-ROM drive in. Which meant either the LVM would be down because of a missing disk, or I couldn't access the 40GB drive that I was wanting to restore boot functionality to. What I hadn't thought of was that the CD-ROM was a master device - when I put it in place of another master device that IDE chain would be fine, but when I put it in place of an IDE slave the two masters would get grumpy and not speak to anyone, which was causing the boot lockup. And of course I was being quick-and-dirty and not swapping the CD-ROM out and the correct drive in when I rebooted just in case I needed it again. So I also made a one hour problem into a six hour problem by just not thinking.

I can only assume that there are a couple of readers of Planet Linux Australia out there chuckling away to themselves at my LVM / MD exertions. Because, in hindsight, MD on LVM makes no sense whatsoever. If one of the disks in a VG goes missing, and that disk has allocated blocks on it, the whole VG is considered dead. This then means that the entire MD is dead, and no amount of persuasion is going to bring it back. This is why an MD device needs to go on a raw disk partition - because MD itself is then doing the fault tolerance, not LVM. Lesson learnt.

Just got to keep thinking, I guess... And pull my head in.

Last updated: | path: tech / fedora | permanent link to this entry

Mon 17th Jul, 2006

Crazy LVM Partitioning continued

Today, I decided to finally attack the root partition on my MythTV box. By "attack", here, I mean "move off the 40GB drive", or at least do as much of this process as I can. After a bit of a think, what I wanted was a RAID5 array, but LVM doesn't offer RAID. MD does, but all the space on the drives in question is used by LVM. Presto chango, I pvmoved a bit of space around until I had 8GB of space free on each drive, then I made three logical volumes (called, unoriginally, lv_root_1, lv_root_2, and lv_root_3) and created a RAID5 device across them with mdadm -C /dev/md1 -l raid5 -p ra /dev/vg_storage/lv_root_*. This time I kept the mkfs parameters fairly standard; then I mounted it and started copying the root directory across with cp --preserve=all -rxv / /mnt/test.

The thing that most impresses me about this process is that the copying is taking about %5 (on average) CPU for the actual cp process, and barely any time at all for the md1_raid5 and kjournald processes. So doing RAID5 in software certainly doesn't require a huge grunty CPU (this is an Athlon 2400, yes, but it hasn't even broken into a sweat yet). This'd all be possible on a VIA EPIA motherboard... And, once it's finished, I'll have a root volume that can stand a complete drive failure before it starts worrying, and when it does I'll simply add a new drive, create a new PV, add it to the VG, create a new LV for the new drive, add the drive as a new hot spare, and remove the faulty LV; all at my leisure.

The fact that I can understand all this and think it 'relatively simple' gives me a small measure of pride. One of the few things I would thank EDS for in the time I spent there was sending me on the Veritas volume management training course. LVM is still pretty easy to get a grip on without that kind of training, but it's still made things a little easier.

(Educated readers would be asking why I specified -p ra rather than using its default, ls (or, in other words - why use the parity write policy of right asymmetric rather than left symmetric?) There's no particularly good reason. Firstly, I want asymmetric rather than symmetric to spread the parity load across disks, as is consistent with RAID5. Secondly, when I see an option like this I tend to want to choose the non-default option because, if all options are tested equally but most people use the default, then if a failure mode comes up then it's more likely to be found in the default case, and that may not affect the non-default case. It's a version of the "all your eggs in one basket" argument. I don't give it much weight.)

And all of this over an command line, through SSH, to home. I love technology (when it works...)

Last updated: | path: tech / fedora | permanent link to this entry

Fri 14th Jul, 2006

Why I Love Linux part 002

Or: Linux Disk Craziness continued

I'm reconfiguring my MythTV server so I can remove the 40GB disk - it's old, it's dwarfed by the other drives and I'd like to have one IDE channel back for the DVD drive. This involved a bit more fun with LVM, so I thought I'd document it...

The first task was building a non-LVM /boot partition. That was accomplished by using pvresize. I wanted 100MB, but the LVM tools display size in GB, allow you to set it in MB, but round up to an extent size. I played it safe and reduced it by three times as much as I thought I'd need. Then I used fdisk to resize the LVM partition and add a new partition. First challenge: fdisk doesn't allow you to modify a partition, you can only delete it and then create it anew. Luckily I remembered to change the type of the new (LVM) partition to 8e. Second challenge: fdisk only allows you to specify partition sizes in number of cylinders or sectors. So a couple of quick back-of-envelope calculations later, I had something which was roughly the right size. I then formatted the new partition, using -i 16384 because /boot contains relatively few files that are relatively large - this saved a relatively trivial 3MB on copying. I then used pvresize to expand the LVM area back to its maximum extent inside the partition. All done.

The second task was creating a swap partition. Because I'm a speed-mad power freak, I wanted a stripe across all three LVM physical disks. Ooops, the current LV already takes up all free extents on the first two disks, so I go to work with pvmove. After I remember to specify /dev/hdc1:(starting PE)-(ending PE), rather than /dev/hdc1:(starting PE)-(number of extents), this works rather nicely. Then it's a simple matter of swapoff -a, lvcreate -S 1G -i 3 -n lv_swap vg_storage, mkswap /dev/vg_storage/lv_swap, vi /etc/fstab to change the swap device, and swapon -a to get it working again. That's the easy part. :-)

The third task is to create a root partition. Hmmm, slight problem: There's not enough space in the volume group for a 20GB root partition. Further problem: the storage LV in the main VG is formatted as XFS, and XFS can't be shrunk (at the moment). OK, I'm going to have to think about this one. But, driven mad by power now, I consider how I'd configure it: how about a mirror of two three-disk-striped LVM arrays? At first blush it sounds reasonable. How fast would it be? I set up two 1GB LVM partitions, and, delving into mdadm, create a new /dev/md1 as a RAID1 mirror across them (the full command was mdadm -C /dev/md1 -l mirror -n 2 /dev/vg_storage/lv_test_1 /dev/vg_storage/lv_test_2). time dd if=/dev/zero of=/mnt/test/thing count=200000 (displaying my great ability to choose meaningless names) reports 2 seconds to write a 100MB file. That's pretty good, I reckon.

Of course, if any one disk goes down, that does take the whole thing with it, which is not exactly the required effect. But I'll remember that bit of the Veritas Storage Manager course sooner or later, and in the meantime I have larger fish to fry...

Last updated: | path: tech / fedora | permanent link to this entry

Wed 14th Jun, 2006

Why I Hate Linux Printing Part 02

Anton Blanchard might have shot Eric Raymond, and everyone might be happy, but, by Linus and the penguin that bit him, ESR's right about The Horror Of CUPS Printing. As far as I can see, the people who designed CUPS and the printconf-gui interface knew exactly what they were doing and wanted to provide a vaguely friendly interface on it, rather than actually trying to solve the problem of printing.

It all started when I upgraded the cups package and all my printers ceased to work. I found out that the package upgrade had decided that it was going to put all of the backend drivers in /usr/lib/cups/backend, but that cups itself still looks in /usr/lib64/cups/backend for them. A few symlinks later and at least the laserjet works correctly. You can read all about this in Bug 193987; basically the justification is that they're executables, not libraries, so they should be in /usr/lib. Because that makes so much more sense than /usr/bin/cups/backend, for instance.

Because of this problem, I plugged the USB printer into my testing (i386) machine and got it working. "Fine," says I, "I'll just print to it across the LAN." "Hah-hah!" said cups, "We're going to make things as difficult for you as possible!" I followed Eric Raymond's steps and managed, at least, to get cups to bind to the network interface. But it now still steadfastly refuses to either broadcast itself or let other people use it; every time I run the actual printer configuration program, it resets everything to stop other people see it too. This is just A1 weapons-grade stupid.

OK, maybe there's something screwy in my configuration. Why can't something tell me this? Why did the problem with the 64-bit backend drivers being moved manifest itself as an unknown lpr error, rather than something meaningful? Why did it appear at all? (It seems to be gone in the latest update to cups, 1.2.1-1.7, crossed fingers and touching wood.) Why doesn't the configuration un-screw itself? Why is it not Doing What I Say, let alone Doing What I Mean? And why do I seem to get no help on this at all?

Last updated: | path: tech / fedora | permanent link to this entry

Fri 26th May, 2006

Why I Hate Printing Under Fedora part 01

Printing under Fedora Core has been a mixed bag. The most recent installment of non-stop 360^° fun was when I discovered all my printers had stopped working on my newish install of FC5. A bit of probing discovered that all the backend drivers that are supposed to be in /usr/lib64/cups/backend/ that actually do something - usb, socket (for JetDirect printers), half a dozen others - have just magically disappeared.

I do a bit of experimenting. yum whatprovides /var/lib64/cups/backend/usb says that the cups package from the base repository provides them, but it's not installed. I download it, remove the old package with rpm -ev --nodeps cups and then rpm -ivh --nodeps --oldpackage cups-1.1.23-30.2.x86_64.rpm to install the base package. Goodie, all the backend drivers are back, but ugh, cupsd now fails with client error 127, whatever the hell that means. yum upgrade upgrades cups, which then gleefully (I swear I heard the words "A working cups installation! Let's adger it thoroughly with pitchforks!" coming from the motherboard...) removed all the backend drivers again.

And no-one on #fedora seems able to help me. I'm up to begging for help on the blogosphere / lazyweb.

(Maybe it's an x86_64 thing - my test i386 machine seems to have both the old package and the backend drivers. But that still doesn't help me ...)

Last updated: | path: tech / fedora | permanent link to this entry

Tue 2nd May, 2006

Return Of The SELinux Security Contexts

I wrote my own SELinux policy file today!

Today I realised my CGI pages weren't coming up because the scripts weren't allowed to connect to, read and write the Postgres socket. (They also seem to require the ability to getattr and read the krb5.conf file, and I have absolutely no freaking clue why, because my code doesn't use Kerberos in any way). I'd done a bit of research and found the command:

grep <error messages> /var/log/messages | audit2allow

A bit of questioning on #selinux on irc.freenode.org revealed that if I did audit2allow -M <name>, I'd get a module with the name I gave, so that later I can identify what particular policy modules I've loaded into SELinux and what they do (rather than just 'local'). (They even have version numbers too!) The module .te file is a text file, so you can edit it. Including all the various permissions you want to set in one file means that when you compile and load it with:

checkmodule -m -M -o <name>.mod <name>.te semodule_package -o <name>.pp -m <name>.mod semodule -i <name>.pp

(which the audit2allow script will do all but the last automatically) then you can have all your policy revisions in one neat place, rather than grabbing each separate error, making a separate policy for, and then probably overwriting the last policy module you called 'local' or whatever.

Neat.

Now all I have to learn is how to create new SELinux types, so I can say "only these HTTP scripts are allowed to read and write to this directory". Then I will truly know what the hell I'm doing. Possibly.

Last updated: | path: tech / fedora | permanent link to this entry

Wed 26th Apr, 2006

Learning SELinux-fu 101b

Incidentally, this is much better than the audit2allow method of fixing this problem, which just blasts a new rule to cover that specific case in. This might solve access to the directory but not then allow access to the files therein, requiring further audit2allow calls to fix, and so on. You're better off finding out what the original policy was for this daemon and then adding a new rule that covers your new configuration.

It seems so easy, I wonder why I haven't found it before...

Although I still want to know where the policy rules file is so I can make sure its backed up...

Last updated: | path: tech / fedora | permanent link to this entry

Learning SELinux-fu 101

Today I get to play around with SELinux all day, because today I'm trying to get all the services running that need to be. Because I've moved some of the directories around (to put all my data on /opt, out of habit), and restored some of those directories from DAR backup, not only do the files not have the right contexts but the rules for determining contexts aren't in place for those new directories. So, after a bit of Q&A time with the folks in #selinux on irc.freenode.org, I worked out how to use the semanage command.

semanage fcontext -l | grep mysql told me what I needed to know about the existing context rules. With a bit of copy and paste,
semanage fcontext -a -t mysqld_db_t "/opt/mysql(/.*)?" restorecon -v -R /opt/mysql installed the new rule and updated the rules on the /opt/mysql tree. Finally I found out that I had to put the [client] section into the /etc/my.cnf file with a socket line to tell it look in the new path for the socket, and all was well.

Ironically, the server was starting just fine; it was the 'check that the server is now running' part of the script that was failing. It took me a while to work this out... :-/

Last updated: | path: tech / fedora | permanent link to this entry

Mon 10th Apr, 2006

SELinux Strikes Back!

Now I've found that the reason my USB harddrive isn't mounting is because of SELinux messages:

Apr 10 09:37:52 biojanus kernel: audit(1144625872.015:1953): avc:  denied  { getattr } for  pid=2316 comm="hald" name="/" dev=dm-1 ino=2 scontext=system_u:system_r:hald_t:s0 tcontext=system_u:object_r:file_t:s0 tclass=dir
Apr 10 09:37:52 biojanus kernel: audit(1144625872.015:1954): avc:  denied  { getattr } for  pid=2316 comm="hald" name="/" dev=sdc1 ino=2 scontext=system_u:system_r:hald_t:s0 tcontext=system_u:object_r:file_t:s0 tclass=dir
Apr 10 09:37:52 biojanus kernel: audit(1144625872.167:1955): avc:  denied  { getattr } for  pid=2316 comm="hald" name="/" dev=dm-1 ino=2 scontext=system_u:system_r:hald_t:s0 tcontext=system_u:object_r:file_t:s0 tclass=dir
Apr 10 09:37:52 biojanus kernel: audit(1144625872.167:1956): avc:  denied  { getattr } for  pid=2316 comm="hald" name="/" dev=sdc1 ino=2 scontext=system_u:system_r:hald_t:s0 tcontext=system_u:object_r:file_t:s0 tclass=dir
Apr 10 09:37:52 biojanus kernel: audit(1144625872.335:1957): avc:  denied  { getattr } for  pid=12272 comm="hal-system-stor" name="/" dev=sdc1 ino=2 scontext=system_u:system_r:hald_t:s0 tcontext=system_u:object_r:file_t:s0 tclass=dir
Apr 10 09:37:52 biojanus kernel: audit(1144625872.335:1958): avc:  denied  { getattr } for  pid=12272 comm="hal-system-stor" name="/" dev=sdc1 ino=2 scontext=system_u:system_r:hald_t:s0 tcontext=system_u:object_r:file_t:s0 tclass=dir
Apr 10 09:37:52 biojanus kernel: audit(1144625872.339:1959): avc:  denied  { search } for  pid=12280 comm="touch" name="/" dev=sdc1 ino=2 scontext=system_u:system_r:hald_t:s0 tcontext=system_u:object_r:file_t:s0 tclass=dir

... and so on. Now to figure out how to fix it...

Last updated: | path: tech / fedora | permanent link to this entry

Fri 7th Apr, 2006

Scratching the Surface of SELinux

In restoring the server to its former glory I needed to restore the installation of Lahey Fortran for our old Fortran programmers. (With Fortran programmers, as with Fortran, one learns to not question why they need to do something, or to ask them to learn something new, but to just work with what they're giving you. I'm not about to start asking them why they can't write their programs in some version of Fortran that the GNU Fortran compiler supports...)

Unfortunately, running the program gave me the error "library name: cannot restore segment prot after reloc: Permission denied.". A Google on this error message showed me that it's caused by SELinux, which doesn't just allow anyone to come along and install new shared object libraries - you have to make sure that they're set to be a shared library (type 'texrel_shlib_t'). So applying the command 'chcon -v -t texrel_shlib_t /path/to/library' made everything suddenly work. And made me learn a little more about how SELinux sits in with all the other parts of Linux.

Learning: it's a good thing.

Last updated: | path: tech / fedora | permanent link to this entry

Thu 30th Mar, 2006

How not to install Fedora Core 5

I wanted to reinstall the operating system on my server. After my totally newbie install of Fedora Core 2 test 3, then using the development and bleeding-edge repositories and adgering the Python system (including up2date), and then using atrpms with innocent glee until I found out that they had a totally different and even nastier version of 'bleeding edge', the whole system felt as if it had been limping along being delicately prodded to stay upright. The final straw was the attack on another machine in the lab which, while it didn't seem to have actually broken into my machine, still didn't give me the Ring Of Confidence.

But here's how not to do an install of Fedora Core 5:

Start on the day you've just installed a Java system for another person in the lab - the day before he goes overseas for a conference.
Make sure the last disk image you have is slightly corrupted so that, after getting through all the other packages, Fedora Core says it can't install trivial-program-1.0.0.1-x86_64.rpm and has to reboot, leaving the entire install adgered.
Don't check your images against the SHA1SUM file but try for a while to install using the net boot disk off your local hard drive, getting the same install error as above in different places.
Use the i386 net install disk when you want to install an x86_64 server. Waste some more time finding and burning the x86_64 net install disk, as you can't quite get either the internal or the external burner to erase and write the disk correctly. Waste some more time finding this out the hard way by booting off a disk that's not correctly burnt.
When you finally get the system upright, find out that the portable USB disk that has its own power supply seems to not recognise any system you plug it into, with the exception of the Windows XP system that is, for some reason, adgered sufficiently that it cannot see anything on the network anywhere.
Once you've got the disk freed from its enclosure and plugged into a separate machine (because finding an IDE cable and installing it in the actual install machine is 'too hard'), remember to make the temporary logical volume too small to take all the files. If you're lucky, you can catch yourself doing this and re-make it before you actually copy anything across the network.
Remember to install specific rules in your backup scripts that don't backup your Thunderbird email or your scratch directory full of music. Especially, remember not to disable these rules when making that final backup.
Try restoring the differential backup first, without restoring the base backup. Assume that doing the first automatically also does the second. Install the files in the places they're supposed to go. Wonder why half your configuration doesn't seem to be there, or working. Waste some more time redoing it the right way. Here, again, you can be lucky if you learn from previous mistakes and don't just restore over the places things were backed up from but restore to a temporary directory and be selective about what you copy back.

Why am I so stupid?

On the plus side, I learnt a few things:

Booting from the net install disk and then installing off a local hard drive's worth of ISOs is superior to burning a bunch of CDs in every respect. The fact that you can do this from a removable USB disk is all the sweeter.
dar is the backup program of the gods. It will back up and restore SELinux security information in version 2.3.0. Put a copy of dar_static on your backup media and away you go.
Use LVM for flexibility. Use Software RAID to span your root partition across two disks and get mirroring. Then use LVM to create everything else. If it made any sense I'd make a LVM partition as a hot spare for the software RAID, but they're all on the same disks anyway. This way I can add another disk and span the LVM across it, and then add the hot spare out of space on that Physical Volume.

Last updated: | path: tech / fedora | permanent link to this entry

Too Busy For Words - the PaulWay Blog

Tue 22nd Jun, 2010

Weird boot problems fixed by mkinitrd

Wed 22nd Nov, 2006

Preventative partitioning

Tue 21st Nov, 2006

Not learning the hard way

Wed 19th Jul, 2006

Disks Gone Mad

Mon 17th Jul, 2006

Crazy LVM Partitioning continued

Fri 14th Jul, 2006

Why I Love Linux part 002

Or: Linux Disk Craziness continued

Wed 14th Jun, 2006

Why I Hate Linux Printing Part 02

Fri 26th May, 2006

Why I Hate Printing Under Fedora part 01

Tue 2nd May, 2006

Return Of The SELinux Security Contexts

Wed 26th Apr, 2006

Learning SELinux-fu 101b

Learning SELinux-fu 101

Mon 10th Apr, 2006

SELinux Strikes Back!

Fri 7th Apr, 2006

Scratching the Surface of SELinux

Thu 30th Mar, 2006

How not to install Fedora Core 5

Why I Love Linux part 001

Tue 21st Mar, 2006

How big a pipe, you say?