User:Djgreen/Linux Administration

From WolfTech
Jump to navigation Jump to search

AFS Software List

Basic Commands

Prep Machine for KickStart

  • Make sure hostname has config file in /afs/bp/system/config/linux-kickstart/configs/ece
  • Make sure host is using PXE-all DHCP template in QIP.

Details on the keywords in the kickstart file:

Rename Linux Boxes

If the machine is using dhcp I think you just need to switch in qip and reboot into the new lease / IP / hostname.

If the machine has a static configuration, you need to edit:




with the new values.

Try editing these files:




The hostname should be stored there.

After your box is on the network as the new name, run this command as root:

/usr/sbin/rhnreg_ks --force --activationkey <your_key_here>

Where <your_key_here> comes from the "activationkey" line in the web kickstart file. This will create a new "object" in red hat network.

Configure Networking

[a] GUI tool (X Windows required) - system-config-network [b] Command line text based GUI tool (No X Windows required) - system-config-network [c] Edit configuration files stored in /etc/sysconfig/network-scripts/ directory.


Disk Space

(2:54:22 PM) djgreen: best cmd to find out how much stuff is in a subdir? best cmd to find out available quota?

  • (2:55:13 PM) rsmclane: df -h tells you how much space is used/available on a partition
  • (2:55:52 PM) rsmclane: du -s . gives you how much space is used by a directory (that you're in)
  • (2:56:08 PM) rsmclane: du -s * will give you how much is used by all the sub directories in the current directory.
  • (2:56:29 PM) rsmclane: There are other options to du, but the -s (summary) is probably what you're most looking for.
  • (2:56:55 PM) rsmclane: By quota, I'm assuming you mean disk space on local disks, not AFS.
  • (2:57:05 PM) djgreen: both actually
  • (2:57:07 PM) ajbarnes: i like the -h option myself du -sh
  • (2:57:43 PM) rsmclane: Yeah, forgot the h on the du, but I had it on the df. h is human readable (MB, GB, etc instead of all in bytes)
  • (2:58:06 PM) rsmclane: AFS quota is fs lq
  • (2:58:25 PM) rsmclane: In the appropriate volume

Get Login Logs

in Linux, they have the /var/log/wtmp file but it is a binary file format. So the recommended interface is to use the last command.

last -n 100 will show the last 100 people who logged in.

You can also specify alternate wtmp files like

last -n 100 -f /var/log/wtmp.1

Controlling Access

It is possible to use pts groups to control access to Realm Linux.

cluster <cell> <PTS group>

is what goes in the config file. You can also hand edit /etc/update.conf if you don't want to re-install a box. It should look like this:



cluster> eos itecs-admin:helpdesk

where you replace the pts group with what group you want to use. If there is more than one pts group then you just add more cluster> lines.

Let us know if you need any further information. see also:

Remote Reinstall of Existing RHEL box

(11:45:53 AM) gsgatlin: djgreen: You can edit /boot/grub/grub.conf and change the default boot item to re-install this workstation.

(11:47:56 AM) gsgatlin: it starts numbering at 0 so if re-install is the first item it would be default=0, then you reboot and it starts installing.

Access Local Commandline

When the linux box is freezing during the boot cycle, you can bring up a cmdline by:

  1. At boot, press any key to trigger kernel selection screen.
  2. Click "p" then enter passwd for root account.
  3. Select the kernel you want, then click "a" to add to boot parameters.
  4. Add "single" without the quotes to the end and hit enter.

That will boot into runlevel 1.

Mount an NTFS drive

  • mount -t ntfs-3g <device> <mount location>
    • mounts drive to specified location.
    • <device> is going to the device path of the drive. For SATA hard drives, it's /dev/sda, /dev/sdb, /dev/sdc for the first connected drive, second, and so forth. For IDE drives, it's /dev/hda, /dev/hdb, etc.
      • Additionally, drives will have partitions, which is what you actually want to connect to. Most Windows formatted drives are going to have two partitions, a boot partition, and then the main partition. So what you normally want to mount is partition 2. For example, /dev/sdb2 is the second partition of the second SATA drive
    • the mount location must be an existing folder, so if you want to mount to /mnt/windows, you will have to create that windows folder under /mnt first
    • example, mount -t ntfs-3g /dev/sdb2 /mnt, mounts the second partition of the second SATA drive under /mnt
  • cp -vau <source> <destination>
    • for copying files off the drive. v is for verbose, a for archive, u for update. The destination must be an existing folder.
      • don't know if it will stop if it hits an error
    • example: (assuming the drive is mounted to /mnt) cp -vau /mnt/bob /local/backup, copies the bob folder from the drive and anything in it to /local/backup
    • Paths and filenames in linux are case-sensitive!
    • Use \ to get linux to recognize spaces and odd characters in a path. So a folder named "This is mine" would need to be typed as "This\ is\ mine". Also, hitting tab as you type a path or file name will get Linux to try to auto-complete.

Some extra helpful disk utilities

  • dmesg | tail -##
    • gives you the last ## entries from the message buffer. Can use this to figure out if the drive is connected and where.
  • fdisk -l <device>
    • lists the partitions on the device
  • /usr/sbin/smartcl -data --all <device>
    • runs a SMART test on the device

Using vi

  • Press INSERT to enter editing mode.
  • Once finished editing, press ESC to exit editing mode.
  • Type in :wq to quit and save
  • Other commands:
    • :q! (quit without saving)
    • /string (search forward for the string you input)
    • dd (delete line)

Add users individually

sudo vi /etc/users.local.base

sudo vi /etc/users.local

  • add their unity username. Each one goes on a separate line

Add sudo

sudo su -


  • add as "username ALL=(ALL) ALL"
  • again, each on separate line
  • any sudo users must also have been added to the two users.local files as well

Local Home Dir

[10:14] <djgreen> Micah -- anything I need to know about creating local user home folders (/home/*) in Linux? [10:15] <macolon> Umm... make sure that /home is a symlink to /local/home, aside from that, can't really think of anything. [10:16] <djgreen> how about perms? [10:17] <djgreen> just chown to the user and I'm ok?

[10:17] <macolon> mkdir directory; chmod 700 directory; chown username.ncsu directory

[10:17] <macolon> Assuming that no one else is supposed to access the dir. [10:18] <macolon> if it chokes on chown username.ncsu directory -- two steps then: chown username dir; chgrp ncsu dir

[11:09] <djgreen> Micah -- is there a way for someone to have /home/userid be set as their "homedir" when they login to a machine? Rather than their regular AFS home dir? [11:11] <macolon> setenv HOME /home/userid in .mycshrc

Repair DotFiles

[10:20] <elliot> /usr/bin/ will do it for linux [10:21] <elliot> make sure they run that from the root of their home directory

Chinese Fonts

> 01157197 - any idea about adding the Chinese language pack? Is this just another "yum" install?

I'm not sure. Try this:

/usr/bin/yum install fonts-chinese

And see if it fixes his problem. That package isn't installed by default in RHEL 5 it seems. It contains Chinese TrueType Fonts which will hopefully fix the issue. Let me know if that fixes it? If so we might want

to add it to the Eos lab machines.

Administrative Scripts

> How did you go about doing that? I'd like to be able to replicate if you're n ot around.

I've been using realm-crons, which have the option of being run either every 20 minutes, hour, day, or month. You can take a look at some of the scripts I've used for random stuff in:


A repository of scripts is in the scripts directory, while things that actually run are in one of the cron.* directories. The names are pretty self explanatory, with the exception of wsr, which is the 20 minute cron job.

To avoid killing AFS, all crons are setup to have a random wait before running, with a maximum time of 20 minutes, so doing things at an EXACT instant is not generally feasible -- though with a couple of combos of scripts it could be done -- but its would be messy.

Nvidia and X

> P.S. Should I just use nVidia's installer, or is it packaged nicely for > RHEL 5/Realm Kit somewhere?

Normally the open source nv driver is more stable than the binary only nvidia. However, RHEL5 isn't exactly new anymore. An updated X driver would be the thing to try. has packages for Fedora. Ah...

has the nvidia packages for RHEL 5.



Step 1. Download the Drivers The easiest method to downloading the latest drivers is by following the directions on Nvidia’s own driver download page. If you are in the mood to try out beta or older versions of the drivers, check out this page.

Step 2. Kill the X Server The Nvidia installer will complain if you try to install new drivers while the X server (a.k.a. all the graphic user interface stuff) is running. So, you’ll have to jump to a new session by hitting Ctrl+Alt+F1. This will bring you down to a text-only terminal. Login if it asks you to.

Now, GNOME (which uses gdm) users will usually enter this to stop the X Server:

sudo /etc/init.d/gdm stopAs for KDE (kdm) users:

sudo /etc/init.d/kdm stopStep 3. Start the Driver Installer Navigate to the directory where the driver installer downloaded to. For me, this was /home/eddie/Downloads:

cd ~/DownloadsNow, you must have root permissions to install new drivers (because it ties itself in with the kernel), so make sure you either switch to the root user or use sudo (recommended) before running the installer:

sudo sh ./NVIDIA-Linux-x86_64-195.36.15-pkg2.runNOTE: Remember to use the name of the driver file you downloaded, not the one above.

Step 4. Follow the Installer’s Instructions The installer should ask you a few questions as it installs the new drivers. It is usually safe (and recommended by myself) to say yes to all of the questions asked (install 32-bit OpenGL libraries, create a fresh Xorg.conf, etc.). After the questions, sit back and let the installer finish.

Step 5. Reboot and Enjoy And now you are done! Reboot and enjoy the up-to-date drivers:

sudo rebootTroubleshooting and How to Handle Errors In this section, I will describe the methods I used to work around a few of the issues I have encountered when installing new drivers. (I will update this section whenever a new problem arises!)

1. “Provided install script failed”

If you run Ubuntu, then you will see this everytime you try to install a new driver. Just ignore it, the install script provided by the Ubuntu developers fails on purpose.

2. Error locating kernel source

If you are like me and have compiled your own custom kernel, this problem will probably affect you. If you do not run a custom kernel, and use the default, distribution provided kernel, then you probably do not the kernel headers installed. On Ubuntu, this is simple to fix:

sudo apt-get install kernel-sourceBut if you ARE on a custom kernel, or you have the correct kernel headers installed but it still cannot find them, append the –kernel-source-path option on to the installer command. Kernel headers are usually located in the /usr/src directory. In my case, the command I use to start the 195.36.15 driver installer is:

sudo sh ./ --kernel-source-path=/usr/src/linux-headers-2.6.32-bfs311-idlesoft-desktop-amd64/


Clean up error reports

/usr/bin/run-realm-cron wsr ; /usr/bin/run-parts /etc/cron.wsr


Best way to make it stop is to clean up the /etc/aliases file.

In RL6 there is a script that keeps the local-mail-users table up dated with whatever you dump into aliases and any local accounts that are created. Normaly clients don't listen to port 25 other than from localhost. Servers, however, have minial sendmail configuration. Its was generally assumed when I wrote the configs that server know what they are doing and need to be able to receive the bounces they send.


Mounting Drives

  • (3:42:45 PM) slack: you can set user mounts in fstab, or just the automounter
  • (3:42:50 PM) slack: s/just/use/
  • (3:45:48 PM) slack:
  • (3:46:06 PM) djgreen: fstab, as Micah explained it to me, required that we pre-create mount points for each user ahead of time
  • (3:46:39 PM) slack: yes. For a user to be able to use mount, the mount point and end point must pre-exist.
  • (3:46:59 PM) djgreen: and they have to own it
  • (3:47:01 PM) slack: Otherwise, a user could remount /usr with their own binaries and own you
  • (3:47:15 PM) djgreen: so how does one automate that?
  • (3:48:02 PM) slack: with the automounter...which is how /ncsu works
  • (3:48:14 PM) djgreen: and since you have multiple people logging into the same box... ok.
  • (3:48:22 PM) slack: /etc/auto.master set's up /ncsu and points it an an LDAP map
  • (3:48:39 PM) djgreen: you have a link to point me at for automount doc?
  • (3:48:43 PM) slack: although you can have a static, file based map on the machine for ease of use
  • (3:48:48 PM) slack: let me see
  • (3:49:09 PM) djgreen: for celerra, I'll need a good number of diff mount points on each machine.
  • (3:49:50 PM) djgreen: but if I can make a list as you've mentioned and just push that to all, it could work.
  • (3:50:04 PM) slack: *nod*
  • (3:50:42 PM) djgreen: though I'm not sure what happens when there's 50 mount points and user only has privs to 1 or 2
  • (3:51:22 PM) slack: then they can only get to 1 or 2
  • (3:51:41 PM) djgreen: but does that cause errors/delays due to retries, etc?
  • (3:51:45 PM) slack: with the automounter you can easily set it so you can see everything under, say, /ncsu or you only see directories there after you access them
  • (3:52:28 PM) slack: if the automounter can't mount the share its been told to, yes
  • (3:53:02 PM) djgreen: not sure how well it'll work with the celerra cifs paths then...
  • (3:53:04 PM) slack:
  • (3:53:53 PM) slack: wouldn't we all like to know a little more about the celerra and cifs and the plan for doom?
  • (3:54:18 PM) slack: if the krb auth fails it may fail gracefully, but I've nothing to test to make sure

Finding Install Date

  • (9:48:55 AM) slack: there should also be a /.version if the machine was installed by webkickstart that has the time stamp of the install and a list of rpms at install time

Installing Applications in AFS

  1. Download the installation media -- store a copy in the software share
  2. Place a copy in the /data folder of the test installation box -- currently.
  3. Install the software on the test install box. Note the size of the installation. You may need to split the folders into smaller 2GB volumes.
  4. Now email / assign the queue to Gary Gatlin (gsgatlin). In your email, include the following:
    • the path to what all volumes you want me to create
    • what PTS groups should have access -- we'll try to standardize on itecs-swtest:ece
  5. Things to note:
    • "eos" (/afs/eos/dist) is for engineering stuff and "bp" (/afs/bp/dist) is for stuff that might be useful for the rest of campus.
    • "eos" cell addable locker volmes can be up to 4 GB while bp cell volumes must be 2 GB.
    • and the bp cell servers were getting crowded the last time I checked.
  6. Once the space has been created, install the software into the "." location -- aka, /afs/.eos/ or /afs/.bp/
  7. Set the "add" here:
  9. Give everyone read access...
    find . -type d -exec fs sa {} username permissions >>& /dev/null \;

Setup RAS servers

  1. use ....
  2. configure firewall (

*filter :INPUT ACCEPT [0:0] :FORWARD ACCEPT [0:0] :OUTPUT ACCEPT [27103570:11873538963] :RH-Firewall-1-INPUT - [0:0] -A INPUT -j RH-Firewall-1-INPUT -A FORWARD -j RH-Firewall-1-INPUT -A OUTPUT -d -j REJECT --reject-with icmp-port-unreachable -A OUTPUT -d -j REJECT --reject-with icmp-port-unreachable -A OUTPUT -d -j REJECT --reject-with icmp-port-unreachable -A OUTPUT -d -j REJECT --reject-with icmp-port-unreachable -A RH-Firewall-1-INPUT -i lo -j ACCEPT -A RH-Firewall-1-INPUT -p icmp -m icmp --icmp-type any -j ACCEPT -A RH-Firewall-1-INPUT -p esp -j ACCEPT -A RH-Firewall-1-INPUT -p ah -j ACCEPT -A RH-Firewall-1-INPUT -d -p udp -m udp --dport 5353 -j ACCEPT -A RH-Firewall-1-INPUT -p udp -m udp --dport 631 -j ACCEPT -A RH-Firewall-1-INPUT -p tcp -m tcp --dport 631 -j ACCEPT -A RH-Firewall-1-INPUT -m state --state RELATED,ESTABLISHED -j ACCEPT -A RH-Firewall-1-INPUT -p tcp -m state --state NEW -m tcp --dport 22 -j ACCEPT -A RH-Firewall-1-INPUT -p tcp -m state --state NEW -m tcp --dport 7001 -j ACCEPT -A RH-Firewall-1-INPUT -p udp -m state --state NEW -m udp --dport 7001 -j ACCEPT -A RH-Firewall-1-INPUT -p tcp -m state --state NEW -m tcp --dport 7006 -j ACCEPT -A RH-Firewall-1-INPUT -p udp -m state --state NEW -m udp --dport 7006 -j ACCEPT -A RH-Firewall-1-INPUT -s -p tcp -m state --state NEW -m tcp --dport 49696 -j ACCEPT -A RH-Firewall-1-INPUT -s -p udp -m state --state NEW -m udp --dport 49696 -j ACCEPT -A RH-Firewall-1-INPUT -s -p tcp -m state --state NEW -m tcp --dport 161 -j ACCEPT -A RH-Firewall-1-INPUT -s -p udp -m state --state NEW -m udp --dport 161 -j ACCEPT -A RH-Firewall-1-INPUT -s -p tcp -m state --state NEW -m tcp --dport 9600:9700 -j ACCEPT -A RH-Firewall-1-INPUT -s -p tcp -m state --state NEW -m tcp --dport 9600:9700 -j ACCEPT -A RH-Firewall-1-INPUT -s -p udp -m state --state NEW -m udp --dport 9600:9700 -j ACCEPT -A RH-Firewall-1-INPUT -s -p udp -m state --state NEW -m udp --dport 9600:9700 -j ACCEPT -A RH-Firewall-1-INPUT -j REJECT --reject-with icmp-host-prohibited COMMIT

  1. Also need to setup Dell OpenManage Server:
    1. wget -q -O - | bash
    2. yum install srvadmin-base
    3. yum install srvadmin-rac4
    4. Start the service:
      service dataeng start


Installing VMWare Tools on Linux VM

  • Bring up VM Console. Login to the Linux VM.
  • Inventory>Virtual Machine>Guest>
  • mkdir /mnt/cdrom
  • mount /dev/cdrom /mnt/cdrom
  • cd /tmp
  • rpm -Uhv /mnt/cdrom/VMWareTools-blah-blah.rpm
  • umount /dev/cdrom
    Accept the defaults.

Didn't do it, but this came up at the end -- might need for later:

To use the vmxnet driver, restart networking using the following commands:
/etc/init.d/network stop
rmmod pcnet32
rmmod vmxnet
modprobe vmxnet
/etc/init.d/network start

Printing Error Logs

(3:34:59 PM) slack: in /etc/cups/cupsd.conf there is a LogLevel setting that you can change to "debug" to get a lot more information out of the client about what's going on
(3:36:35 PM) slack: /var/log/cups/error_log is where the debug information goes
(3:36:45 PM) djgreen: thanks, that'll help
(3:41:32 PM) tpgrimes: djgreen: if LogLevel "debug" isn't enough to figure it out, you can try "debug2" which supposed to log everything.

Firewall Management


To run realmconfig remotely:

sudo /usr/sbin/realmconfig

Allowing Remote Access to GRENDELS

(1:03:33 PM) slack: pam_ncsu_access is always in the pam files (by default) and realmconfig fills out the /etc/security/realmconfig-*.conf files
(1:03:48 PM) slack: what I would do is edit /etc/security/realmconfig-remote-cluster.conf
(1:04:06 PM) slack: the format is specified here: /usr/share/doc/realm-auth-tools-0.5.0/README
(1:05:07 PM) slack: But it would be something like this:
(1:05:09 PM) slack: + @engrstaff
(1:05:09 PM) slack: + @engr_fall
(1:05:11 PM) slack: + @engr_spring
(1:05:14 PM) slack: + @engr_sum
(1:05:53 PM) slack: /etc/security/realmconfig-remote-local.conf should only affect logins at the text console or gdm screen
(1:07:00 PM) djgreen: "/etc/security/realmconfig-remote-cluster.conf" says that its written by realmconfig. 
(1:07:10 PM) djgreen: issues w/ using that?
(1:07:19 PM) djgreen: or is that something that isn't used anymore
(1:07:56 PM) slack: just fill it out and don't use realmconfig to rewrite it ;-)
(1:08:08 PM) slack: RC isn't smart enough to handle a config that quite that complex

Fixing Corrupted Disk

  • First figure out what's corrupted. You try looking at "dmesg" to figure out the errors.
  • If you're given a /dev/* that you don't recognize, try
    ls -l /dev/<dev> /dev/mapper/* /dev/md*
    No luck? Try some other commands:
dmsetup table
dmsetup ls -tree
cat /proc/mdstat

You're looking for information that indicates which device is messed up. The minor number in may be the indicator. For example, when looking for /dev/dm-6, I found that /Volume00-tmp was 253, 6. Showing that it was the device in question. Plus, I had a bunch of /tmp errors anyway!

  • Unmount the disk:
    umount /tmp
  • Run fsck:
    fsck /tmp
  • Remount:
    mount /tmp
  • Then, since its /tmp, let's clear out the fluff:
tmpwatch --mtime --all 1 /tmp

(where the 1 = number of hours since a file has been updated)

  • Reboot