By default most Linux installations come with the console screen saver on. This is generally a bad idea for a server system because if you do experience kernel panic issues you will not be able to see what caused the panic. The screen will remain blank.
Put the following lines in your /etc/rc.d/rc.local or in your login scripts:
setterm -blank 0
If you use X-Windows append one more line:
This should allow for viewing of the screen in the even of kernel panic.
Hey everyone, normally I don’t have issues with hardware in any production systems but, on occasion, I do see a few hardware failures. I would say the primary hardware failure I see on a day-to-day basis is hard disk failures. Today, however, I experienced a Machine Check Exception which had be puzzled. After turning to Google I found that this is most typically (at least on Linux) bad cache on the processor. This particular system was a dual processor system so I had virtually nothing to worry about if I had to fly solo until the part arrived.
The Machine Check Exception looks like this:
CPU 3: Machine Check Exception 0000000000000005
Bank 0: b200004010000400
Bank 5: b200121020000400
On Windows systems a MCE (Machine Check Exception) could also mean bad RAM, Motherboard or Processor. Reseating the processor is also another good option if you experience this error message.
I recently ran in to an issue where calling functions from java.awt.Color caused a "NoClassDefFoundError" in the JSP page. I restarted resin and kept refreshing this JSP page. I saw a different error message that looked like this:
libXp.so.6: cannot open shared object file: No such file or directory
at java.lang.ClassLoader$NativeLibrary.load(Native Method)
at java.security.AccessController.doPrivileged(Native Method)
This error was much different from the previous error but shows that when AWT was trying to initialize it could not locate "libXp.so.6". Through some more research I found that libXp.so.6 was part of the "xorg-x11-depreciated-libs" package in CentOS 4.5. I issued a "yum -y install xorg-x11-depreciated-libs" and a "ldconfig" to be safe and restarted resin. My java.awt.Color functions seemed to work perfectly after this.
Hopefully this helps someone!
One great command to add to your arsenal is the “lsof” command. This command prints out all open files in Linux.
lsof can be used to help resolve issues like:
- Can’t unmount device because it is busy; even if you believe it’s not busy.
umount: /mountpoint: device is busy
- A process is using a file but you have no idea which process
- To view a list of active connections (netstat works better for this) and which program and PID (Process ID) is using this socket.
To successfully unmount a device which still complains about being in use simply run the following command:
# lsof | grep “/mountpoint”
This command returns a list of processes and associated PID’s and the user which has that directory or files open. Look for files (usually marked with “REG”) which will allow you to locate the service or program with the file open. Stop this service or at the very extreme kill -9 that process. (A funny video about kill -9)
To search for a file which is in use simply use an alteration of the command above:
# lsof | grep “openfile”
This allows you to locate the process and user using that file.
To view a list of active connections run this command:
# lsof | grep “IPv4”
This returns a list of all open IPv4 connections.
Also be aware that the “lsof” command can take quite some time to run on servers with very large file counts open (Oracle Servers, Web Servers) so please be patient. It’s not uncommon for the lsof command to take about 2-4 seconds to run.
I believe that most performance issues related to slowness occur because of slow disks or poor application tuning. Memory is a big factor when it comes to OS-level caching and buffering but there’s nothing like a fast SCSI array or even a few WD Raptors in RAID-1.
The Linux utility "iostat" allows you to see a complete overview of disk utilization. The iostat utility does this by looking at the time the device is active in relation to the devices average transfer rate.
Using the iostat utility with the -x flag (-x is for extended statistics) will yield results that look like this:
If the iostat command is not available on your system perform one of the following commands to install the sysstat package.
CentOS/RHEL – # yum -y install sysstat
Ubuntu/Debian – # apt-get install sysstat
Pay special attention to the "%util" column of the results. In the example above the percentage of CPU time for I/O requests for /dev/sdb is quite high. This device is actually a large RAID-6 array and has not yet reached its 100% utilization mark. The closer the device or array is to 100% the closer you are to total saturation of that device.
If your utilization numbers are higher than expected take the following into consideration:
- Tune the application (This is where you can gain the cheapest and most performance)
- Obtain faster disks (10K+ SATA/SAS/SCSI)
- Use a larger and more efficient RAID array for your application (RAID 0 for video editing, RAID-10 for databases, RAID-5 for file storage and general access and RAID-6 on newer controllers for increased redundancy)