BVLog Bryan Voss’ mental synchronization point

14Sep/090

Linux force password change

I needed to have a user change their password on their next login, so I had to look up how to do that. Since I had to do a little more searching than usual to find an answer, I'm posting it here for posterity.

First, change the user's password to a temporary:
passwd [username]

Next, reset the password expiration to 90 days (-M) and set the last change date to 0 (-d) to force a change:
chage -M 90 -d 0 [username]

Finally, verify the info:
chage -l [username]

Filed under: linux, sysadmin No Comments
6Jul/090

Windows: svchost.exe taking up a lot of memory

One of my coworkers has been keeping an eye on a couple of Windows Server 2003 boxes at work due to some problems we have had with them. He sent me a screenshot of Task Manager showing a svchost.exe instance using more than 1GB of memory. Unfortunately, by the time I checked the server, the process was gone.

Since svchost.exe can host many things, I did a little research and sent the following command to him to help us identify what's consuming so much memory:
tasklist /svc /fo list /fi "imagename eq svchost.exe" /fi "memusage gt 1000000"

That will show any svchost.exe processes that are consuming more than 1GB of memory. It will also list all the hosted DLLs so we can track down the culprit. Of course, all the same info is available in Process Explorer, but we don't have that installed on the box in question (although I am thinking of suggesting that we make that a standard part of our server loadout).

I may also encapsulate the tasklist command above into a Nagios check and run it against all our Windows boxen. That's what I like about having a flexible monitoring system like Nagios in place. As we find new things to check, we can just add them and the whole thing is automated from that point on.

Filed under: sysadmin, windows No Comments
22May/090

Linux process elapsed time

I'm working on a script to dump a daily audit log from one of our clinical systems. The script invokes Oracle's sqlplus and runs a query provided by the vendor to show all user activity in the past 24 hours. I started the query running with the expectation that it would take at most a few minutes to run. That was before lunch. It's now late afternoon and the script is still running.

The output file is slowly growing, so it's still working. I attached an strace to the process and see it mostly waiting in a read state, so I assume that we just need to optimize the indexes on the database to make it run faster. That's mostly outside my responsibility, but I did want to check to see how long the process had been running in order to make a preliminary report to my coworkers. Digging around in the /proc directory for the process didn't immediately show me what I wanted to know, so I turned to the "ps" command. I had to do some reading, but ended up with the following:
ps -o etime,stime,time,cmd -C sqlplus

This gave me what I was looking for: elapsed time, system time, wait time, and the command for all sqlplus processes.

Total elapsed time so far: 5 hours. Wow. That's quite a while for a single query.

Filed under: linux, sysadmin No Comments
9Mar/090

1.) Is it turned on?

I was recently contacted by EMC Support saying they had not received a health report from our new secondary Centera cluster in a while. They had tried dialing into the cluster via modem, but were not getting a response. They asked me to reset the modem on the cluster to ensure that it was working correctly so they could dial in and check things out.

As soon as I hung up the phone, I brought up my Centera Viewer client and tried to login to the cluster. No response. No ping response either. As I walked down the hall to the datacenter, I was reviewing network connectivity for the cluster in my mind. If a system isn't working correctly, blame it on the network, right?

Once in the datacenter, I opened the back of the rack and found the modem dead. No lights at all. After checking cables, it occurred to me that I wasn't feeling any breeze from the fans in all the nodes. A quick glance told me that there were no lights on the back of the cluster. I walked around to the front and found no lights there either.

As possible causes for a complete power failure to the rack began whizzing through my head, one tidbit floated to the surface: About two weeks before, we had been coordinating with the Maintenance department on moving our datacenter power feeds to a new powerhouse the hospital recently built. We have big APC UPSes that will power the datacenter for a few minutes until generators kick in. Since Maintenance wasn't sure how long it would take to reroute power through the new powerhouse and generators were out of the question, we had to prepare for the worst and assume the UPSes would drain and shut down before power was restored. One of the steps we took was powering down all non-critical systems. Since the new Centera was a replication target and replication was not in full swing yet, I decided to power it down for the move.

Of course, I'm sure you've already determined the problem. We forgot to power it back up! Since the Centera was new, I had not yet added it to our Nagios monitoring system and was not paying much attention to it. I powered the cluster up and sheepishly called EMC Support to report my little flub.

Take-aways (don'tcha love biz-speak terms like that?):

  • Even experienced tech guys like me fall victim to noob shenanigans like forgetting to check power on a system before diving into troubleshooting.
  • Add systems to your monitoring solution early, even if they're not in production yet. You can always disable alerting for that particular system until it's in production, and it's a good shakedown to make sure your thresholds are reasonable. It will also tell you if you maybe shut down the system and forget to turn it back on! (Like anybody would ever do something like that...)
27Oct/080

Red light, green light

One of our enterprise application vendors recommends the following procedure for Daylight Savings Time change:

On the first Sunday in November you need to shut down blah applications during the time change to avoid duplicate chart times.

This is for a mission-critical 24x7 application that is used daily by hundreds of people.

Why don't we just go ahead and close the hospital for an hour to avoid duplicating any information?

Wow.

Filed under: funny, sysadmin No Comments
27Oct/080

Today’s post brought to you by the letter zero

Me: Ok, your account has been created. Your username is blah blah blah zero.

User: Um, is that the number zero or the letter zero?

Me: (Dumbfounded silence) That would be the number zero.

User: Ok, thanks!

Filed under: funny, sysadmin No Comments
16Sep/080

ssh login via shared key

Since I only need to set this up the first time I get a new PC/server online, I figured I might as well document it here to make it easier to remember.

Very briefly, we're creating a new key with ssh-keygen. (Don't run this if you have an existing key you want to use). We're then copying the public portion of the key to [server]'s authorized_keys, which will allow us to login without a password from now on. Many assumptions are made here, so if you run into problems, google "ssh key login" for more info.

Use wisely.

ssh-keygen
cat ~/.ssh/id_rsa.pub | ssh [server] 'mkdir -p .ssh ; cat >> .ssh/authorized_keys'

Filed under: linux, sysadmin No Comments
12Jun/080

More shell scripting

for file in Pyramis*.ps ; do name=`echo $file | cut -d'.' -f 1` ; ps2ascii $file > $name.txt ; done

Converting a bunch of Postscript files to ASCII text files. Posted here for my future reference. (And anybody else that may happen to be interested.) It's a fairly basic example, but I tend to fumble around on these if I haven't done much shell scripting in a while.

We're iterating through the list of files named Pyramis*.ps . The filename ($file) is passed through cut to chop off the extension (.ps) and the resulting name is assigned to the $name variable. We then run the original filename ($file) through ps2ascii to do the actual conversion and write the output to $name with a .txt extension. We end up with a bunch of files with the same name as the original Postscript file, but with a .txt extension.

Groovy.

Filed under: linux, sysadmin No Comments
19May/080

Why commandline is important

Some interesting BASHing I did today:

find . -type f -print | cut -d'/' -f 2 | uniq | sort > content
for files in *; do echo $files; done | sort | grep -v content | grep -v -f content | xargs -n 1 rm -rf

The first line finds all subdirectories of the current directory that contain files and prints the list to the file contents. The second line deletes all the subdirectories that are not contained in the file content.

I had to do this to clean up a huge number of orphaned batch directories left by our document imaging system (Windows, but I'm running Cygwin and pointing to a drive mapped to the DI fileserver). The vendor provides a GUI app to do this, but it takes the larger part of eternity to map out all the directories, then you have to click on each directory and click delete. Pretty useless for the thousands of directories I had to deal with. My commandline solution took about 5 minutes to work out and around 15 seconds to run. Several hours of menial mouse clicking saved.

I had to shut down a release process in order to make sure I didn't catch any legitimate directories that had been created but not yet populated. The next iteration will probably include a bit more logic to only include directories that are more than a day old or so. A simple -mtime tweak to the find command should do it. That will enable me to run it while the entire system is live.

Filed under: linux, sysadmin No Comments
8May/080

grep for yesterday’s syslog entries

I was asked by a coworker how to grep syslog files for entries from the past 24 hours. Although it is simple to do manually, I thought it might be nice to put together a simple script to do the work for her. Here's what I came up with:

#/bin/sh
# bvoss 2008/05/08 grep for yesterday's date in any syslog-formatted file
# (May 7)
yesterday="`date -d "-24 hours" | cut -b 5-10`"
grep "$yesterday" $1

Just put it somewhere in the path (/usr/local/bin works nicely) and "chmod +x" to make it executable. Works on any syslog-formatted file with month and day at the beginning of each line. Syntax is grep_yesterday [file]. It returns all entries from midnight until 23:59:59 yesterday.

Filed under: linux, sysadmin No Comments

Pages

Archives

Categories

Meta