Friday, February 27, 2009

Creates a minimalist xorg.conf

$ dpkg-reconfigure -phigh xserver-xorg


Wednesday, February 25, 2009

Show biggest files/directories, biggest first with 'k,m,g' eyecandy

du --max-depth=1 | sort -r -n | awk '{split("k m g",v); s=1; while($1>1024){$1/=1024; s++} print int($1)" "v[s]"\t"$2}'

Thursday, February 19, 2009

Study SPI4.2 standard.

10 Awk Tips, Tricks and Pitfalls

Hi guys and girls, this is the first guest post on my blog. It’s written by Waldner from #awk on FreeNode IRC Network. He works as a sysadmin and does shell scripting as a hobby. Waldner will be happy to take any questions about the article. You can ask them in the comments of this post or on IRC.

This article takes a look at ten tips, tricks and pitfalls in Awk programming language. They are mostly taken from the discussions in #awk IRC channel. Here they are:

Be idiomatic!

In this paragraph, we give some hints on how to write more idiomatic (and usually shorter and more efficient) awk programs. Many awk programs you’re likely to encounter, especially short ones, make large use of these notions.

Suppose one wants to print all the lines in a file that match some pattern (a kind of awk-grep, if you like). A reasonable first shot is usually something like

awk '{if ($0 ~ /pattern/) print $0}'

That works, but there are a number of things to note.

The first thing to note is that it is not structured according to the awk’s definition of a program, which is

condition { actions }

Our program can clearly be rewritten using this form, since both the condition and the action are very clear here:

awk '$0 ~ /pattern/ {print $0}'

Our next step in the perfect awk-ification of this program is to note that /pattern/ is the same as $0 ~ /pattern/. That is, when awk sees a single regular expression used as an expression, it implicitly applies it to $0, and returns success if there is a match. Then we have:

awk '/pattern/ {print $0}'

Now, let’s turn our attention to the action part (what’s inside braces). print $0 is a redundant statement, since print alone, by default, prints $0.

awk '/pattern/ {print}'

But now we note that, when it finds that a condition is true, and there are no associated actions, awk performs a default action that is (you guessed it) print (which we already know is equivalent to print $0). Thus we can do this:

awk '/pattern/'

Now we have reduced the initial program to its simplest (and more idiomatic) form. In many cases, if all you want to do is print some lines, according to a condition, you can write awk programs composed only of a condition (although complex):

awk '(NR%2 && /pattern/) || (!(NR%2) && /anotherpattern/)'

That prints odd lines that match /pattern/, or even lines that match /anotherpattern/. Naturally, if you don’t want to print $0 but instead do something else, then you’ll have to manually add a specific action to do what you want.

From the above, it follows that

awk 1
awk '"a"' # single quotes are important!

are both awk programs that just print their input unchanged. Sometimes, you want to operate only on some lines of the input (according to some condition), but also want to print all the lines, regardless of whether they were affected by your operation or not. A typical example is a program like this:

awk '{sub(/pattern/,"foobar")}1'

This tries to replace “pattern” with “foobar“. Whether or not the substitution succeeds, the always-true condition “1” prints each line (you could even use “42″, or “19″, or any other nonzero value if you want; “1″ is just what people traditionally use). This results in a program that does the same job as sed ’s/pattern/foobar/’. Here are some examples of typical awk idioms, using only conditions:

awk 'NR % 6'            # prints all lines except those divisible by 6
awk 'NR > 5' # prints from line 6 onwards (like tail -n +6, or sed '1,5d')
awk '$2 == "foo"' # prints lines where the second field is "foo"
awk 'NF >= 6' # prints lines with 6 or more fields
awk '/foo/ && /bar/' # prints lines that match /foo/ and /bar/, in any order
awk '/foo/ && !/bar/' # prints lines that match /foo/ but not /bar/
awk '/foo/ || /bar/' # prints lines that match /foo/ or /bar/ (like grep -e 'foo' -e 'bar')
awk '/foo/,/bar/' # prints from line matching /foo/ to line matching /bar/, inclusive
awk 'NF' # prints only nonempty lines (or: removes empty lines, where NF==0)
awk 'NF--' # removes last field and prints the line
awk '$0 = NR" "$0' # prepends line numbers (assignments are valid in conditions)

Another construct that is often used in awk is as follows:

awk 'NR==FNR { # some actions; next} # other condition {# other actions}' file1 file2

This is used when processing two files. When processing more than one file, awk reads each file sequentially, one after another, in the order they are specified on the command line. The special variable NR stores the total number of input records read so far, regardless of how many files have been read. The value of NR starts at 1 and always increases until the program terminates. Another variable, FNR, stores the number of records read from the current file being processed. The value of FNR starts from 1, increases until the end of the current file, starts again from 1 as soon as the first line of the next file is read, and so on. So, the condition “NR==FNR” is only true while awk is reading the first file. Thus, in the program above, the actions indicated by “# some actions” are executed when awk is reading the first file; the actions indicated by “# other actions” are executed when awk is reading the second file, if the condition in “# other condition” is met. The “next” at the end of the first action block is needed to prevent the condition in “# other condition” from being evaluated, and the actions in “# other actions” from being executed while awk is reading the first file.

There are really many problems that involve two files that can be solved using this technique. Here are some examples:

# prints lines that are both in file1 and file2 (intersection)
awk 'NR==FNR{a[$0];next} $0 in a' file1 file2

Here we see another typical idiom: a[$0] has the only purpose of creating the array element indexed by $0. During the pass over the first file, all the lines seen are remembered as indexes of the array a. The pass over the second file just has to check whether each line being read exists as an index in the array a (that’s what the condition $0 in a does). If the condition is true, the line is printed (as we already know).

Another example. Suppose we have a data file like this

20081010 1123 xxx
20081011 1234 def
20081012 0933 xyz
20081013 0512 abc
20081013 0717 def
...thousand of lines...

where “xxx”, “def”, etc. are operation codes. We want to replace each operation code with its description. We have another file that maps operation codes to human readable descriptions, like this:

abc withdrawal
def payment
xyz deposit
xxx balance
...other codes...

We can easily replace the opcodes in the data file with this simple awk program, that again uses the two-files idiom:

# use information from a map file to modify a data file
awk 'NR==FNR{a[$1]=$2;next} {$3=a[$3]}1' mapfile datafile

First, the array a, indexed by opcode, is populated with the human readable descriptions. Then, it is used during the reading of the second file to do the replacements. Each line of the datafile is then printed after the substitution has been made.

Another case where the two-files idiom is useful is when you have to read the same file twice, the first time to get some information that can be correctly defined only by reading the whole file, and the second time to process the file using that information. For example, you want to replace each number in a list of numbers with its difference from the largest number in the list:

# replace each number with its difference from the maximum
awk 'NR==FNR{if($0>max) max=$0;next} {$0=max-$0}1' file file

Note that we specify “file file” on the command line, so the file will be read twice.

Caveat: all the programs that use the two-files idiom will not work correctly if the first file is empty (in that case, awk will execute the actions associated to NR==FNR while reading the second file). To correct that, you can reinforce the NR==FNR condition by adding a test that checks that also FILENAME equals ARGV[1].

Pitfall: shorten pipelines

It’s not uncommon to see lines in scripts that look like this:

somecommand | head -n +1 | grep foo | sed 's/foo/bar/' | tr '[a-z]' '[A-Z]' | cut -d ' ' -f 2

This is just an example. In many cases, you can use awk to replace parts of the pipeline, or even all of it:

somecommand | awk 'NR>1 && /foo/{sub(/foo/,"bar"); print toupper($2)}'

It would be nice to collect here many examples of pipelines that could be partially or completely eliminated using awk.

Print lines using ranges

Yes, we all know that awk has builtin support for range expressions, like

# prints lines from /beginpat/ to /endpat/, inclusive
awk '/beginpat/,/endpat/'

Sometimes however, we need a bit more flexibility. We might want to print lines between two patterns, but excluding the patterns themselves. Or only including one. A way is to use these:

# prints lines from /beginpat/ to /endpat/, not inclusive
awk '/beginpat/,/endpat/{if (!/beginpat/&&!/endpat/)print}'

# prints lines from /beginpat/ to /endpat/, not including /beginpat/
awk '/beginpat/,/endpat/{if (!/beginpat/)print}'

It’s easy to see that there must be a better way to do that, and in fact there is. We can use a flag to keep track of whether we are currently inside the interesting range or not, and print lines based on the value of the flag. Let’s see how it’s done:

# prints lines from /beginpat/ to /endpat/, not inclusive
awk '/endpat/{p=0};p;/beginpat/{p=1}'

# prints lines from /beginpat/ to /endpat/, excluding /endpat/
awk '/endpat/{p=0} /beginpat/{p=1} p'

# prints lines from /beginpat/ to /endpat/, excluding /beginpat/
awk 'p; /endpat/{p=0} /beginpat/{p=1}'

All these programs just set p to 1 when /beginpat/ is seen, and set p to 0 when /endpat/ is seen. The crucial difference between them is where the bare “p” (the condition that triggers the printing of lines) is located. Depending on its position (at the beginning, in the middle, or at the end), different parts of the desired range are printed. To print the complete range (inclusive), you can just use the regular /beginpat/,/endpat/ expression or use the flag technique, but reversing the order of the conditions and associated patterns:

# prints lines from /beginpat/ to /endpat/, inclusive
awk '/beginpat/{p=1};p;/endpat/{p=0}'

It goes without saying that while we are only printing lines here, the important thing is that we have a way of selecting lines within a range, so you can of course do anything you want instead of printing.

Split file on patterns

Suppose we have a file like this

line1
line2
line3
line4
FOO1
line5
line6
FOO2
line7
line8
FOO3
line9
line10
line11
FOO4
line12
FOO5
line13

We want to split this file on all the occurrences of lines that match /^FOO/, and create a series of files called, for example, out1, out2, etc. File out1 will contain the first 4 lines, out2 will contain “line5″ and “line6″, etc. There are at least two ways to do that with awk:

# first way, works with all versions of awk
awk -v n=1 '/^FOO[0-9]*/{close("out"n);n++;next} {print > "out"n}' file

Since we don’t want to print anything when we see /^FOO/, but only update some administrative data, we use the “next” statement to tell awk to immediately start processing the next record. Lines that do not match /^FOO/ will instead be processed by the second block of code. Note that this method will not create empty files if an empty section is found (eg, if “FOO5\nFOO6″ is found, the file “out5″ will not be created). The “-v n=1” is used to tell awk that the variable “n” should be initialized with a value of 1, so effectively the first output file will be called “out1“.

Another way (which however needs GNU awk to work) is to read one chunk of data at a time, and write that to its corresponding out file.

# another way, needs GNU awk
LC_ALL=C gawk -v RS='FOO[0-9]*\n' -v ORS= '{print > "out"NR}' file

The above code relies on the fact that GNU awk supports assigning a regular expression to RS (the standard only allows a single literal character or an empty RS). That way, awk reads a series of “records”, separated by the regular expression matching /FOO[0-9]*\n/ (that is, the whole FOO… line). Since newlines are preserved in each section, we set ORS to empty since we don’t want awk to add another newline at the end of a block. This method does create an empty file if an empty section is encountered. On the downside, it’s a bit fragile because it will produce incorrect results if the regex used as RS appears somewhere else in the rest of the input.

We will see other examples where gawk’s support for regexes as RS is useful. Note that the last program used LC_ALL=C at the beginning…

Locale-based pitfalls

Sometimes awk can behave in an unexpected way if the locale is not C (or POSIX, which should be the same). See for example this input:

-rw-r--r-- 1 waldner users 46592 2003-09-12 09:41 file1
-rw-r--r-- 1 waldner users 11509 2008-10-07 17:42 file2
-rw-r--r-- 1 waldner users 11193 2008-10-07 17:41 file3
-rw-r--r-- 1 waldner users 19073 2008-10-07 17:45 file4
-rw-r--r-- 1 waldner users 36332 2008-10-07 17:03 file5
-rw-r--r-- 1 waldner users 33395 2008-10-07 16:53 file6
-rw-r--r-- 1 waldner users 54272 2008-09-18 16:20 file7
-rw-r--r-- 1 waldner users 20573 2008-10-07 17:50 file8

You’ll recognize the familiar output of ls -l here. Let’s use a non-C locale, say, en_US.utf8, and try an apparently innocuous operation like removing the first 3 fields.

$ LC_ALL=en_US.utf8 awk --re-interval '{sub(/^([^[:space:]]+[[:space:]]+){3}/,"")}1' file
-rw-r--r-- 1 waldner users 46592 2003-09-12 09:41 file1
-rw-r--r-- 1 waldner users 11509 2008-10-07 17:42 file2
-rw-r--r-- 1 waldner users 11193 2008-10-07 17:41 file3
-rw-r--r-- 1 waldner users 19073 2008-10-07 17:45 file4
-rw-r--r-- 1 waldner users 36332 2008-10-07 17:03 file5
-rw-r--r-- 1 waldner users 33395 2008-10-07 16:53 file6
-rw-r--r-- 1 waldner users 54272 2008-09-18 16:20 file7
-rw-r--r-- 1 waldner users 20573 2008-10-07 17:50 file8

It looks like sub() did nothing. Now change that to use the C locale:

$ LC_ALL=C awk --re-interval '{sub(/^([^[:space:]]+[[:space:]]+){3}/,"")}1' file
users 46592 2003-09-12 09:41 file1
users 11509 2008-10-07 17:42 file2
users 11193 2008-10-07 17:41 file3
users 19073 2008-10-07 17:45 file4
users 36332 2008-10-07 17:03 file5
users 33395 2008-10-07 16:53 file6
users 54272 2008-09-18 16:20 file7
users 20573 2008-10-07 17:50 file8

Now it works. Another localization issue is the behavior of bracket expressions matching, like for example [a-z]:

$ echo 'èòàù' | LC_ALL=en_US.utf8 awk '/[a-z]/'
èòàù

This may or may not be what you want. When in doubt or when facing an apparently inexplicable result, try putting LC_ALL=C before your awk invocation.

Parse CSV

This is another thing people do all the time with awk. Simple CSV files (with fields separated by commas, and commas cannot appear anywhere else) are easily parsed using FS=’,’. There can be spaces around fields, and we don’t want them, like eg

    field1  ,   field2   , field3   , field4

Exploiting the fact that FS can be a regex, we could try something like FS=’^ *| *, *| *$’. This can be problematic for two reasons:

  • actual data field might end up correponding either to awk’s fields 1 … NF or 2 … NF, depending on whether the line has leading spaces or not;
  • for some reason, assigning that regex to FS produces unexpected results if fields have embedded spaces (anybody knows why?).

In this case, it’s probably better to parse using FS=’,’ and remove leading and trailing spaces from each field:

# FS=','
for(i=1;i<=NF;i++){
gsub(/^ *| *$/,"",$i);
print "Field " i " is " $i;
}

Another common CSV format is

"field1","field2","field3","field4"

Assuming double quotes cannot occur in fields. This is easily parsed using FS=’^”|”,”|”$’ (or FS=’”,”|”‘ if you like), keeping in mind that the actual fields will be in position 2, 3 … NF-1. We can extend that FS to allow for spaces around fields, like eg

   "field1"  , "field2",   "field3" , "field4"

by using FS=’^ *”|” *, *”|” *$’. Usable fields will still be in positions 2 … NF-1. Unlike the previous case, here that FS regex seems to work fine. You can of course also use FS=’,’, and remove extra characters by hand:

# FS=','
for(i=1;i<=NF;i++){
gsub(/^ *"|" *$/,"",$i);
print "Field " i " is " $i;
}

Another CSV format is similar to the first CSV format above, but allows for field to contain commas, provided that the field is quoted:

 field1, "field2,with,commas"  ,  field3  ,  "field4,foo"

We have a mixture of quoted and unquoted fields here, which cannot parsed directly by any value of FS (that I know of, at least). However, we can still get the fields using match() in a loop (and cheating a bit):

$0=$0",";                                  # yes, cheating
while($0) {
match($0,/[^,]*,| *"[^"]*" *,/);
sf=f=substr($0,RSTART,RLENGTH); # save what matched in sf
gsub(/^ *"?|"? *,$/,"",f); # remove extra stuff
print "Field " ++c " is " f;
sub(sf,""); # "consume" what matched
}

As the complexity of the format increases (for example when escaped quotes are allowed in fields), awk solutions become more fragile. Although I should not say this here, for anything more complex than the last example, I suggest using other tools (eg, Perl just to name one). Btw, it looks like there is an awk CSV parsing library here: http://lorance.freeshell.org/csv/ (I have not tried it).

Pitfall: validate an IPv4 address

Let’s say we want to check whether a given string is a valid IPv4 address (for simplicity, we limit our discussion to IPv4 addresses in the traditiona dotted quad format here). We start with this seemingly valid program:

awk -F '[.]' 'function ok(n){return (n>=0 && n<=255)} {exit (ok($1) && ok($2) && ok($3) && ok($4))}'

This seems to work, until we pass it ‘123b.44.22c.3′, which it happily accepts as valid. The fact is that, due to the way awk’s number to string conversion works, some strings may “look like” numbers to awk, even if we know they are not. The correct thing to do here is to perform a string comparison against a regular expression:

awk -F '[.]' 'function ok(n) {
return (n ~ /^([01]?[0-9]?[0-9]|2[0-4][0-9]|25[0-5])$/)
}
{exit (ok($1) && ok($2) && ok($3) && ok($4))}'

Check whether two files contain the same data

We want to check whether two (unsorted) files contain the same data, that is, the set of lines of the first file is the same set of lines of the second file. One way is of course sorting the two files and processing them with some other tool (for example, uniq or diff). But we want to avoid the relatively expensive sort operation. Can awk help us here? The answer (you guessed it) is yes. If we know that the two files do not contain duplicates, we can do this:

awk '!($0 in a) {c++;a[$0]} END {exit(c==NR/2?0:1)}' file1 file2

and check the return status of the command (0 if the files are equal, 1 otherwise). The assumption we made that the two files must not contain duplicate lines is crucial for the program to work correctly. In essence, what it does is to keep track of the number of different lines seen. If this number is exactly equal to half the number of total input records seen, then the two files must be equal (in the sense described above). To understand that, just realize that, in all other cases (ie, when a file is only a partial subset or is not a subset of the other), the total number of distinct lines seen will always be greater than NR/2.

The program’s complexity is linear in the number of input records.

Pitfall: contexts and variable types in awk

We have this file:

1,2,3,,5,foo
1,2,3,0,5,bar
1,2,3,4,5,baz

and we want to replace the last field with “X” only when the fourth field is not empty. We thus do this:

awk -F ',' -v OFS=',' '{if ($4) $6="X"}1'

But we see that the substitution only happens in the last line, instead of the last two as we expected. Why?

Basically, there are only two data types in awk: strings and numbers. Internally, awk does not assign a fixed type to the variables; they are literally considered to be of type “number” and “string” at the same time, with the number 0 and the null string being equivalent. Only when a variable is used in the program, awk automatically converts it to the type it deems appropriate for the context. Some contexts strictly require a specific type; in that case, awk automatically converts the variable to that type and uses it. In contexts that does not require a specific type, awk treats variables that “look like” numbers as numbers, and the other variables are treated as strings. In out example above, the simple test “if ($4)” does not provide a specific context, since the tested variable can be anything. In the first line, $4 is an empty string, so awk considers it false for the purposes of the test. In the second line, $4 is “0″. Since it look like a number, awk uses it like a number, ie zero. Since 0 is considered false, the test is unsuccessful and the substitution is not performed.

Luckily, there is a way to help awk and tell it exactly what we want. We can use string concatenation and append an empty string to the variable (which does not change its value) to explicitly tell awk that we want it to treat it like a string, or, conversely, add 0 to the variable (again, without changing its value) to explicitly tell awk that we want a number. So this is how our program should be to work correctly:

awk -F ',' -v OFS=',' '{if ($4"") $6="X"}1'   # the "" forces awk to evaluate the variable as a string

With this change, in the second line the if sees the string “0″, which is not considered false, and the test succeeds, just as we wanted.

As said above, the reverse is also true. Another typical problematic program is this:

awk '/foo/{tot++} END{print tot}'

This, in the author’s intention, should count the number of lines that match /foo/. But if /foo/ does not appear in the input, the variable tot retains its default initial value (awk initializes all variables with the dual value “” and 0). print expects a string argument, so awk supplies the value “”. The result is that the program prints just an empty line. But we can force awk to treat the variable as numeric, by doing this:

awk '/foo/{tot++} END{print tot+0}'

The seemingly innocuous +0 has the effect of providing numeric context to the variable “tot“, so awk knows it has to prefer the value 0 of the variable over the other possible internal value (the empty string). Then, numeric-to-string conversion still happens to satisfy print, but this time what awk converts to string is 0, so print sees the string “0″ as argument, and prints it.

Note that, if an explicit context has been provided to a variable, awk remembers that. That can lead to unexpected results:

# input: 2.5943 10
awk '{$1=sprintf("%d",$1); # truncates decimals, but also explicitly turns $1 into a string!
if($1 > $2) print "something went wrong!" } # this is printed

Here, after the sprintf(), awk notes that we want $1 to be a string (in this case, “2″). Then, when we do if($1>$2), awk sees that $2 has no preferred type, while $1 does, so it converts $2 into a string (to match the wanted type of $1) and does a string comparison. Of course, 99.9999% of the times this is not what we want here. In this case, the problem is easily solved by doing “if ($1+0 > $2)” (doing $2+0 instead WON’T work!), doing “$1=$1+0” after the sprintf(), or using some other means to truncate the value of $1, that does not give it explicit string type.

Pulling out things

Suppose you have a file like this:

Yesterday I was walking in =the street=, when I saw =a
black dog=. There was also =a cat= hidden around there. =The sun= was shining, and =the sky= was blue.
I entered =the
music
shop= and I bought two CDs. Then I went to =the cinema= and watched =a very nice movie=.
End of the story.

Ok, silly example, fair enough. But suppose that we want to print only and all the parts of that file that are like =something=. We have no knowledge of the structure of the file. The parts we’re interested in might be anywere; they may span lines, or there can be many of them on a single line. This seemingly daunting and difficult task is actually easily accomplished with this small awk program:

awk -v RS='=' '!(NR%2)'
# awk -v RS='=' '!(NR%2){gsub(/\n/," ");print}' # if you want to reformat embedded newlines

Easy, wasn’t it? Let’s see how this works. Setting RS to ‘=’ tells awk that records are separated by ‘=’ (instead of the default newline character). If we look at the file as a series of records separated by ‘=’, it becomes clear that what we want are the even-numbered records. So, just throw in a condition that is true for even-numbered records to trigger the printing.

GNU awk can take this technique a step further, since it allows us to assign full regexes to RS, and introduces a companion variable (RT) that stores the part of the input that actually matched the regex in RS. This allows us, for example, to apply the previous technique when the interesting parts of the input are delimited by different characters or string, like for example when we want everything that matches something. With GNU awk, we can do this:

gawk -v RS='' 'RT==""'

or again

gawk -v RS='' '!(NR%2)'

and be done with that. Another nice thing that can be done with GNU awk and RT is printing all the parts of a file that match an arbitrary regular expression (something otherwise usually not easily accomplished). Suppose that we want to print everything that looks like a number in a file (simplifiying, here any sequence of digits is considered a number, but of course this can be refined), we can do just this:

gawk -v RS='[0-9]+' 'RT{print RT}'

Checking that RT is not null is necessary because for the last record of a file RT is null, and an empty line would be printed in that case. The output produced by the previous program is similar to what can be obtained using grep -o. But awk can do better than that. We can use a slight variation of this same technique if we want to add context to our search (something grep -o alone cannot do). For example, let’s say that we want to print all numbers, but only if they appear inside “–”, eg like –1234–, and not otherwise. With gawk, we can do this:

gawk -v RS='--[0-9]+--' 'RT{gsub(/--/,"",RT);print RT}'

So, a carefully crafted RS selects only the “right” data, that can be subsequently extracted safely and printed.

With non-GNU awk, matching all occurrences of an expression can still be done, it just requires more code. See FindAllMatches.

Have fun!

Have fun learning Awk! It’s a fun language to know.

Ps. I will go silent for a week. I have an on-site interview with Google in Mountain View, California. I’ll be back on 31st of October and will post something new in the first week of November!

Email Post Email '10 Awk Tips, Tricks and Pitfalls' to a friend | Print Post Print '10 Awk Tips, Tricks and Pitfalls' | Permalink Permalink to '10 Awk Tips, Tricks and Pitfalls' | Trackback Trackback to '10 Awk Tips, Tricks and Pitfalls'
(Popularity: 22%) 17,143 Views

Did you like this post? Subscribe here:

My Amazon.com Wish List If you really enjoyed the post, I'd appreciate a gift from my geeky Amazon book wishlist. Books would make make me more educated and I would write even better posts. Thanks! :)

29 Responses

  1. Unix User says:

    Something I often need to do is match lines against a regexp, and print out a matching group within that line. But I have never been able to find a way to do this in awk, and end up resorting to Perl.

    So - is there a way to do something like this?

    /abc([0-9]+)def/ { print group(1); }

    so that input of:

    abc654def

    produces:

    654

    Thanks!

  2. waldner says:

    Unix User:

    That is easily done with gawk, see the last tip. You could do eg

    gawk -v RS='abc([0-9]+)def' 'RT{gsub(/[^0-9]/,"",RT)print RT}

    Of course, the exact regexes used for RS and in the gsub vary from time to time depending on what you want to achieve. Another solution is using gensub(), again from gawk.

    Unfortunately, standard awk regexes lack backreferences, so getting what you want using standard awk would not be easy.

  3. waldner says:

    To pkrumins: something got lost during reformatting. The first two examples that uses GNU awk and RT should be as follows:

    “…like for example when we want everything that matches something. With GNU awk, we can do this:

    gawk -v RS='' 'RT==""'

    or again

    gawk -v RS='' '!(NR%2)'
  4. waldner says:

    ok, now I see :-)

    Let’s see it this time it works:

    “…like for example when we want everything that matches something. With GNU awk, we can do this:

    gawk -v RS=’’ ‘RT==””‘

    or again

    gawk -v RS=’’ ‘!(NR%2)’

  5. waldner says:

    pkrumins:

    no, using int(n)==n to check if a number is valid in an IPv4 address won’t work. It will accept, for example, “+100″ which is not valid in a dotted quad IPv4 address.

  6. Thorsten Strusch says:

    @Unix User:
    tr should be the tool of your choice:

     echo "abc654dEF" | tr -d 'a-zA-Z'
    654
  7. links for 2008-10-24 « B-link List says:

    […] 10 Awk Tips, Tricks and Pitfalls - good coders code, great reuse This article takes a look at ten tips, tricks and pitfalls in Awk programming language. They are mostly taken from the discussions in #awk IRC channel. (tags: awk linux shell reference bash) […]

  8. zts says:

    Alternate solution for the IP address validation function:

    function ok(n){ return (n !~ /[^0-9]/) && (n>=0 && n
    This just adds an additional test to assert that the value being tested contains only numeric characters.
  9. zts says:

    (I’ll try that again) Alternate solution for the IP address validation function - same as your first suggestion, but with a condition allowing only numeric values:

    function ok(n){ return (n !~ /[^0-9]/) && (n>=0 && n
  10. [root@EGA]# » Blog Archive » links - 20081024 says:

    […] Smashing Magazine 40 devastatingly simple ways the web can save you big money | News | TechRadar UK 10 Awk Tips, Tricks and Pitfalls - good coders code, great reuse 60 Useful Adobe AIR Applications You Should Know | Tools How to create a stunning and smooth popup […]

  11. links for 2008-10-24 « Donghai Ma says:

    […] 10 Awk Tips, Tricks and Pitfalls - good coders code, great reuse (tags: reference tips programming awk scripting bash) […]

  12. 10 tips awk | taggle.org says:

    […] On sait qu’il y a des fans ici, donc un petit tour par la pour lire 10 petites astuces awk. […]

  13. Ruslan Abuzant says:

    Keep us updated with what goes there at Mt. View, CA :D

    Good luck newbie googler (Y)

  14. PS says:

    Found this site due to the article you wrote on perl one-liner youtube downloader (which no longer works) and I see you’ve digressed. Why would anyone go from perl to awk? Did you hit your head?

  15. 10 tips awk | traffic-internet.net says:

    […] On sait qu’il y a des fans ici, donc un petit tour par la pour lire 10 petites astuces awk. […]

  16. waldner says:

    @zts:

    That’s a good one! Thank you.

    @PS:

    I agree perl is more powerful of awk. But I think you’ll agree that that is not a valid reason to stop using awk (or sed, or cat, or all the other tools that perl could easily replace).

  17. Steve Kinoshita says:

    Hi!

    This is a really helpful article thanks!

    I am trying to remove all double quotes and angled brackets, and replcase all semicolons and colons with newlines in a text file with gawk.

    Can you help?

    I have trouble making my scripts work.
    I use gawk3.1.6 for Windows and following are some of the codes I have tried.

    awk {gsub(/,/,"\n")}1 
    awk {gsub(/"/,"")}1 
  18. waldner says:

    @Steve Kinoshita:

    To remove all double quotes and angled brackets, try this:

    gsub(/["<>]/,"")

    To replace all semicolons and colons with newlines, try this:

    gsub(/[;:]/,"\n")

    Since you say you’re using windows, I suggest you put your awk program in a separate file, and then run it using

    awk -f prog.awk yourfile
  19. Marco R. says:

    In “Pitfall: validate an IPv4 address” awk returns not-zero when the input is a valid IPv4 address and zero otherwise. That’s because awk’s boolean arithmetic assigns 1 to True and 0 to False.

    This is not what a shell programmer would expect because shells usually act in the opposite way: true=0 and false=1.

    Thus, the final “shell-compatible” script should be:

    awk -F '[.]' 'function ok(n) {
    return (n ~ /^([01]?[0-9]?[0-9]|2[0-4][0-9]|25[0-5])$/)
    }
    {exit (! ( ok($1) && ok($2) && ok($3) && ok($4) ) )}'

    However, I’d prefer to use something simpler:

    function ok(n) {
    if (n ~ /[^[:digit:]]/)
    return 1==0;
    return (n
    Not fully tested but should work the same

    my 2 pennies ;)

Wednesday, February 18, 2009

S60 第三版简介及部分技巧

因为还有很多网友对S60第三版手机了解不透,这里搜集了S60 第三版的一些资料和使用技巧供朋友们查阅。呵呵,第一次发表文章,有什么不妥之处请指出。

1、S60 第三版简介
Symbian 9.0 加上 S60 第三版的系统,Symbians60 OS V9.1等手机已经不同以往的 Symbian 手机,以往 S60使用的软件,现在都必需有 Symbian 的认证才可以安装,所以以往自行开发的软件都不能再装到 Symbians60 OS V9.1手机里去,必需重新编译才行。即使是小小的抓屏幕软件也是一样.
S60 第三版是建置在 Symbian OS V9.1 之上,也改变了一些重要程序,Symbian C++开发者就必需了解这一环才可以开发新软件。每个新版本的 S60 平台通常都会有一长串的新功能,也让这些应用程序可以顺利被开发,而第三版的S60 平台也是如此。然而第三版有的是一些基础性的程序变动,这将影响之前支持 S60的各式软件的安装或执行。其中两个重大改变就是编辑器的大改版,这是对 S60程序开发者影响最大的一部份,另一个就是平台系统的安全性功能的增加。
S60 第三版包含 .sis 档案的程序托管程序,这代表之后所有的 .sis 档案必需经过合法的认证才可以安装到 Symbian手机里,而且第三版之前的版本软件也不可再安装到第三版的手机里,这个称为 "Symbian signed 认证" 以后都将成为Symbian 手机应用程序所需的官方认证,不论是第三版或之后的版本。另外还有一个「Symbian开发者认证」对于软件开发者也是同等需要的。
2、第三版手机内部系统文件分析
(1)Private篇
字典 101f9cfe
软件注册信息,rsc文件 10003a3f\import\apps
软件安装文件的备份,有些程序删除后在程序管理里有残余,在这删除 10202dce
Java程序存放文件夹 102033E6\MIDlets
卡上主题存放文件夹守 10207114\import
BounceMP3 Ringtoneeditor 20000c0f
QuickMark 20004FFE
JAVA程序 102033E6
MAIL2短信邮件 1000484b
CapsuleSE 20001271
ThemeDIY 20004A20
office suit sheet 20002ee2
office suit word 20002ee3
office suit docslauncher 20002ee4
smartmovie a0000b68
skyforce A0000BF4
skyforce reload A0000BF5
QReader a0000c49
MWeather A0000C98
Y-brower A00007A6
photorite a00008B1
SuperMiners A020D913
BestCalc A0000790
Sudoku AB736950
OggPlay F000A661
S-Tris2 F0202C7F
(2)SD卡的目录分析
data\mbook 掌上书院安装后配置文件存放文件夹,如果遇到书打不开可以把其中的umdrcnt.lst,mdstng删掉,再打开
Images 照片图片存放位置。照片默认是在“图像”文件夹内,用手机安装的管理软件或数据线方式(数据线方式仅可以看到E盘,即扩展卡)显示的文件夹是Images
具体显示:设置拍照存储在内存,文件路径是 C:\Date\Images\2006××**\图像◎◎◎.JPG;设置拍照存储在卡上,文件路径是 E:\Images\2006××**\图像◎◎◎.JPG。如果是按照 工具-文件管理路径查看,那么无论打开手机内存还是扩展卡,均存储在 图像文件夹下,表示方式相同 :2006××**
2006××**\图像◎◎◎.JPG,是你把照片名称设置为“文字”照片格式名称。××为月份,为先字母后数字的组合表示方法,即保证你在一个月内如果有非常多的次数的拍照,24个字母和10位数字可以保证你有240次最大拍照次数,而每次的数量可以是N次。
如果设置为日期,则上述表示的方式就改变为2006××**\2006××**◎◎◎.JPG
◎◎◎为总计数,除非你格机或者刷机,否则一直会积累下去
Installs 存放安装文件。
Music Downloads 机子自带浏览器下载音乐后,都存在这里
MyMusic 音乐模式下歌存在这里
Sounds 铃声存放文件夹
Videos 动画存放文件夹
resource\apps 程序文字资源存在这里,大多是rsc文件
resource\help 程序自带帮助文件存放在这里
resource\plugins 好像是放插件的地方,但是目前只有rsc文件
System\[102072c3] 目前不明
System\Install\Registry Java程序安装记录文件
System\Apps\Opera Opera安装后建立
System\Data\Opera 文件夹下opera.ini可调节缓存大小,cache4目录为缓存目录
(3)C盘system篇
【通讯录】→c:\system\data\contacts.Cdb同C:\system\data\cntmodel.ini
【功能表】→c:\system\data\applications.Dat
【待机状态模式】→c:\system\data\scshortcutengine.ini
【彩信设置】→c:\system\data\mms_setting.Dat
【短信设置】→c:\system\data\smsreast.Dat,smssegst.Dat,sms_settings.Dat
【闹钟设置】→c:\system\data\alarmserver.lnl
【连接设置】→c:\system\data\cdbv3.Dat
【记事本】→c:\system\data\notepad.Dat wap
【书签】→c:\system\data\bookmarks1.db
【 情景模式】→c:\system\data\profiles
【日程表】→c:\system\data\calendar
【收藏夹】→c:\system\favourites注意:【可以将这些文件移动到e:\system\favourites中】
反安装文件: c/system/install这个目录下的是 (前提:软件装在C盘),都可以删除,但是如果删除了,在程序管理列表中就没有了,只能直接删除e\system\apps\下对应目录。
安装记录文件:C/system/install下的install.log要删除安装记录文件,就将些文件删除即可。
c\system\apps下的目录里是设置和存档文件。
3、智能手机常用的指令秘籍
*#06# :IMEI 码,也就是我们所说的手机串号,几乎所以手机都适用, IMEI 就是“国际移动装备辨识码”, IMEI =TAC+FAC+SNR+SP,其中TAC是批准型号码,共6位,FAC是最后组装地代码,共2位,但由于现在已经有JS已经能改串号了,所以NOKIA将所有的7、8位都改成00了,就是说已经看不出生产地了,SNR是序号,共6位,SP是备用码,就1位。
*#0000#(部分型号如果不起作用,可按*#型号代码#,如*#6110#) :手机版本信息,显示后一共会出现3行信息,第一行是手机软件当前版本,第二行是此版本软件发行日期,这个版本的发布时间为2004年6月28日,第三行是手机型号代码。
*#7370#:恢复出厂设置(软格机),这个命令一般是在手机处于错误或系统垃圾过多的情况下使用格机命令,格机前可以通过第三方软件或6600PC套件备份一下你的名片夹或需要的资料,格机时一定要保持电量充足,不要带充电器格机,格机时只显示“NOKIA”字样还有亮屏幕,没格完千万不要强迫关机和拔电池,以免造成严重后果,格机完成后重新输入时间,再恢复你的名片夹和资料就可以了,格机可以恢复一切原始设置,将C:盘内容全部清空,再写入新的系统信息,注意的是此格机不影响MMC卡内容。
*#7780#:恢复出厂设置,等同于功能表——工具——设置——手机设置——常规——原厂设定,注意此命令仅是恢复设置,不同于格机,恢复后名片夹、图片、文档等全部依然存在,只是设置还原了,有些朋友因设置错误而不知如何改回来就可以使用这个命令了。
*#92702689#:显示的总通话时间。此通话时间格式化,刷机后不会改变,有效防止2手机器。
以上的秘技有部分是需要输入锁码的,这里所说的锁码也就是手机密码,不过不要和SIM卡密码弄混了,手机锁码的设置是在:功能表——工具——设置——安全性设置——手机和SIM卡——锁码,其初始锁码为:12345,只要需要输入锁码的地方默认值都是12345,更改过手机锁码的以新锁码为准。
4、手机格式化
格机有二种方法:
1)、软格:在手机上输入 *#7370#之后要求你输入锁码,初始密码是:12345,如果你更改过手机密码,那就是更改后的密码(不是SIM卡密码),之后出现白屏,只显示NOKIA字样,2~3分钟后格机完成,重新输入时间。
2)、硬格:先关机,在开机的时候按住拨号键、“*”键、“3”键,打开电源手别松开,直到“NOKIA”字样出现(此过程不能松开任何一个按键)。稍稍等几秒直至出现“Formating……/”字样,这时方可松开以上按键。就开始格式化了。此格式化比较彻底,不会出现格式化无效的问题。过几分钟,系统格式化完成,手机自动重启并进入待机画面。  
以上格机需要注意:保持电量绝对充足,格机途中不能企图关机,不能插充电器等。一般以软格为先。(记得格机前一定先备份好自己要的数据资料等)。
5、恢复出厂设置
待机画面输*#7780#,等同于功能表——工具——设置——手机设置——常规——原厂设定,注意此命令仅是恢复设置,不同于格机,恢复后名片夹、图片、文档等全部依然存在,只是设置还原了,有些朋友因设置错误而不知如何改回来就可以使用这个命令了。
6、格机后成英文时如何改为中文
打开MENU - SETTINGS - PHONE SETT - GENERAL - PERSONALISATION - LANGUAGE
- PHONE LANUAGE 简体中文
7、C盘清理技巧1)文件传送法(建议剩余8兆以下的用):首先,把信息的存储指定到机器存储,然后看自己机器内存有多大。用其他蓝牙设备给你发送一个大于8兆的文件,直到你的手机显示剩余空间不足,自动断开传送为止(手机在接受文件时机身内存不够用,系统就自动清理内存,还不够的话就自动断开连接)。C盘的内存就会变大。 2)浏览法清理理法:用随机的网络浏览器上网(占用内存大),多开些网页,直到提示内存不足无法开网页时,退出浏览,再清空缓存(此方法也只适用于机身内存比较小的3250和N71,7610等)
3)换卡法:只使用一个SIM卡,手机的运行速度会变慢,需要清理C盘垃圾文件。最简单的方法是取MINISD卡接着换SIM卡后再开机。待机3-5分钟后关机换回原来的SIM卡。这样Series60系统就会重新将C盘的数据重写一次,自动清除了原来无用的文件。
8、解决系统死机点滴
1)、在进行程序操作的时候,按键的速度要慢些,不能过快,否则会导致死机、重启、黑屏、白屏等现象(特别是第三版手机,在进入一些菜单的过程中,会有需要 20 秒左右才能进入的情况,此时乱按键会导致系统冲突而死机)。
2)、遇到开机出现“系统错误”,停在“NOKIA”白色画面不动的情况,就按住“笔形键”,狂按“确认键”(打勾键),强制进入机子,然后用seleq删除C和E底下的system/recogs文件夹,关机->开机(此方法还适合装了新软件后不能带卡开机等故障)。
3、遇到开启程序时死机,尽可能让手机自动重启,宁可多等几分钟,也尽量不用拔电池的方法重启动(这样对手机硬件不利)。

Monday, February 16, 2009

The CMB bank is 15.33 now.

22个图片在线实用小工具

web2.0时代真是给我们带来很多便利,以前很多需要专业知识的功能都可以通过简单的在线工具实现,比如图片特效。以下列举了收藏的实用图片类在线实用小工具,有别于以前的趣味图片生成服务网站,此处列举的是一些图像编辑的小工具,比如倒影、水印等等,而以前必须要PS实现。

1、Mirroreffect 在线图像特效制作工具,允许用户上传图片后,选择镜面反射方向(底部、右侧、左侧、顶部),设置反射图像大小、间距、透明度,即刻将你的图片制作出镜面特效。(详情

2、Watereffect 简单的在线工具,选择一张本地的图片,上传后自动生成水中倒影的效果,并且是Gif格式。(详情

3、Pizap 有趣的在线图片编辑服务,上传图片后,可在线添加泡泡、文字、各类装饰物,简单的添加使图片生机勃勃!(详情

4、Picmarkr 小巧的在线图片添加水印服务,为每张图片贴上自己的徽章,提供了3种水印,分别是文字,图片和水印平铺。(详情

5、Smushit 在线的小工具,可最优化压缩你的图片文件大小,而丝毫不会影响图片的质量和视觉效果。(详情

6、Gifmake 是一个十分简单小巧的在线Gif制作工具,不需要注册即可直接使用。制作过程极其简单,你只需至少上传两张图片,支持本地上传和网络上传,设置显示时间后点击自动生成。(详情

7、Reflectionmaker 让你简单地给图片加上倒影效果,支持上传上限为200KB的图片和直接加载网络图片,可以对倒影背景和倒影大小进行配置。(via

8、Photo-notes是一个提供简易图片注释服务网站。简单来讲就是可以为你的照片添加个性化的文字注释,达到美化照片,或者是照片更加有趣味的效果。(详情

9、BubbleSnaps是一个提供简易的在线图片文字标注服务的网站,或者也可以将其作为一个简易的漫画制作工具。(详情

10、Wavemypic 可以为照片添加水波倒影,挺有趣的。(via)

11、finetuna是一个图片编辑小工具,主要功能是对图片中各元素进行各种标注,从而使图片更直观更形象。(via

12、 roundpic是一个在线图片圆角修改工具,通过这个服务,能将图片的4角不同程度的圆角化,并能设定圆角的背景颜色,以及对图片的清晰度进行调整.(via)

13、Picshadow是一个简单的免费在线图片处理工具。提供的功能很单一,但也很实用,就是给图片添加阴影效果。(via)

14、pixizerresizeyourimage一样,也是一个应用于图片尺寸大小修改的服务,通过这个工具可以对图片进行多方面的修改,包括等比例改变大小,自定义长宽,以及对文件大小的修改。(via)

15、 quickthumbnail是一个很不错的在线图片尺寸修改工具,上传图片后就可以对图片进行大小调整了。(via)

16、resizeyourimage.com提供非常专业的图片尺寸修改服务。可以随意自定义剪切,自定义缩放,自定义旋转,然后对缩放和旋转的图片又可以进行自定义剪切。(via)

17、 Fotowoosh,可以将任何图片转换成3D图像,用户可以上传一个2D图像,转换成3D动画并可以嵌入到任何网站上。(via)

18、web resizer是国外一个在线优化压缩图片的工具,它的一个明显特点就是能够明显压缩图片,使其更小(via)

19、Picslice 小小的在线图片处理工具,上传图片后提供两个功能:1、分割图片:可很方便地将一张大图随意分割成均等的小图片,分割的倍数可自己设定,在1×1到20×20之间。2、裁剪图片:将图片裁剪到你所需要的大小。

20、SayTweet 允许用户在任意图片上加上形如Twitter会话的泡泡,并可随着Twitter 的更新而实时更新!很有意思!

21、Picfont 图像类在线服务,允许在任意的图片上加上文字说明,在不方便使用PS时随时应用。

22、picreflect的倒影效果选项更为丰富一点. 使用picreflect, 你可以设置倒影的高度, 透明度, 尺寸, 旋转, 和镜像偏左还是偏右, 背景颜色以及旋转度数(via

Saturday, February 7, 2009

Get To Know Linux: Understanding smb.conf

Next to the xorg.conf file (read my Get To Know Linux: Understanding xorg.conf for more) the smb.conf file might be the most misunderstood of all files. Part of the reason for this is because the default file is, well, rather large and confusing. When you compare what you need vs what you have (in the default at least), you will be surprised at how simple Samba can be to configure.

After Samba is installed the smb.conf file will be around 533 lines long. Fear not. It’s much easier than it seems.

The smb.conf file is broken into sections. Each section will start with a line that looks like:

[TITLE]

Where TITLE is the actual title of the block. Each block represents either a configuration or a share that other machines can connect to. You will, at minimum, have a global block and a single share.

Global

The global block is one of the more important blocks in your smb.conf file. This block defines the global configuration of your Samba server. This block begins with:

[global]

Within your blocks your configuration lines will be made up of:

option = value

statements.

The most important statements you will need in your global block are:
netbios name= NAME
workgroup = WORKGROUP_NAME
security = SECURITY_TYPE
encrypt passwords = YES/NO
smb passwd file = /path/to/smbpasswd
interfaces = ALLOWED_ADDRESSES

The values for each option above should be self explanatory. But there is one thing to note. If you are encrypting passwords you will need to add users (with passwords) with the smbpasswd command.
Within the global block one of the more important options is the security option. This option refers to authentication (how users will be able to log in). There are five different types of security:

  • ADS - Active Directory Domain
  • Domain - User verification through NT Primary or Backup Domain
  • Server - Samba server passes on authentication to another server
  • Share - Users do not have to enter username or password (until they try to access a specific directory)
  • User - Users must provide valid username/password. This is the default.

Share Blocks

The next blocks will refer to individual shares. You will need a different block for each directory you want to share to Samba users. A typical share block will look like this:
[SHARE NAME]
comment = COMMENT
path = /path/to/share
writeable = YES/NO
create mode = NUMERIC VALUE
directory mode = NUMERIC VALUE
locking = YES/NO

Everything in caps above will be defined according to your needs. The tricky entries will be the create and directory modes. What this does is define permissions for any file created as well as the share directories. So the values will be in the form of 0700 or 0600 (depending upon your permission needs). Remember, you will need a share block for every directory you want to share out.

Naturally there are plenty of options that can be used in Samba. Many of these options will fall in the global block.

Printer Block

You can also define a block to share out printers. This block will start with:

[printers]

and will contain options like:
comment = COMMENT
path = /PATH/TO/PRINTER/SPOOL
browseable = YES/NO
guest ok = YES/NO
writable = YES/NO
printable = YES/NO
create mode = NUMERIC VALUE

Sample smb.conf

I have an external drive that I mount to /media/music and I share out to my home network with the following smb.conf file:
[global]
netbios name = MONKEYPANTZ
workgroup = MONKEYPANTZ
security = user
encrypt passwords = yes
smb passwd file = /etc/samba/smbpasswd
interfaces = 192.168.1.1/8
[wallen music]
comment = Music Library
path = /media/music
writeable = yes
create mode = 0600
directory mode = 0700
locking = yes

And that’s it. That is my entire smb.conf file. Granted I am only sharing out a single directory, but it shows how simple smb.conf can be to configure.

Friday, February 6, 2009

Get To Know Linux: Understanding xorg.conf

For most Linux users the xorg.conf file is one of those files that makes many Linux users cringe with fear upon the threat of having to configure. There is a reason for that, it’s complex. But when you have an understanding of the pieces that make up the whole puzzle, configuring X Windows becomes much, much easier.

But now the Linux community has distributions, such as Fedora 10, that do not default to using an xorg.conf file. This is great news for many users. However, it’s bad news when, for some reason, X isn’t working or you have specific needs that the default isn’t meeting. With that in mind we’re going to break down the xorg.conf file so that you will be able to troubleshoot your X Windows configuration when something is wrong.

The Basics

The first thing you need to know is that xorg.conf (located typically in /etc/X11) is broken up into sections. Each section starts with the tag Section and ends with the tag EndSection. Each section can be broken into subsections as well. A subsections starts with the tag SubSection and ends with the tag EndSubSection. So a typical section with subsections contains the tags:
Section Name
Section Information
SubSection Name
SubSection information
EndSubSection
EndSection

Of course you can’t just use random sections. There are specific sections to use. Those sections are:

  • Files - pathnames for files such as fontpath
  • ServerFlags - global Xorg server options
  • Module - which modules to load
  • InputDevice - keyboard and pointer (mouse)
  • Device - video card description/information
  • Monitor - display device description
  • Modes - define video modes outside of Monitor section
  • Screen - binds a video adapter to a monitor
  • ServerLayout - binds one or more screens with one or more input devices
  • DRI - optional direct rendering infrastructure information
  • Vendor - vendor specific information

Each section will have different information/options and is set up:

Option Variable

Let’s take a look at a sample section. We’ll examine a Device section from a laptop. The section looks like:

Section "Device"
Identifier "device1"
VendorName "VIA Technologies, Inc."
BoardName "VIA Chrome9-based cards"
Driver "openchrome"
Option "DPMS"
Option "SWcursor"
Option "VBERestore" "true"
EndSection

The above section configures a Via Chrome video card (often a tricky one to get running) using theopenchrome driver. Here’s how this section breaks down:

  • The identifier (labled “device1″) connects this section to Screen section with the Device “device1″option.
  • The VendorName and BoardName both come from the make and model of the video adapter.
  • The Driver is the driver the video card will use.
  • Option “DPMS” - this enables the Display Power Management System.
  • Option “SWcursor” - this enables the cursor to be drawn by software (as opposed to the HWcursor drawing by hard ware).
  • Option “VBERestore” “true” - allows a laptop screen to restore from suspend or hibernate.

The lengthiest section of your xorg.conf file will most likely be your Screen section. This section will contain all of the subsections that contain the modes (resolutions) for your monitor. This section will start off like this:

Section "Screen"
Identifier "screen1"
Device "device1"
Monitor "monitor1"
DefaultColorDepth 24

Notice how the above section references both a device and a monitor. These will refer to other sections in the xorg.conf file. This section also contains the DefaultColorDepth which will define the default color depth for your machine. In the case above the default is 24. Now, take a look below at the SubSections of this section:

Subsection "Display"
Depth 8
Modes "1440x900" "1280x800"
EndSubsection
Subsection "Display"
Depth 15
Modes "1440x900" "1280x800"
EndSubsection
Subsection "Display"
Depth 16
Modes "1440x900" "1280x800"
EndSubsection
Subsection "Display"
Depth 24
Modes "1440x900" "1280x800"
EndSubsection
EndSection

As you can see there is a SubSection for four different color depths. Included in those subsections is the default 24. So when X reads the DefaultColorDepth option it will automatically attempt to set the modes configured in the Depth 24 subsection. Also notice that each subsection contains two resolutions. X will attempt to set the first resolution (in the case above our first default is 1440×900) and move on to the next if it can not set the first. Most likely X will be able to set the first.

Final Thoughts

This is only meant to be an introduction to the xorg.conf configuration file. As you might guess, xorg.conf, can get fairly complex. Add to the complexity numerous options available for each section and you have a valid case to make sure you RTFM (read the fine man page.) And the man page is an outstanding resource to find information on all of the available options. To read the man page issue the command man xorg.conf from the command line.

By having a solid understanding of the xorg.conf file you won’t have any problems fixing a fubar’d X installation or tweaking your xorg.conf file to get the most from your new video card.

Wednesday, February 4, 2009

Recovering Ubuntu After Installing Windows

Using the Ubuntu Desktop/Live CD

Quick Start

This option will use the Desktop/Live CD to install Grub into your MBR (Master Boot Record). This option will overwrite your Windows Boot Loader It is OK to do this, in fact that is the goal of this how to (in order to boot Ubuntu)

1. Boot the Desktop/Live CD. (Use Ubuntu 8.04 or later)

2. Open a terminal (Applications -> Accessories -> Terminal)

3. Start grub as root with the following command :

  • sudo grub

4. You will get a grub prompt (see below) which we will use to find the root partition and install grub to the MBR (hd0,0)

  •          [ Minimal BASH-like line editing is supported.   For
    the first word, TAB lists possible command
    completions. Anywhere else TAB lists the possible
    completions of a device/filename. ]

    grub>
    Type the following and press enter:
    find /boot/grub/stage1
    If you get "Error 15: File not found", try the following:
    find /grub/stage1
    Using this information, set the root device (fill in X,Y with whatever the find command returned):
    grub> root (hdX,Y)
    Install Grub:
    grub> setup (hd0)
    Exit Grub:
    grub> quit

5. Reboot (to hard drive). Grub should be installed and both Ubuntu and Windows should have been automatically detected.

6. If, after installing grub, Windows will not boot you may need to edit /boot/grub/menu.lst (That is a small "L" and not the number 1 in menu.lst)

  • Open a terminal and enter :
     gksu gedit /boot/grub/menu.lst
    Or, in Kubuntu:
     kdesu kate /boot/grub/menu.lst
    Your Windows stanza should look something like this :
     title Windows XP/Vista # You can use any title you wish, this will appear on your grub boot menu
    rootnoverify (hd0,0) #(hd0,0) will be most common, you may need to adjust accordingly
    makeactive
    chainloader +1
    Note: Put your Windows stanza before or after AUTOMAGIC KERNEL LIST in the menu.lst

Overwriting the Windows bootloader

Boot from a Live CD and open a terminal. You'll need to run a few commands as root so you can use sudo -i to get a root shell and run them normally instead of using sudo on each of them. Be extra careful when running a root shell, especially for typos !

We'll need to find which partition your Ubuntu system is installed on. Type the command fdisk -l. It will output a list of all your partitions, for example :

fdisk -l

Disk /dev/hda: 120.0 GB, 120034123776 bytes
255 heads, 63 sectors/track, 14593 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

Device Boot Start End Blocks Id System
/dev/hda1 1 8 64228+ 83 Linux
/dev/hda2 9 1224 9767520 83 Linux
/dev/hda3 * 1225 2440 9767520 a5 FreeBSD
/dev/hda4 2441 14593 97618972+ 5 Extended
/dev/hda5 14532 14593 498015 82 Linux swap / Solaris
/dev/hda6 2441 14530 97112862 83 Linux

Partition table entries are not in disk order

Here I have three Linux partitions. /dev/hda2 is my root partition, /dev/hda1 is my /boot partition and /dev/hda6 is my /home partitions. If you only have one, obviously this is the one your Ubuntu system is installed on. If you have more than one and you don't know which one your Ubuntu is installed on, we'll look for it later. First, create a mountpoint for your partition, for example :

mkdir /mnt/root

Then mount your partition in it. If you don't know which one it is, then mount any of them, we'll se if it's the correct one.

mount -t ext3 /dev/hda2 /mnt/root

Of course, replace /dev/hda2 with the correct name of your partition. You can check if it's the correct one by running ls /mnt/root, which should output something like this :

bin    dev      home        lib    mnt   root     srv  usr
boot etc initrd lib64 opt sbin sys var
cdrom initrd.img media proc selinux tmp vmlinuz

If what you have looks not at all like this, you didn't mount the correct partition. Do umount /mnt/root to unmount it and try another one. You also need to mount your /boot partition if you made one, like this :

mount -t ext3 /dev/hda1 /mnt/root/boot

To make sure it was the correct one, run ls /mnt/root/boot, which sould output something like this :

config-2.6.18-3-686      initrd.img-2.6.18-3-686.bak  System.map-2.6.18-3-686
grub lost+found vmlinuz-2.6.18-3-686
initrd.img-2.6.18-3-686 memtest86+.bin

Once again, if what you have doesn't fit, unmount it and try another partition.

Now that everything is mounted, we just need to reinstall GRUB :

sudo grub-install --root-directory=/mnt/root /dev/hda

If you got BIOS warnings try:

sudo grub-install --root-directory=/mnt/root /dev/hda --recheck

Of course, replace /dev/hda with the location you want to install GRUB on. If all went well, you should see something like this :

Installation finished. No error reported.
This is the contents of the device map /boot/grub/device.map.
Check if this is correct or not. If any of the lines is incorrect,
fix it and re-run the script `grub-install'.

(hd0) /dev/hda

Now you can reboot and the GRUB menu should appear. If you see a warning message regarding XFS filesystem, you can ignore it.

Preserving Windows Bootloader

The method shown above puts GRUB back on the MBR (master boot record) of the hard drive instead of in the root partition. But you probably won't want that, if you use a third-party boot manager like Boot Magic or System Commander. (The original poster also suggested that this would be useful to restore the Grub menu after a re-ghosting.) In that case, use this alternative.

This alternative, used without a third-party boot manager, will not cause Ubuntu to boot.

If you have your Linux system in a second (or third...) hard disk this method will not work. Please check Super Grub Disk's method that address this problem.

1. Boot from a Live CD, like Ubuntu Live, Knoppix, Mepis, or similar.

2. Open a Terminal. Open a root terminal (that is, type "su" in a non-Ubuntu distro, or "sudo -i" in Ubuntu). Enter root passwords as necessary.

3. Type "grub" which makes a GRUB prompt appear.

4. Type "find /boot/grub/stage1". You'll get a response like "(hd0)" or in my case "(hd0,3)". Use whatever your computer spits out for the following lines. Note that you should have mounted the partition which has your Linux system before typing this command. (e.g. In Knoppix Live CD partitions are shown on the desktop but they're not mounted until you double-click on them or mount them manually)

5. Type "root (hd0,3)".

6. Type "setup (hd0,3)". This is key. Other instructions say to use "(hd0)", and that's fine if you want to write GRUB to the MBR. If you want to write it to your linux root partition, then you want the number after the comma, such as "(hd0,3)".

7. Type "quit".

8. Restart the system. Remove the bootable CD.

From: http://ubuntuforums.org/showpost.php?p=121355&postcount=5

From Inside Ubuntu

You have to run "grub" not from the Ubuntu Desktop/Live CD, but from your disk installation to make it work. To do this mount your root partition (following examples assume a root partition on hda1):

sudo mkdir /mnt/linux
sudo mount /dev/hda1 /mnt/linux

then change directory to your installation sbin and run grub from there

cd /mnt/linux/sbin
sudo ./grub

Using the Unofficial "Super Grub Disk"

From within Windows

  • Download Auto Super Grub Disk

  • Double-click auto_super_grub_disk_1.0 icon, install it, and reboot.

  • On the next boot, select the UNetbootin-supergrubdisk menu entry; this will launch the Auto Super Grub Disk.

  • Do nothing till you see your Grub menu again.

  • Next time you boot Windows, click yes when asked to remove UNetbootin-supergrubdisk to remove the Super Grub Disk menu entry.

As a standalone cd/floppy/usb

  • Download Super Grub Disk

  • Burn into a cdrom (better) or a floppy
  • Boot from it
  • Select: GRUB => MBR & !LINUX! (>2) MANUAL |8-)

  • Select the Linux or Grub installation you want to restore.

  • You see the message: SGD has done it!
  • Reboot
  • You're done.

Preserving Windows Bootloader

The method shown above puts GRUB back on the MBR (master boot record) of the hard drive instead of in the root partition. But you probably won't want that, if you use a third-party boot manager like Boot Magic or System Commander. (The original poster also suggested that this would be useful to restore the Grub menu after a re-ghosting.) In that case, use this alternative.

This alternative, used without a third-party boot manager, will not cause Ubuntu to boot.

This alternative will let you boot your second hard disk Linux installations from Windows while the Using the Ubuntu Desktop/Live CD Preserving Windows Bootloader instructions will not.

Either:

  • Download Super Grub Disk

  • Burn into a cdrom (better) or a floppy
  • Boot from it

Or:

  • Download UNetbootin Super Grub Disk Loader (Windows .exe version)

  • Run the installer and reboot when once done installing.
  • On the next boot, select the "UNetbootin-supergrubdisk" menu entry; this will launch the Super Grub Disk interface.

Then:

  • Super Grub Disk (WITH HELP) :-)))

  • Select: your language

  • Select: Windows

  • Select: Windows chainloads Grub!

  • Select the Linux or Grub installation you want to restore to its own partition.

  • You see the message: SGD has done it!
  • Reboot
  • You're done.

Using Microsoft Vista

If you have Vista installed and you installed Ubuntu and when you rebooted it didn't show up as a dual boot option, try going into Vista (since that is all you can do), use the program EasyBCD version 1.7 It looks like this: http://aycu01.webshots.com/image/31560/2002188190250314159_rs.jpg

Add your Linux install to the boot sequence.

Troubleshooting

This section applies to...

  • Dual-boot setups in which Windows was installed after Ubuntu
  • Conditions where Windows failure forced a re-installation
  • Windows recovery techniques involving the "restoration" of the MBR
  • Cases where GRUB failed to install

Prerequisites:

  • Your Ubuntu partitions are all still intact
  • You have a LiveCD, such as the Ubuntu Desktop CD, or anything you're comfortable with
  • You're familiar enough with your LiveCD to gain access to a console
  • You remember how you set up your partitions (having a printout of /etc/fstab is ideal, though you can make do with the output of fdisk -l /dev/hda)

  • Knowledge of how your kernel works (specifically with regards to initrd), if you're using a non-Ubuntu kernel or built your own
  • Your kernel's version; this howto assumes 2.6.10-5-386

Preparing Your Working Environment

To begin the restoration procedure, insert your LiveCD and reboot your computer. Proceed with your LiveCD's bootup proceedure until you are presented with an interface. If your LiveCD does not immediately present you with a console, also called a terminal, open one -- to do this with the Ubuntu LiveCD, click Applications -> System Tools -> Terminal.

Note: Since this is a LiveCD environment, any changes to user accounts or filesystem layouts at this level will not be permanent. This means you can set a temporary root password and create directories without affecting your actual installation.

Now, you need to gain root access. Under Ubuntu, this can be done with the following commands:

sudo -i

Under Knoppix, the following command will suffice, and you will not be prompted for a password.

su -

Now that you have root access, you need to mount the partition(s) containing your bootloader files.

You will need access to both your /sbin/ and /boot/ directories. If you have a /boot/ listing in your fstab, you are among those who will need to mount two partitions.

Begin by creating a mount point for your working environment -- you'll notice this is the same as creating a directory.

mkdir /mnt/work

If you need to mount /boot/, too, run the following command.

mkdir /mnt/work/boot

Now it's time to actually load your filesystem data. Review your fstab and identify the location(s) of / and /boot/; these will likely look something like /dev/hda3 and /dev/hda4, though the letter 'a' and the numbers 3 and 4 may differ.

Note: For the remainder of this howto, /dev/hda3 and /dev/hda4 will be assumed, so alter them as needed when typing them in yourself.

Enter the following commands to load your filesystem and some information GRUB may need.

mount /dev/hda4 /mnt/work
mount -o bind /dev /mnt/work/dev
mount -o bind /proc /mnt/work/proc
cp /proc/mounts /mnt/work/etc/mtab

Now, you have to enter your working environment. The following command will take care of that.

chroot /mnt/work/ /bin/bash

Warning: From this point on, any files you modify will affect your Ubuntu system. You have left the safety of the LiveCD. Excercise caution.

Recovering GRUB Automatically

If you have a separate /boot/ partition, type the following line.

sudo mount /dev/hda3 /boot/

Reinstalling GRUB from this point is easy. Just enter the following command.

sudo /sbin/grub-install /dev/hda

If the command you used above failed, which is unlikely, you will need to configure GRUB manually (it isn't too hard); if it succeeded, you should read the note at the start of the final section: "Configuring the GRUB Menu".

Recovering GRUB Manually

Before you can undertake the next step, it's important that you understand how GRUB identifies partitions.

To GRUB, numbers begin with 0, and letters are expressed numerically, also beginning with 0.

For example, /dev/hda1 is "hd0,0" to GRUB. Similarly, /dev/hdb3 is "hd1,2".

Note: The "root" line must point to the location of your /boot/ partition if you have one. If you do not have one, point it at your / partition.

sudo /sbin/grub
grub> root (hd0,2)
grub> setup (hd0)
grub> quit

Note: This step does not need to be done if you're just trying to recover your MBR. Installing Windows will not alter the contents of your existing menu.lst, so if everything was working right before, everything will continue to work right now, and you can restart your computer.

Open the GRUB menu file, /boot/grub/menu.lst, with your favourite text editor. An example follows.

sudo nano /boot/grub/menu.lst

Note: Your menu.lst file is used to control the operating systems GRUB displays on startup, as well as its visual appearance. This howto will only explain how to get your operating systems to boot; it will not tell you how to make your bootloader pretty.

A sample menu.lst, stripped of unnecessary comments, appears below. It is based on the /dev/hda3 and /dev/hda4 example above, and assumes Windows resides at /dev/hda1.

timeout 5 #The number of seconds GRUB should wait before booting an OS
default 0 #The entry which should be booted by default
fallback 1 #The entry which should be booted in the event of the first one failing

title Ubuntu, 2.6.10 #A 32-bit Ubuntu entry
#This (or something like it) should be in your configuration
root (hd0,2)
initrd /initrd.img-2.6.10-5-386
kernel /vmlinuz-2.6.10-5-386 root=/dev/hda4

title Ubuntu, 2.6.10 #Another 32-bit Ubuntu entry
#This is an example of an Ubuntu entry which does not have a separate /boot/ partition
#(it is provided only as an alternate to the example above -- do not use them together)
root (hd0,2)
initrd /boot/initrd.img-2.6.10-5-386
kernel /boot/vmlinuz-2.6.10-5-386

title Microsoft Windows XP Home #An entry for a Windows installation
#If you're reading this guide, you probably want this
root (hd0,0)
makeactive
chainloader +1

And that's it. Save and close the file, then reboot and try out the entries.

Using the Ubuntu Alternate/Install CD

This section explains how to rescue GRUB (the GRand Unified Boot loader), using the Ubuntu alternate/install CD ROM.

  1. Enter your computers BIOS to check computer can boot from CD ROM. If you can boot from CD, insert CD ROM into drive. Exit the BIOS (if needed save your settings to make sure the computer boots from the CD ROM).
  2. When the Ubuntu splash screen comes up with the boot: prompt, type in rescue and press enter.

  3. Choose your language, location (country) and then keyboard layout as if you were doing a fresh install.
  4. Enter a host name, or leave it with the default (Ubuntu).
  5. At this stage you are presented with a screen where you can select which partition is your root partition (there is a list of the partitions on your hard drive, so you are required to know which partition number Ubuntu is on). This will be dev/discs/disc0/partX, where the X is a partition number.

  6. you are then presented with a command prompt (a hash).
  7. type $ grub-install /dev/hdaX where X is your Ubuntu root install.

The GUI Way: Using the Alternate/Install CD and Overwriting the Windows bootloader

  1. Boot your computer with the Ubuntu CD
  2. Go through the installation process until you reach "[!!!] Disk Partition"
  3. Select Manual Partition
  4. Mount your appropriate linux partions:
    • /
    • /boot
    • swap
    • ...
  5. DO NOT FORMAT THEM.

  6. Finish the manual partition
  7. Say "Yes" when it asks you to save the changes
  8. It will give you errors saying that "the system couldn't install ....." after that
  9. Ignore them, keep select "continue" until you get back to the Ubuntu installation menu
  10. Jump to "Install Grub ...."
  11. Once it is finished, just restart your computer

From: http://doc.gwos.org/index.php/Restore_Grub and http://ubuntuforums.org/showthread.php?t=76652

GRUB Resources