O’Reilly news

O'Reilly Releases Official User's Guide to the awk Programming Language

June 5, 2001

Sebastopol, CA--Using the awk language to program is a little like using notepad rather than a full-fledged word processing program in that, unencumbered by the features and frills you don't need, you are able to focus quickly and efficiently on tackling the task at hand. Sometimes a less complex tool is all you need. Yet the awk programming language can be surprisingly powerful, too. In Effective awk Programming, (O'Reilly, US $39.95) author Arnold Robbins explains how to perform sophisticated text processing and report generation, including powerful regular expression matching and text substitution facilities, associative arrays, and user-defined functions.

"It's important to program effectively with whatever language or tools you happen to be using," Robbins explains. "Tools should help you get a job done, using them isn't an end in and of itself. With that in mind, for a certain class of problems, awk's pattern-action programming paradigm is very expressive and elegant. Often awk solutions are adequate, without the need to turn an awk prototype into a 'production' version in C or C++."

The awk language differs from other programming languages in that awk is data-driven rather than procedural. Programs written with awk are usually much smaller than they would be in other languages; for example, the typical awk program usually amounts to 100 lines of code or less. A programmer might quickly compose an awk program at his or her terminal, use it once, and throw it away. As Robbins explains in his book, awk programs are interpreted, allowing programmers to avoid the (usually lengthy) compilation part of the typical edit-compile-test-debug cycle of software development.

There are many variants of awk, including gawk, which is the GNU version that currently ships with every GNU/Linux distribution. In addition to providing in-depth coverage of the POSIX awk language, Effective awk Programming also serves as the "official documentation" for gawk. Robbins, who was one of the lead developers of gawk, currently maintains the gawk language and its documentation.

"The release of this book coincides with the release of GNU awk 3.1, the first major release of gawk in about five years!" says Robbins. "There are lots and lots of new features in this release, as well as several bug fixes over the last minor release. The most important new features have to do with networking, profiling awk programs, and internationalizing awk programs. There are other, smaller, new features as well."

In his book, Robbins clearly distinguishes standard awk features from the gawk-specified features, points out the "dark corners" of the language (areas to watch out for when programming), and devotes two entire chapters to example programs. The book also covers:

  • Internationalization of gawk
  • Interfacing to i18n at the awk level
  • Two-way pipes
  • TCP/IP networking via the two-way pipe interface
  • The new PROCINFO array, which provides information about running gawk
  • Profiling and printing awk programs
  • Dynamically adding built-in functions at run time

As the official gawk user's guide, this book will also be available electronically, and can be freely copied and distributed under the terms of the Free Software Foundation's Free Documentation License (FDL). A portion of the proceeds from sales of this book will go to the Free Software Foundation to support further development of free and open source software.

Arnold Robbins is a professional programmer and technical author. He has been working with Unix systems since 1980 and with gawk since 1988. As a member of the POSIX 1003.2 balloting group, he helped shape the POSIX standard for awk. In addition to this book, Arnold is the author of Unix in a Nutshell, Third Edition and the sed & awk Pocket Reference. He is the coauthor of sed & awk, Second Edition and Learning the vi Editor, 6th Edition.

Online Resources:

Effective awk Programming,3rd Edition
By Arnold Robbins
ISBN 0-596-00070-7, 421 pages, $39.95 (US)

About O’Reilly

O’Reilly, the premier learning platform for technology professionals, offers the industry’s most extensive catalog of high-quality technical and professional skills development courses. From AI, programming, and cloud technologies to essential business skills such as leadership training and critical thinking, O’Reilly delivers highly trusted content from its network of renowned experts that meets a diverse array of learning needs, with over 5,000 role-based on-demand courses, nearly 200 live events each month, access to interactive sandboxes and labs, and more. For more information, visit www.oreilly.com.

Email a link to this press release