Text Processing in Python

Text Processing in Python

This book digs on the restructuring, reformatting and extracting bits of textual information using Python. For post-introductory level.

Tag(s): Python

Publication date: 12 Jun 2003

ISBN-10: 0321112547

ISBN-13: 0076092017905

Paperback: 544 pages

Views: 31,002

Type: N/A

Publisher: Addison-Wesley Professional

License: n/a

Post time: 24 Oct 2005 06:13:58

Text Processing in Python

Text Processing in Python This book digs on the restructuring, reformatting and extracting bits of textual information using Python. For post-introductory level.
Tag(s): Python
Publication date: 12 Jun 2003
ISBN-10: 0321112547
ISBN-13: 0076092017905
Paperback: 544 pages
Views: 31,002
Document Type: N/A
Publisher: Addison-Wesley Professional
License: n/a
Post time: 24 Oct 2005 06:13:58
:santagrin: This book is suggested by Mike Goins

Terms and Conditions:
David Mertz wrote:This stuff is copyrighted by Addison Wesley (except the code samples which are released to the public domain). Feel free to use this material personally; but no permission is given for further distribution beyond your personal use.

Book excerpts:

At the broadest level text processing is simply taking textual information and doing something with it. This doing might be restructuring or reformatting it, extracting smaller bits of information from it, algorithmically modifying the content of the information, or performing calculations that depend on the textual information.

Text processing is arguably what most programmers spend most of their time doing. Configuration files, log files, CSV and fixed-length data files, error files, documentation, and source code itself are all just sequences of words with bits of constraint and formatting applied.

In this book, Python was chosen in large part because it is such a clear, expressive, and general-purpose language. But for all Python's virtues, text editors and small utilities will always have an important place for developers getting the job done. As simple as Python is, it is still more complicated than you need to achieve many basic tasks. But once you get past the very simple, Python is a perfect language for making the difficult things possible (and it is also good at making the easy things simple).

What sets text processing most clearly apart from other tasks computer programmers accomplish is the frequency with which we perform text processing on an ad hoc or one-shot basis. We retrieve the text and write a script to process it. And when the next requirement came, the script you reluctantly used a second time turns out to be quite similar to a more general task you will need to perform frequently, perhaps even automatically. You imagine that with only a slight amount of extra work you can generalize and expand the script, maybe add a little error checking and some runtime options while you are at it.

The goal of this book is to make its readers able to create such scripts.

Intended Audience:

This book is ideally suited for programmers who are a little bit familiar with Python, and whose daily tasks involve a fair amount of text processing chores. Programmers who have some background in other programming languages--especially with other scripting languages--should be able to pick up enough Python to get going by reading Appendix A.

While Python is a rather simple language at heart, this book is not intended as a tutorial on Python for nonprogrammers. Instead, this book is about two other things: getting the job done, pragmatically and efficiently; and understanding why what works works and what doesn't work doesn't work, theoretically and conceptually. As such, this book can be useful both to working programmers and to students of programming at a level just past the introductory.

Reviews:

Amazon.com
:) "... his exposition is well organized and wonderfully lucid. If you're the sort of person who likes books that have a chapter zero, you'll enjoy his style."

:) "It will be a most useful reference source for when I am doing various text related tasks for some time to come, and it was also a delightful and educational quick read in the here and now."

:) "As an illustration of how good this book is, I am now using regular expressions (selectively), and this was only possible with the help of this book!"
 




About The Author(s)


Senior Software Engineer and Senior Trainer at Continuum Analytics, Inc., David is a well-known author and speaker in the Python community; he wrote the long-running columns, Charming Python and XML Matters for IBM developerWorks and the Addison-Wesley book Text Processing in Python, and has spoken at OSCon, PyCon, and keynoted at PyCon India.

David Mertz

Senior Software Engineer and Senior Trainer at Continuum Analytics, Inc., David is a well-known author and speaker in the Python community; he wrote the long-running columns, Charming Python and XML Matters for IBM developerWorks and the Addison-Wesley book Text Processing in Python, and has spoken at OSCon, PyCon, and keynoted at PyCon India.


Book Categories
Sponsors