Sep

5

This expression demonstrates the power of PCREs that provide ways to modify the strings captured and placed into the replacement expression using back references. Unfortunately, this recipe isn’t available using POSIX regular expressions. This recipe will turn the following:

proper noun

into this:

Proper Noun


#!/usr/bin/perl -w
use strict;

open( FILE, $ARGV[0] ) || die "Cannot open file!";

while ( <FILE> )
{
    # print the filtered line
    my $line = $_;
    $line =~ s/\b([a-z])(\w+)\b/\u$1$2/g;
    print $line;
}

close( FILE );

Regular Expression Explanation:

This expression needs to capture the first character of a word in one group and then capture the rest of the group in another group so the second group can be unaltered in the replacement expression. You can break down the search expression like this:

\b

is a word boundary, followed by

[

a character class containing

a

the letters a

through

z


z

]

followed by

\w

a word character

+

one or more times, followed by

\b

a word boundary.

In this breakdown, I didn’t include the parentheses, which are used to capture pieces of text that match the expressions. The first grouping appears around the character class ([a-z]) and will grab the first letter of a word from a to z. The second group, (\w+), grabs the rest of the word up to its boundary.

The replacement expression breaks down into the following:

\u

changes the following reference to uppercase

$1

whatever was found in the first group, followed by

$2

whatever was found in the second group.



Similar Posts

Comments

Name (required)

Email (required)

Website

Speak your mind

Sponsors




Links