PDA

View Full Version : DW Search/Replace Regular Expressions


Linda Baldwin
05-16-2006, 05:55 AM
I'm trying to use search and replace to remove a column from a table. This table contains 70 or 80 of these:
<tr>
<td>1 </td>
<td>Schromburgkea undulata </td>
<td>semi alba</td>

<td class="pq">250</td>

</tr>

The text is different in each one of them, and what I want to do is remove the third column altogether. I could simply go through and delete the third cell, in this case <td>semi alba</td>, but if I knew how to use regular expressions in the search and replace it sure would be easier. According to the Help, \w is a wildcard for alpha-numeric text. How do I use that? I would search for the above and replace with the above minus the third <td></td>

Thanks,
Linda

iamback
05-16-2006, 06:36 AM
I'm trying to use search and replace to remove a column from a table. This table contains 70 or 80 of these:
<tr>
<td>1 </td>
<td>Schromburgkea undulata </td>
<td>semi alba</td>

<td class="pq">250</td>

</tr>

The text is different in each one of them, and what I want to do is remove the third column altogether. I could simply go through and delete the third cell, in this case <td>semi alba</td>, but if I knew how to use regular expressions in the search and replace it sure would be easier. According to the Help, \w is a wildcard for alpha-numeric text. How do I use that? I would search for the above and replace with the above minus the third <td></td>I don't have DW but do have HomeSite, which supports POSIX-style REs in search and replace; the \w suggests DW is maybe using Perl-style REs? Does it have a shortcut for a newline? Let me know and I can have a stab at it (I love regular expressions!).

Linda Baldwin
05-16-2006, 06:44 AM
Marjolein, I don't even know what you mean. <g> Here's the list

images/previous.gif (22_codi7.htm) images/next.gif (22_codi9.htm)

Regular expressions

Regular expressions are patterns that describe character combinations in text. Use them in your code searches to help describe concepts such as "lines that begin with ‘var’" and "attribute values that contain a number." For more information on searching, see Searching and replacing tags and attributes (22_cod25.htm#wp72207).

The following table lists the special characters in regular expressions, their meanings, and usage examples. To search for text containing one of the special characters in the table, "escape" the special character by preceding it with a backslash. For example, to search for the actual asterisk in the phrase some conditions apply*, your search pattern might look like this: apply\*. If you don’t escape the asterisk, you’ll find all the occurrences of "apply" (as well as any of "appl", "applyy", and "applyyy"), not just the ones followed by an asterisk.

Character

Matches

Example

^

Beginning of input or line.

^T matches "T" in "This good earth" but not in "Uncle Tom’s Cabin"

$

End of input or line.

h$ matches "h" in "teach" but not in "teacher"

*

The preceding character 0 or more times.

um* matches "um" in "rum", "umm" in "yummy", and "u" in "huge"

+

The preceding character 1 or more times.

um+ matches "um" in "rum" and "umm" in "yummy" but nothing in "huge"

?

The preceding character at most once (that is, indicates that the preceding character is optional).

st?on matches "son" in "Johnson" and "ston" in "Johnston" but nothing in "Appleton" or "tension"

.

Any single character except newline.

.an matches "ran" and "can" in the phrase "bran muffins can be tasty"

x|y

Either x or y.

FF0000|0000FF matches "FF0000" in bgcolor="#FF0000" and "0000FF’" in font color="#0000FF"

{n}

Exactly n occurrences of the preceding character.

o{2} matches "oo" in "loom" and the first two o’s in "mooooo" but nothing in "money"

{n,m}

At least n, and at most m, occurrences of the preceding character.

F{2,4} matches "FF" in "#FF0000" and the first four F’s in #FFFFFF

[abc]

Any one of the characters enclosed in the brackets. Specify a range of characters with a hyphen (for example, [a-f] is equivalent to [abcdef]).

[e-g] matches "e" in "bed", "f" in "folly", and "g" in "guard"

[^abc]

Any character not enclosed in the brackets. Specify a range of characters with a hyphen (for example, [^a-f] is equivalent to [^abcdef]).

[^aeiou] initially matches "r" in "orange", "b" in "book", and "k" in "eek!"

\b

A word boundary (such as a space or carriage return).

\bb matches "b" in "book" but nothing in "goober" or "snob"

\B

Anything other than a word boundary.

\Bb matches "b" in "goober" but nothing in "book"

\d

Any digit character. Equivalent to [0-9].

\d matches "3" in "C3PO" and "2" in "apartment 2G"

\D

Any nondigit character. Equivalent to [^0-9].

\D matches "S" in "900S" and "Q" in "Q45"

\f

Form feed.



\n

Line feed.



\r

Carriage return.



\s

Any single white-space character, including space, tab, form feed, or line feed.

\sbook matches "book" in "blue book" but nothing in "notebook"

\S

Any single non-white-space character.

\Sbook matches "book" in "notebook" but nothing in "blue book"

\t

A tab.



\w

Any alphanumeric character, including underscore. Equivalent to [A-Za-z0-9_].

b\w* matches "barking" in "the barking dog" and both "big" and "black" in "the big black dog"

\W

Any non-alphanumeric character. Equivalent to [^A-Za-z0-9_].

\W matches "&" in "Jake&Mattie" and "%" in "100%"

Control+Enter or Shift+Enter (Windows), or Control+ Return or Shift+Return or Command+ Return (Macintosh)

Return character. Make sure that you deselect the Ignore Whitespace Differences option when searching for this, if not using regular expressions. Note that this matches a particular character, not the general notion of a line break; for instance, it doesn’t match a <br> tag or a <p> tag. Return characters appear as spaces in Design view, not as line breaks.



Use parentheses to set off groupings within the regular expression to be referred to later. Then use $1, $2, $3, and so on in the Replace With field to refer to the first, second, third, and later parenthetical groupings.

NOTE



In the Search For text box, to refer to a parenthetical grouping earlier in the regular expression, use \1, \2, \3, and so on instead of $1, $2, $3.

For example, searching for (\d+)\/(\d+)\/(\d+) and replacing it with $2/$1/$3 swaps the day and month in a date separated by slashes, thereby converting between American-style dates and European-style dates.

donmcc
05-16-2006, 07:22 AM
The problem is that most S&R routines can search on alphanumeric text, but they cannot replace it. So unless every cell of your table is identical, and I know it isn't, you really can't do much with it.

Is it possible to select the column in browser view, and delete it that way. (Sorry, I don't know DW well enough to know.)

Linda Baldwin
05-16-2006, 07:38 AM
Nope, that won't work. It just deletes the contents of the cells AFAICT.

Thanks,
Linda

iamback
05-16-2006, 08:14 AM
Marjolein, I don't even know what you mean. <g> Here's the list (...)OK, let's try this. First make a backup, in case it doesn't work first time.

Now, search for:
(<tr>\r\n(<td>[^<]*</td>\r\n){2})<td>[^<]*</td>\r\n

Replace with:
$1

Update:I just realized that the search expression isn't going to work if the actual code uses indentation. Linda: when posting code, wrap them in the BBcode code tags so any layout will be preserved.
You might try this variant:
(<tr>(\r?\s*<td>[^<]*</td>){2})\r?\s*<td>[^<]*</td>That should also take care of differences between Windows/DOS and Unix line endings

Daudio
05-16-2006, 10:25 AM
Linda,

Nope, that won't work. It just deletes the contents of the cells AFAICT.

I think you may be able to do it, perhaps by using another command from the menubar.

I can remove a row or column from a table in Frontpage, but I have to use a menu command. If I just hit the delete key, it only empties the cells. Haven't tried it in Dreamwaever though...

I would persue a DW design view solution before spending a whole lot of time on the RE track (not that that isn't a valuable exercise), which very likely won't do what you want.

dacoyle
05-16-2006, 11:15 AM
Nope, that won't work. It just deletes the contents of the cells AFAICT.

Thanks,
Linda

Linda, are you deleting the entire column? If so, you don't need REs. Yes, if you select the column and press the Delete key you only clear the contents. But if you right click while in the column, select Table > Delete Column.

iamback
05-16-2006, 12:06 PM
Linda, are you deleting the entire column? If so, you don't need REs.Maybe not if it's a single document. But an RE search and replace will work across any number of documents - and once I knew what dialect DW uses, it took me less than a minute to type those REs in.

REs don't cost a lot of time: they save a lot of time!

gary
05-16-2006, 02:12 PM
As Dennis suggested: instead of using an RE, why not use a design window -- should be far simpler...

The problem with an RE is that you need a pretty specific pattern to match and if you have the columns distributed on individual lines then it may be difficult to come up with a suitable RE. There is no simple way in RE to say "the third <td>...</td> after a <tr>".

I would be inclined as a first pass to make an RE that added a class="delete" to the candidate columns so that I could do a visual verification before the actual delete.

Daudio
05-16-2006, 02:58 PM
M,

Now, search for:
Code:

(<tr>\r\n(<td>[^<]*</td>\r\n){2})<td>[^<]*</td>\r\n


Perhaps you could explain to us how that Expression does what it does ?

iamback
05-16-2006, 03:21 PM
Perhaps you could explain to us how that Expression does what it does ?I want to hear for Linda first whether it actually does what it's intended to do :) Then I'll explain!

iamback
05-16-2006, 03:25 PM
As Dennis suggested: instead of using an RE, why not use a design window -- should be far simpler...only if you have a single page to change; REs come into their own when you need to make the same change(s) on multiple pages.

The problem with an RE is that you need a pretty specific pattern to match and if you have the columns distributed on individual lines then it may be difficult to come up with a suitable RE. There is no simple way in RE to say "the third <td>...</td> after a <tr>".My RE is pretty simple and that's exactly what it does say!

Now let's hear for Linda wether it does what I think it will do - I have no way of checking the exact implementation of the dialect DW is using (though from the description it looks pretty close to Perl REs); there are always minor differences....

dacoyle
05-16-2006, 06:09 PM
Maybe not if it's a single document. But an RE search and replace will work across any number of documents - and once I knew what dialect DW uses, it took me less than a minute to type those REs in.

REs don't cost a lot of time: they save a lot of time!

Marjolein,

I agree entirely; we maintain a spreadsheet of handy REs at work that I use daily. You wrote a couple of them. :)

My point was if it was a single file and single column, an RE wasn't necessary as DW has a function to delete columns. (That function also works for complex tables with colspan).

One of my favorite RE search strings, useful to update an older site to current standards:

<[/]?font[^>]*>

Removes all font tags, including the </font>

When I gave that to my work's webmaster and told him it would work across subfolders, I almost expected a hug.

donmcc
05-17-2006, 04:03 AM
I had an idea on this at 12:26 last night. My apologies for waiting so long to post it. :)

I would take the table and cut and paste to a single page. Then save the html, and open in MSWord. In Word you can easily delete a column. Then save the page back into html and open it in DW. I think DW has some tools to clean up the horrible HTML that Word creates.

Should be a snap, if you can't make replace work correctly.

gary
05-17-2006, 07:14 AM
My RE is pretty simple...Perhaps for anyone who uses them on a regular basis (c.f. "what does it do"); many users never use grouping and substitution. and that's exactly what it does say!...assuming that ALL the cells are identically formatted (probably so) and that there aren't any trailing spaces on lines (accomadable) and that ONLY the desired cells match the format.

iamback
05-17-2006, 07:40 AM
Perhaps for anyone who uses them on a regular basis (c.f. "what does it do"); many users never use grouping and substitution....assuming that ALL the cells are identically formatted (probably so) and that there aren't any trailing spaces on lines (accomadable) and that ONLY the desired cells match the format.Grouping and subsititution are exactly what is called for here (and I cannot even live with REs without grouping!). Pretty easy, too - there are things in REs much more complicated and harder to grasp than that.

As to assumptions: sure - but I wrote the RE for the format that Linda published. I could easily generalize it; it'd just be a bit larger, but not by much.

iamback
05-17-2006, 07:41 AM
Should be a snap, if you can't make replace work correctly.But we can, I promise!