DTP


 
Lively discussions on the graphic arts and publishing — in print or on the web


Go Back   Desktop Publishing Forum > General Discussions > Print Production & Automation

Reply
 
Thread Tools Display Modes
Old 08-05-2009, 01:07 AM   #1
Mike
Staff
 
Mike's Avatar
 
Join Date: Oct 2004
Location: Llanwrtyd Wells
Posts: 1,450
Default Tricky search and replace

Quote:
Originally Posted by Benwiggy View Post
That's where grep searching works wonders.
Any idea as to how I'd go about using grep to replace double quotes with single while ensuring that quotes within quotes are double without also substituting apostrophes in possessives or abbreviations?

He said, "I was told, 'The Jones' songs of the '60s can't be compared to today's 'cos they're older,' by a real aficionado."

which needs to become:

He said, 'I was told, "The Jones' songs of the '60s can't be compared to today's 'cos they're older," by a real aficionado.'

   
__________________
Mike

www.welshframing.com
Mike is offline   Reply With Quote
Old 08-05-2009, 01:11 AM   #2
Richard Waller
Member
 
Richard Waller's Avatar
 
Join Date: Aug 2005
Location: Goring-by-Sea, West Sussex UK
Posts: 732
Default

Use italics perhaps?

   
__________________
Richard Waller
www.waller.co.uk
www.goring-by-sea.uk.com
Richard Waller is offline   Reply With Quote
Old 08-05-2009, 01:19 AM   #3
annc
Sysop
 
annc's Avatar
 
Join Date: Oct 2004
Location: Subtropical Queensland, Australia, between the mountains and the Coral Sea
Posts: 4,434
Default

It's pretty difficult and I don't have a solution. In most books, it's just a tedious task to go through page by page after the initial S&R. I'm no GREP wizard, so in my limited experience, there will be problems with setting the maximum number of characters, multiple runs through the text etc. If you've got a copy editor available, that's probably the best bet.

We probably need Brad Walrod for this, because he's had experience the other way, running scripts to change English quotes etc. to American.

   
__________________
annc is offline   Reply With Quote
Old 08-05-2009, 07:21 AM   #4
ktinkel
Founding Sysop
 
ktinkel's Avatar
 
Join Date: Oct 2004
Location: In Connecticut, on the Housatonic River near its mouth at Long Island Sound.
Posts: 11,189
Default

Quote:
Originally Posted by Mike View Post
Any idea as to how I'd go about using grep to replace double quotes with single while ensuring that quotes within quotes are double without also substituting apostrophes in possessives or abbreviations?

He said, "I was told, 'The Jones' songs of the '60s can't be compared to today's 'cos they're older,' by a real aficionado."

which needs to become:

He said, 'I was told, "The Jones' songs of the '60s can't be compared to today's 'cos they're older," by a real aficionado.'
I would start by replacing single quotes with different characters (## and ###, say). That leaves the opening apostrophe in the contracted date in limbo; depending on how much of that sort of thing I expect, I might just let it go, then fix it later.

Then replace all the opening double quotes with opening singles; then do the closing sets.

Finally, search on the ## and ### you set up before and convert those to opening and closing singles respectively.

Then root out the opening apostrophes and fix those yourself. (Or maybe someone knows a better way.)

   
__________________
[SIZE=2][COLOR=LemonChiffon]::[/COLOR][/SIZE]
[SIGPIC][/SIGPIC]
ktinkel is offline   Reply With Quote
Old 08-05-2009, 11:24 AM   #5
don Arnoldy
Curmudgeon
 
don Arnoldy's Avatar
 
Join Date: Oct 2004
Posts: 491
Default

Quote:
Originally Posted by ktinkel View Post
I would start by replacing single quotes with different characters (## and ###, say)...
One could start by replacing any single quote sandwiched by alpha characters with a marker character—this would catch "can't," "today's," and "they're" in the example. Then, you could turn those back after completing the steps you enumerated. This would leave "Jones'," "'60s," and "'cos" in the example to be corrected manually. Distinguishing initial and terminal apostrophes from single quotes was (as I remember) one of the big stumbling blocks to the implementation of "smart quotes."

--don

   
__________________
--don
don Arnoldy is offline   Reply With Quote
Old 08-05-2009, 12:03 PM   #6
Howard Allen
Member
 
Howard Allen's Avatar
 
Join Date: Oct 2007
Location: Calgary, Alberta, Canada
Posts: 824
Default

After thinking about it, and doing some experiments, I'd follow this workflow (assuming the quotes and apostrophes are all proper "curly quotes" to begin with):

1) Find & replace all left double quotes with a proxy character (say #)
2) Find & replace all right double quotes with a second proxy character (say ~)
3) Find & replace all left single quotes (these can only be left single quotes) with a third proxy character (say ^).
4) Find & replace right single quotes followed by a word character (these can only be apostrophes) with a fourth proxy character (say *). Use the GREP search string \’(\w) and the replace string \*\1
5) Find & replace right single quotes preceded by a non-word character (these could only be right quotation marks) with a fifth proxy character (say @). Use the GREP search string (\W)\’ and the replace string \1\@
6) This will leave only the rare ambiguous cases where single right quote characters differ from apostrophes only by context (such as your Jones' example). You will have to manually search-and-replace these one at a time, but there shouldn't be many of them to deal with.
7) Find and replace all your proxy characters with their correct replacements.

   
__________________
Howard

OSX 10.10.5
Howard Allen is offline   Reply With Quote
Old 08-05-2009, 12:15 PM   #7
ktinkel
Founding Sysop
 
ktinkel's Avatar
 
Join Date: Oct 2004
Location: In Connecticut, on the Housatonic River near its mouth at Long Island Sound.
Posts: 11,189
Default

Watch out which proxy characters you choose — the reason I use double characters (hash marks or others) is because those doubles will be unlikely to occur in text no matter what subject (or lists of references) may be included.

   
__________________
[SIZE=2][COLOR=LemonChiffon]::[/COLOR][/SIZE]
[SIGPIC][/SIGPIC]
ktinkel is offline   Reply With Quote
Old 08-05-2009, 03:13 PM   #8
Michael Rowley
Member
 
Join Date: Jan 2005
Location: Ipswich (the one in England)
Posts: 5,105
Default

Howard:
Quote:
assuming the quotes and apostrophes are all proper "curly quotes" to begin with
But Mike says that he often gets manuscripts in which all the quoted matter is enclosed by pairs of ", and all apostrophes are indicated by '. He has the task of replacing the pairs of " by single quotation marks (thus: ‘—’) and seeing that the apostrophes are the right way round (thus: ’— and —‘—). If he's doing work on copy finished in USA, he'll have a simpler task (but not much simpler).

   
__________________
Michael
Michael Rowley is offline   Reply With Quote
Old 08-05-2009, 06:10 PM   #9
Howard Allen
Member
 
Howard Allen's Avatar
 
Join Date: Oct 2007
Location: Calgary, Alberta, Canada
Posts: 824
Default

Quote:
Originally Posted by Michael Rowley View Post
Howard:
But Mike says that he often gets manuscripts in which all the quoted matter is enclosed by pairs of ", and all apostrophes are indicated by '. He has the task of replacing the pairs of " by single quotation marks (thus: ‘—’) and seeing that the apostrophes are the right way round (thus: ’— and —‘—). If he's doing work on copy finished in USA, he'll have a simpler task (but not much simpler).
OK, you've thrown down the gauntlet

Try this on for size (this time, assuming NO curly quotes or apostrophes in the original text):

1) Find double quotes followed by any word character (this catches all the opening double quotes), replace with proxy character 1. GREP find string would be \"(\w) replace string would be \proxy1\1 (where proxy1 is the character of your choice)

2) Find double quotes followed by any non-word character (this catches all the closing double quotes), replace with proxy2. Find string would be \"(\W) replace string would be \proxy2\1

3) Find single quotes bounded on both sides by any word character (this catches all the regular apostrophes), replace with proxy3. Find string would be (\w)\'(\w) replace string would be \1\proxy3\2

4) This is the step that deals with ambiguous single quotes/apostrophes, but narrows the possibilities: Find single quotes preceded by s (possessive plurals, such as Jones' kids' etc.). This will also catch closing single quotes in those instances where the last word of the quoted text ends with s, but these should be few enough that a manual search-and-replace would be tolerable. replace with proxy4, and do the manual search-and-replace at this stage. Find string would be s\' replace string would be s\proxy4

5) Find single quotes preceded by a space, followed by any characters except a CR, followed by a single quote followed by a non-word character. This finds opening single quotes, replaces them with proxy5, and closing single quotes, replacing them with proxy6. (Logically, there should be no cases in which an opening single quote is separated from its closing single quote by a carriage return). Find string would be _\'(.+)\'(\W) where _ is a space character, NOT an underscore! Replace string would be _\proxy5\1\proxy6\2

6) This should leave only leading apostrophes (such as '60s, 'cos etc.) which at this point can be replaced with their curly counterparts.

7) Replace all your other proxies with their correct counterparts.

I won't guarantee that this workflow will catch 100% of the permutations and combinations, but it worked on Mike's example. I wouldn't be surprised if there are some rare contingencies that I've missed. If your contributor used double quotes for ALL levels of quotation marks (singles and doubles), it obviously wouldn't work--this would be a major challenge even for a manual search-and-replace!

And like Kathleen says, be careful what you choose for proxies!

Quote:
and seeing that the apostrophes are the right way round (thus: ’— and —‘—).
Er...it seems to me that apostrophes can only be one way round, can't they?

Cheers,

   
__________________
Howard

OSX 10.10.5
Howard Allen is offline   Reply With Quote
Old 08-06-2009, 02:36 AM   #10
Mike
Staff
 
Mike's Avatar
 
Join Date: Oct 2004
Location: Llanwrtyd Wells
Posts: 1,450
Default

That looks promising. Many thanks.

I have a new typescript to deal with next week so I'll experiment with that and inDesign's searchreplacebylist script. Many typescripts these days arrive with smart quotes courtesy of MS Word but ‘60s (rather than ’60s) is common and frequently there are batches of straight quotes for no apparent reason.

   
__________________
Mike

www.welshframing.com
Mike is offline   Reply With Quote
Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Just the fax: fix or replace? marlene Business Matters 24 04-18-2007 06:31 AM
Tricky Font ID dthomsen8 Fonts & Typography 5 04-17-2007 11:56 AM
DW Search/Replace Regular Expressions Linda Baldwin Web Site Building & Maintenance 17 05-17-2006 07:41 AM
Replace a table with CSS ilox Web Design 7 08-22-2005 07:07 AM
You think fonts are tricky today? ktinkel Fonts & Typography 10 06-13-2005 01:18 PM


All times are GMT -8. The time now is 02:44 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2017, vBulletin Solutions, Inc.
Contents copyright 2004–2014 Desktop Publishing Forum and its members.