DTP


 
Lively discussions on the graphic arts and publishing — in print or on the web


Go Back   Desktop Publishing Forum > General Discussions > Print Design

Reply
 
Thread Tools Display Modes
Old 12-17-2012, 09:24 AM   #1
UK_Smithy
Member
 
Join Date: Aug 2009
Posts: 7
Default Filtering Text with Basic Formatting Only

I've been working with QuarkXPress and InDesign for many years, but have not yet found a way to filter text files from MS Word so that ONLY very basic formatting is imported. For instance, I want to import Word files WITHOUT font name, size and spacing information, but I want to RETAIN the formatting data for Roman, Italic, Bold and Bold Italic. I want to remove everything else, so in essence I want a plain text file that contains the <b>, <i>, <bi> and <p> flags but no other formatting data.

Does anyone know if this is possible, and if so what are the best tools?

Thanks.
UK_Smithy is offline   Reply With Quote
Old 12-17-2012, 10:54 AM   #2
Steve Rindsberg
Staff
 
Join Date: Nov 2004
Posts: 6,734
Default

I'm probably missing something, but it seems you could open the file in Word, save under a new name, then Ctrl+A to select all, change everything to a single font name, size and spacing. That shouldn't affect bolding/italic/etc.

   
__________________
Steve Rindsberg
====================
www.pptfaq.com
www.pptools.com
and stuff
Steve Rindsberg is offline   Reply With Quote
Old 12-17-2012, 11:42 AM   #3
UK_Smithy
Member
 
Join Date: Aug 2009
Posts: 7
Default

The problem with that solution is the fact that the Word file continues to contain font & size definitions. Even if it's all the same font/size, the information is there regardless. I want to be able to import the file into InDesign or Quark in the same way as I would a plain text file, but it would contain just the roman, italic, bold etc styles and nothing else.

I'm importing Word files for a number of academic books, and the fonts and sizes must be standardised using Style Sheets, but I don't want to have to go through manually redefining the italic and bold bits, of which there are thousands.
UK_Smithy is offline   Reply With Quote
Old 12-17-2012, 12:42 PM   #4
UK_Smithy
Member
 
Join Date: Aug 2009
Posts: 7
Default

Put another way, all I want to do is import a Word document into Quark or InDesign and then apply Style Sheets cleanly but without losing the four basic text attributes.
UK_Smithy is offline   Reply With Quote
Old 12-17-2012, 01:26 PM   #5
terrie
Staff
 
Join Date: Oct 2004
Posts: 8,931
Default

Quote:
uk smithy: Does anyone know if this is possible, and if so what are the best tools?
I don't use Word so I'm not sure if this is possible and if possible, it will give you what you want but...'-}}

Can you save the file as an RTF (rich text format) and try importing the RTF into Quark/ID?

Terrie
terrie is offline   Reply With Quote
Old 12-17-2012, 01:32 PM   #6
UK_Smithy
Member
 
Join Date: Aug 2009
Posts: 7
Default

Thanks, but no, that doesn't work. The RTF file contains just as much formatting data as the Word file. I'm currently experimenting with InDesign's ability to export as 'Tagged Text'. When I import that back into a text editor (TextWrangler in my case) it shows all of the formatting and style data. If I can delete everything except the tags for the aforementioned styling I should be able to import it back, adopting the InDesign Style Sheets 'cleanly' but also retaining the basic styles. I'll let you folks know if I'm successful.
UK_Smithy is offline   Reply With Quote
Old 12-17-2012, 02:00 PM   #7
Michael Beloved
Member
 
Join Date: Sep 2008
Location: Brooklyn NY
Posts: 141
Default

I have found that the only way to remove the hidden Word format markings completely is to move the contents into Notepad and then copy it from there and paste it in. The problem with this method is that you lose the elementary styling which you had before.

I found this out when I first began converting .docx files from Word 2007 to html format in preparation for making kindle files.

After trying many solutions, I came to the conclusion that you cannot remove the Word markings in total. To have a clean file you have to begin with a txt file and style it in the desired program from day one.

   
__________________
michael beloved
Michael Beloved is offline   Reply With Quote
Old 12-17-2012, 02:28 PM   #8
terrie
Staff
 
Join Date: Oct 2004
Posts: 8,931
Default

Quote:
uk smithy: I'll let you folks know if I'm successful.
Sorry the RTF idea was a no go. Do let us know how it goes...

One of the reasons I have always liked WordPerfect is because of its Reveal Codes option which allow you to see the internal codes and you can do a lot of playing with them although I don't know if more current versions still have it--I'm still using WordPerfect 8...

Terrie
terrie is offline   Reply With Quote
Old 12-17-2012, 02:54 PM   #9
Howard Allen
Member
 
Howard Allen's Avatar
 
Join Date: Oct 2007
Location: Calgary, Alberta, Canada
Posts: 824
Default

I feel your pain. I do a palaeontological abstracts volume every year, as well as a quarterly newsletter and the submissions (almost all Word files) are liberally peppered with italicized latin names, all in different fonts and styles. I want the italics, but not all the other junk.

I'm not sure if I've fully grasped your problem, however, because it seems to me that InDesign already does what you want if you simply "Place" using the "Remove Styles and Formatting from Text and Tables" with the "Preserve Local Overrides" box checked.

After it's placed in the ID document, I apply my paragraph style sheet to the text, and it's done. See the attached screenshots "before" (Word document, in 12 pt Times New Roman and 18 pt Arial) and "after" (ID document in 11 pt Minion Pro). All the italic and bold comes through with no fiddling. Note that your ID style sheet should specify only the font family, not a particular face (plain, bold, italic, etc.).

Am I barking up the wrong tree?
Attached Thumbnails
Click image for larger version

Name:	before.png
Views:	39
Size:	55.8 KB
ID:	1800   Click image for larger version

Name:	after.png
Views:	31
Size:	46.8 KB
ID:	1801  

   
__________________
Howard

OSX 10.10.5
Howard Allen is offline   Reply With Quote
Old 12-17-2012, 03:05 PM   #10
UK_Smithy
Member
 
Join Date: Aug 2009
Posts: 7
Default

Howard - thanks, you're barking up the right tree, that's spot on!

I hadn't understood the 'Preserve Local Overrides' box! I believe Quark 9 has a similar feature but I only have version 8, but obviously ID does sport it, so even if I need to work in Quark I can almost certainly convert the text in ID first.

All that said, it does seem feasible to edit tagged text using a text editor, but that's still long-winded compared to InDesign's little gizmo.

Joy of joys - happy Christmas one and all!
UK_Smithy is offline   Reply With Quote
Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Formatting Poetry Andrew B. Print Design 1 07-23-2010 12:37 PM
Basic Type Vocabulary don Arnoldy Fonts & Typography 45 09-25-2006 11:47 AM
First line and letter formatting Bo Aakerstrom Web Site Building & Maintenance 8 08-24-2006 08:45 AM
DW Different Basic Page? dthomsen8 Web Site Building & Maintenance 4 03-28-2006 10:26 AM
Formatting various levels of subheads marlene Print Design 22 03-25-2005 10:42 PM


All times are GMT -8. The time now is 09:21 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2017, vBulletin Solutions, Inc.
Contents copyright 2004–2014 Desktop Publishing Forum and its members.