Removing invisibles

Have a problem? A question? This is the place for answers from other Express users.
Post Reply
ngazidja
Posts: 152
Joined: 2005-01-23 17:12:16

Removing invisibles

Post by ngazidja »

A friend has sent me a Microsot Word document (I know, I told him) which has some serious problems. It took me forever to open, is 1.7Mb, has 400,000 characters and 163,000 words and runs to 70 pages. Hmm. I turned on invisibles and it appears every second character is something I haven't seen before, an invisible that looks like a box crossed out.

In order of importance:

1. How do I get rid of these (I can't cut and paste them into the find and replace, and since I don't know what they are I can't type them into the find box either)?

2. What are they?

3. How did they get there?

4. What is the meaning of life? (Just kidding, answers to 1-3 will do).
cchapin
Posts: 424
Joined: 2004-02-25 18:28:40
Location: Nagoya, Japan

Post by cchapin »

Here's an idea that might help. I don't really know.

1. Select one of these weird characters.
2. Open the system's Character Palette (Edit > Special Characters or else the language flag in the menu bar > Show Character Palette).
3. In the lower left corner of the Character Palette (at least in Tiger) is a button with a little gear and a triangle. Click that.
4. From the menu that pops up, select Show Character Selected in Application.
5. Open the Find dialog (Edit > Find > Find).
6. With the insertion point in the "Find what" field, click the Insert button at the bottom right of the Character Palette.
7. Back in the Find dialog (you might as well close the Character Palette at this point), make sure that the "Replace with" field is blank.
8. Click Replace All.

Let me know if this works.

--Craig
ngazidja
Posts: 152
Joined: 2005-01-23 17:12:16

Post by ngazidja »

Unfortunately I have 10.3.9, which doesn't have that character selected option...
User avatar
martin
Official Nisus Person
Posts: 5228
Joined: 2002-07-11 17:14:10
Location: San Diego, CA
Contact:

Re: Removing invisibles

Post by martin »

ngazidja wrote:1. How do I get rid of these?
You actually should be able to copy-paste one of these gremlins to the Find window. Just place the caret before one of them, hold down the shift key, and press the left/right arrowkey once. Even though nothing appears to be selected, you should have the character. Then copy-paste it to the Find window and do the replacement.
2. What are they?
If you are using the pre-release version of 2.7 you can find out what code points the little boxes are by using the menu Convert > To Unicode Code Points.
3. How did they get there?
Hard to know, could be a file conversion bug or perhaps they came along with the original text. If you can, send it onward to expressfeedback@nisus.com.
4. What is the meaning of life?
42 right?
ngazidja
Posts: 152
Joined: 2005-01-23 17:12:16

Re: Removing invisibles

Post by ngazidja »

You actually should be able to copy-paste one of these gremlins to the Find window.
I know you won't believe this, but it didn't work the first time I tried... honest! (But I have done a bit of format switching and save-as-text since I received it.)
Convert > To Unicode Code Points.
U+0000. Now that I know that, what do I do?
If you can, send it onward to expressfeedback@nisus.com.
I can't send the whole doc, since it's not mine, but I can send an excerpt if that's any use. But its basically a file with a U+0000 symbol between every character, as far as I can tell.
42 right?
What was the question again?
User avatar
martin
Official Nisus Person
Posts: 5228
Joined: 2002-07-11 17:14:10
Location: San Diego, CA
Contact:

Re: Removing invisibles

Post by martin »

ngazidja wrote:U+0000. Now that I know that, what do I do?
That's the ASCII/Unicode "null" character. Basically it shouldn't ever appear in document text. I've actually seen this before, in all cases because the text was mangled by some application before it ever touched Express.

What can you do? Probably not much if the document wasn't put together by you. Once you have it in Express you can user PowerFind Pro to eliminate the null characters:

1. Open the Find window.
2. Set the mode to PowerFind Pro.
3. Search for "\u0000" and replace it with nothing.

Maybe we should add a Zap Gremlins command like BBEdit has.
ngazidja
Posts: 152
Joined: 2005-01-23 17:12:16

Post by ngazidja »

Wasn't there a "remove gremlins" facility in Classic? Or I am thinking of some other piece of software?
ngazidja
Posts: 152
Joined: 2005-01-23 17:12:16

Post by ngazidja »

(Like the one that mangled it?)
ngazidja
Posts: 152
Joined: 2005-01-23 17:12:16

Post by ngazidja »

Me again, just thought I'd summarize the process, FYI:

1. Received a "doc" file, couldn't open it with NWE.

2. Friend saved it as an rtf and sent again.

2. Opened it with NWE, but still couldn't find the problem - the invisibles didn't appear when switched on but, looking at it again now, I see there is something between each character (because it takes two arrows to jump one character).

4. Saved it as txt and opened again with NWE.

5. Invisible invisibles now visible and selectable, copyable, etc.

6. PowerFind Pro now removing several hundred thousand gremlins...

7. Gone for a cofffee.

8. Document fine.
Mark XM
Posts: 51
Joined: 2003-01-23 18:54:50
Location: Xiamen, China

Re: Removing invisibles

Post by Mark XM »

[quote="ngazidja"]A friend has sent me a Microsot Word document (I know, I told him) which has some serious problems. It took me forever to open, is 1.7Mb, has 400,000 characters and 163,000 words and runs to 70 pages. Hmm. I turned on invisibles and it appears every second character is something I haven't seen before, an invisible that looks like a box crossed out.

I often import Microsot (Good version!) Wierd docs into NWE, in my case produced under Chinese Windo$e though it happens with other versions, which are problematic. I have two solutions that you could try.

(1) run them through MacLink plus and translate this one to RTF Mac and then try opening that.

(2) open it in TextEdit and save it out as an RTF file, or at worst as a plain text file.

Or do both in that order. It's a real bore, but it makes things possible.

Mark
User avatar
martin
Official Nisus Person
Posts: 5228
Joined: 2002-07-11 17:14:10
Location: San Diego, CA
Contact:

Post by martin »

If either of you can submit to me a document that has these gremlins in Express, but not in TextEdit, I would be grateful. I do need to see the whole file I'm afraid, since just copying a chunk of it and saving it as a new file won't reveal any decoding errors.
Mark XM
Posts: 51
Joined: 2003-01-23 18:54:50
Location: Xiamen, China

Post by Mark XM »

martin wrote:If either of you can submit to me a document that has these gremlins in Express, but not in TextEdit, I would be grateful. I do need to see the whole file I'm afraid, since just copying a chunk of it and saving it as a new file won't reveal any decoding errors.
Hi Martin, you've had most of my problem files in the course of our correspondence. My message was merely a suggestion to ngazidja that ways of getting to files that don't want to behave properly include using MacLink Plus to translate them and/or opening them first in TextEdit. I have found that TextEdit is pretty tolerant of peculiarities, so it can help in cases where Nisus won't open a file. I just thought that his gremlins might show up in TextEdit, which would enable him to zap them. :)

Yours
Mark
dhlr1
Posts: 5
Joined: 2004-03-13 05:07:11

gremlins

Post by dhlr1 »

I miss the remove gremlins Classic macro.
Post Reply