GlyphCopy

unicode detector

Unicode Character Detector

Paste text to find every zero-width, invisible, control, and direction-mark Unicode character it contains. Toggle the categories you want to remove and copy the cleaned text back. The detector is designed to debug AI output, copied PDFs, broken usernames, and config files — all without uploading your text.

Last reviewed Read the privacy policy

Unicode Character Detector

Detected characters

Paste text above to inspect the Unicode characters it contains.

Remove invisible characters

Cleaned text

Detect invisible Unicode characters

Paste any string in the textarea above. The detector scans every character, classifies it, and lists the position, code point, and Unicode name for anything that is invisible, zero-width, control-only, or otherwise unexpected. Smart punctuation such as curly quotes and em dashes is reported separately so you can decide whether to keep them.

What this tool can find

Zero-width characters

Zero-width characters take no horizontal space. They are common in invisible username tricks, AI output, and copied web pages. The detector finds U+200B, U+200C, U+200D, U+2060, and U+FEFF.

Non-breaking and Unicode spaces

Non-breaking spaces (U+00A0) and the wide Unicode space block (U+2000–U+200A, U+202F, U+205F, U+3000) look similar to a normal space but behave differently. They often slip into copied text from word processors and PDFs.

Direction marks

Right-to-left and left-to-right marks (U+200E, U+200F) plus the embedding and isolate controls (U+202A–U+202E, U+2066–U+2069) can flip parts of your text in confusing ways. The detector flags every one of them.

Control characters

Most control characters in the C0 range (under U+0020) are not meant to appear in normal text. The detector lists every occurrence so you can clean up logs, CSV files, or pasted shell output.

Smart punctuation

Curly quotes, em dashes, and ellipses are not invisible, but they often break code, URLs, and CSV files. The detector calls them out so you can decide whether to keep them or normalize them.

Why hidden characters cause problems

Code and config files

A zero-width space inside a JSON key, a YAML indent, or a SQL query is invisible in most editors but breaks parsers. Hidden direction marks can flip variable names in code reviews. Detecting and removing them avoids hours of debugging.

CSV and spreadsheets

Spreadsheets often pick up non-breaking spaces and curly quotes from copy-paste. They make exact-match lookups fail and inflate cell values. The detector reveals each occurrence so you can clean the file before importing.

URLs and search

A zero-width space hiding in a URL prevents redirects and search engines from indexing the link. Text generated by AI tools is a frequent source of these characters.

AI-generated text and copied documents

LLM output and PDFs often contain unusual spaces, smart quotes, and control characters. Run pasted snippets through the detector before publishing them.

How to remove hidden characters safely

Toggle the categories you want to remove, then copy the cleaned text. The default selection removes zero-width characters, direction marks, and control characters because they are almost always unwanted. Non-breaking spaces and smart punctuation are kept by default because they sometimes carry meaning in formatted writing.

Privacy: everything happens in your browser

The detector loads zero external scripts when you paste, classifies each character locally, and never uploads the input. You can disconnect from the network after the page has loaded and the tool keeps working.

FAQ

Frequently asked questions

What is a zero-width character?

A zero-width character takes no horizontal space when rendered. The most common are U+200B (Zero Width Space), U+200C (Zero Width Non-Joiner), U+200D (Zero Width Joiner), U+2060 (Word Joiner), and U+FEFF (BOM).

Why does my code or username break?

A hidden character in the middle of a string can prevent exact matches, change parsing, or flip a section of text. The detector finds the offending character and tells you its position.

Can this detect AI-generated text?

AI output sometimes contains unusual zero-width or punctuation characters. The detector finds them, but it cannot prove that the text was generated by an AI on its own.

Does this detect every Unicode character?

It detects the categories that most commonly cause bugs and abuse: zero-width, control, direction marks, unusual spaces, and smart punctuation. Visible normal characters are not listed.

How do I remove invisible characters?

Toggle the categories you want to remove and copy the cleaned text. Toggle them off to keep them.

Does GlyphCopy upload my text?

No. The detector classifies and rewrites the text locally in your browser.

Why do smart quotes matter?

Smart quotes look almost identical to regular quotes but break exact matches in code, URLs, and CSVs. Cleaning them often fixes mysterious bugs.

What is a non-breaking space?

U+00A0 looks like a normal space but does not allow line breaking. It comes from copy-paste in word processors and HTML and is a common source of off-by-one bugs.

What is a direction mark?

Direction marks (LTR/RTL) and isolates control how text segments flow visually. They can hide changes in code review or fool search by reordering text.

Can I keep normal line breaks?

Yes. The detector treats `\n` and `\r` as expected control characters and the cleaner does not strip them by default.

More tools