For plain text (no markup), there are no special characters except < and &: Just make sure that your XML declaration references the correct encoding scheme for the language and/or writing system you want to use, and that your computer saves the file correctly using that encoding scheme. See the question on non-Latin characters for a longer explanation. To resolve the above errors: Ensure that the XML input string is compatible by excluding character ranges that are outside the specification. Typically, these characters are not printable or readable, and are uncontrolled input artifacts from external sources, which is predictable if the input XML comes from other systems or user input. OK, let`s separate the question about characters that: The @dolmen answer in "stackoverflow.com/questions/730133/invalid-characters-in-xml/5110103#5110103" is still valid, but needs to be updated with the XML 1.1 specification. Acceptable characters are tab characters, carriage return, line feed, and allowed characters from Unicode and ISO/IEC 10646. This is C# code for removing invalid XML characters from a string and returning a new, valid string. Predefined character entities are used in the markup of an XML document. If these are used without translation in the data values represented in the same document, the markup is no longer valid. For this reason, the XML specification defines a method for declaring these characters in the scope of an XML document to represent data instead of markup.

In C#, use System.Security.SecurityElement.Escape or System.Net.WebUtility.HtmlEncode to escape these invalid characters. In addition, although the following code points are valid in all XML 1.0 and XML 1.1 documents, they are also limited and deprecated in both versions of XML because they are permanently assigned to non-characters in Unicode and ISO/IEC 10646. Some XML parsers may even flag them as invalid in their character set decoder, and the XML documents that contain them may not pass through certain restricted interfaces or may not be interchangeable. These non-characters can still be encoded in standard UTF (such as UTF-8) because these UTFs only limit the code points assigned to surrogate non-characters: the characters `:` and `_` are allowed as the beginning of the name. The XML specification covers the character sets in section 2.2 of the XML specification. The following characters are supported by XML: You can also verify that all characters are valid in XML format: XML data formats such as HTML define additional character entities to enable rendering of special characters, which is necessary when aliases need to be configured for specific Unicode characters. HTML 4 defines 252 named character entities that can be referenced by name, decimal reference, or hexadecimal. “XmlWriter and lower ASCII characters” worked for me Comments can be displayed anywhere in a document outside of markup. In the comments, none of the 5 special characters can be escaped or encoded. The following characters are treated as the beginning of a name, not a name character, because they are arranged alphabetically in the properties file: [#x02BB-#x02C1], #x0559, #x06E5 #x06E6. It can also be noted that not all parsers always take this into account and that XML documents with control characters can be rejected. The Namespaces in XML recommendation assigns meaning to names that contain colons.

Therefore, authors must use the colon in XML names only for namespace purposes, but XML processors must accept the colon as a namecharacter. [Definition: (From strings or names:) Two strings or names being compared MUST be identical. Characters with multiple possible representations in ISO/IEC 10646 (for example, precomposed characters and base+diacritics) only match if they have the same representation in both strings. No suitcase folding is performed. (Strings and rules in grammar:) A string corresponds to a grammatical production if it belongs to the language generated by that production. (Content and content templates:) An element conforms to its declaration if it conforms in the manner described in the constraint [VC: Valid element].] A fifth character reference is also provided for the greater than sign. Although such characters rarely need to be “escaped” per se, many people prefer to “escape” them for consistency with the minus character.