Extract Embedded Files
Locate and retrieve hidden file attachments, data sheets, and invoice XML documents embedded inside your PDF files.
Extraction Settings
Files are scanned and extracted directly within your browser. Select options below to download.
Upload a PDF to scan attachments
Upload a PDF to parse its catalog attachments.
Why Use Extract Embedded Files?
Full Attachment Parsing
Scans the PDF catalog namespaces to detect and locate any files embedded in the document's attachments.
Batch ZIP Downloads
Extract all attachments individually or pack them instantly into a single ZIP archive to save time.
Privacy-First Sandboxing
Attachment parsing and binary extraction happen locally inside your web browser. No data ever leaves your device.
Zero Limits & Fast
Process and download files without signup, waiting queues, size restrictions, or watermark additions.
When to Extract Embedded Files from PDF
Extract E-Invoice Data
Recover structured ZUGFeRD or Factur-X XML data from billing documents for automated accounts payable.
Retrieve Source Tables
Extract raw Excel sheets, CSV data, or research spreadsheets nested inside publication PDFs.
Unpack Design Assets
Recover high-res image files, vector source files, or CAD attachments from layout portfolios.
Legal Exhibit Access
Access and open evidentiary files, media, or annex documents bundled inside primary legal briefs.
Access Interactive Media
Extract training audio, video clips, or sample scripts embedded inside interactive training packets.
Audit Hidden Files
Inspect incoming PDF documents to identify and audit any embedded attachments for safety and compliance.
About Extract Embedded Files
What are embedded files in PDFs?
An embedded file (also known as an attachment) is any file—such as a spreadsheet, image, XML document, or ZIP archive—that is stored directly inside the binary structure of a PDF document. Unlike images or text blocks rendered onto a page's visual layout, embedded files are kept as raw binary streams linked to the PDF's internal catalog, functioning similarly to email attachments.
How the extraction pipeline works
This tool uses `pdfjs-dist` to parse the PDF document's Catalog namespace. Specifically, it resolves the `/Names` dictionary, querying the `/EmbeddedFiles` name tree. This namespace contains references to all embedded file specifications. The tool retrieves the name, file size, and raw binary buffer (Uint8Array) of each attachment, enabling direct client-side downloads or compiling into a ZIP archive.
Confidentiality and local parsing
Many PDF tools upload your documents to external cloud systems, which poses a serious security risk for corporate documents. Our tool operates entirely in-browser. All binary parsing and file unpacking happen locally on your device's browser thread, ensuring that financial spreadsheets, invoices, or legal contracts remain 100% confidential.
ZUGFeRD & Factur-X standard support
In European billing regulations (like ZUGFeRD and Factur-X), invoice files are sent as hybrid PDF/A documents. These files contain a human-readable visual PDF layout and an embedded XML file containing the structured data. Our tool allows you to easily extract this underlying XML file, enabling quick data import into accounting software.
Difference from hyperlinks: Hyperlinks in a PDF point to external web URLs or anchors on other pages, which require an internet connection and external browser tabs. In contrast, embedded files are self-contained inside the PDF document itself. This ensures that attachments are always distributed along with the main PDF and can be accessed completely offline.
Related Tools
JPG to PDF
Convert JPG images to PDF instantly - Free online JPG to PDF converter
PNG to PDF
Convert PNG images to PDF instantly - Free online PNG to PDF converter
SVG to PDF
Convert SVG vector graphics to PDF - Free online SVG to PDF converter
BMP to PDF
Convert BMP bitmap images to PDF instantly - Free online BMP to PDF converter
Frequently Asked Questions
Simply upload your PDF document into our dropzone. The tool will parse the document structure in your browser, look for any attachments, and display a list. You can then download individual files or click "Download All (ZIP)" to save them as a single ZIP archive.
Any file format can be embedded as an attachment in a PDF. Common formats include XML data sheets (ZUGFeRD), Excel spreadsheets (XLSX, CSV), high-resolution pictures (PNG, JPEG), text files (TXT), CAD blueprints, Word documents, or even other nested PDFs.
Yes, entirely. All attachment extraction logic runs locally within your browser using JavaScript. No files or contents are sent to external servers, protecting your privacy and complying with security policies for confidential data.
Standard web browser PDF viewers (like those in Chrome, Edge, or Safari) focus on rendering page layouts and often do not include an interface for viewing or downloading file attachments. You would normally need complex desktop software like Adobe Acrobat Reader to view them, but our tool lets you extract them instantly in any browser.
No, password-protected or encrypted PDFs restrict access to their catalog namespaces. You must first decrypt or unlock the PDF file using our "Unlock PDF" tool before you can extract its embedded attachments.
No. There are no file size limits, page counts, or attachment number restrictions. You can extract dozens of embedded files at once for free. The speed and limit depend solely on your computer's browser memory capacities.
Yes, all tools on our website are completely free to use with no registration, email sign-ups, or usage limits.