Extract Text from PDF Free - Copy Text Without Formatting Issues
Need to copy text from a PDF? Here are the best free methods to extract text from PDFs without formatting problems.
You found a great resource in a PDF, but you can't copy the text. Either the text is locked (copying is disabled), or when you do copy it, it comes out as a jumbled mess with line breaks everywhere and fonts that look wrong.
Extracting text from PDFs shouldn't be this hard. Here are the methods that actually work, from the simplest to the most robust.
Method 1: Direct Copy-Paste (Works for Unlocked PDFs)
Most PDFs let you copy text directly. The problem is how you do it.
The Wrong Way
Select all text with Ctrl+A, paste it into Word, and you get line breaks after every few words. Paragraphs are broken up. Tables turn into nonsense. The formatting is destroyed.
The Right Way
Instead of copying entire pages at once, copy smaller sections:
- Open your PDF reader (Adobe, Chrome, Edge, Preview, etc.)
- Click and drag to select just one paragraph or section at a time
- Press Ctrl+C / Cmd+C to copy
- Paste into your destination with Ctrl+V / Cmd+V
- Repeat for each section
This preserves paragraph structure better than selecting everything at once. It's slower, but you end up with readable text.
Copying from Tables
Tables in PDFs are the worst. Even when you copy them correctly, they paste as unstructured text. Cell boundaries disappear, and columns get mixed up.
If you need the table structure preserved, skip to the methods below that use OCR or specialized conversion tools.
Method 2: Google Docs (Free, Handles Formatting Well)
Google Docs has a hidden talent: it can open PDFs and extract text while preserving most formatting.
- Go to Google Docs in your browser
- Click the folder icon to upload a file
- Select your PDF and upload it
- Google will ask: "Do you want to convert this PDF to an editable Google Doc?"
- Click "Open with Google Docs"
- Wait for the conversion to complete
- You can now edit and copy the text directly
Google's conversion is surprisingly good. It handles:
- Paragraph structure
- Headings (they become bold/larger text)
- Lists (they become bulleted or numbered lists)
- Basic tables (sometimes)
Limitations: Complex layouts often get scrambled. Images are removed (you just get the text). The conversion isn't perfect, but for most documents, it's better than raw copy-paste.
Privacy: You're uploading the PDF to Google. If it contains sensitive information, consider another method.
Method 2.5: Microsoft Word (If You Have Office)
If you have Microsoft Word (part of Office 365 or a standalone license), it can open PDFs directly and convert them to editable text.
- Open Microsoft Word
- File → Open → select your PDF
- Word will convert the PDF to an editable document
- The original PDF is left unchanged
- Edit and copy the text as needed
Word's PDF conversion is better than Google Docs in some ways, worse in others. It handles:
- Paragraph structure (usually well)
- Fonts and text sizes (preserved better than Google)
- Tables (hit or miss)
- Images (extracted and placed inline)
Limitations: Requires a paid subscription (though many workplaces have it). Complex layouts can still get scrambled.
Method 3: PDF to Text Converter Tools (Free, Text-Only)
Sometimes you don't need formatting — just the raw text. PDF to text converters strip everything except the words.
Online Converters
Sites like iLovePDF, Smallpdf, PDF2Go, and countless others offer "PDF to Text" or "PDF to TXT" conversion.
- Go to the converter's website
- Upload your PDF
- Click "Convert"
- Download the TXT file
The result is plain text with no formatting. This is actually useful when:
- You're feeding the text into another tool (like a translation API or text analyzer)
- You want to search through the content programmatically
- File size matters (TXT files are tiny)
- You're pasting into a system that strips formatting anyway
Command Line: pdftotext
If you're comfortable with the terminal, pdftotext is a fast, free command-line tool.
Install on Ubuntu/Debian:
sudo apt install poppler-utilsInstall on Mac:
brew install popplerExtract text:
pdftotext input.pdf output.txtPreserve layout (columns, tables — still imperfect but better):
pdftotext -layout input.pdf output.txtMethod 4: OCR for Scanned PDFs (When You Can't Select Text)
If you can't select text at all, your PDF is probably a scanned image. The text is part of the image, not actual text you can copy. You need OCR (Optical Character Recognition).
Google Drive (Free, Surprisingly Good)
Google's OCR is free and works better than you'd expect.
- Upload your PDF to Google Drive
- Right-click the PDF → "Open with" → "Google Docs"
- Google will OCR the document and create an editable Doc
- The original PDF is preserved — the OCR'd version is a new file
Google handles common OCR errors decently. It recognizes:
- Most common fonts
- Basic layout (columns, paragraphs)
- Mixed fonts and sizes
Handwriting and very old or damaged scans will still give it trouble.
Adobe Acrobat (Paid, Best OCR)
If you have Adobe Acrobat Pro, its OCR engine is excellent.
- Open your PDF in Adobe Acrobat
- Tools → "Edit PDF" (or "Scan & OCR")
- Acrobat will ask if you want to OCR the document
- Choose your OCR settings (language, output format)
- Click "Recognize Text"
- Wait for processing
- You can now copy and edit the text
Adobe's OCR is particularly good at:
- Recognizing text in images
- Preserving layout (columns, headers, footers)
- Handling low-quality scans
- Multilingual documents
Tesseract (Free, Open Source)
Tesseract is a free, open-source OCR engine maintained by Google. It's powerful but requires setup.
Install on Ubuntu/Debian:
sudo apt install tesseract-ocrInstall on Mac:
brew install tesseractOCR a PDF (requires additional setup with Ghostscript to convert PDF to images first):
gs -sDEVICE=png16m -dTextAlphaBits=4 -r300 -o output-%03d.png input.pdf tesseract output-001.png output-001Tesseract is powerful but requires more technical knowledge than the other options.
Method 5: When Copying Is Disabled (Permission Restrictions)
Sometimes you can select text, but when you try to copy, nothing happens. The PDF has a permission restriction that disables copying.
Chrome's Print Trick (Works for Most Permission Passwords)
As mentioned in other guides, Chrome can strip permission passwords for free:
- Open the PDF in Chrome
- Press Ctrl+P / Cmd+P to print
- Change printer to "Save as PDF"
- Save the new PDF
- Open the new PDF — copying is now enabled
This removes permission restrictions (including copy-disabled) but doesn't remove encryption (user passwords).
Online "Unlock PDF" Tools
Sites like iLovePDF, Smallpdf, and PDF2Go offer "unlock PDF" features that remove permission restrictions.
- Upload your PDF
- Select "unlock" or similar option
- If it's a permission password, you won't need to enter anything
- Download the unlocked PDF
Warning: You're uploading your files. For sensitive documents, use a local tool like Adobe Acrobat or command-line qpdf/pdftk.
Common Problems and Fixes
Problem: Line Breaks After Every Word
Cause: The PDF was created with hard line breaks (like a fixed-width text document), or you copied an entire page at once.
Fix: Copy smaller sections instead of whole pages, then manually merge paragraphs. Or use Google Docs/Word conversion which handles this better.
Problem: Text Comes Out as Garbled Characters
Cause: The PDF uses a custom font or encoding that your system doesn't recognize. This is common with PDFs created from Asian languages or specialized software.
Fix: Try Google Docs conversion (it has broader font support). If that fails, use Adobe Acrobat or try OCR even on non-scanned PDFs.
Problem: Tables Are Completely Broken
Cause: PDF tables are essentially invisible — they're visual arrangements of text, not actual table structures. When you copy them, the layout information is lost.
Fix: Use specialized PDF to Excel converters (like those covered in other guides) or Adobe Acrobat, which does better at detecting table structures.
Problem: Can't Copy Anything — Text Not Selectable
Cause: The PDF is a scanned image, not actual text. It needs OCR.
Fix: Use Google Drive, Adobe Acrobat, or Tesseract to OCR the document first, then copy the text.
Quick Reference: Which Method Should You Use?
"Just need to copy a few paragraphs"
- Direct copy-paste, section by section
"Need to preserve formatting as much as possible"
- Google Docs (free) or Microsoft Word (if you have it)
"Just want the raw text, no formatting"
- Online PDF to text converter or pdftotext command line
"Can't select text at all — it's a scanned document"
- Google Drive OCR (free, works well enough)
- Adobe Acrobat OCR (best quality, paid)
- Tesseract (free, technical)
"Copying is disabled — permission restriction"
- Chrome print-to-PDF trick (free, works for most)
- Online unlock tool (free, uploads files)
- Adobe Acrobat or qpdf/pdftk (local)
"Need to copy a table with structure intact"
- Adobe Acrobat (best chance)
- Specialized PDF to Excel converter
The Bottom Line
Extracting text from PDFs is simple when the PDF is unlocked and you copy small sections. It gets complicated with scanned documents, permission restrictions, or complex layouts.
Start with direct copy-paste. If that fails, try Google Docs or Word conversion. For scanned PDFs, use OCR. For permission issues, Chrome's print trick or an unlock tool will usually do the job.
And remember: some PDFs were never meant to have their text extracted. Scanned images of text, heavily formatted documents, and PDFs with custom encodings may always give you trouble. In those cases, you might need to type the text manually or find an alternative source.
Extract text from PDFs without uploading them
No uploads, no sign-ups. Everything happens in your browser.
Try PeacefulPDF Free →