Turn Any PDF Into a WordPress Blog Post While You Sleep

TL;DR: This n8n workflow transforms PDF documents into polished WordPress blog posts using AI text extraction, GPT-powered content generation, and automated image creation with Pollinations.ai. The human-in-the-loop Gmail approval step ensures quality control before publication, while Telegram and email notifications keep stakeholders informed. Perfect for content teams drowning in whitepapers who need to ship blog posts faster than David needs to debug a webhook.

Difficulty⭐⭐⭐ (Level 3 - Intermediate)
Who's This For?Content managers juggling PDFs, marketing teams repurposing research, solopreneurs automating their blog
Problem SolvedManually rewriting PDF content into blog posts eats hours you don't have
Template Linkn8n.io/workflows/3010
Tools Requiredn8n, OpenAI API, WordPress site, Gmail account, Telegram bot (optional), imgbb account
Setup Time45-60 minutes
Time Saved3-4 hours per blog post

The PDF Graveyard Problem

David once told me he had seventeen PDFs sitting in a folder labeled "blog ideas" that hadn't been touched in six months. When I asked why, he said writing blog posts from scratch felt like "translating ancient Sumerian while someone's toddler screams in the background."

I suggested he just copy-paste the PDF text into WordPress and clean it up. He looked at me like I'd suggested he build a CMS from scratch using punch cards.

Turns out there's a middle path. This workflow takes any PDF, extracts the text, hands it to GPT to rewrite as a proper blog post with structure and SEO juice, generates a featured image automatically, and drops the whole thing into WordPress as a draft. The twist? Before it publishes, Gmail sends you an approval email so you can say yes or no without logging into WordPress.

David's folder is now down to three PDFs. Progress.

What This Workflow Actually Does

At its core, this workflow is a content assembly line that starts with a PDF and ends with a WordPress post ready to ship. The process runs through six major stations, each handling a specific transformation.

First, you upload a PDF through a web form that n8n hosts for you. This form lives at a webhook URL, meaning anyone with the link can drop a PDF and trigger the workflow. The moment that file arrives, n8n's Extract From File node (see n8n Extract From File documentation) pulls out all the text content, whether it's a research paper, a whitepaper, or last quarter's investor deck.

That raw text then flows into a LangChain node connected to OpenAI's GPT-4o-mini model. The prompt is carefully structured to demand specific formatting: an H1 title under ten words with no colon, an introduction between 150-200 words, six to eight main chapters with H2 headings and 300-400 words each, and a conclusion that wraps it all up. The AI doesn't just summarize; it rewrites with personality, adds transitions, and formats everything in clean HTML.

While the AI writes, another branch of the workflow hits Pollinations.ai with an HTTP request. This free image generation API takes the blog post title as a prompt and returns a vibrant, AI-generated image. The workflow downloads this image as binary data, converts it to base64, and uploads it to imgbb for temporary hosting. Why imgbb? Because WordPress needs a publicly accessible URL to fetch the image before setting it as the featured image.

Once the content and image are ready, n8n creates a draft post in WordPress using the native WordPress node. It sets the title, injects the HTML content, and attaches the featured image by ID after uploading it through WordPress's media API (official WordPress REST API Media reference). At this point, you have a complete blog post sitting in your drafts folder, but it hasn't gone live yet.

Here's where the human-in-the-loop approval kicks in. The workflow sends you an email via Gmail with the full blog post content in the body. You read it, click "Approve" or "Reject" directly from the email, and Gmail sends that response back to n8n. If you approve, the workflow publishes the post. If you reject, it logs an error and sends a Telegram notification so you know something needs fixing.

Finally, once published, the workflow sends confirmations via both Gmail and Telegram. The email contains the full post text, while the Telegram message includes a preview image and the first 400 characters of markdown-converted content. Stakeholders get notified, and you get peace of mind that the post actually shipped.

Quick Start Guide

Before diving into the node-by-node setup, gather your credentials. You'll need an OpenAI API key with access to GPT-4o-mini, a WordPress site with REST API enabled and application password credentials, a Gmail account with OAuth or app password configured, and an imgbb API key from their free tier. If you want Telegram notifications, create a bot via BotFather and grab the chat ID for your notification channel.

Import the template from n8n.io/workflows/3010 into your n8n instance. The workflow will appear with placeholder credentials marked in red. Click each node that requires authentication and connect your accounts. The Form Trigger node needs no credentials but will generate a unique webhook URL once you activate the workflow. Copy this URL because you'll need it to upload PDFs.

Customize the AI prompt inside the "Write Blog Post" LangChain node to match your content voice. The default prompt produces formal, structured posts, but you can adjust tone, length, and formatting requirements by editing the message parameter. Test the workflow by uploading a sample PDF through the form URL, then watch the execution log to see each step complete. If everything works, you'll receive an approval email within 30-60 seconds.

Building the Workflow Step by Step

The journey starts with the Form Trigger node, which n8n configures as a webhook that accepts file uploads. Set the path to something memorable like "/pdf-to-blog" and configure the form fields to accept a single PDF file with a label like "Upload PDF File." Enable the "Required Field" option so the form won't submit without a file attached. When someone visits your webhook URL, they'll see a clean form titled "PDF2Blog" with the description "Transform PDFs into captivating blog posts."

Connect the Form Trigger output to an Extract From File node set to "PDF" operation mode. Map the binary property name to "Upload_PDF_File" which matches the form field name. This node uses pdf-parse under the hood to pull text from the PDF, handling most standard PDF formats including scanned documents with embedded text layers. The extracted content flows out as a text string in the json.text property.

Branch the workflow into two paths here. The main path goes to the LangChain node for content generation, while a secondary path waits to handle images later. In the LangChain node, connect a ChatOpenAI sub-node configured with your API key and model set to "gpt-4o-mini." Set the response format to "text" in the options. The prompt should instruct the AI to analyze the PDF text and create a blog post following specific structural requirements.

For Advanced Readers: The LangChain prompt uses n8n's expression syntax to inject the extracted text: ={{ $json.text }}. The message parameter contains the full prompt with markdown formatting for structure. You can add custom instructions like "Use a conversational tone" or "Include practical examples" by appending them to the existing prompt text.

After the AI generates the blog post, pipe the output into a Code node named "Get Blog Post." This node uses JavaScript to parse the HTML content and extract the title from the first H1 tag using regex. The script returns a json object with two properties: title and content. This separation allows later nodes to reference the title and body independently.

For Advanced Readers: The regex pattern /

(.*?)<\/h1>/s captures content between H1 tags using a non-greedy match. The /s flag enables dotall mode so the pattern works even if the title spans multiple lines. The extracted title gets stored in json.title while the full HTML stays in json.content.

Insert an If node to validate the AI output. Configure it to check two conditions: {{ $json.title }} is not empty AND {{ $json.content }} is not empty. This prevents the workflow from trying to publish incomplete posts if the AI fails or times out. Route the "true" output to continue the workflow, and route the "false" output to a Telegram error notification node.

On the image generation branch, add an HTTP Request node pointing to https://image.pollinations.ai/prompt/{{ $('Get Blog Post').item.json.title }} and avoid adding text and keep the image vibrant. This dynamic URL construction passes the blog post title as the image prompt. Pollinations.ai returns a JPEG image as binary data. The workflow downloads this automatically when you set the response format to binary.

Connect the image output to another HTTP Request node configured for imgbb's upload API. Set the method to POST, URL to https://api.imgbb.com/1/upload, and add query parameters for your imgbb API key and expiration time (600 seconds works well). In the body parameters, set "image" to ={{ $json.data }} (the base64 image data from the previous node after passing through a "Get Base64" node). imgbb returns a JSON response with a public URL you'll use for WordPress.

Now create the WordPress draft. Add a WordPress node set to "Create" operation for posts. Set the title to ={{ $('Get Blog Post').item.json.title }} and content to ={{ $('Get Blog Post').item.json.content }}. In additional fields, set status to "draft" so it doesn't publish immediately. Enable "Always Output Data" so the node passes through even if there's an error, and set "On Error" to "Continue Regular Output" for resilience.

For Advanced Readers: The WordPress node returns a post ID in json.id after creation. You'll need this ID to attach the featured image. The n8n expression ${('NodeName').item.json.property} syntax lets you reference data from earlier nodes by name, making it easy to pull the title and content from "Get Blog Post" even though several nodes have executed in between.

Add two more HTTP Request nodes to handle the WordPress featured image. The first uploads the image to WordPress media library using a POST request to https://[YOUR-SITE]/wp-json/wp/v2/media with WordPress API authentication. Set the Content-Disposition header to attachment; filename="cover-image-{{ $('Create Wordpress Post').item.json.id }}.jpeg" and send the binary image data in the body. This returns a media ID.

The second HTTP Request sets the featured image by POSTing to https://[YOUR-SITE]/wp-json/wp/v2/posts/{{ $('Create Wordpress Post').item.json.id }} with a query parameter featured_media={{ $json.id }} (the media ID from the upload). Now your draft post has both content and a cover image.

For the human-in-the-loop approval, add a Gmail node set to "Send and Wait" operation. Configure the recipient as your review email address, subject line as Approval Required for "{{ $json.title }}", and message body as ={{ $json.content }} (the full blog post HTML). Enable "Approval Type: Double" so you get explicit Approve and Reject buttons in the email. Set a wait time limit of 45 minutes so the workflow doesn't hang forever if you don't respond.

Connect the Gmail output to another If node checking {{ $json.data.approved }} equals true. Route the "true" path to the final publishing steps and the "false" path to error handling. On the "true" path, add a Merge node that combines the blog post data with image data from earlier branches, then splits into two final notification nodes: one Gmail and one Telegram.

The Gmail notification node sends the final post content to stakeholders. The Telegram node uses "Send Photo" operation with the image binary data and a caption showing the first 400 characters of markdown-converted content. Add a Markdown node before Telegram to convert the HTML content to markdown using ={{ $('Get Blog Post').item.json.content }} as input.

For Advanced Readers: The Merge node uses "Combine by Position" mode to align data from parallel branches. This ensures the image data from the Pollinations path syncs with the post data from the content generation path. Without this merge, the Telegram notification wouldn't have access to the image binary for the photo attachment.

Key Learnings

The first major concept here is multi-stage approval gates in no-code workflows. Most automation runs fire-and-forget, but adding human review points lets you maintain quality control without sacrificing speed. Gmail's "Send and Wait" operation turns email into a synchronous decision point, effectively pausing workflow execution until you click a button. This pattern works for any scenario where automated output needs human judgment before taking action.

Second, binary data handling across API boundaries teaches you how modern no-code platforms manage file uploads and downloads. When n8n extracts text from a PDF, it stores the file as binary in memory. When Pollinations.ai generates an image, that comes back as binary too. Converting between binary, base64, and URL references lets different services communicate about the same file without manually downloading and re-uploading. Understanding this flow means you can chain together any services that work with files, from image processors to document converters.

Third, conditional branching based on data validation prevents cascading failures in complex workflows. The If nodes checking for empty title or content stop the workflow from trying to publish garbage if the AI model has an off day. Rather than crashing with an error, the workflow gracefully routes to a notification path that tells you something went wrong. This defensive programming approach is critical when workflows touch production systems like your public blog.

Video Walkthrough: PDF Automation with n8n

For a visual guide to PDF data extraction and automation with n8n, check out this comprehensive tutorial:

This step-by-step walkthrough covers the core concepts of extracting data from PDFs and integrating with various services.

Frequently Asked Questions

How much does this workflow cost to run per post?

The main cost is OpenAI API usage. GPT-4o-mini costs approximately $0.15-$0.30 per blog post (depending on PDF length), assuming ~1,500 input tokens and ~2,000 output tokens. Pollinations.ai image generation is free, imgbb has a free tier with 32MB upload limit, and WordPress REST API is included with your hosting. Total cost per post: under $0.50.

What types of PDFs work best with this workflow?

The workflow handles text-based PDFs (generated from Word, Google Docs, or design tools) very well. Scanned PDFs with embedded OCR text layers also work. However, purely image-based scanned documents without OCR will not extract text properly. For best results, use PDFs with selectable text and clear structure (headings, paragraphs). Complex multi-column layouts or heavily formatted PDFs may lose some formatting during extraction.

Can I customize the WordPress post format (categories, tags, custom fields)?

Yes! The WordPress node in n8n supports additional fields including categories, tags, custom taxonomies, featured image position, excerpt, and custom fields. You can either hardcode these values in the node configuration or dynamically extract them from the PDF content using additional AI prompts (e.g., "suggest 3 tags for this post"). The workflow template uses minimal settings, but you can extend it to match your site's taxonomy structure.

What happens if the AI generates poor content or the workflow fails mid-execution?

The workflow includes several safety mechanisms. First, the If node validates that both title and content are present before creating the WordPress draft. Second, the Gmail "Send and Wait" approval step lets you review the full post before it goes live—if the AI wrote nonsense, you click "Reject" and the post stays in drafts. Third, all nodes have error handling configured to continue execution and send Telegram alerts on failure. If something breaks (API timeout, authentication issue), you'll get notified immediately rather than discovering a broken workflow days later.

What's Next

You've built a PDF-to-blog pipeline that most content teams would pay thousands for. David still has those last three PDFs sitting in his folder, but now he has no excuse. The workflow is live, the webhook is ready, and all he has to do is drag-drop-approve.

Here's your challenge: ship one blog post using this workflow before Friday. Pick a PDF you've been sitting on (we all have them) and run it through. When you get the approval email, don't overthink it. Click approve and let it publish. The world needs more content from people who actually ship.

If you want to level up, add a Slack notification node that pings your team channel when posts go live. Or connect a Google Sheets node to log every PDF you process with timestamp, title, and WordPress URL for tracking. Or integrate with a social media scheduler so published posts automatically tweet themselves.

David's working on getting his folder down to zero PDFs. You can beat him to it.