What Can You Do With This Plugin?
Crawlomatic Multisite Scraper Post Generator Plugin for WordPress is a breaking edge website crawling and scraping, post generator autoblogging plugin that uses website crawling and scraping to turn your website into a autoblogging or even a money making machine!
Get content from almost any webpage! You no longer need API’s which requires registration and provides limited access, also you can retrieve data from non API providing websites. Schedule it for once and let it autopilot your posts 7/24 for you like a master!
How does it work?
This plugin will crawl the seed URL you give it (crawling means that it will search all links that the webpage contains) and will visit and extract content from each crawled URL. The crawling process is customizable: you can set the crawling depth, crawling rate, maximum crawled article count, crawl only links with specific class or ID and many more customizations.
Crawlomatic v2.0 update
In the v2.0 update, a new live scraper shortcode was added to the plugin: [crawlomatic-scraper]. This new feature makes this plugin an easy to implement web data extractor for WordPress. As a result, it can be used to display real-time data from any websites directly into your posts, pages or sidebar. It also temporarily caches the scraped content, so your website will not over use on resources. You can use this plugin to include real-time stock quotes, cricket or soccer scores or any other generic content from public domains!
New features included in this update:
- Scraped output can be displayed through custom template tag, shortcode in page, post and sidebar (through a text widget).
- Configurable caching of scraped data. Cache timeout can be defined in minutes for every scraped data.
- Configurable Useragent for your scraper can be set for every scrape.
- Configurable default settings like enabling, useragent, timeout, caching, error handling.
- Multiple ways to query content – CSS Selector, XPath or Regex, Auto Detection.
- A wide range of arguments for parsing content.
- Option to pass post arguments to a URL to be scraped.
- Dynamic conversion of scraped content to specified character encoding to scrape data from a site using different charset.
- Create scraped pages on the fly using dynamic generation of URLs to scrape or post arguments based on your page’s get or post arguments.
- Callback function for advanced parsing of scraped data.
Check the official documentation of the v2 update, browse through examples and check FAQ for crafting a perfectly optimized web scraper.
More about the plugin
You can scrape content from almost every web site that you open in your browser. If the content is loaded using JavaScript, the plugin can be combined with PhantomJS to scrape also JavaScript generated content.
Also, you can automatically generate unlimited number of custom website crawling and scraping.
Other plugin features:
- v2.5.1 update: Scrape WooCommerce product variants from other WooCommerce/Shopify stores
- v2.5.0 update: Scrape search engine results for your custom keyword searches, from Google or from Bing. Check the tutorial video of this new feature.
- v2.4.1 update: Scrape product image galleries for WooCommerce products (for non-product post types, post attachments will be created from the scraped images)
- v2.3.5 update: Execute your own JavaScript code on the scraped HTML and scrape the results – this feature is available only when headless browsers are used for scraping (Puppeteer/Tor/PhantomJS) or HeadlessBrowserAPI
- v2.2.1 update: Crawl RSS feeds for links and scrape articles listed in them
- v2.2.0 update: Use HeadlessBrowserAPI to scrape JavaScript Generated HTML Content from any website on the internet without the need to install anything (besides this plugin) on your server – tutorial video
- v2.1.0 update: Scrape .onion websites from the Dark Web using the Tor Browser and Puppeteer! – tutorial video
- v2.0.0 update: Live Scraper shortcode added for even more crawling control and scraping power: [crawlomatic-scraper]
- v1.7.1 update: Sitemap crawling supported – video tutorial
- v1.6.5 update: Visual content selector support added – video tutorial
- v1.6.0 update: Added the ability to make screenshots of crawled pages and use them in generated post’s content – video tutorial
- v1.5.2 update: Ability to shorten outgoing (post source) links (and monetize them), using Shorte.st link shortener service – example of shortened link
- v1.4.8 update: Added JavaScript execution support for crawled pages – requires PhantomJS installed on server – How to install PhantomJs? – video tutorial
- v1.4.4 update: Added the ability to set multiple proxies for crawling pages. The plugin will select one at random at each page access
- v1.4.0 update: Added the ability to paginate crawling (crawling for articles will continue on the next page of the seed page).
- v1.4.0 update: Added the ability to import product prices for crawled products (WooCommerce compatible) + dropshipping price automatic modification – video tutorial
- v1.4.0 update: Added the ability to increase imported product price by a fixed number or to multiply it with a predefined number (great value for dropshipping!)
- v1.2.8 update: Added paginated post importing support (into a single crawled post) Check: VIDEO.
- v1.2.4 update: Added the ability to set proxies for crawling pages
- v1.2.3 update: Added an option to crawl the page from Google cache when direct crawling fails (blocked)
- Google Translate support – select the language in which you want to post your articles
- Text Spinner support – automatically modify generated text, changing words with their synonyms – built-in, The Best Spinner, SpinRewriter, WordAI, TurkceSpin and others – great SEO value!
- customizable generated post status (published, draft, pending, private, trash)
- shortcode to list all posts generated by this plugin: [crawlomatic-list-posts type =>…