nodejs|April 08, 2020|2 min read

How to check whether a website link has your URL backlink or not - NodeJs implementation

TL;DR

Read a list of URLs from a text file and use Node.js to fetch each page, parse the HTML, and check if it contains your website's backlink.

How to check whether a website link has your URL backlink or not - NodeJs implementation

Introduction

I got my seo backlink work done from a freelancer. It was like 3000 links, and usually the links that freelancer provides are broken. So, I wanted to really test each single of them to check if those URLs are actually active and having my url ot backlink.

NodeJs automation

I wrote a simple nodejs automation which read list of urls from a text file, and one by one check the validity of url and backlink.

Input

  1. A text file having list of urls
  2. My website name: xyz.com

Code

Following is the directory structure:

project
    - app.js
    - src/http/url_checker.js
    - package.json

package.json

{
  "name": "check_links_seo",
  "version": "1.0.0",
  "description": "For checking link validity work given by freelancers",
  "main": "app.js",
  "scripts": {
    "test": "echo \"Error: no test specified\" && exit 1"
  },
  "author": "Gorav Singal",
  "license": "ISC",
  "dependencies": {
    "async": "^3.2.0",
    "cheerio": "^1.0.0-rc.3",
    "request": "^2.88.2",
    "request-promise": "^4.2.5"
  }
}

app.js

const urlChecker = require('./src/http/url_checker');
const fs = require('fs');

const urls = fs.readFileSync('urls.txt').toString().split('\n');

//remember to put your website here
const myWeb = 'XYZ.com';

return urlChecker.checkYourLinkInUrls(urls, myWeb)
    .then(() => {
        console.log('Successful finished...');
    })
    .catch(err => {
        console.error(err);
    });

url_checker.js

const rp = require('request-promise');
const cheerio = require('cheerio');
const async = require('async');

class UrlChecker {
    checkYourLinkInUrls(urls, desiredWebsite) {
        return new Promise((resolve, reject) => {
            async.eachLimit(urls, 1, (url, callback) => {
                return this.__checkYourLinkInUrl(url, desiredWebsite)
                    .then(function (res) {
                        if (!res) {
                            console.log('failed', url);
                        }
                        else {
                            console.log('success', url);
                        }
                        callback();
                    }).catch(function (err) {
                        callback(err);
                    });
            }, function (err) {
                if (err) {
                    reject(err);
                } else {
                    resolve();
                }
            });
        });
    }

    __checkYourLinkInUrl(url, desiredWebsite) {
        // console.log('Checking url: ', url);
        return rp(url)
            .then(html => {
                return html.indexOf(desiredWebsite) > -1;
                // const $ = cheerio.load(html);
                // const links = $('a');

                // let found = false;
                // $(links).each(function(i, link){
                //     const web = $(link).attr('href');
                //     console.log(web);
                //     // console.log($(link).text() + ':\n  ' + $(link).attr('href'));
                //     if (web.startsWith(desiredWebsite)) {
                //         found = true;
                //         return found;
                //     }
                // });
                // // console.log($(links));
                // return found;
            })
            .catch(err => {
                // console.error('Error in url', url, err);
                return false;
            });
    }
}

module.exports = new UrlChecker();

Note: In above code, I’m just checking whether given web page is having my website or not. And in commented code, I’ve also checked for actual links. But, this code is bit expensive in computation as well as memory.

Run code

node app.js

Thanks for reading…

Related Posts

Nodejs - Json object schema validation with Joi

Nodejs - Json object schema validation with Joi

Introduction In this post, I will show how to validate your json schema…

Mongoose - Using CRUD operations in mongodb in nodejs

Mongoose - Using CRUD operations in mongodb in nodejs

MongoDB CRUD Operations Mongoose provides a simple schema based solution to…

How to connect to mysql from nodejs, with ES6 promise

How to connect to mysql from nodejs, with ES6 promise

Introduction I had to develop a small automation to query some old mysql data…

How to generate powerful tags for your content - SEO

How to generate powerful tags for your content - SEO

One of the biggest task while writing article or blog is to have right set of…

How to Download multiple Youtube Videos using Nodejs and Show a Progress Bar

How to Download multiple Youtube Videos using Nodejs and Show a Progress Bar

Introduction I was trying to download some youtube videos for my kids. As I have…

Moment.js - How to perform date relatedd arithmetic in javascript/NodeJs

Moment.js - How to perform date relatedd arithmetic in javascript/NodeJs

Introduction In your backend and frontend projects, you always need to deal with…

Latest Posts

Claude Code Skills — Build a Better Engineering Workflow with AI-Powered Code Reviews, Security Scans, and More

Claude Code Skills — Build a Better Engineering Workflow with AI-Powered Code Reviews, Security Scans, and More

Most developers use Claude Code like a search engine — ask a question, get an…

Building an AI Voicebot for Visitor Check-In — A Practical Guide to Handling the Messy Parts

Building an AI Voicebot for Visitor Check-In — A Practical Guide to Handling the Messy Parts

Every office lobby has the same problem: a visitor walks in, nobody’s at the…

Server Security Best Practices — Complete Hardening Guide for Production Systems

Server Security Best Practices — Complete Hardening Guide for Production Systems

Every breach post-mortem tells the same story: an unpatched service, a…

Staff Engineer Study Plan for MAANG Interviews — The Complete 12-Week Roadmap

Staff Engineer Study Plan for MAANG Interviews — The Complete 12-Week Roadmap

If you’re a Senior Engineer (L5) preparing for Staff (L6+) roles at MAANG…

XSS and CSRF Explained — The Complete Guide with Real Attack Examples and Defenses

XSS and CSRF Explained — The Complete Guide with Real Attack Examples and Defenses

XSS and CSRF have been in the OWASP Top 10 for over a decade. They’re among the…

OWASP Top 10 (2021) — Every Vulnerability Explained with Code

OWASP Top 10 (2021) — Every Vulnerability Explained with Code

The OWASP Top 10 is the industry standard for web application security risks. If…