Python Poetry for Quickly Indexing Blogger Tools on FreeBSD

Python is a versatile language used for various applications. However, managing dependencies, right python packages for a job is very complicated. With the very fast growth of the Python ecosystem, developers are required to be able to manage all Python packages and dependencies appropriately and can be used according to the project they are working on, as well as being compatible with each other.

Poetry provides solutions to these challenges. Poetry is a package manager for Python that allows developers to manage dependencies, create virtual environments, and deploy them with just one command line tool.

In this tutorial, you will learn the basics of using Poetry to index blogger URLs so they can be quickly submitted by Google Search Console. In writing this article we used the FreeBSD server to practice Poetry as a blogger indexer.


1. What is Poetry Python

Poetry is a tool for dependency management and packaging in Python. Python's Poetry module allows you to declare libraries that your project depends on and it will manage (install/update) those libraries for you. It will help you make your projects simpler and organized by resolving the complexity of dependencies in your projects and managing installations and updates for you. To tackle messy projects, Poetry is here to help us with just one pyproject.toml file to manage all dependencies. In other words, poetry uses pyproject.toml to replace setup.py.

As a Python project management tool, py-poetry provides the following features:
  1. Automatic management of virtual environments.
  2. Fast build and publish.
  3. Dependency management tool with reproducible installations and a conflict resolver.
  4. It makes the project more simple by resolving dependency complexities.
  5. Commands to manage, setup, run and deploy a project.
  6. Poetry will help us organize and tidy up our projects.
Apart from that, poetry also has several advantages:
  1. Admin has full control over dependencies as dependency files are locked.
  2. Keep dependency versions compatible with the project.
  3. Easy to use, install and set up new virtual environments.
  4. The file structure is very simple.
  5. It's easy to add new dependencies to a project.
  6. Easy to access and understand project metadata from pyproject.toml
When this article was written, the last version of Poetry was "py39-poetry-1.6.1".


2. py-poetry Installation

Poetry is designed to be compatible with multiple platforms, including FreeBSD, DragonFly, macOS, Linux and Windows. As of FreeBSD 13.2, py-poetry can only run with python310. Follow the steps below for py-poetry installation.
root@ns6:~ # pkg install lang/python310
Since py-poetry runs on Python310, Installing Python is what we have to do first. After Python310 has been successfully installed, we continue by installing py-poetry and its supporting libraries.

To install py-poetry with Putty, open the Windows Command Line and run the following command:
root@ns6:~ # pkg install py39-poetry-core py39-poetry-dynamic-versioning py39-poetry-semver py39-poetry-types py39-poetry2setup
After all the above dependencies have been installed, we continue with installing py-poetry, we recommend using the port system so that all libraries can be installed perfectly.

Type the command below in the PuTTY command line shell.
root@ns6:~ # cd /usr/ports/devel/py-poetry
root@ns6:/usr/ports/devel/py-poetry # make install clean
So that FreeBSD immediately detects Python310, we create a symlink file, below is how.
root@ns6:/usr/ports/devel/py-poetry # rm -R -f /usr/local/bin/python
root@ns6:/usr/ports/devel/py-poetry # ln -s /usr/local/bin/python3.10 /usr/local/bin/python
root@ns6:/usr/ports/devel/py-poetry # reboot


3. Create py-poetry Virtual Environment

On FreeBSD py-poetry is highly recommended to be run in the Python Virtual Environment. Before continuing this article, you must read the previous article about PIP and VENV.


In this article we do not explain Python VENV and PIP, you just read the article above, so we will go straight to the main discussion, namely the blogger URL index.

In general, to create a py-poetry Virtual Environment, you can follow the steps below.
root@ns6:~ # mkdir -p /root/.cache
root@ns6:~ # cd /root/.cache
root@ns6:~/.cache # python -m venv pypoetry
root@ns6:~/.cache # cd pypoetry
root@ns6:~/.cache/pypoetry # source bin/activate.csh
Now you are active in the py-poetry Virtual Environment.


4. py-poetry Search Engine Indexing Tools.

In this article, we'll look at how to use Python's py poetry to create a script to send index requests for your site's URLs to search engines in bulk and also ask search engines to crawl your pages faster.

OK, let's just assume you have installed Python VENV and PIP on the FreeBSD server. The first thing we have to do is enter the Python Virtual Environment. Type the command below in py-poetry Virtual Environment.
(pypoetry) root@ns6:~/.cache/pypoetry # curl -sSL https://install.python-poetry.org | python3 -
(pypoetry) root@ns6:~/.cache/pypoetry # poetry self update
(pypoetry) root@ns6:~/.cache/pypoetry # poetry new index
(pypoetry) root@ns6:~/.cache/pypoetry # poetry init
(pypoetry) root@ns6:~/.cache/pypoetry # poetry add requests
(pypoetry) root@ns6:~/.cache/pypoetry # poetry install --no-root

a. Google Indexing API

Google Indexing API is a Google search engine tool that allows site owners to send notifications directly to search robots about adding or removing pages from the Google search engine. The Google Indexing API allows you to submit up to 200 pages per day for scanning and subsequent indexing (up to a maximum of 100 pages per iteration).

In order to use the Google Indexing API, you must have a json file that you can create on Google Cloud Platform. Read the previous article on how to create a JSON file.


Before you continue the article, you must read the two articles above and practice them on your FreeBSD server.

Copy the Jason file to the folder "/root/.cache/pypoetry/index", we use WINSCP to copy the file. Look at the image below. In the image below the JSON file is named "blog-project-43540-d31ba843867c.json".



After that we create a file called "/root/.cache/pypoetry/index/api.py". In the "api.py" file, we enter the script below.
(pypoetry) root@ns6:~/.cache/pypoetry/index # ee api.py

import csv
import json
import time

import httplib2
from loguru import logger
from oauth2client.service_account import ServiceAccountCredentials


class GoogleIndexationAPI:
    def __init__(self, key_files: list[str], urls_file_path: str):
        self.key_files = key_files
        self._key_file_index = 0
        self.urls_list = urls_file_path
        self._urls_processed = 0

    @property
    def key_file(self):
        return self.key_files[self._key_file_index]

    @property
    def urls_processed(self):
        return self._urls_processed

    @urls_processed.setter
    def urls_processed(self, value):
        self._urls_processed = value

    def update_key_file_index(self):
        self._key_file_index += 1
        logger.warning('Key file updated')
        if self._key_file_index > len(self.key_files) - 1:
            logger.error("Key files are ended. Start the script tomorrow.")
            raise StopIteration()
        logger.error("Sleeping for 5 seconds.")
        time.sleep(5)

    def get_urls(self):
        try:
            with open(self.urls_list, 'r', encoding='utf-8') as f:
                urls = [line.strip() for line in f]
            return urls
        except Exception as e:
            logger.error(f'Error while reading urls from file ::: {e}')
            exit()

    def send_request(self, url):
        """
        Makes a request to Google Indexing API
        :return: Content: Response from API
        """
        api_scopes = ["https://www.googleapis.com/auth/indexing"]
        api_endpoint = "https://indexing.googleapis.com/v3/urlNotifications:publish"
        credentials = ServiceAccountCredentials.from_json_keyfile_name(self.key_file, scopes=api_scopes)
        method = 'URL_UPDATED'
        try:
            http = credentials.authorize(httplib2.Http())
            r_content = json.dumps({"url": url, "type": method}).encode('utf-8')
            response, content = http.request(api_endpoint, method="POST", body=r_content)
            log = [url, method, response.status, content.decode('utf-8')]
            return log
        except Exception as e:
            logger.error(f'{e}, {type(e)}')
            return None

    def parse_response(self, content):
        """Parses error response"""
        try:
            json_line = json.loads(content)
            result = [json_line['error']['message'], json_line['error']['status'], self.key_file]
        except Exception as e:
            result = ['API response parse error', e]
        return result

    def indexation_worker(self):
        logger.info('Processing... Please wait')

        urls = self.get_urls()

        with open('report_log.csv', 'w', encoding='utf-8', newline='') as f:
            my_csv = csv.writer(f, delimiter='\t')
            header = ['URL', 'METHOD', 'STATUS_CODE', 'ERROR_MESSAGE', 'ERROR_STATUS', 'KEY FILE']
            my_csv.writerow(header)

            for url in urls:
                logger.debug(f'Sending {url}')
                result = self.send_request(url)
                if not result:
                    logger.info('Empty response, skipping the url')
                    continue

                log = result[0:3]

                if result[2] == 200:
                    self.urls_processed += 1
                elif result[2] == 429:
                    self.update_key_file_index()

                if result[2] != 200:
                    log.extend(self.parse_response(result[3]))

                my_csv.writerow(log)
                logger.debug(log)


def index_api():
    # NOTE: add ALL your account credits json files
    key_files = [
        'blog-project-43540-d31ba843867c.json',
    ]

    # NOTE: put your urls that you want to index to the file
    urls_file = 'urls.txt'

    api = GoogleIndexationAPI(key_files=key_files, urls_file_path=urls_file)
    try:
        api.indexation_worker()
    except StopIteration:
        logger.error("Exiting...")

    logger.info(f"Done! We've sent {api.urls_processed} URLs to Googlebot. You can check report in report_log.csv")


if __name__ == '__main__':
    index_api()
Create another file called "/root/.cache/pypoetry/index/urls.txt". In the "urls.txt" file, you add a url that will be indexed by the Google search console, as in the example below.

https://www.unixwinbsd.site/2023/12/quick-way-to-configure-openssh-server.html
https://www.unixwinbsd.site/2023/12/how-to-learn-to-create-blog-with-go_0385964493.html
https://www.unixwinbsd.site/2023/12/how-to-learn-to-create-blog-with-go_0324087457.html
https://www.unixwinbsd.site/2023/12/freebsd-openssh-key-authentication-ssh.html
https://www.unixwinbsd.site/2023/12/how-to-learn-to-create-blog-with-go.html
https://www.unixwinbsd.site/2023/12/quick-start-guide-to-pandoc-file.html
https://www.unixwinbsd.site/2023/12/implementing-apache-web-socket-on.html
https://www.unixwinbsd.site/2023/12/ansible-on-freebsd-faster-setup.html
https://www.unixwinbsd.site/2023/12/freebsd-practical-instructions-for.html
https://www.unixwinbsd.site/2023/12/installing-freebsd-dhcp-server-with-pf.html
https://www.unixwinbsd.site/2023/11/how-to-install-gimp-on-ghostbsd-freebsd.html
https://www.unixwinbsd.site/2023/11/how-to-copy-move-remove-files-and.html
https://www.unixwinbsd.site/2023/11/download-video-youtube-with-freebsd.html
https://www.unixwinbsd.site/2023/11/implementing-dokuwiki-on-freebsd.html
https://www.unixwinbsd.site/2023/11/internet-router-gateway-with-freebsd-pf.html
https://www.unixwinbsd.site/2023/11/nextcloud-installation-on-freebsd-with.html
https://www.unixwinbsd.site/2023/11/implementing-python-modules-for-unbound.html
https://www.unixwinbsd.site/2023/11/how-to-create-freebsd-ipfw-firewall.html

After that, install the Google indexing API dependencies.
(pypoetry) root@ns6:~/.cache/pypoetry/index # pip install loguru
(pypoetry) root@ns6:~/.cache/pypoetry/index # pip install httplib2
(pypoetry) root@ns6:~/.cache/pypoetry/index # pip install oauth2client
After you have finished configuring everything, just carry out the indexing command "poetry run python api.py".
(pypoetry) root@ns6:~/.cache/pypoetry/index # poetry run python api.py
2023-12-10 09:02:43.914 | INFO     | __main__:indexation_worker:76 - Processing... Please wait
2023-12-10 09:02:43.915 | DEBUG    | __main__:indexation_worker:86 - Sending https://www.unixwinbsd.site/2023/12/quick-way-to-configure-openssh-server.html
2023-12-10 09:02:45.731 | DEBUG    | __main__:indexation_worker:103 - ['https://www.unixwinbsd.site/2023/12/quick-way-to-configure-openssh-server.html', 'URL_UPDATED', 200]
2023-12-10 09:02:45.732 | DEBUG    | __main__:indexation_worker:86 - Sending https://www.unixwinbsd.site/2023/12/how-to-learn-to-create-blog-with-go_0385964493.html
2023-12-10 09:02:47.097 | DEBUG    | __main__:indexation_worker:103 - ['https://www.unixwinbsd.site/2023/12/how-to-learn-to-create-blog-with-go_0385964493.html', 'URL_UPDATED', 200]
2023-12-10 09:02:47.097 | DEBUG    | __main__:indexation_worker:86 - Sending https://www.unixwinbsd.site/2023/12/how-to-learn-to-create-blog-with-go_0324087457.html
2023-12-10 09:02:48.458 | DEBUG    | __main__:indexation_worker:103 - ['https://www.unixwinbsd.site/2023/12/how-to-learn-to-create-blog-with-go_0324087457.html', 'URL_UPDATED', 200]
2023-12-10 09:02:48.458 | DEBUG    | __main__:indexation_worker:86 - Sending https://www.unixwinbsd.site/2023/12/freebsd-openssh-key-authentication-ssh.html
2023-12-10 09:02:49.814 | DEBUG    | __main__:indexation_worker:103 - ['https://www.unixwinbsd.site/2023/12/freebsd-openssh-key-authentication-ssh.html', 'URL_UPDATED', 200]
2023-12-10 09:02:49.814 | DEBUG    | __main__:indexation_worker:86 - Sending https://www.unixwinbsd.site/2023/12/how-to-learn-to-create-blog-with-go.html
2023-12-10 09:02:51.156 | DEBUG    | __main__:indexation_worker:103 - ['https://www.unixwinbsd.site/2023/12/how-to-learn-to-create-blog-with-go.html', 'URL_UPDATED', 200]
2023-12-10 09:04:45.648 | INFO     | __main__:index_api:121 - Done! We've sent 105 URLs to Googlebot. You can check report in report_log.csv
Congratulations, you have successfully indexed 105 URLs into Googlebot, you can check report in report_log.csv.


a. Ping Sitemap

You may have already submitted your sitemap to Google search console if so rest assured that the search engine knows that you have the sitemap. Refresh your sitemap at least once a day, or maybe several times a day.

To do this, create a file called "/root/.cache/pypoetry/index/ping.py", and enter the script below in the "ping.py" file.
(pypoetry) root@ns6:~/.cache/pypoetry/index # ee ping.py

from time import sleep

import requests
from loguru import logger


def ping():
    urls = [
        'https://www.unixwinbsd.site/sitemap.xml',
    ]
    base = 'https://www.google.com/ping?sitemap='

    for url in urls:
        url = f'{base}{url}'
        logger.info(f'Requesting url {url}')
        r = requests.get(url)
        if r.status_code == 200:
            logger.info(f'\t >>> Good. Sitemap sent.')
        else:
            logger.warning(f'Bad request. Status code: {r.status_code}')
        sleep(0.2)  # sleep for 0.2 seconds to be less spammy


if __name__ == '__main__':
    ping()
Change "https://www.unixwinbsd.site/sitemap.xml" with your website domain name. After that run the ping command "poetry run python ping.py".
(pypoetry) root@ns6:~/.cache/pypoetry/index # poetry run python ping.py
2023-12-10 10:02:26.089 | INFO     | __main__:ping:15 - Requesting url https://www.google.com/ping?sitemap=https://www.unixwinbsd.site/sitemap.xml
2023-12-10 10:02:26.340 | INFO     | __main__:ping:18 -          >>> Good. Sitemap sent.

Have we discussed enough about Python Poetry? From setting up the project structure, managing dependencies with poetry, and linking to Google search console. Now you just wait for GoogleBot to crawl your site and index your blogger or website URLs.
Iwan Setiawan

I Like Adventure: Mahameru Mount, Rinjani Mount I Like Writer FreeBSD

Post a Comment

Previous Post Next Post