My blog posts

Here is my blog with information about my recent projects in IT automation, AI and web design. 

I am setting up an automatic news mashup using txtai API and summarizing articles from news sites using GDELT. My vision is to have an automatic news mashup generator that can help improve SEO Ranking for a website by creating a lot of articles or blogs in a short time.

GDELT project
GDELT v2 API

Txtai is a python framework that uses machine learning and natural language processing and the latest transformers to process the articles and create a summary. It can also be used for several other tasks. Read more about it here.
GDELT is the Global Database of Events, Language and Tone, a multilingual search and analytical project for researching press and media reporting.
It can be accessed in different ways. I am using the old API to query news by keywords. You build up a URL that you then can query and collect the output or present it as html.

I will use node-red to query GDELT and then go through the output and filter the links to the news sites to get only content that is free and click on gdpr consent to be able to continue and download the content.

I planned to setup the txtai API on a Windows PC with a Nvidia GPU but the startup script didn't work. There was only description for how to start it on Linux. I search the GitHub repository for answers and also at FastAPI which is used for the API. They got some information but not enough to solve it so next step is to ask for help.  But for now I will continue to set it up by spinning up a docker container. I will not get the GPU speed then but I can batch run it instead. I had Docker desktop installed and wsl2 but it had stopped working. I got lots of problems both with wsl and Docker so have to uninstall all and start from scratch. In the mean time I continue to work on it on my laptop instead to have a working concept.
Hear I started with Node-red and created a flow that connects to GDELT and downloads news for Sweden in Swedish with the keyword.
Then I filtered the URLs to news that is free and with a wdio puppeteer node I click on consent and then download the news. The next step is to setup the txtai docker container on the laptop and then connect to the API from the node-red flow and pass in the news from the site and process it.

Today I managed to set up wsl2 and Docker Desktop am setting up an automatic news mashup using txtai API and summarizing articles from news sites using GDELT. My vision is to have an automatic news mashup generator that can help improve SEO Ranking for a website by creating a lot of articles or blogs in a short time.

Txtai API web page
GDELT v2 API

The steps involved in setting this up where the following:

- Setup a sub domain on Cloudflare
- Create, configure, build and run a docker container for a reverse nginx proxy
- Add the subdomain and port to the nginx proxy
- Configure the txtai workflow for translating text, make a summary of text and translate back to Swedish.
- Creat, configure, build and run the Docker container for the txtai API.
- Access the API from the web page to see that it is working.
- Connect to it from a Node-RED platform to load articles from GDELT.


The performance of the Txtai API where acceptable on CPU. It is also possible to set it up using a Docker container that can use a GPU. That could be the next step to improve the performance.

Txtai API web page

I have started connecting Txtai API to the Node-RED platform to process text that is extracted from the GDELT API.

IT automation using Node-RED and Txtai API