…
Intelek Data Pipes

Data collection software

All your data in one place

Intelek Collector solves the problem of data dispersion, importing all your data from any source and integrating it into a single centralized information system, source of all your organization's knowledge, for its subsequent exploitation and analysis.


What types of data sources are there?

Today's companies accumulate very diverse tools and technologies, thus being their data stored in very diverse systems, based on different storage technologies. All these data sources can be classified into mainly five types, for which different data capture techniques will be used.

1. Third party management software

Today's organizations are used to using different management software developed by third parties, for which it is not possible to access the databases that support the software. It is very common for these softwares to have an interface to access the data programmatically, that is, an API. Each software has its own different API, with completely different rules to access the data, which makes the development of data collection programs for each of these APIs very expensive.

Intelek Collector allows you to:

  • Import data from the cloud software you use: ERP, CRM, eCommerce, and many more.

  • Automatic scaling to synchronize data with big data volumes and frequencies.

  • Flexibility to adapt to new and unusual specific software APIs.

2. On-premise databases

It is common for organizations to have custom-developed software, each one with its respective database and independent of the others. This poses a problem to build a centralized data warehouse, since it is necessary to connect to several data sources of very different natures and in separate locations. Intelek Collector allows you to:

  • Import data that resides in a local database, be it MySQL, Oracle, PostgreSQL, Microsoft SQL Server, or other database systems.

  • Connecting to databases on a local network or in the cloud.

3. Document, PDF or image connectors

Sometimes you have valuable information contained in PDF documents, often the result of scanning physical documents, or you may have images with valuable content in their text, or even in graphic elements contained in the image. Intelek Collector allows you to:

  • Extract valuable information from PDF files or scanned documents.

  • Extract tables from PDF files, such as an invoice, a delivery note, among others.

  • Extract text from images (.png, .jpg).

  • Extract information from the objects that appear in the images.

4. Obtain data from a web page (scraping)

Data found on the web pages of your competitors and other online sites can be very valuable for decision making. This data can be captured in a periodic or on-demand basis, and thus feed your data warehouse to:

  • Perform analytics by enriching your data.

  • Monitor changes in competitor data.

  • Compare the data of your competitors with that of your company.

5. Data Providers

Sometimes the most useful data for decision-making is not available, and it is not possible to extract it from the web either. In these situations it may be interesting to acquire this information from private data providers for a price adapted to the needs of the project to help us:

  • Know how certain behaviors of citizens affect your business. For example, the changes in food consumption of the population in the summer or during Christmas.

  • Identify events that most affect the operation of your business. For example, the impact of weather in product distribution logistics.

  • Automate market studies that contextualize your business.

Management of periodic data collection tasks

Create multiple sync tasks from the sources you have enabled.

  • As many tasks for the same data source as you want.

  • Select the entities you are interested in downloading in the process.

  • Choose the database or datalake where to store the data of each task.

  • Define the execution interval with crontabs.

Monitoring the execution of the data collection tasks

With the Data Collector monitoring module you will be able to check at all times the status of the data collection tasks, including:

  • Volume of data downloaded.

  • Date and time of the last data acquisition.

  • Date and time of the next data collection scheduled task.

  • Data preparation and cleansing tasks scheduled after the collection task (and expected completion time).

Statistics

Statistics of the CPU consumption of each data collector task.

Errors

Monitoring of errors in the execution of processes.

Data volume

Insights on the volume of data downloaded over time.

Related content

Interested?

We will be more than happy to listen you and turn your needs into data-driven growth opportunities for your business.

Contact us