Compañía

TechuntingVer más

addressDirecciónMendoza, Mendoza
CategoríaInvestigación y desarrollo

Descripción del trabajo

Esta oferta de trabajo no se encuentra disponible en tu país.

1. Data Acquisition and Integration :

  • Source, gather, and integrate data from various internal and external platforms and databases.
  • Collaborate with data providers and stakeholders to understand and ensure data availability and quality.

2. Data Cleaning :

  • Identify, diagnose, and resolve any data inconsistencies, anomalies, and missing data.
  • Design and implement data cleaning procedures to enhance data quality and reliability.
  • Document any data transformations, anomalies, and resolutions to maintain data integrity.

3. Database Management :

Design and maintain scalable and optimized database schemas for storing and retrieving machine learning datasets.

4. Creation of Queries :

  • Develop complex SQL queries to extract, transform, and load (ETL) data tailored to specific machine learning tasks.
  • Create data views and aggregations to simplify data access and usage by machine learning teams.
  • Optimize query performance to ensure swift data retrieval.

5. Process Optimization & Data Pipelines :

  • Design, implement, and maintain ETL pipelines for seamless data flow across systems.
  • Optimize existing data processes for speed, cost-efficiency, and reliability.
  • Automate recurring tasks and jobs to ensure timely data availability for machine learning projects.
  • Monitor and ensure the smooth running of data pipelines, troubleshoot any issues, and provide quick resolutions.

6. Support for Machine Learning Testing :

  • Work with the team to understand data needs for model development and testing.
  • Assist in debugging data-related issues in machine learning pipelines, such as data leakage, imbalances, or missing values.

Requirements : Qualifications :

Qualifications :

  • Bachelor's or higher degree in a related field (e.g., Data Science, Computer Science, Statistics).
  • Knowledge of ETL tools (e.g., Apache Nifi, Talend, Informatica) for data acquisition and transformation.
  • Experience using APIs and connectors for data extraction from various sources.
  • Proficiency in database management systems (e.g., MySQL, PostgreSQL, MongoDB) and SQL knowledge.
  • Ability to design and maintain optimized database schemas.
  • Proficiency in writing complex SQL queries for data extraction and transformation.
  • Experience with task automation and orchestration tools, such as Apache Airflow.
  • Ability to use data monitoring and logging tools to ensure proper data flow.
  • Capability to troubleshoot and provide efficient solutions in case of data flow interruptions.
  • Understanding data requirements for machine learning projects and the ability to identify data issues in ML pipelines.
  • Ability to document data transformations and issue resolutions to maintain data integrity.
  • Skill in optimizing ETL processes and queries for efficient performance.
  • Knowledge of programming languages like Python or Java can be beneficial for automation and customization tasks.
  • Familiarity with data cleaning tools.
  • Awareness of data security best practices and the ability to implement security measures in data flows.
  • Strong problem-solving skills and attention to detail.
  • Excellent communication and collaboration skills to work with cross-functional teams.
Refer code: 544528. Techunting - El día anterior - 2024-01-31 05:04

Techunting

Mendoza, Mendoza

Compartir trabajos con amigos

Trabajos relacionados

Data Science - Data Engineer

Data Science Engineer

Techunting

Mendoza, Mendoza

5 Hace meses - visto