Job Openings Data Engineer

About the job Data Engineer

3+ Years of experience in a Data Engineering role.

Your key responsibilities

Data Pipeline Architecture and Development

Design, construct, install, test, and maintain highly scalable data pipelines with a focus on machine learning models and analytics.

Data Integration

Work closely with data scientists, ML engineers, and stakeholders to ensure that data is accessible, consistent, and reliable for ongoing projects.

API and Data Services

Develop and maintain APIs for data access and manipulation, and integrate with external data services as needed.

Data Storage

Manage and optimize data storage solutions, including relational databases, Search Engines like Elasticsearch and NoSQL databases, to support the requirements of machine learning models.

Understand data engines and structure to effectively design solutions for transactional, analytics, and search purposes.

Data Quality and Governance

Implement processes to monitor data quality and ensure production data is always accurate and available for key stakeholders.

Collaboration and Support

Collaborate with ML engineers to assist in data-related technical issues and provide architectural guidance and solutions.

Security and Compliance

Ensure compliance with data security and privacy policies.

Documentation

Maintain clear and up-to-date documentation including data dictionaries, metadata, and architectural diagrams.



Skills and attributes for success

  • Bachelors degree in Computer Science, Engineering, Mathematics, or a related field; or equivalent work experience.
  • 3+ years of experience in a Data Engineering role.
  • Proficiency in SQL and programming languages like Python, Java, and Scala.
  • Hands-on experience with big data technologies like Hadoop, Spark and Flink.
  • Familiarity with machine learning frameworks such as TensorFlow, PyTorch, or similar.
  • Strong understanding of data warehousing concepts, ETL processes, and data modeling.
  • Experience with API development and integration with data services.
  • Experience with cloud platforms like AWS, GCP.
  • Knowledge in DevOps, CI/CD methods, and containerization technologies like Docker or Kubernetes.
  • Experience with real-time data processing.

Technical stack

  • Programming Languages: Python, Java, Scala, SQL, Bash
  • Big Data Technologies: Hadoop, Spark, Flink
  • Databases: MySQL, PostgreSQL, MongoDB, Cassandra, HBase, Redis
  • Cloud Platforms: Azure

API Development: RESTful APIs, GraphQL, OpenAPI

Data Services: Kafka, RabbitMQ

Containers: Docker, Kubernete