Description
Job Overview:
We are seeking an experienced Data Engineer to join our growing data team. The ideal candidate will have 3-5 years of experience in developing and managing data pipelines, optimizing data infrastructure, and collaborating with data scientists and analysts to support data-driven decision-making. This role will be focused on building and maintaining scalable data architectures that ensure accurate and efficient data flow across various systems.
Key Responsibilities:
- Design and Develop Data Pipelines: Build, maintain, and optimize scalable data pipelines to ingest, transform, and load data across various platforms (cloud and on-premise).
- ETL Process Management: Lead the development and optimization of ETL (Extract, Transform, Load) processes to ensure seamless data flow, integrity, and consistency across internal and external systems.
- Database Management: Manage and optimize relational and NoSQL databases (e.g., MySQL, PostgreSQL, MongoDB, Cassandra) for performance, reliability, and scalability.
- Cloud Data Infrastructure: Work with cloud platforms such as AWS, Google Cloud Platform (GCP), or Azure to deploy and scale data solutions. Familiarity with cloud data services (e.g., AWS Redshift, GCP BigQuery) is required.
- Data Warehousing: Build and maintain data warehousing solutions, ensuring the efficient and reliable storage of large datasets.
- Data Quality Assurance: Ensure the accuracy, consistency, and availability of data by implementing data quality and validation checks across data pipelines.
- Automation and Monitoring: Automate routine tasks and implement monitoring to ensure data systems and pipelines run efficiently with minimal downtime.
- Collaborate with Cross-functional Teams: Work closely with data scientists, business analysts, and product teams to understand data requirements and deliver solutions that enable data analysis and insights.
- Data Governance and Security: Implement best practices for data governance, security, and compliance, ensuring adherence to local regulations such as the Data Privacy Act of 2012 in the Philippines.
- Documentation: Maintain clear and up-to-date documentation for data pipelines, infrastructure, and data models to ensure knowledge sharing and transparency across teams.
Required Skills and Qualifications:
- Bachelor’s degree in Computer Science, Information Technology, Engineering, or a related field.
- 3-5 years of experience in data engineering or a similar role, ideally in a fast-paced, data-driven environment.
- Proficiency in SQL and experience working with relational databases (e.g., MySQL, PostgreSQL, SQL Server).
- Strong programming skills in Python, Java, or Scala for building data pipelines, automating tasks, and processing large datasets.
- Experience with ETL tools (e.g., Apache Airflow, Talend, Apache NiFi) for automating data workflows and ensuring data transformation processes.
- Hands-on experience with cloud data services such as AWS (Redshift, S3, Lambda), Google Cloud Platform (BigQuery, Cloud Storage), or Microsoft Azure.
- Familiarity with NoSQL databases (e.g., MongoDB, Cassandra) for working with unstructured data.
- Experience with data warehousing solutions like Amazon Redshift, Snowflake, or Google BigQuery.
- Strong understanding of data governance principles and data privacy laws, especially the Data Privacy Act of 2012 in the Philippines.
- Excellent problem-solving skills and the ability to troubleshoot data issues across systems.
- Familiarity with version control systems (e.g., Git) for collaborative development.
Preferred Skills:
- Experience with big data technologies
- Knowledge of containerization tools like Docker and Kubernetes for deploying and managing applications.
- Experience with CI/CD tools for automating data workflows and deployments.
- Familiarity with data visualization tools such as Tableau, Power BI, or Looker for presenting and analyzing data.
- Basic understanding of machine learning workflows and how to prepare data for model development.
Requirements
Please refer to job description.