Data Engineer (SQL/Database/Python)
182 Le Dai Hanh Street, Ward 15, District 11, Ho Chi Minh
Không xác định
2019-10-30 -> 2019-10-31
- Must have
- BA/BS degree in Computer Science, Statistics, Informatics, Information Systems or another quantitative field.
- 2+ years of experience as a Data Architect, Data Engineer, Data Scientist, Data Analyst or similar role.
- Advanced working SQL knowledge and experience working with relational databases, query authoring (SQL) as well as working familiarity with a variety of databases.
- Experience building and optimizing ‘big data’ data pipelines, architectures and data sets.
- Working knowledge of message queuing, stream processing, and highly scalable ‘big data’ data stores.
- Experience with big data tools (e.g. Hadoop, Spark, Kafka, or similar).
- Experience with relational SQL and NoSQL databases, including Postgres and Cassandra.
- Experience with data pipeline and workflow management tools (e.g. Azkaban, Luigi, Airflow, or similar).
- Experience with AWS cloud services: EC2, EMR, RDS, Redshift
- Experience with stream-processing systems (e.g. Storm, Spark-Streaming or similar).
- Experience with object-oriented/object function scripting languages (e.g. Python, Java, C++, Scala, etc.)
- Have any of the below experience/knowledge is a plus
- SaaS product development.
- Experience with eCommerce, CRM, Inventory Management, or Order Processing…
- Familiarity with data visualization tools (e.g. Tableau, D3.js and R)
- Business Intelligence skills (e.g. MS SSAS, DAX, Power BI)
- Statistical knowledge (e.g. basics Python, R stat libraries)
- ETL development knowledge (e.g. MS SSIS, Airflow)
- We are looking for a
- Data Engineer
- to join our Data Science team. The hire will be responsible for:
- Extraction, transformation, and loading of data from a wide variety of data sources.
- Designing, building, optimizing, and maintenance infrastructures for optimal extraction, transformation, and loading of data.
- Create and maintain optimal data pipeline architecture.
- Provided ETL applications and scripts across a range of data sources and stores in support of batch data processing and reporting.
- Assemble large, complex data sets that meet functional/non-functional business requirements.
- Identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc.
- Design the Data Warehouse based on best practices and industry standard.
- Build the infrastructure required for optimal extraction, transformation, and loading of data from a wide variety of data sources using SQL and big data technologies.
- Build analytics tools that utilize the data pipeline to provide actionable insights into customer acquisition, operational efficiency and other key business performance metrics.
- Develop and sustain tool to monitor performance and schedule data related tasks.
- Work with stakeholders including the Executive, Product, Data Science, and Design teams to assist with data-related technical issues and support their data infrastructure needs.
- Keep our data separated and secure across national boundaries through multiple data centers and cloud platform regions.
- Create data tools that assist Data Science team in building and optimizing our product into an innovative industry leader.
- Work with data and analytics experts to strive for greater functionality in our data systems.
- Research and test new data processing technologies and data architects.
- Write technical documents.
- Managing multiple petabyte-scale clusters, simple systems to handle security, disaster recovery, and developing replication.