Data Scientist
Sendo.vn
Ho Chi Minh
Không xác định
2017-07-01 -> 2017-07-02
- Experience with Machine Learning techniques (e.g., classification, clustering and feature engineering) is a big advantage
- Experience with Artificial Intelligence (e.g., Natural Language Processing - especially for the Vietnamese language, and Computer Vision) is a big advantage
- Good knowledge of algorithms and data structures
- Critical thinking and good problem solving skills
- Good programming skills in one or more languages, especially Java, Python, R
- Good knowledge of databases. Experience with NoSQL databases such as MongoDB and HBase is an advantage
- Experience with software development on Linux
- Master's or PhD degree in Computer Science, Statistics, Mathematics or a related field is an advantage
- Experience with the following technologies is an advantage
- Develop new algorithms and improve existing ones for Sendo.vn’s various data products (which are employed on multiple platforms: Desktop website, Mobile website and Android & iOS mobile apps). Potential projects include improving / implementing the following systems:
- A search engine to return the products that are most relevant to a user’s search keyword, based on language data (e.g., semantics extracted from Vietnamese search keywords) and past user behavior data.
- A product ranking system to predict the best-selling products of each category (e.g., fashion and electronics), based on past transaction data and user behavior data.
- An advertising system based on users’ search keywords (which is similar to Google AdWords, and requires large-scale, real-time processing capabilities).
- A recommender system to predict products that a user may like, based on his or her past behavior data.
- A system to detect counterfeit products, based on product attributes (e.g., price), textual cues (e.g., product description) and visual cues (e.g., product images).
- In order to develop the above systems, you will have access to Sendo.vn’s various data sources:
- User behavior logs (e.g., search keyword data, product impression data and product click data)
- Transaction data (e.g., completed orders and cancelled orders)
- Product databases (e.g., product description and images)
- Third-party data (e.g., Google Analytics data)
- Work with the Head of Data Science through all stages of the algorithm development process:
- Build train and test datasets from the aforementioned data sources
- For each system, develop an algorithm that balances accuracy and speed (Sendo.vn’s websites have millions of visits per month, and thus speed is of great importance)
- Work with data engineers to deploy the developed algorithms in production
- Design A/B tests to measure the impact of the developed algorithms on e-commerce metrics such as click-through rate (CTR) and conversion rate (CR)
- Perform ad-hoc analyses of user behavior logs and transaction data to derive important business insights and metrics, e.g., customer retention rate and customer lifetime value