To learn more about how Dell Technologies is supporting our communities, customers, partners and team members, please visit our COVID-19 response site.
Senior Principal Big Data Engineer
Primary Location: Beijing,
Additional Location(s): Beijing, China
We are looking for a passionate data engineer to develop a robust, scalable data model and optimize the consumption of data sources we require to generate unique insights about our systems. You will be solving some of the hardest problems for business leveraging ML/AI? Most of our use cases start with 10 million+ rows of data.
We are transforming Marketing at Dell. Come join Dell's Customer Engagement Platform.
About the position
Our data scientists & engineers are helping to shape our marketing strategy for a digital world.
What are our goals in Marketing?
- Drive perception;
- Drive demand;
- Grow the buyer base;
- Build the best content.
The models and solutions you deliver will be the essential ingredient we use to supply our customers & sales with relevant product recommendations and creative breakthroughs to our business.
You will work with other engineers and data scientists to solve some of the hardest business problems. You will learn what it takes to build, deploy & scale ML/AI models in real-world (it’s a hard problem, we assure you). You will build analytical data sets on which model’s will be built. This would entail two key tasks – feature engineering on large datasets and optimization of our pipelines to drive scalability. If you are already good at it, we will make you better.
We offer you a fast-paced work environment in a diverse team with exposure to Dell teams across the globe. You will be part of a dynamic, collaborative and cross-functional team focused on making significant business impacts.
- Feature engineering: Stitching together and aggregation of multiple large data sets working with the data scientist to a desired format. You will need to optimize both queries and architecture to support big data sets.
- Data pipelining: Once the initial dataset are prepared, it is normally run through a further pipeline to prepare it for modelling. Here you will create features that make ML/AI algorithms work (e.g. translating text into category variables). Expect to work with datasets with billions of rows and thousands of columns. Python/R/Spark is usually the preferred language.
- Lead solutioning (often includes working with multiple business and engineering teams), define timelines & deliver on commitments with minimal supervision.
You would need to have (Essential Requirements):
- Deep experience hands on experience with data processing software (such as Hadoop, Spark, Pig, Hive), along with data processing algorithms (MapReduce, Flume).
- Advanced SQL & Python/R coding skills skills.
- Education (Bachelor’s degree or higher) in Computer Science, Mathematics, or a related technical field, or equivalent practical experience
- You have professional experience. 6+ years of experience in Data or BI engineering dealing with large complex data scenarios
In addition to the above, it’s a plus if you have:
- Experience in software development in one or more languages, such as Scala, Java, Go, C++ and/or similar. Experience in Graph database is a Plus.
- Proven ability to work with varied data infrastructures – including relational databases, column stores, NoSQL databases, and file-based storage solutions.
- Ability to set up containerized services using e.g. Harmony, Kubernetes and Docker.
- Exposure to machine learning or machine learning pipelines is a plus.