Pozíció leírása / Job description
We are looking for a Big Data Engineer to our partner company to help their team improve and maintain their big data pipelines at scale. The team is contributing directly to the success of the company's Identity Engine by building systems that process terabytes of transaction data every day.
The network ingestion team provides the data for our data scientists – both in terms of labeled transactions for experimentation and model training, and for calculating features for production evaluation of transactions over the deployed models. This involves various processing steps on a large amount of incoming transactions – including the calculation of various complex features called the network signals –, which we execute with distributed computations.
In the Big Data Engineer role you will:
- Design and develop our mission-critical big data processing systems using Apache Spark and related technologies.
- Develop deep understanding of Spark and its internals to continually optimize our computations for runtime and cost efficiency.
- Maintain a sufficiently generic yet simple and economical solution.
- Insist on highest coding standards, follow and create best practices for clean code and architecture.
- Manage a sense of urgency and risks on project timelines and propose creative strategies for delivering constant business value.
- Develop deep understanding of data, get a good sense of signal vs noise to help business with shaping new products.
Elvárások / Requirements
Our ideal Big Data Engineer will have:
- Experience in building complex ETLs, Data Warehousing or custom pipelines from multiple data sources, including proper monitoring, alerting, verification, and metrics in a commercial environment.
- Experience with the Apache Spark ecosystem in a production environment.
- Experience working with AWS EMR, Databricks and/or similar platforms.
- Passion for diving deep in data and insights.
- Strong sense of urgency and bias for results.
- 1+ years of software development experience.
- AWS Cloud experience with EMR, S3, EC2 in a big data environment.
- Proven track record building multi-tenant scalable enterprise software in cloud.
- Fundamentals around JVM and garbage collection optimization understanding.
- Bachelor’s degree in Computer Science or a related area.
Amit nyújtunk / Benefits
- learning & development opportunities
- calm office environment
- yearly budget for conferences and training
- wide range of cafeteria
- health insurance
- stock option