You’ll be responsible for designing and implementing scalable, high-performance data systems, while helping define best practices across the organization.
From building pipelines to enabling advanced analytics and ML use cases, your work will directly impact how we leverage data across global clients.
Design and implement scalable data architectures (ETL/ELT)
Build and maintain data pipelines for ingestion, processing, and consumption
Develop APIs and data services to support analytics and product use cases
Work with AWS and/or Azure to deploy modern data platforms (data lakes, warehouses, streaming)
Automate data workflows with a focus on resilience, observability, and performance
Collaborate with Data Scientists, Analysts, and Product teams to bring models into production
Participate in architecture decisions and code reviews
Explore and implement emerging trends (Data Mesh, MLOps, RAG, etc.) 🚀
5–7+ years of experience in Data Engineering or backend/data-focused development
Strong expertise in Python (Pandas, NumPy, SQLAlchemy)
Advanced SQL skills (optimization, complex queries, performance tuning)
Experience with Apache Spark and Airflow
Solid experience with data warehouses (Snowflake, BigQuery, Redshift)
Strong knowledge of AWS and/or Azure ecosystems
Experience building production-ready pipelines with CI/CD and Docker
Experience working in agile environments
Experience with Kafka or real-time streaming
Knowledge of Scala or Java (for Spark)
Experience with NoSQL databases (MongoDB, Redis, Cassandra)
Familiarity with Kubernetes and Terraform
Exposure to MLOps or AI systems (RAG, ML pipelines)
Experience in startups or fast-paced environments
Scalable architecture & performance optimization
Strong problem-solving and systems thinking
Ownership and autonomy
Clear communication and ability to work cross-functionally